Image-based Analysis to Study Plant Infection with Human Pathogens

Our growing awareness that contaminated plants, fresh fruits and vegetables are responsible for a significant proportion of food poisoning with pathogenic microorganisms indorses the demand to understand the interactions between plants and human pathogens. Today we understand that those pathogens do not merely survive on or within plants, they actively infect plant organisms by suppressing their immune system. Studies on the infection process and disease development used mainly physiological, genetic, and molecular approaches, and image-based analysis provides yet another method for this toolbox. Employed as an observational tool, it bears the potential for objective and high throughput approaches, and together with other methods it will be very likely a part of data fusion approaches in the near future.


Phytopathometry in Use
Identification and quantification of plant diseases are required for the adequate plant protection, the determination of crop losses, and the design of breeding strategies in agriculture [1]. The use of images in disease control and survey has a long tradition. Already 90 years ago aerial pictures made from airplanes were used to study crop diseases on fields in the USA [2,3]. Since then image-based detection of disease symptoms constantly improved. Today, not only detection, but also very sophisticated and informative analysis is possible. Those can, and use, the full spectra of electro-magnetic radiation. However, the vast majority uses images based on the UV, visual and infrared spectra. Image-based analysis is also a powerful tool in studies of plant physiology, especially the responses to pathogen attack at the organism or tissue levels.
The different aspects of image-based detection and measurement of disease symptoms in plants are under constant development and were reviewed in several recent publications [1,4,5]. In 1966, E. C. Large introduced the general term phytopathometry to describe the quantification of plant disease [6]. Few decades later, Nutter and coworkers together with the American Phytopathological Society defined several other terms related to measurable symptoms of plant diseases [7,8]. Among them they defined: "disease severity" as the proportion or percentage of sample unit (fruit, plant or field) showing the symptoms. "Disease incidence" as the proportion of individual plants or plant organs within the total number of assessed individuals and "disease prevalence" as the proportion of fields, areas or countries in which the disease was detected. Also the term "disease intensity" related to the amount of disease in the host population was introduced. Hence, many studies concentrate on disease severity, describing the distribution of symptoms caused by pathogens on plant organs (leaves, stems, roots, etc.) or in plant populations at flied, forest or grassland scale.
The visible symptoms, observed on plants and caused by the propagation of a pathogen might be based on different physiological phenomena. For instance, fruit or leaf soft rot diseases caused by diverse bacteria from the Erwinia, Pseudomonas, Bacillus, or Clostridium groups [9], are the result of disintegration of plant tissues by bacterial enzymes. Those, the enzymes, are secreted to the surroundings and cause destruction of the middle lamella followed by maceration of cell walls and the cellular content. Many fungal pathogens exhibiting necrotrophic lifestyle (Botrytis spp., Alternaria spp., or Rhizoctonia spp.) also rely on an active degradation of host tissues, causing in consequence, well visible disease symptoms [10]. Such symptoms are often referred to as necrosis. On the other hand, biotrophic or hemibiotrophic pathogens may trigger intense activation of defense mechanism known as hypersensitive response (HR). HR occurs within few hours or days after inoculation and results in localized cell death. Very often HR is the consequence of the so-called effector-triggered immunity (ETI), which occurs when the plant recognizes the effector proteins injected by the pathogen into the plant cell [11]. The function of this rapid cell death, or HR, is to counteract the systemic spreading of the pathogen. Although both necrosis and HR originate from different mechanisms, their result is a change of leaf or other tissue appearances. Those morphological differences can be easily visualized using the visible part of the electromagnetic spectrum by analog or digital photography [1]. Necrosis and HR are however, not the only possible outcome of a pathogen attack. Upon recognition of a pathogen plants may close their stomata and therefore restrict the access to mesophyll tissue [12][13][14]. Because of the physiological functions of stomata, which are gas exchange and the control of inner surface evaporation, stomatal closure results in an increase of leaf temperature [15]. Those differences can be assessed using for example infrared imaging. In the same manner, pathogens affecting plant metabolism can influence the content of plants chlorophyll and other pigments, which in turn changes the plants' autofluorescence and can be visualized using the near-UV spectrum imaging. Taking together, plant physiology and their reactions to pathogen attack offer multiple possibilities for an image-based assessment of changes and hence the detection and measurement of disease symptoms.

Plants as Source for Human Pathogens
Numerous pathogenic bacteria seem to have a fairly broad spectrum of host organisms. Among these, Salmonella spp., Pseudomonas spp., Klebsiella spp., Escherichia coli EHEC, and others efficiently proliferate in animal and plant organisms [16][17][18]. Salmonella enterica is one of the main causes of food-borne poisonings today. Salmonellosis is unfortunately a constant threat to human health not only in developing but also in developed countries. A large study conducted in 2007 showed that in the UK, the Netherlands, Germany and Ireland 0.1 to 2.3% of pre-cut products were contaminated with Salmonella bacteria [19]. Another European study from 2009 revealed that 2.5% of fresh produce were contaminated with Salmonella [19]. In the USA, one out of six citizens is estimated to infect himself by eating contaminated food [20]. Salmonella infections have not declined in the last 15 years, making the non-typhoidal strains the leading cause of food poisoning. In cases related to domestic food poisoning in the USA, salmonellosis was responsible for 35% of the hospitalizations and 28% of deceases [21]. Poultry and eggs are commonly associated with Salmonella outbreaks; however, 20% of infections from 2004 to 2008 were linked to other sources including: sprouts, leafy greens, roots, grain-beans, fruits and nuts [20]. The assumption that Salmonella passively survives on plants after occasional contaminations changed in the last few years. Research on the interaction between plants and these bacteria suggests an active infection process [22][23][24][25][26][27][28][29][30][31].
In order to deploy the host immune system S. enterica uses diverse effector proteins, those proteins interact with the host immune system and inhibit or abolish its action. Effectors are usually injected into host cytoplasm by Type III Secretion Systems (T3SSs), those secretion apparatuses function as molecular needle and allow the translocation of bacterial proteins (e.g. effectors) into the host cytoplasm [32]. Salmonella has two T3SSs, which secret different yet overlapping sets of effector proteins that function at different stages of the infection. Giving the importance for human health, the suppression of the animal immune system by Salmonella is very intensely studied. We know already 44 effectors, which are injected into animal host cells, and for many of them we know the function and the target proteins [33]. Interestingly, bacterial effectors often target signaling cascades, which are important regulators of the immune response in animals and plants.
For instance, the SpvC effector from Salmonella spp. encodes a phosphothreonine lyase that dephosphorylates and therefore deactivates the ERK1/2 kinases, key regulators of animal immune system [34][35][36]. Another effector protein, the integral membrane protein SseF [37] together with SseG, is responsible for the formation of Salmonella-induced filaments, an elongated tubular structure within which the bacteria reside in animal cells. In plant cells, SseF is recognized and triggers the above-discussed HR [38].
Although several Salmonella effectors have homologues in phytopathogenic bacteria: e.g.: HopAI1 is a homologue of SpvC in Pseudomonas spp. [39] and HopAO1 is a functional homologue of SptP [40], the function of Salmonella proteins during the inactivation of the plant immune system remains elusive. Nonetheless, it is very tempting to speculate that biochemical features of those effectors are conserved between animal and plant hosts. This would provide Salmonella and other pathogenic bacteria with an efficient toolbox for suppression of plant immune system [18]. Such suppression was already reported. Recent study on the interaction between tobacco plants and Salmonella Typhimurium showed that in contrast to living bacteria, dead bacteria elicited an oxidative burst and pH changes in tobacco cells [31]. Similar response was provoked by the invA mutant, which lacks one of the T3SS [31]. Those results suggest that Salmonella depends on the secretion of effectors to actively suppress tobacco immune responses. Two transcriptome analyses performed after inoculation of Arabidopsis plants with the wild type S. Typhimurium strain 14028s and the prgH, a T3SS mutant, revealed a similar scenario [30,41]. The prgH mutant, similar to invA lacking one of the T3SS, induced the expression of more genes than the wild type bacteria, and the majority of which were related to defense responses, suggesting that the wild type bacteria are able to suppress the expression of a set of defense related genes. Moreover, mutants impaired in their T3SSs were less virulent towards Arabidopsis plants than wild type bacteria [30,42].
Taking together, recently published results indicate that Salmonella uses plants as alternative hosts and that these bacteria could, similarly to the infection in animals, actively suppresses the plant defense mechanisms. Whether these bacteria use the same or different effectors in order to achieve this goal is not yet clear, it seems however to be acceptable to conclude that Salmonella requires T3SSs during interaction with plants.

How to Use Image-Based Analysis to Study Infection with Human Pathogens
Visible symptoms caused by Salmonella on plant leaves depend on several mechanisms: i) The recognition of bacterial effectors, as in the case of SseF, which triggers the HR as a part of the ETI response [38]; ii) the suppression of ETI and therefore the HR as indicated by the inability to do so by mutants in T3SS [42,43]; iii) the serotype of the strain, strains belonging to the S. enterica serogroup E4 (O: 1, 3, 19): Cannstatt, Krefeld, Liverpool, and Senftenberg induced chlorosis and wilting in Arabidopsis leaves. In contrast, strains lacking the O-antigen (Typhimurium, Enteritidis, Heidelberg and Agona, as well as strains of S. enterica subspecies arizonae and diarizonae) were not causing any visible symptoms [44]; and iv) not less important: the plant species itself, infiltration of tobacco, for example, had no impact on the macroscopic appearance of the leaves [31], whereas in Arabidopsis, infiltration with Salmonella caused eventually the death of infiltrated leaves [29,42].
Besides, or maybe because of the multitude in interactions between this bacterium and plants, change in leaves', and other tissues', appearance is a good way to assess very different questions. The virulence towards plant is the obvious one. In addition, using the plethora of genetic tools available for many plants and pathogenic bacteria, recent studies commenced to uncover more detailed information on the interactions between plants and microorganisms considered previously as human or animal pathogens (for review see: [16,41,[45][46][47]).

Visible Spectrum Analysis -Image-Based Classification
The analysis of plant leaves and other tissues can be efficiently done using an analysis of images taken with standard visible spectrum cameras. Although the process of gaining images is nowadays easily accomplished, the automatic inspection of those data is still a challenging task. Usually, the involved algorithms need to perform a couple of preprocessing steps before the actual classification can be done. These steps comprise: a transformation of the complex color space, a detection of leaves in images (segmentation), extraction of features from the raw image data and finally the actual classification. The latter assigns to each pixel a predefined class, e.g. healthy or infected. Below we present how those steps could be deciphered.

Color Spaces
Cameras designed for the visible spectrum usually deliver color images using the RGB color model with three main colors: red, green and blue [48]. However, for an image analysis such color model is not well suited, since already slight variations of the illumination have an effect on all color channels. Because of this problem, a color model transformation is usually applied as a preprocessing step. As suitable models, the HSI or HSV, with following channels: hue, saturation and intensity or value, or the I1I2I3 model were proposed by many authors [1,42,49]. The drawback of HSI and HSV models is the hue channel, which is defined via color angles with values in the range of 0 (red) to 360°( red) and both extreme points corresponding to the same color. In order to avoid this problem the I1I2I3 color model was developed [50]. The transformation from RGB to I1I2I3 is linear and therefore very fast. The intensity information is transformed into the first channel. The relevant color information is decoded in the second and third channels (Fig. 1). With this preprocessing, the segmentation of the captured images into e.g. "leaf" and "non-leaf" areas is far more robust.

Segmentation
The problem of extracting relevant object(s) from an image can be seen as the segmentation of an image into two regions, foreground and background. All pixels labeled as foreground count as part of an object and are therefore candidates for further analysis. Image segmentation is a common task in computer vision, and many solutions have been proposed in order to solve this problem. The best solutions were offered by variational approaches. Here, an energy or cost functional is formulated and minimized. The minimum of this functional is the desired segmented image. Three main classes of variational approaches for image segmentation exist. The first one is level sets [51][52][53], the main advantage is the continuously formulated and minimized energy functional and hence no need for discretization, i.e. the functional is solved independent of the actual image resolution. This enables more natural segmentation boundaries. On the other hand, the local minimization of the energy functional, used within the level set framework, does not necessarily lead to globally optimal solutions. The second class is graph cuts [54][55][56][57] with two main advantages: the computation time is very short, even for large images, and the gained solution is guaranteed to lie within a defined boundary around the global optimal solution for the minimization problem. The main disadvantage of these approaches is the discrete formulation on a graph, which leads to discretization errors. A combination of the benefits from those two classes constitutes the third one: total variation (TV) minimization using the total variation norm. Chan et al. proposed this method in 2004 for image segmentation of intensity-based images using a transformed version of the Mumford-Shah model [58]. Since then many improvements have been presented, establishing TV as the standard technique in this field [59,60]. The only drawback of these approaches is its relative high computational burden. Nevertheless, the computation can be usually scheduled as parallel tasks and processed on a graphical processing unit (GPU), which for most of the applications leads to acceptable execution time. In case of leaf images, the best results were obtained using images of cut-off leaves on rather homogeneous background. Recently however, an approach was proposed for segmenting leaves in images of intact plant, aiming at the identification of such leaves features as trichomes or anthocyanin accumulation [61].

Features
Several approaches to classify leaf regions into different states, e.g. infected vs. non-infected have been proposed. All of them require the transformation of the raw pixel into a feature space. This reduces the dimension and enables better decision boundaries. Features can be separated into two sets: pixel-wise and region-wise. Pixel-wise features usually use only the color information e.g. the I2 and I3 values after color transformation [62]. For region-wise features, the information of the surrounding around the central pixel is concentrated, e.g. color histogram, first and second moments, and gradient distributions [63,64].

Classification
Having a feature set of the extracted image region the image analysis algorithm has to classify the pixel or region into one or multiple classes, depending on the task. Typically this is done via supervised learning, where a pre-labeled data set exists. From this data set the algorithm needs to learn how the different classes appear in images.
Two main classification approaches have been proposed in the field of image analysis:

• Neural Networks
Since the works of Rosenblatt, Neural Networks (NNs) have been a popular tool for classification in images [65]. In 2011, Al-Hiray and coworkers used a NN to distinguish between healthy and unhealthy leaf regions [63]. Since the 2000s also other methods have been used to overcome the computational burden of NN. However, the concept of deep learning has brought NN back to the focus of current research, making NN more efficient and applicable in various pattern recognition problems [66,67].

• Support Vector Machines
Since the work of Vapnik [68], Support Vector Machines (SVMs) have become a powerful tool for classification and regression analysis. Based on a training set, SVM builds typically a hyperplane that can be seen as a separation of the training set in the feature space. In addition, using the kernel-trick, SVM can be used for non-linear separable data. In our previous report, SVM has been used to classify pixels belonging to infected and non-infected areas of Arabidopsis leaves inoculated with Salmonella. In this study, the accuracy of SVM was higher than for Bayesian classifiers [42]. Also in the study of diseases on grapes, SVM performed better than the NN approach [69].

Using Image Analysis to Study Plant Defense Responses to Human Pathogens
The fact that plants, e.g. Arabidopsis, inoculated with human pathogens exhibit disease symptoms allows the use of image-based approaches to monitor the infection process. In addition, it allows studying the defense mechanisms employed by plants to fight the pathogen as well as the strategies used by those pathogens to suppress the plant immune system. One of the possible examples was studied during the inoculation of Arabidopsis plants with Salmonella mutants lacking the T3SS apparatus [42]. In this study, symptoms (defined as a color change of the observed leaves) caused by the different mutants were used to reveal that this bacterium might use a similar strategy to suppress human and plant immune systems.
Different color variation models can be employed to distinguish the "healthy" and "unhealthy" regions in leaf images. A probabilistic algorithm, employing a Gaussian Mixture Model (GMM) and a Bayesian classifier for classification of disease symptoms in Arabidopsis plants was already presented [49]. However, results from Bayes-like classifiers can be inaccurate, because the estimation of a robust GMM is not always possible from real data. To overcome these limitations a different classification strategy was proposed. In order to classify pixels of leaf images, the proposed algorithm used the color feature space as input information for SVM [42]. An overview of the steps described in this paper is presented in Fig. 2. First, a segmentation method was applied to obtain a binary image with only foreground and background information. Each pixel belonging to the foreground region was then given as an input to a linear SVM classifier for prediction of the class to which it belongs. After identification of all pixels belonging to the foreground, the neighborhood information has been used to alter the result of pixels classified as "unhealthy". The GMM approach reached a correct classification rate of 91.5% however, the SVM approach could improve the results and a correct classification rate of 95.8% was achieved (Fig. 3) [42]. Nonetheless, it should be noted that results discussed above were obtain with a labeled data set. If only unlabeled data sets are available, Bayesian classifiers are very well applicable because they can be use within an unsupervised learning strategy [70]. Fig. 2. Overview of the algorithm proposed in [42]. An Arabidopsis leaf with almost monochromatic background was the input for the algorithm. A segmentation method was applied to identify pixels belonging to the leaf. Those pixels are classified using a linear SVM classifier. The output from classifier was further refined through a neighborhood-check method.

Hyperspectral analysis of plant diseases
In the field of remote sensing, hyperspectral analysis has become a major tool for the surveillance of large agricultural areas. Here, the cameras gain images not only in the visible or the infrared spectrum, but also in multiple spectrum bands. The main advantage is that different physiological conditions of plants can be identified in different bands (spectra) and therefore the classification becomes more reliable. A hyperspectral analysis can analyze and measure data related to growth, disease infestation, water availability, or fertilization. The Normalized Differenced Vegetation Index (NDVI) [71] is used to determine if living green vegetation is present in images or not. Beside this index other index values have been discovered [72,73]. Hyperspectral sensors are usually big and expensive, thus they have been deployed mainly for satellite installations. However, nowadays small, hyperspectral cameras become available (with a limited number of bands) and are already used for plant inspection in a smaller scale [74]. Nevertheless, the potential of hyperspectral analysis was so far not used for image-based analysis to study plant infections with human pathogens.

Conclusions
The potential of image-based analysis in the detection and monitoring of plant disease is well accepted. The use of these techniques to study plant infections with potentially human pathogenic microorganisms is a new and developing area. It is very tempting to speculate that in the near future, image analysis will be joined by other related techniques such as non-imaging spectroradiometry, high resolution mass spectrometry imaging [75], or high throughput protein-protein interactome studies [76]. Such conjunctive systems will allow very informative data fusion approaches and surely boost our knowledge on those interactions. Such knowledge is of great importance for our own health and therefore for new strategies aiming at prevention of food-originated diseases. Higher accuracy can be achieved by using the SVM classifier (C).