Aberystwyth University Watching plants grow

: We present a comprehensive overview of image processing and analysis work done to support research into the model ﬂowering plant Arabidopsis thaliana . Beside the plant’s importance in biological research, using image analysis to obtain experimental measurements of it is an interesting vision problem in its own right, involving the segmentation and analysis of sequences of images of objects whose shape varies between individual specimens and also changes over time. While useful measurements can be obtained by segmenting a whole plant from the background, we suggest that the increased range and precision of measurements made available by leaf-level segmentation makes this a problem well worth solving. A variety of approaches have been tried by biologists as well as computer vision researchers. This is an interdisciplinary area and the computer vision community has an important contribution to make. We suggest that there is a need for publicly available datasets with ground truth annotations to enable the evaluation of new approaches and to support the building of training data for modern data-driven computer vision approaches, which are those most likely to result in the kind of fully automated systems that will be of use to biologists.


Introduction
The plant Arabidopsis thaliana (Arabidopsis) is an important subject in the biological sciences.
While it is not a useful crop plant (though it is related to cabbage) it has many advantages as the subject of study.It is the first plant to have its genome fully sequenced [1] and being small and fast to grow to maturity (the first flowers might appear five weeks after sowing), it is easy to grow in large numbers for experimental study.One area for which it is well suited is study of the phenotype, the characteristics of an organism as influenced by its genetic constitution (its genotype) and its environment.As the phenotype is more variable than the genotype, being affected by the environment and also the passage of time, its study requires a large amount of data.There is a great deal of interest in methods that help with the (non-destructive) acquisition and the understanding of this volume of data [2].Biologists need to measure variations in size between specimens, using such measures as fresh and dry weight and leaf area.Traditionally, taking these measurements of a plant entails harvesting the plant so an experimenter can only get one set of measurements from each specimen.If all measurements are taken at the end of an experiment then differences in rates of growth will be lost [3].As there is a close correlation between plant weight and leaf area, [4], using images to measure leaf area gives biologists a useful measure of the size of the plant and using images as the basis of measurement means measurements can be taken non-destructively.This leads to an explosion in the amount of data that can be obtained [5].How best to process and analyse the resulting images is itself an open and active research area and so is of interest to specialists in computer vision as well as to biologists.The factors that make it an interesting computer vision problem include:- • The shape of a plant and its leaves varies from specimen to specimen.
• The shape of the plant and leaves change over time both in the short term (the possibility of leaves being blown in the wind and -over a longer period -daily changes of leaf orientation from day to night) and in the long term (the plant's growth).
• As the plant grows, new leaves appear, changing the overall shape of the plant.
• As leaves appear, they start to overlap and occlude each other and knowledge of the extent of this occlusion is important to the accuracy of measurements obtained.
• It is a problem of tractable scale.

IET Computer Vision
• A successful solution will be of interest to the biological community.
Image analysis has been used in the study of Arabidopsis by computer vision researchers and biologists since 1992 [6], and the time is ripe for a comprehensive review of the approaches that have been adopted.We present such a review and make some suggestions as to what is required to advance the state of the art.Much work in this area is carried out using integrated phenotyping platforms that are capable of capturing images of the growing plants periodically, as well as weighing them (together with the soil and pots in which they are growing) and managing watering [7].There is a review of how such platforms have been used in phenotyping in [8] but that does not discuss the image analysis techniques used.An introduction to the use of image analysis in plant phenotyping is to be found in [9].They do not concentrate on Arabidopsis so while their introduction is broader in scope than the present paper, we present a fuller review of image analysis as applied to Arabidopsis.

Some biological background
As this paper will be read by members of the computer vision community rather than biologists, some biological background will help place the work reviewed in context.A successful approach to image analysis of Arabidopsis will give biologists a useful range of measurements, of a useful degree of accuracy obtained from living plants, to inform study of their phenotypes.Phenotyping is the process of obtaining the measurements of specimens.It should involve substantial populations to accommodate individual variation, hence the interest in "high throughput phenotyping" [10].It is an important way of amassing data for phenomics, the study of the plant's growth, performance and composition [2].In the case of plants, this involves the growing of large numbers of plants and periodically measuring them.If images can be used to obtain these measurements, the plant survives for more measuring in future.This is valuable as the passage of time is one of the factors that affects a phenotype and how different phenotypes are so affected is of interest.
Arabidopsis is a "rosette plant", which means that a young plant grows its leaves from a very short stem in a circular pattern with all the leaves at much the same height above the soil and oriented close to horizontal.This means that measurements can be taken using top-down 2D photographs, ignoring the effect of having leaves at different distances from the camera.However, as Fig. 1 shows, leaves overlap and occlude each other as new leaves grow immediately above older ones.Each leaf consists of two parts, the petiole ("leaf stem") and the blade.tained from the Photon Systems Instruments PlantScreen Phenotyping installation at Aberystwyth University.It is typical of the images obtained from plant phenotyping systems and of the visible light images in the publicly available datasets.Some experimenters use other approaches, such as growing the plants in vitrio, in agar.There are example images of some other growth media in Wikimedia commons (https://commons.wikimedia.org/wiki/Category:Arabidopsisthaliana).Besides growth, the leaves move between light and dark periods, being angled more upright in the dark.This upward orientation is known as "hyponasty" and these diurnal changes in hyponasty are of interest.A way of obtaining measurements from living plants is needed to study this behaviour, so this is another area where image analysis is used [11].The opposite of hyponasty (that is, a downward curvature) is epinasty.After five or six weeks, the plant will grow a flower stem and produce flowers.This does grow to a noticeable height, so top-down 2D photographs of the flower stem are less useful, but most phenotyping work (at least with Arabidopsis) is concerned with young plants in their leaf-growing phase so in addition to the advantages of Arabidopsis as a model plant for biologists, it is also an easy plant to photograph.
As outlined in the introduction, the most basic measurement of interest to biologists is the leaf area of the plant.This is shown to have a close correlation with the overall size of the plant expressed in terms of its fresh or dry weight [4] though the occlusion of leaves causes this relation to change as the plant grows.A plant's capacity for photosynthesis (and so "primary production" of

Fig. 2. Three views of the same Arabidopsis plant taken six days after the previous image, showing how leaf occlusion increases with age.
organic matter) is also related to leaf area [12].If a sequence of images taken at known intervals is available, then rates of growth of the plant can be derived from this [13].There is also interest in the compactness of the rosette, the proportion of the area within the convex hull of a plant that is leaf [14].Further to these measurements, leaf-level measurements allow more sophisticated representation of variation.The number of leaves provides a mapping to the plant's growth stage [15] and the area of individual leaves can be found [16] and their rates of growth derived, as can the rate of photosynthesis in an individual leaf [17], using chlorophyll fluorescence imaging.Chlorophyll emits light on returning to an inactive state and this is an indicator of the level of photosynthesis in that area of the plant.If the plant is lit by a specific wavelength to which it is sensitive and that wavelength is filtered out by the imaging sensor this fluorescence can be traced in the resulting image.See [18,19] for overviews of this technique.As the amount of chlorophyll fluorescence is associated with the level of photosynthetic activity, this technique enables an approach like [17] to show which leaves are most active.There has also been work on quantifying variations in the 5 shape of the leaves between genotypes [20].

Image analysis for plant phenotyping
To extract useful measurements the plants need separating from the background, so we will consider this area before looking at the more advanced problem of obtaining measurements of individual leaves.This plant-level segmentation limits the measurements to those based on the area of the 2D projection of the whole plant.Leaf-level segmentation increases the range and precision of available measurements and is the aim of most recent work.Across this division between plantand leaf-level approaches we will discuss work in terms of three areas:- • The degree of automation, or how much human interaction is needed to obtain a useful segmentation.If a system is to be capable of managing a high throughput of images, it is clearly desirable to avoid the need for human interaction, at least during the running of experiments, as this creates a bottleneck [21].In addition, reducing the human element in image analysis reduces variations from human subjectivity [22].While complete automation of image analysis is the aim of most approaches, some work accepts that human input is required for more accurate results and seek to simplify the input as far as possible [23].
• The type(s) of images used, their dimensionality and image modality (or modalities), such as visible light or infrared.As Arabidopsis is a rosette plant useful information can be obtained from a top-down 2D image.However, some approaches do make use of specialist imaging techniques to obtain and use depth information, such as the use of a laser rangefinder in [24].While many approaches use the visible spectrum, it is common to use Infra Red (IR) or Near Infra Red (NIR) [3].Plants are not sensitive to these wavelengths, so they can be used to photograph plants at night without them reacting.Further, plants strongly reflect Infra Red light so there is high contrast between plant and background.
• Temporal information, specifically whether sequences of images are used in image processing or whether the method is designed to work on a single image.As one area of interest to biologists is the change in plants over time, it is common to arrange an experimental set up 6 to take time lapse sequences of images.Successive images might be taken over any period from a few minutes (in [25] an image is taken every 2 minutes) to once a day [21].In this latter case, the images should be taken at the same time each day to eliminate as far as possible the effects of diurnal variations in plant appearance.The daily images of [21] are taken three hours after start of daylight, plus or minus half an hour.Here we distinguish between approaches such as [26] that use information from the sequence of images in segmentation and those like [27] that simply use time lapse information to derive temporal experimental data, such as rates of growth, while processing each image independently.Another pragmatic question here is the ability of a system to process the volume of data generated.
While the range of measurements supported by an approach is important, other criteria for evaluation are worth consideration.A useful set of criteria for the evaluation of plant image analysis systems comes from [28].They suggest an approach should:-1.accommodate most laboratory settings 2. not rely on an explicit scene arrangement 3. handle multiple plants in an image 4. tolerate low resolution and out of focus discrepancies 5. be robust to changes in the scene 6. require minimal user interaction, in both setting up and training 7. be scalable to large populations or high sampling rates 8. offer high accuracy and consistent performance 9. be automated with high throughput.
These criteria are aimed specifically at approaches intended for use with a variety of laboratory setups so are less applicable to those approaches that relate to a specific approach to image capture.
In addition to these criteria, some authors stress the low cost of their approach [16].While this paper focuses on imaging approaches that generate plant-level images, we observe in passing that specialist imaging techniques have been used to analyse Arabidopsis at the cell level.Examples include the use of dark field microscopy (in which only that light scattered by the subject is detected) in [29] to capture the patterns of veins in a prepared Arabidopsis leaf.A 3D cell-level model of the leaf was obtained by [30] using confocal microscopy.High resolution x-ray computed tomography generated a 3D model of the plant that resolves the cells in [31] while optical projection tomography was introduced in [32].
Many of the approaches included in this review are to be found in the online database of plant image analysis software [33].

Example datasets
The usual measure of performance of an image analysis approach is by comparison with hand annotated "ground truth" images.For plant-level segmentation these will be binary silhouettes of the plants while for leaf-level work, the leaves will be individually coloured as in Fig. 3.Recently The Leaf Segmentation Challenge dataset [34,35] has two sets of top-down 2D visible light im-8 ages of Arabidopsis, one grown for 3 weeks and one for 7, and one set of images of tobacco.These are images of single plants rather than the multiple plants per image output by a high throughput phenotyping system.They have been selected to provide examples of the kinds of problems faced in attempting to segment images of plants growing in soil, such as the growth of moss.Although the datasets were captured using time lapse [35] the images used were taken many hours apart rather than the few minutes separating successive images in a high throughput approach such as [36].This does mean the dataset has examples of different stages of growth.Not all images have been annotated and not all annotations have been released.Some have been held back for future competitions so only a few hundred of the tens of thousands of plant images have publicly available annotations.The Leaf Segmentation Challenge itself is an annual competition in which training and test sets of images are released to participants and the participants' solutions are evaluated against the test set ground truth images using a set of evaluation test functions.These compare the numbers of differently labelled regions to model a count of leaves and use the degree of overlap between regions in the image to be tested and ground truth (the Dice coefficient [37,38]) as a measure of accuracy of segmentation.Previous years' images and evaluation functions are freely available.
The MSU-PID database [39] has top-down images of Arabidopsis and black bean obtained using visible light, infra-red, chlorophyll fluorescence and depth camera.Plants were grown in soil covered by a layer of black foam to ease background separation.Images were taken at the end of each hour through a 16 hour day with the RGB and depth images taken simultaneously, the chlorophyll fluorescence image 2 minutes later and the IR image 2 minutes after that.No movement was observed between images.Images were taken over a 9 day period.There are 16 plants in each Arabidopsis image so there are 2160 Arabidopsis plant images in each modality.
Four frames a day were annotated for ground truth for leaf tip location, leaf segmentation and leaf consistency over time so there are 576 plant images annotated for each modality.
It is hoped that more such datasets will become available in future as this is an important resource if researchers without access to their own source of images are to play a part in this field.
The availability of shared datasets is also an important driver of improvement in performance of 9 approaches to a specific problem, as evidenced by the Pascal Visual Object Classes Challenge [40].
The aim of the Leaf Segmentation Challenge is to drive a similar improvement in this field.

Plant-level image analysis
Until recently, most work on image analysis for phenotyping of Arabidopsis had considered approaches that depend only on obtaining data from images of a whole plant, with no further segmentation.This restricts measurements to things like overall leaf area and such characteristics of the rosette as a whole as size of the convex hull, from which the compactness of the rosette can be obtained [21].In [14], 19 rosette area and shape measurements are extracted from segmented plants taken from a population of different genotypes.Segmentation was done with a nearest neighbour approach, using two sets of (RGB) colour intensities, foreground and background.Image pixels that match these example sets are chosen and used as starting points for classifying neighbouring pixels.Morphological corrections -erosion and dilation -are then used to remove small, isolated regions.Their statistical analyses showed area and compactness (defined as area of plant over area within its convex hull) to be useful discriminants between genotypes.They did not take leaf occlusion into account so the plant area will be below the true value but will indicate the area of unshaded leaf.Neither did they count the leaves.For greater precision, some way is needed of accounting for the occluded areas of leaves unless analysis is restricted to plants so young the problem does not arise [27].The overall rate of growth can be derived from the plant area measurement taken across a series of images taken at known intervals.

Degree of automation
For high throughput phenotyping, the ideal is to eliminate human interaction, at least during an experiment.Many approaches at least attempt to achieve this."Rosette Tracker" [13]  value sufficiently close to the plant hue.This can either be a default value or one chosen by the user clicking on a plant region.This helps the system manage differences in light colour between installations.As suggested by their list of desirable attributes quoted earlier, [28] sought to eliminate user input as far as possible.The only input they need is the number of plants in an image and the grouping of different genotypes (if applicable).They start segmentation by using k-means clustering to find the plants and use this as the basis for an active contour model.As they wanted to incorporate a variety of image features, they used an active contour model for vectorvalued images.A vector valued active contour model is also used by [41], with a map of "saliency" used as the starting point for the active contours.Saliency is derived from the distance between a pixel's colour and the distribution of colours in the image and its closeness to a distribution of similar coloured pixels.This is used to inform the initial active contour.
In practice some user input is often required, though this might be only once per sequence of images.PhenoPhyte users [22] are asked to set threshold values to get the best possible segmentation of plant from background although earlier steps in segmentation are performed automatically.
The automated steps are enabled by normalisation of the visual light image using a specific colour chart to be included in the frame.This allows the colour balance and scale of the image to be normalised so segmentation is more consistent and measurements can be in real world units.Such approaches allow fine tuning of the system to get the best results on a given image set and once this is done, individual images can be segmented with no user input.Some approaches take a semi-automatic approach, requiring some input from the user.In [3] plants are grown in Petri dishes (several plants to a dish) and photographed hourly using NIR.
This allows the plant regions to be segmented from background using an empirically derived (and manually checked) threshold value.To identify and segment the individual plants, a mask is made (by the user) showing the locations of the plants to be included.The largest "object" region corresponding to each of these mask dots is used as the starting point for watershed segmentation of the threshold image to obtain a distinct region for each plant.
One way of capturing human expertise while avoiding the need for interaction during an experiment is to use a machine learning approach with a set of expertly annotated training data images.

IET Computer Vision
In [42], users are asked to generate training data by clicking on classifying pixels in a training image.These annotations were used to try alternative approaches to classifying test data, either using support vector machines or using a manually generated discrimination boundary.The latter approach was more successful.

Type of image
The choices in how to capture images for analysis are essentially whether to use 2D or 3D imaging (and if 3D, how to get the third dimension) and what wavelength(s) to use.Arabidopsis being a rosette plant, useful measurements can be obtained from a 2D top-down image and there is arguably less need of 3D imaging than for plants with a more complex morphology.However, some approaches do capture depth information in various ways.A top-down 2D image cannot capture the angle of leaves relative to the horizontal plane so in [24] a laser rangefinder is used to measure variations in the orientation of the faces of leaf blades.This is done alongside a 2D colour camera.Dornbusch et al. [11] use a laser scanner to track changes in hyponasty.The scanner generates a 2.5D image that has depth values recorded as changes in intensity.This image is converted to a 3D point cloud.Comparison of sequences of these enable changes in leaf orientation to be traced over time.
One approach to gaining depth information from a 2D camera is to take multiple views.In [43] this is done as required to correct the area of leaves from the overhead image for the leaf angle.
The other major area of choice is the wavelength(s) to use.If an image analysis system is to be applicable in a variety of installations as is the intention for PhenoPhyte [22], the visible spectrum is preferable, simply because any digital camera can be used for capture.This approach is intended to analyse images from a variety of sources and to accommodate the resulting variablity in the images, users are asked to include a specific colour chart in the frame.This can be used to derive both scale and colour information.They also ask that users adhere to standards of image capture, such as avoiding shadows.
Use of the visible spectrum allows colour information to be used in segmenting the plant from the background.[42] uses transformed colour information for plant from background segmenta- There are good biological reasons for choosing alternatives to visible spectrum imaging.A common approach is to use near infrared (NIR) [3].As plants are not sensitive to NIR, time lapse sequences can be carried out 24 hours a day and can capture changes in plant behaviour between light and dark periods, such as when in the day relatively high rates of growth occur [44].Plants also reflect a lot of IR light, resulting in strong contrast between plant and background.These advantages of NIR make it a common modality in the plant sciences.As digital camera sensors are typically sensitive to NIR this is blocked by a filter inside the camera, but this filter can be removed so a camera is sensitive to both visible and NIR light.If a daylight filter is used on the camera and NIR light emitting diodes used for illumination, then images taken by day and night will be similar while not interfering with the plants' diurnal cycles [3].As plants reflect NIR light, it may even be possible to use simple intensity thresholding for plant from background segmentation, as is done by [3].
Another use of infra red light is to backlight roots grown in agar.There is a good description of such a setup in [45].This is done to trace the extent to which a root that is positioned horizontally changes angle to grow downward.This is of interest as root morphology is an indicator of a plant's ability to absorb nutrients from the soil and its drought resistance.One driver of this is the tendency of roots to exhibit positive gravitropism.That is, they grow in the direction of gravitational pull [46].Segmentation in these cases is followed by either finding the centre line of the root [47] or the tip [48] and using principal components analysis to find the angle of the root near the tip.A similar approach is used by [49] to trace early growth of the stem.There are obvious difficulties in photographing roots growing in soil but [50] have developed a platform to allow image capture and analysis of roots grown in a "rhizotron".In this context, a rhizotron is a soil filled box with an angled glass front.As a plant grows in the box, some of the roots will come up against the glass front, so can be photographed.Not all of the root will grow against the glass front, of course, and manual tracing needs to be done to fill in other sections, so this approach is inevitably highly interactive.
An approach need not be tied to a specific type of image.The Integrated Analysis Platform (IAP) of [51] allows users to build a four stage pipeline for image pre-processing, segmentation, feature extraction and post processing and this pipeline includes modules appropriate for visible light, NIR and chlorophyll fluorescence images.These images can be correlated, so a segmentation mask generated using the fluorescence image, for example, can be applied to the corresponding visible light image.It also supports multiple views (side and top view) and being open source allows users to add their own specific pipeline modules if required."Rosette Tracker" [13] can be used with NIR or chlorophyll fluorescence images.In these cases the hue histogram is replaced by the intensity histogram.
These various specialist image modalities all result in a single channel image that can be rendered as greyscale -so have similar outputs, but with different contrast and different features highlighted.Chlorophyll fluorescence images have a strong contrast between plant and background so creating a background mask is straightforward, at least if imaging is well controlled.Simple thresholding is used with chlorophyll fluorescence imaging in [52].Where a system accommodates multiple modalities advantage can be taken of these different features to support processing of each other.Chlorophyll fluorescence images might be used to create a background mask to be used with a corresponding visible light image [51], for example.
While most 2D imaging of Arabidopsis for plant phenotyping uses top-down views of the rosette of immature plants, [53] extract the branching structure of the stem of a harvested mature plant photographed pressed against a black silk background, an approach that simplifies the segmentation problem but as the plant needs handling has its limitations for high throughput work.

Temporal information
Although many experimental installations take time-lapse images, few researchers make use of information from a series of images to inform segmentation of an image.Besides the colour and texture information used in their active contour modelling, [28] also use a plant modelled as a Gaussian mixture and this model is updated with the results of previous segmentations.In [54] segmentation was done using a background threshold on a filtered image.They used a growth chamber whose lighting changed to model diurnal and seasonal changes of light and the image 14 filtering had to account for this.They used each day's midday image as the starting point for setting that day's filter values and the midday values in turn were derived from the previous day's.
A default value was used at the start of the sequence.
A long-standing area of interest for biologists has been tracing diurnal changes to the degree of hyponasty, for example, [6] dates from 1992.One question biologists wanted to answer was whether this behaviour was a response to the stimulus of the change in light level or was driven by the plant's circadian rhythm (its internal clock) [55].As hyponasty is a behavioural trait, nondestructive measurement is a requirement.Side views of Arabidopsis grown in agar were photographed in continuous visible light by [55], demonstrating that the diurnal change in hyponasty was driven by circadian rhythms.Engelmann et al. [6] tried something similar with top views but found side views more appropriate.Both relied on available software to manage isolating the objects and tracing the changes.

Leaf-level analysis
While biologically useful measurements can be obtained from images of plants separated from the background, further segmentation of a plant's rosette into the individual leaves can be expected to increase both the range and precision of measurements that can be obtained.At a basic level, the number of leaves is a useful indicator of a plant's maturity, as in the growth stages of Arabidopsis in [15].Leaves are counted in [56] without performing a full segmentation, so areas of leaves cannot be obtained.Leaf-level analysis might enable a more precise assessment of the area of leaves shaded by other leaves (so reducing the plant's photosynthetic activity) than the simple measure of rosette compactness in [14].While [4] show that leaf area is a good indicator of a plant's dry weight (and that, in turn, is a good indicator of its fresh weight), they also note that the overlapping of leaves affects this relationship as the plant grows.Leaf-from-leaf segmentation should give a clearer indication of the degree of overlap in a given example and allow variations in the degree of overlap for a given overall plant size to be captured, which a simple mathematical model of the expected increase in hidden plant area due to overlapping leaves cannot.Such a unitary model will tend to mask possible variations between different genotypes.Given a temporal sequence of 15 images, leaf-from-leaf segmentation allows the rate of growth of individual leaves to be assessed [23] while a plant-level model has its apparent rate of growth influenced by the increasing number of mature leaves.Another area of interest is measuring the photosynthetic activity of individual leaves by segmentation of chlorophyll fluorescence images [17].This increase in the range and precision of possible measurements has motivated recent work on leaf-level segmentation, but it remains a challenging problem because of the changing shapes and orientation of the leaves and particularly the partial occlusion of leaves.In [57], several approaches were found to perform better at plant rather than leaf segmentation.
Work on quantifying leaf shape characteristics has tended to simplify segmentation by using images of leaves removed from the plant and photographed against a plain background.This allows thresholding and morphological filtering to do much of the segmentation, though [20] also make use of Canny detection and snakes.They extract measurements such as length, width, area, perimeter length and curvature (bending energy) of perimeter from 500 evenly spaced points on the perimeter.The distance between serrations and the depth of indentations is found by [58] (not using Arabidopsis) while [59] find size, width and tip-to-base asymmetry.There has also been work on leaf segmentation concerned with extracting a single leaf to recognise the plant by the leaf shape [60,61].Here we are concerned with approaches that segment the leaves as part of the plant with the aim of extracting biologically useful measurements.
It is possible to have an intermediate approach between full leaf-level segmentation and plantlevel measurements.The high-throughput phenotyping platform for plant growth modelling and functional analysis (HPGA) in [62], seeks to sidestep the challenge of leaf segmentation but derive some estimate of the degree of overlapping between leaves to improve the accuracy of plant area measurement relative to other approaches.An even simpler approach to this was taken by [16] where morphological operations were used to remove bridges between leaves and variation from an elliptical shape used to provide an approximation to the degree of overlap between leaves.This approach would seem to be restricted to very young plants whose leaves are close to elliptical in shape and where there is little overlap.
Where an approach incorporates leaf-from-leaf segmentation it is often the case that this is 16 treated as a second stage after plant-from-background segmentation [36,63].Indeed it is quite feasible to regard these processes as entirely distinct and use plant from background to generate a binary mask that can be added to (say) the original image, eliminating all the uninteresting data before using different characteristics for leaf-from-leaf segmentation.In [64], pixel classification is used to sort plant from background and a watershed approach then used to segment individual leaves.The pixel classification uses six features; red, green, blue, "excess green" (2G − R − B), variance filtered green channel and gradient magnitude filtered green channel.A feedforward neural network was found to be the best classifier.The seeds for the watershed leaf classification were determined by Euclidean distance from the background mask derived in plant segmentation and areas were merged according to an empirically obtained threshold value.
Recently the leaf segmentation challenge [35] has raised the profile of this problem, such that some work focuses on the segmentation itself without worrying about how well it extracts biologically useful information [63].This is built on by [65] in which different machine learning approaches were tried.They evaluated the importance of different geometrical features as well as comparing the performance of machine learning algorithms in leaf counting and segmentation.They used the three Leaf Segmentation Challenge datasets and found different approaches worked best for the different sets.It is perhaps unfortunate that the Leaf Segmentation Challenge datasets contain temporally sparse images of single plants as this encourages development of approaches that do not remove the human from the loop.Such approaches do not lend themselves to scaling for the high throughput systems needed to perform biologically useful experimentation.In [23], users are asked to mark each leaf as the seeds of a graph based segmentation procedure using random walk.While this simplifies interaction, the user has to mark each leaf in each image.Even with this simple interaction, a human user will not be able to keep up with a high throughput phenotyping platform.
Deep learning has also been applied to the leaf segmentation challenge datasets, as an example of "instance segmentation" where the aim is to identify and segment each example of some class of object in an image.Both [66] and [67] model a sequential approach to object counting using recurrent neural networks as these are capable of keeping track of those objects that have already

IET Computer Vision
This article has been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.been identified.While the ways they build their systems differ, both are inspired by modelling how humans count objects and both have achieved state of the art scores at counting leaves.They use the leaf, not the plant, as the object whose instances are to be counted.

Degree of automation
For high throughput systems, an automated image analysis system is required to eliminate the human input bottleneck, though this need not rule out a calibration stage that establishes the best parameters for a set up in initialising an installation or specific experiment.Some otherwise automated tools require fine tuning to get the best results on a given image set even though once these are found, individual images can be segmented with no user input.Different threshold values for watershed merging were needed by [64] for the three different image sets in the Leaf Segmentation Challenge data [35].Similarly, the approach developed in Nottingham described in [57] that uses SLIC superpixels for both colour based plant segmentation (using the Lab colour space) and watershed based leaf segmentation has five parameters that were empirically derived and can be adjusted either for a dataset (as was done in the work in [57]) or per image.
An automated system might require the user to mark a point of interest on the plant itself.In [44] the user has the option of marking the leaf whose growth pattern is to be traced with a small dot of inert paint.If this is not done, the user has to select the tip in the images so the system is not automatic.Some work has been done with semi automated approaches that achieve segmentation with some user interaction with each image [23].While this is not useful for high throughput phenotyping, it can help when generating training data and ground truth for training and evaluation of more automated approaches.
Several approaches use a machine learning approach to image segmentation as a way of capturing human expertise while avoiding the need for human interaction during experiments.The training data will typically be a set of images with hand annotated masks to show the leaf-level segmentation.The Leaf Segmentation Challenge dataset [35] is an example.[63] learns 3D histograms for plant segmentation by pixel classification while [65] seeks to refine this approach by comparison of various machine learning approaches.In [68], machine learning is combined with model based knowledge where leaves are modelled using two Gaussian clusters, one for for blade and one for petiole (the stem of the leaf).In addition, the petioles are constrained to emerge from the centre of the plant.For training, non-occluded leaves were selected by hand.Plants were oversegmented and the segmentations pruned.They invariably found a leaf counting error to involve identifying too few leaves.The training data used by [56] is simpler as it needs images together with the number of leaves in each image rather than a fully annotated image.They are attempting a simpler task, leaf counting rather than segmentation.One aspect of these approaches is that the training set images were taken from the same image dataset as the test set, so [63] used three training sets of images, one for each of the three datasets in the Leaf Segmentation Challenge data.This suggests the approach would need training for each new set of images, or at least for each new image source where the environment was closely controlled and the image sets from successive experiments can be expected to be sufficiently similar.

Type of image
Many approaches that perform leaf-level segmentation and analysis use 2D visible light images, partly driven by the availability of such images in the Leaf Segmentation Challenge dataset [35].
This allows a relatively cheap set-up cost [16] as any digital camera can be used.Using the visible spectrum allows the choice of how to use colour information.In [16] different approaches for daytime and night-time images are used.For daytime images, Wu quantisation [69] was used to extract 16 colour clusters from the image.This choice of 16 clusters is empirical, and was found to result in a green cluster that could be retrieved as being plant.Gravel in the soil was found to affect this and the authors suggest spreading a layer of white sand to make the background less variable.Green light was used for night-time imaging as the wavelength does not trigger daytime detection by the plants.It is also strongly reflected by plants so promotes contrast between plant and background.This has the same effect as using NIR of restricting imaging to a single channel, of course, so [16] compensate for this in segmenting night time images, where the loss of colour information is replaced by an approach making use of the last daylight (so colour segmented) image to initialise the segmentation of the monochrome night time image.The use of the green channel in [56] also amounts to using a selected single channel image.
As plants are not sensitive to it, NIR is used to capture diurnal effects such as regular changes in rate of growth [44].While using NIR might increase the contrast between plant and soil, the loss of colour information itself affects how segmentation can be carried out.In [44], a single leaf whose pattern of growth, indicated by movement of the leaf tip, is chosen manually.This is either by marking the actual leaf or by selecting the leaf tip in each image.Some approaches use chlorophyll fluorescence imaging, which is associated with photosynthetic activity.This gives a single channel image that can be treated like any other from an image processing point of view.In this way [62] uses Canny edge detection to find leaf edges and Sobel edge detection to segment plant from background.In [70], a simple threshold is sufficient to segment plant from background, as chlorophyll fluorescence imaging gives a high contrast between plant and background.The threshold was found empirically, so can be expected to need resetting on changing the experimental setup.The plant's contour is used to identify areas of petiole and so to separate individual leaves.This approach fails to handle cases where leaves overlap so is restricted to very young plants.In [17], individual leaves in a chlorophyll fluorescence image are matched against a template library of different shapes, sizes and orientations and the best matches selected.The library can be obtained from edge maps of chosen "typical" leaf shapes transformed to create the range of sizes and orientations.This approach is claimed to identify both of an overlapping pair of leaves provided the overlap is less than 23% of the area of the smaller leaf.
Besides the choice of wavelength(s) to use, another possibility is to use a 3D imaging technique to allow different characteristics to be captured, such as the direction of the leaf blade surface in [24].They use a laser range finder alongside a colour 2D image and the rangefinder image is used to separate the plant from the background and combined with the colour image to create a coloured polygonal model of the plant, or at least its upward facing surfaces.This model is used to extract orientation information for quantifying the direction of the blade surface and epinasty (downward curvature) of the leaf.A light field camera is used to get depth information in [36].This generates a depth image and a 2D focus image, the latter being used for segmentation both of plant from 20 background and of individual leaves.This is then used to generate landmarks in the depth image.
The depth image allows measurements of leaf angle and so from the time lapse series differences in hyponasty between genotypes can be established.One pragmatic consideration here is that approaches using specialist imaging equipment might be tied in to a specific platform [36], in contrast to approaches intended for use anywhere [51].Stereo imaging is used in the approach to finding leaf growth curves in [71].They worked with tobacco (also a rosette plant), not Arabidopsis.They over-segment the image and merge these segments using angular and depth information from the stereo disparity map obtained by block matching.Some work has been done using a depth camera to extract 3D information ( [72], not Arabidopsis), this being a relatively low cost approach.The MSU-PID database includes depth camera images of Arabidopsis [39].
Another aspect of dealing with depth is demonstrated by [16] where the camera's limited depth of field is addressed by taking several frames at different focus distances and merging them before segmentation.They used a very wide aperture as the fluorescent lights used gave insufficient light to avoid this, restricting the depth of field of the images.They do not make use of depth from defocus.

Temporal information
There are two aspects to be considered regarding the use of temporal information.These are whether or not use is made of temporal information in image segmentation and analysis or, more simply, whether an approach is capable of managing the bulk of data, and simply keeping up.In the parametric probabilistic active contours approach of [73] the result of one frame's segmentation is used as the initialisation for the active contours in the sequentially succeeding frame.This work was done with thermal images of sugar beet leaves, not Arabidopsis.
The value of using temporal information in image analysis is demonstrated by [26].This paper extends the approach in [17] by analysing the first image in a sequence in the same way as the earlier paper but analysing subsequent images using the result of the previous image rather than the template library.They demonstrate a quantifiable improvement in performance over their earlier Temporal information is also used in [74] where once the edges of leaves have been identified by isolating smoothly curved edge segments and grouping them based on a leaf model, growth is traced over a sequence of images by finding a minimum spanning tree, constrained by prohibiting paths that join initially distinct nodes.These are fitted to a logistical growth curve and leaf occlusions can be identified by failure to fit a single curve.This work is done using tobacco but the authors suggest it could be applied to Arabidopsis, given a suitable leaf model for edge segment grouping.
One aspect of temporal information is the possibility of using different lighting and therefore different approaches to segmentation for daylight (colour) images and nighttime (monochrome) ones.The last previous daylight image of a plant is used by [16] to initialise a level set segmentation of a monochrome nighttime image (lit by green light) to compensate for the loss of colour information.
Temporal information is also used to capture derivative measurements and this can be distinct from the segmentation itself.A system that extracts leaf area from each time step independently can still use the difference in leaf area in successive images and knowledge of the period between these image to derive the absolute and relative growth rate.Further, if the periodic measurements are fitted to a growth model so the change in area can be treated as continuous, a continuous record of growth rates can be derived [62].Phytotyping4D [36] has a temporal resolution of 12 minutes but temporal neighbours are not used to inform segmentation of an image.As they use a light field camera to capture depth information, they can also trace changes to hyponasty from the time lapse data.The approach of [56] counts leaves in a few seconds so is claimed to be suitable for high throughput systems, but their results are from images of a single plant while a high throughput system will typically image several plants in a frame.Even where temporal information is not used in segmentation, the need to manage the volume of data as well as possible time constraints on performance remain factors in the design of systems that are to be capable of high throughput.

Conclusion
The development and use of computer vision for plant sciences, and specifically for phenotyping, is at an interesting stage.On one hand, the development of high throughput phenotyping platforms is driving the rapid adoption of computer vision while on the other, there remains a good deal of scope for improvements in the state of the art.Specifically it is to be hoped that developments in leaf-level segmentation and analysis will enable a wider range of more precise measurements to be made available and the development of systems that do not rely on a human in the loop will help bridge the gap between genotype and phenotype.For the computer vision community to advance this state of the art, we suggest the following is required:-• Data sharing.While Arabidopsis image datasets are becoming available, there is a need for data that more closely resembles the output of high throughput platforms.That is, time lapse data with images containing multiple plants.The importance of shared data and evaluation techniques has been shown in other fields of computer vision.
• Benchmark ground truth.This is necessary so the data can be used to evaluate different approaches and for generating training data for approaches that incorporate machine learning.
• Automated systems.For a system to be capable of timely handling of the volume of data generated in phenotyping (where multiple samples must be included in an experimental set), the human input bottleneck must be avoided.
• Modern data-driven computer vision techniques with machine learning at the heart.
• Focused co-operation between the computer vision and biological communities.This will help to eliminate wasteful duplication of effort and to direct research towards solutions that will be most beneficial to the phenomics community.
Until these developments happen, biologists will be left with bespoke systems tailored to particular datasets and image capture setups.By advancing the state of the art in computer vision systems for plant phenotyping we can help biologists in research that might have a vital impact on world food production.

4 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

7 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

Fig. 3 .
Fig. 3. Left to right, the Arabidopsis plant from Fig. 1, plant-level ground truth from the phenotyping machine's automatically generated mask and hand drawn leaf-level ground truth.

Page 10 of 33 IET
is a freely available open source tool that is intended for use with a variety of imaging setups.Where the visible spectrum is used, the image's hue histogram is modelled as a mixture of Gaussians and those pixels closest to the Gaussian whose mean value is the right shade of green are classified as plant.They start with two Gaussian mixtures and increase the number until one has a mean 10 Review Copy OnlyIET Computer VisionThis article has been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

12 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.tion,being G/(R + G + B) against brightness.

13 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

18 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

19 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

22 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.

23 Page
been accepted for publication in a future issue of this journal, but has not been fully edited.Content may change prior to final publication in an issue of the journal.To cite the paper please use the doi provided on the Digital Library page.