High-throughput time-stretch imaging flow cytometry for multi-class classification of phytoplankton

Time-stretch imaging has been regarded as an attractive technique for highthroughput imaging flow cytometry primarily owing to its real-time, continuous ultrafast operation. Nevertheless, two key challenges remain: (1) sufficiently high time-stretch image resolution and contrast is needed for visualizing sub-cellular complexity of single cells, and (2) the ability to unravel the heterogeneity and complexity of the highly diverse population of cells – a central problem of single-cell analysis in life sciences – is required. We here demonstrate an optofluidic time-stretch imaging flow cytometer that enables these two features, in the context of high-throughput multi-class (up to 14 classes) phytoplantkton screening and classification. Based on the comprehensive feature extraction and selection procedures, we show that the intracellular texture/morphology, which is revealed by highresolution time-stretch imaging, plays a critical role of improving the accuracy of phytoplankton classification, as high as 94.7%, based on multi-class support vector machine (SVM). We also demonstrate that high-resolution time-stretch images, which allows exploitation of various feature domains, e.g. Fourier space, enables further sub-population identification – paving the way toward deeper learning and classification based on large-scale single-cell images. Not only applicable to biomedical diagnostic, this work is anticipated to find immediate applications in marine and biofuel research. © 2016 Optical Society of America OCIS codes: (180.0180) Microscopy; (170.3880) Medical and biological imaging; (100.0100) Image processing; (120.0120) Instrumentation, measurement, and metrology; (170.7160) Ultrafast technology.


Introduction
Harnessing group velocity dispersion (GVD) without suffering from the associated dispersive loss through optical amplification, optical time-stretch (also known as dispersive Fourier transformation) enables continuous, single-shot spectrally-encoded measurements at an ultrafast rate beyond MHz in real-time [1]. Apart from the originally perceived applications in high-speed optical communication, its diverse potentials have progressively been invigorating many other arenas, notably high-throughput single-cell imaging [2][3][4][5]. A unique feature empowered by optical time-stretch imaging is generation of the enormous real-time image data, from which a wealth of image-based information of individual cells can be retrieved, such as textures, geometrics and morphologies, at an unprecedented throughput. Such complex high-dimensional image information is innately favorable for automated classification and thus screening applications with the use of machine learning. Prior work on integrating machine learning and time-stretch imaging was successfully demonstrated for binary (two-class) classification, such as screening cancer cells from blood cells [6]. Nevertheless, biological cells by its nature are highly heterogeneous and diverse in cell types, states and thus functions. In order to fully exploit the potential of time-stretch imaging for single-cell analysis that has proven impact in both clinical diagnostics and fundamental cell biology research, two requirements have to be met: (1) sufficiently high time-stretch image quality (resolution and contrast) for visualizing the complexity of the cellular and subcellular structures; (2) automated multi-class classification and analysis for revealing the heterogeneity of the cell populations. With the aims to address these issues and thus further widen the scope of time-stretch imaging, we here demonstrate ultrafast label-free (brightfield) imaging flow cytometry of phytoplankton, combined with the capability of automated multi-class (14 classes) support vector machine (SVM) classification, thanks to the highresolution time-stretch images, from which up to 44 cellular features can be extracted for classification.
Phytoplankton (also called microalgae) are photosynthetic micro-organisms and are the key primary building blocks in the aquatic ecosystem [7]. The ability to enumerate and characterize phytoplankton is of considerable value for environmental monitoring [8-10], e.g. detection of harmful algal bloom (HAB) species [11] and screening of microalgal candidates as renewable and sustainable source for biodiesel production [12]. Due to the extreme diversity in genus and species, phytoplankton are highly heterogeneous in size, shape and morphology/texture. Current gold standards run short of the combination of high-throughput and high-accuracy for large-scale phytoplankton detection, identification and characterization [9,13]. For instances, traditional microscopy allows high-resolution images, and thus detailed inspection on individual phytoplankton, at the expense of imaging throughput [14]. On the other hand, standard flow-cytometry provides high-throughput single-cell interrogation (up to 100,000 cells/s) which is by no means sufficient for taxonomic classification because of the absence of imaging capability. Such predicament is partially alleviated by incorporating image sensor in the flow cytometers [15][16][17][18][19][20][21]. However, the imaging throughput has to be compromised (~1,000's cells/s) by the inherent speed limitation of the image sensors. To this end, optical time-stretch imaging is ideal for addressing the unmet need for high-throughput imaging flow cytometry of phytoplankton [13]. To fully exploit this potential of time-stretch imaging, as discussed earlier, the image resolution has to be sufficient for automated multiclass taxonomic classification without sacrificing the imaging speed -the central motivation of this work. In addition to automated multi-class classification, we also exploit other feature domains, e.g. based on Fourier analysis, to identify the heterogeneity within a single population, i.e. sub-population, of phytoplankton to exemplify the necessity of high-quality time-stretch imaging.

Principle of microfluidic time-stretch imaging
Optical time-stretch imaging flow cytometer developed in this work is an integrated system consisting of a customized microscope and a microfluidic system as in Fig. 1. The configuration is similar to the configuration demonstrated in [3] but we perform wavelengthtime mapping before space-wavelength mapping instead. In the system, a broadband laser pulsed beam from a home-built fiber mode-locked laser [22] (center wavelength = 1064nm, bandwidth = 10nm, repetition rate = 11 MHz) is launched into a 10-km-long single-mode fiber, within which the pulses are time-stretched with a high GVD of 0.38 ns/nm, i.e. the single-shot spectra is transformed into the time waveforms. An in-line home-build optical amplifier module based on ytterbium-doped fibers is employed to provide an optical power on-off gain as high as 30 dB in order to compensate the dispersion loss. The time-stretched pulsed beam is then coupled into the microscope through a diffraction grating (groove density of 1200 lines/mm). It spatially disperses the beam to be a one-dimensional (1D) spectral shower, i.e. line-scan illumination, which is focused by a high-numerical-aperture (NA) objective lens (NA = 0.75) onto the microfluidic channel. Individual phytoplankton flows at a high speed of ~2 m/s along the channel orthogonal to the 1D line-scan beam (scanning at the same rate as laser repetition rate, i.e. 11 MHz). Made of polydimethylsiloxane (PDMS), the microfluidic channel is custom-designed with an asymmetric curved channel for generating the inertial flow focusing effect such that the cells can flow in single focused stream in the imaging section (with a dimension of 80 µm × 80µm (Height × Width)) at high-speed [3]. The time-stretch microscope is built in a double-passed configuration, i.e. the spectral shower double-passes two objective lenses, which ensures the space-wavelength mapping is preserved throughout the imaging path and thus the image resolution and quality. As a result, the spatial information of the imaged phytoplankton in each line scan is encoded to the wavelengths of the spectral shower, which is detected as a serial temporal waveform by a high-bandwidth single-pixel photodetector and a real-time oscilloscope (bandwidth = 16GHz, sampling rate = 80GS/s) for subsequent image reconstruction and off-line multi-class classification by SVM. In principle, time-stretch can be performed either before or after spectral encoding [23]. Nevertheless, there are two key differences between the two approaches: (1) the latter performs single-shot illumination, in which the exposure time is estimated from the timebandwidth product of each spectral-resolvable sub-pulse (typically ~10's ps). The former functions as an all-optical laser beam-scanner, in which the exposure time is determined by the temporal width of stretched waveform (~10's ns).
(2) Input intensity to the time-stretch module in the two configurations could differ by several orders of magnitude. It implies that different amplifier designs are required to optimize the SNR performance. With the aim to maximize the SNR (critical for accurate image classification) whereas minimize the photodamage effect due to excessive illumination power, we configured the time-stretch module prior to spectral-encoding. In this configuration, the input power to the time-stretch process is higher and thus results in better noise figure in amplification, compared to the previous work which implemented time-stretch after spectral-encoding [3]. In this case, the illumination power per resolvable point (~10mW) creates negligible photodamage effect to living cells at the wavelength regime of 1 μm.
When time-stretch is performed after spectral encoding, the exposure time of each resolvable spectral sub-pulse is determined by the time-bandwidth product of a transformlimited pulse, i.e. 4.4 ps. In contrast, when the time-stretch process is performed before spectral encoding, the exposure time of each resolvable spectral sub-pulse could be estimated with the product of GVD and the bandwidth of each sub-pulse δλ. Consider the bandwidth of each sup-pulse of 0.23nm and the GVD of 0.38 ns/nm, the exposure time of each point is now ~87.4 ps. Operating in a laser line-scanning mode, the effective exposure time of one linescan is 3.8 ns, given the total bandwidth of 10nm. Despite the longer exposure time compared to [3], it essentially creates no motion-blur in the ultrafast flow at a speed of ~1 m/s, i.e. the motion blur (~4 nm) is far smaller than the image resolution (~1.59 µm) in our case. Note that as phytoplankton in general exhibit higher contrast than the largely transparent mammalian cells, contrast-enhancement techniques, e.g. phase-gradient contrast [3], are not necessary in this work. The time-stretch imaging system presented here is thus operated in the bright-field mode.

Imaging performance
Our system is capable of imaging cells with sufficiently high resolution and high contrast for resolving cellular structures. The image resolution of time-stretch imaging is generally governed by three limiting regimes [24]: (i) the spatial-dispersion limited resolution, which is governed by the spectral resolution of the diffraction grating δx spatial , (ii) the stationary-phase approximation limit, which is defined for the wavelength-time mapping process δx SPA , and (iii) the photodetector-limited resolution δx det . In the current system, the resultant resolution values limited by these three regimes are evaluated as δx spatial ≈1.59 μm, δx SPA ≈0.89 μm and δx det ≈0.54 μm. Therefore, the resolution along the line-scan is currently limited by the spatial-dispersion resolution, i.e. 1.59 µm. On the other hand, the image resolution along the flow direction could be evaluated as δy ≈1.33 μm, following the analysis in ref [3]. We employed the optical time-stretch imaging flow cytometer to capture a total of more than 10,000 images of the cultured phytoplankton (14 species) (Carolina Biological Supply), as highlighted in Fig. 2. It is shown that our optical system is capable of revealing a great variety of size, shape and morphology of algae cells, for instance, the floatation spines of Scenedesmus sp. and the helical structure of Spirulina Major. More significantly, the characteristic intracellular textures of the phytoplankton such as the vacuoles, pyrenoids and chloroplast are clearly visualized as highlighted in Fig. 2. Note again that single-cell imaging with this level of resolution and contrast is particularly necessary image-based classification which will be discussed in the next section.

Image processing and feature extraction
Subsequent to image reconstruction are image segmentation and feature extraction for phytoplankton classification. In the image segmentation, each grayscale image is firstly transformed into a local entropy map (B in Fig. 3) and then to a binary image by thresholding with the erosion (C in Fig. 3) and the hole filling (D in Fig. 3) operations. Geometrical parameters, e.g. size and equivalent spherical diameter of the segmented binary mask are extracted and used for determining debris/fragments that have the diameters < 1 µm or onefourth of the mean size of the imaged population of the phytoplankton and > 90% of the fieldof-view (FOV) of the image. We further perform manual screening to opt out the residual debris. These non-phytoplankton events are pre-filtered for subsequent machine learning. For the purpose of feature extraction, geometrical features of the phytoplankton, such as area, perimeter, major and minor axes and thus circularity and elongation factor (See Table 1 and Table A1 in Appendix for detailed description), are extracted from the binary masks. In addition, seven image (pixel intensities) moment invariants (or called Hu's moments), which are scale-, translation-, and rotation-invariants [25] are calculated from both the binary mask and the blob image at the intermediate stage (C) (see Fig. 3) -resulting in 14 feature parameters. Texture information of each image is retrieved by analyzing both the 1D (pixelintensity) histogram and the gray-level co-occurrence matrix (GLCM) which are obtained from the gray-scale image but only considering the pixels inside the smallest bounding box. The 1D probability density function (PDF) of gray-scale segmented image pixel intensities gives rise to five image properties: grey-level standard deviation, skewness, entropy, energy and module. Compared to 1D histogram analysis, GLCM, is a 2D statistical measure of the pixel intensities from which texture information can be characterized [26]. GLCM is constructed by book-keeping how often a pixel-pair with specific values occur in an image (64 gray-levels in our case). The pixel separation/offset in each pair is set to be three pixels, corresponding to a half of the diffraction-limited resolution, in four directions (vertical: 0°, horizontal: 90°, diagonal: 45° and 135°). The statistics -mean and the dynamic range of each image property: contrast, correlation, energy, homogeneity and entropy, are calculated from the GLCMs along four directions -leading to a vector quantity of 10 features. We further extract the information about the granularity of the cells derived from the normalized local entropy filtered image, following the same 1D histogram and GLCM analysis. It results in 11 more features: six image properties from the 1D histogram with mean gray level intensity in addition, and five mean values of image properties from the four directions of the GLCM. In total, each segmented cell generates up to 44 features including 18 features describing its geometrics and shape, and 26 features describing its morphology. Detail descriptions of the extracted features are referred in Table 1 and Table 3 in Appendix.

Feature selection
A total of 44 image features, each of which has a normalized score/value ranging from 0 to 1, are extracted (see Table 1) to represent the geometries, morphologies and textures of the cell images. Prior to supervised learning for multi-class classification, feature ranking and selection are conducted to avoid duplicate and irrelevant features input to the machine learning model that could cause misclassification and result in extended computation time for training. In this work, we employ the bagging (or bootstrap aggregation) approach based on the random forest algorithm to evaluate and rank the importance of the features [27,28]. In brief, a random forest is a large collection of decision trees (500 trees in this work), each of which is trained (or "grown") with a random subset of the images (2/3 of the total image count), called a bootstrap sample. For each grown tree, the error rate of phytoplankton prediction/observation left out of the bootstrap samples, termed out-of-bag (OOB) samples (1/3 of the total image count), is recorded. Next, we perform a random permutation of the values of the k th image feature across the OOB samples which are fed to the tree again for measuring a new prediction error rate. The average increase in the error rate across all trees is used as a measure of ranking the importance k th image feature in the random forest. The key feature of bagging in random forest is the ability to take the ensemble average of all decorrelated trees, and thus reduce the variance -favoring a robust and unbiased feature selection process [27,28]. Our result of feature ranking shows that the low-resolution features, e.g. geometrical parameters (highlighted in yellow in Fig. 4(a)), have a comparably lower importance. In contrast, the texture-rich features (highlighted in blue in Fig. 4(b)), which can only be retrieved with high-resolution imaging, generally dominate the high-rank positions in the feature importance. For feature selection, we then implement multi-class SVM with a 10fold cross-validation and evaluate the classification accuracy against the numbers of the selected features according to the descending ranking order, i.e. starting from the highest importance feature to the lowest one. We note that the increase in classification accuracy generally slows down when more than 16 features are selected and reaches to its maximum when the most important 26 features are chosen ( Fig. 4(b)).  Table 3 in Appendix), the feature importance value is the increase in mean squared error (MSE) averaged over all 500 classification trees in the forest, which is divided by the standard deviation taken over the trees, for each feature [27,28]. The low resolution features are highlighted as yellow bars in the plot -most of which show low importance for classification. (b) 10-fold cross-validated accuracy of the multi-class SVM against the numbers of the selected features according to the descending ranking order, i.e. starting from the highest importance feature to the lowest one.

Multi-class SVM classification of phytoplankton
The multi-class SVM classification is implemented with the one-versus-one coding approach [29]. Specifically, we employ the error-correcting output codes (ECOC) multiclass model with an one-vs-one coding matrix [29][30][31], which constructs N = n(n-1)/2 (n = 14 is the number of classes, i.e. N = 91) binary SVM classifiers. The approach of one-vs-one ECOC is chosen because of its high accuracy at the expense of marginal increase in the computation time due to the large number of binary classifiers [29]. Each binary classifier is trained with two groups of data coded as "+1" and "-1", and disregard the remaining data as a code"0". It then generates an output code containing sign "+" as one class or "-" as another class and a value representing the level of confidence to compare with the labeled code [31]. In the case of three-class classification, implementing one-vs-one strategy results in 3 binary SVM classifier (represented by each column in Table 2), each of which is then assigned with a different code words.

Table 2. Assigned Code Words in the One-Vs-One Strategy
By implementing the loss-weighted decoding scheme with the hinge loss function, the image is classified as the closest matching case [32], i.e. finding the smallest difference between the output generated code and the originally assigned code. The nature of SVM is to find the best separating hyperplane in the high dimensional space within a certain margin represented by the cost function. Due to the nonlinear relationship between the features and the class label, each SVM binary classifier firstly transforms the data into higher dimensional space by using the kernel function, which is the radial basis function in this case and then finds the hyperplane. The multiclass SVM model therefore is optimized with cross-validation by grid-searching the two important parameters -kernel scale γ in the kernel function and the cost function C to avoid overfitting [30]. To measure the performance of the optimized SVM model, 10-fold cross-validation is implemented to compute the classification accuracy of each species. It involves three steps: (1) It randomly partitions the data set into 10 subsets; (2) the SVM model is then trained with 9 subsets, followed by validation with the remaining 1 subset for obtaining the classification accuracy; and (3) step (2) is repeated for 10 rounds to evaluate the averaged accuracy. The key results of the multi-class one-vs-one SVM classification of the phytoplankton images are shown in Fig. 5. In general, the averaged 10-fold cross-validated classification accuracy across all 14 species is 94.7% whereas the accuracies of individual species are all well above 80% ( Fig.  5(a) -(b)). It is noteworthy that classification based on only the low-resolution features primarily extracted from the binary mask (i.e. highlighted in yellow in Fig. 4(b)) yields less than 60% for most of the algal species. It clearly demonstrates the significance of highresolution time-stretch imaging for an accurate and precise multi-class classification with a high-throughput measurement capability. There are several species shows more than 5% of misclassification and are confused with the other species, notably, Navicula and Chaetoceros gracilis, and Prorocentrum and Thalassiosira (Fig. 5(a)-5(b)). Primarily, the misclassification may be attributed to the small cell sizes of the species, e.g. Navicula and Chaetoceros gracilis (averaged diameter of < 5µm), that results in insufficient resolvable image pixel number (~3 diffraction-limited resolvable pixels) to reveal the rich morphological features. Moreover, some species have similar texture heterogeneities that render the present descriptors of geometrics and morphologies not sensitive enough to classify among them at high accuracies. Nevertheless, this problem could be mitigated by exploiting texture/feature information in the other forms or domains (apart from the techniques described in this work, i.e. Table 1 and Fig.  4), namely simple Fourier analysis, wavelet texture analysis (WTA) [33], direct multivariate image analysis (MIA) [34], or even convolutional neural network [35]. This could be significant for deeper multi-class classification, not only on the species or the genus level, but also on the sub-type level within the same class of population and thus allows higher classification accuracy and specificity. Comparison of the accuracy of the multi-class SVM classification by using low-resolution (i.e. geometrics and shape features as highlighted in yellow in Fig. 4(a)) and the selected 26 features. The labels refer to 1:Selenastrum capricornutum; 2:Chlamydomonas; 3:Scenedesmus; 4:Chaetoceros gracilis; 5:Gymnodinium; 6:Navicula; 7:Prorocentrum; 8:Thalassiosira; 9: Merismopedia; 10: Spirulina major; 11:Chlorella; 12: Euglena; 13: Synura; 14: Lyngbya.
To exemplify this idea, we access the time-stretch images of the classified Scenedesmus and identify another small population (~24%) showing hollow gelatinous structures. This is in stark contrast to the majority of the population, which is rich in texture (Fig. 6). However, these two sub-populations are obscured by the present SVM model despite the high classification accuracy of Scenedesmus (95.3%). This implies that additional features are required to identify these sub-populations. To this end, we explore the spatial frequency domain of the image, i.e. to perform the 2D Fourier analysis of individual Scenedesmus images. By extracting the averaged values within three annular rings of the 2D power spectrum and using them as the new classifiers, the two dense and hollow sub-types are well distinguishable (Fig. 6). We anticipate that deeper learning could be achieved by further exploiting various feature domains, as mentioned in [33][34][35][36].

Concluding remarks
To fully exploit its capability of ultrafast single-cell imaging for practical high-throughput imaging flow cytometry, we have demonstrated an optofluidic time-stretch imaging system that achieves high image resolution and contrast for visualizing sub-cellular complexity of single cells -a critical feature allowing accurate multi-class classification and analysis for revealing the heterogeneity across different and even within same population of cells. Specifically, the system was tested to classify 14 different species of phytoplankton based on multi-class SVM with a high accuracy of 94.7% and at an imaging throughput beyond 10,000 cells/sec.
The current image-based classification is entirely based on bright-field image contrast. Other enhanced image contrasts can also be accessed and can facilitate better classification, such as differential interference contrast [37], phase-gradient contrast [3] and quantitative phase contrast [38,39]. Notably, quantitative phase imaging has been proven to quantify protein content as well as biophysical properties of the cells [38][39][40]. Such additional parameters can also be extracted for conducting some single-cell analysis on intracellular biochemical content, i.e. the content of lipid and improving the classification accuracy. This is proven by a recent work which demonstrated interferometry-based quantitative phase timestretch imaging for binary classification of cells using a neural network algorithm [6]. When the high-resolution capability is enabled in quantitative phase time-stretch imaging, deep neural network for multi-class classification could readily be feasible. In addition, recent effort also showed that time-stretch imaging can be made even more versatile by incorporating the fluorescence detection capability. As a result, not only the biophysical and morphological information, but also the biomolecular signatures e.g. chlorophyll or other fluorescent markers in the phytoplankton, of the single cell can be extracted at highthroughput [41] -further enriching the collection of features to be used for classification.
The current work could find an immediate application in advancing environmental and toxicological studies of phytoplankton, which are intertwined with the algal bloom in the local scale as well as the climate change in the global scale. In contrast to the traditional methods of microalgae enumeration, identification and sizing, which are laborious, time-consuming and prone to result in an erroneous classification, time-stretch imaging flow cytometry could favor automated real-time monitoring and detection of HAB (or red tide) [11] based on deep learning of an enormous amount of single phytoplankton images. Moreover, it can also be utilized for large-scale screening and characterization of naturally occurring high-lipid microalgal species that is a critical step for design and optimization of biofuel production strategies [12]. 12 Morphology Standard deviation of grey level The standard deviation of gray level intensity calculated from the probability density function (pdf). where ( ) h g is the number of pixels with grey-level of g ranged from 0 to L-1 and M is the total number of pixel inside the mask.