High-throughput, image-based flow cytometry and clustering method for phenotyping 3 heterogeneous cell populations

Image-based cell profiling has become a common tool to identify phenotypic changes in cells 22 exposed to various stimuli. To apply this approach to any research organism, we developed 23 Image3C (Image-Cytometry Cell Classification), a tool that enables clustering of single cells based 24 on their intrinsic phenotypic features by combining image-based flow cytometry with cell cluster 25 analysis. We conducted a morphology analysis of hematopoietic tissue from zebrafish and a 26 phagocytosis experiment. Here, Image3C could identify major hematopoietic cell lineages and, in 27 addition, cells with specific functions, which abundance can be statistically compared between 28 different treatments. To test the versatility of Image3C, we also clustered hemocytes of the apple 29 snail Pomacea canaliculata obtaining results consistent with those collected by classical 30 histochemical approaches. These experiments illustrate how Image3C can be used to classify and 31 visualize heterogenous cell population obtained from either invertebrates or vertebrates without 32 the need of antibodies or molecular databases. 33


Abstract 21
Image-based cell profiling has become a common tool to identify phenotypic changes in cells 22 exposed to various stimuli. To apply this approach to any research organism, we developed 23 Image3C (Image-Cytometry Cell Classification), a tool that enables clustering of single cells based 24 on their intrinsic phenotypic features by combining image-based flow cytometry with cell cluster 25 analysis. We conducted a morphology analysis of hematopoietic tissue from zebrafish and a 26 phagocytosis experiment. Here, Image3C could identify major hematopoietic cell lineages and, in 27 addition, cells with specific functions, which abundance can be statistically compared between 28 different treatments. To test the versatility of Image3C, we also clustered hemocytes of the apple 29 snail Pomacea canaliculata obtaining results consistent with those collected by classical 30 histochemical approaches. These experiments illustrate how Image3C can be used to classify and 31 visualize heterogenous cell population obtained from either invertebrates or vertebrates without 32 the need of antibodies or molecular databases. 33

Main text 34
Modern technologies used to analyze individual cells and subsequently cluster them based on 35 morphology, cell surface protein expression or transcriptome similarities are powerful methods for 36 high-throughput analyses of biological processes at single cell-resolution. Recent advances in 37 image-based cell profiling and single cell RNA-Seq (scRNA-Seq) allow quantification of 38 phenotypic differences in cell populations and comparisons of cell type composition between 39 samples 2 . While studies that use traditional research organisms (e.g. mouse, rat, human or fruit fly) 40 benefit from these methods due to the availability of mature genomic platforms and established 41 antibody libraries, the lack of such resources in non-traditional organisms prevents extensive use 42 of single-cell based methods to interrogate their biology. In these cases, classical histochemical 43 methods are often used to identify and characterize specific cells, but the quantification analysis 44 of specific cell types can be affected by both observer bias 3 and a dearth of quantitative frameworks 45 for making determination of cell classes. 46 To make the analysis of cell heterogeneity accessible to research organisms lacking genetic 47 platforms or extensive species-specific reagents, we developed Image3C. Our method analyzes, 48 visualizes and quantifies the composition of cell populations by using cell-intrinsic features and 49 generic, non-species-specific fluorescent probes (e.g., Draq5 or other vital dyes), thus eliminating 50 observer bias. Image3C is an extremely versatile method that is virtually applicable to any research 51 organism from which dissociated cells can be obtained. By taking advantage of morphology and/or 52 function-related fluorescent probes, Image3C can analyze single cell suspensions derived from any 53 experimental design and identify different constituent cell populations. Image3C combines 54 modern high-throughput data acquisition through image-based flow cytometry, advanced 55 clustering analysis and statistics to compare the cell composition between different samples. 56  Table S1 for details).

106
Next, we sought to determine whether Image3C can be used to detect clusters whose relative 107 abundance significantly changes after specific experimental treatments. We performed a standard 108 phagocytosis assay using hematopoietic cells from zebrafish, which were stained with Draq5 and 109 incubated with CellTrace Violet labeled Staphylococcus aureus (CTV-S. aureus) and 110 dihydrorhodamine-123 (DHR), a reactive oxygen species that becomes fluorescent if oxidized 111 All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint (Supplemental Methods). The DHR was used as a proxy for cell activation to report oxidative 112 bursting as a consequence of phagocytosis. As control, we inhibited phagocytosis through 113 cytoskeletal impairment by CCB incubation or through incubation at lower temperature (i.e. on 114 ice). 115 Events collected on the ImageStream® X Mark II (Amnis Millipore Sigma) were analyzed with 116 our pipeline and clustered in 26 distinct clusters using intensities of morphological and fluorescent 117 features (see Table S1), such as nuclear staining, S. aureus phagocytosis and DHR positivity (Fig.  118 2a). Professional phagocytes were defined by their ability to take up CTV-S. aureus and induce a 119 reactive oxygen species (ROS) response (DHR positive) 8 . To compare between samples incubated 120 with CTV-S. aureus and the respective control samples we used the statistical analysis pipeline 121 from Image3C, which is based on a negative binomial regression model (Fig. 2b). In zebrafish, 122 professional phagocytes are mainly granulocytes and monocytic cells and can be discriminated 123 from each other based on morphological differences (i.e. cell size, granularity and nuclear shape) 9 . 124 By combining the statistical analyses and the visual inspection of the cell galleries (Data File S3) 125 and intensity of morphological and fluorescent intensities (Data File S2), we identified 4 clusters 126 of professional phagocytes: granulocytes within cluster Dr4, Dr12 and Dr13 and monocytic cells 127 in cluster Dr21 (Fig. 2a, 2b). The morphology of cells in cluster Dr12 is characteristic of 128 phagocytic neutrophils (Fig. 2a) that become adhesive and produce extracellular traps upon 129 recognition of bacterial antigens 10 . Overall relative abundance of professional phagocytes is 5-130 All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint 10% (Fig. 2c), which is in line with previous studies that estimated the number of professional 131 phagocytes in hematopoietic tissue of adult zebrafish using classical morphological approaches 9 . 132 It is interesting to note that CCB selectively affects cell viability based on cell identity (Fig. 2b). 133 We found all erythrocyte containing clusters had a significantly higher count in the CTV-S. aureus 134 samples when compared to the CTV-S. aureus + CCB controls (Fig. 2b). Cluster analysis revealed 135 that erythrocytes are almost absent in samples incubated with CCB (Data File S2), while there is 136 a significant increase of dead and apoptotic cells (Fig. 2b, Table S2). Both outcomes are likely due 137 to reduced cell viability of erythrocytes upon CCB incubation. Moreover, we excluded the 138 possibility of higher cell death in the professional phagocytes upon CCB incubation, since we 139 The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint found here pseudo-phagocytes (phagocytes with DHR response but no internalized CTV-S. 140 aureus) to be significantly more abundant (Fig. 2b, Table S2). 141 Next, we inhibited phagocytosis by incubating the hematopoietic cells on ice (Supplemental 142 Methods) and compared the effectiveness of inhibition with the CCB control (Fig. 2c, Table S3). 143 We found that temperature inhibition of phagocytosis only affects adhesive neutrophils (cluster 144 Dr12), probably through the inhibition of adhesion, while CCB effectively blocks phagocytosis in 145 all professional phagocytes in zebrafish hematopoietic tissue (Fig. 2c). resemble the Group I hemocytes previously described using a classical morphological approach 157 11 . The second category is constituted by larger cells with lower nuclear-cytoplasmic ratio and 158 abundant membrane protrusions (clusters Pc1, Pc6, Pc7 and Pc9). Likely, these cells correspond 159 to the previously described Group II hemocytes that include both granular and agranular cells 11 . 160 To identify which of these clusters are enriched with granular cells, the intensities of the 161 morphological features related to cytoplasm texture provided by Image3C were compared between 162 All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint  Table S1 for details). (c) Box plot of event relative 190 abundance within each cluster following the same color-code used in Fig. 2a (Table S4). In the phagocytosis treatment 203 compared to the ice control, however, only cluster 27442 has a significantly higher relative 204 abundance of professional phagocytes (Table S5). 205 The data analysis with Image3C clearly highlighted that the classical phagocytic inhibitors, 206 CCB or EDTA, commonly used in controls for phagocytosis experiments, result in a drastic change 207 of cell morphology, a consequence not easily detectable by other methods and often overlooked. 208 In the present work, these changes significantly modified the overall cell cluster number and 209 distribution, and it must be taken into consideration in any study of morphological features of cells 210 with phagocytosing properties. Furthermore, when determining differences between experimental 211 treatments, Image3C necessarily combines images and data from all the treatments for clustering 212 (Supplemental Methods). Therefore, experiments meant to classify and analyze only innate cell 213 morphologies present in a tissue should be carried out separately from experiments where one or 214 more treatments are likely to significantly affect cell morphology in an unanticipated manner (e.g. 215 CCB or EDTA incubation). This would prevent treatment effects being conflated with innate 216 morphology differences among unperturbed cell types. 217 In summary, we have developed a powerful new method to analyze the composition of any cell 218 population obtained from any research organism of interest at single cell resolution without the 219 need for species-specific reagents such as fluorescently tagged antibodies (multicolor 220 All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint Image3C pipeline and associated R-scripts with input from CW. RP, ACB, AA and CW analyzed 243 and interpreted the data. RP, AA and ACB wrote the paper. All authors read and edited the paper.

Collection of zebrafish whole kidney marrow (WKM)
Twelve-month-old, wild type, female, adult zebrafish were euthanized with cold 500 mg/L MS-222 solution for 5 min. Kidneys were dissected as previously described 6

Collection of apple snail hemocytes
Specimens of the apple snail Pomacea canaliculata (Mollusca, Gastropoda, Ampullariidae) were maintained and bred in captivity, in a water recirculation system filled with artificial freshwater (2.7 mM CaCl2, 0.8 mM MgSO4, 1.8 mM NaHCO3, 1:5000 Remineralize Balanced Minerals in Liquid Form [Brightwell Aquatics]). The snails were fed twice a week and kept in a 10:14 light:dark cycle. Seven wild type adult snails, 7-9 months old and with a shell size of 45-60 mm were starved for 5 days before the hemolymph collection 11 . The withdrawal was performed applying a pressure on the operculum and dropping the hemolymph directly into an ice-cold tube.
The hemolymph was not pooled but the cells collected from each animal were individually analyzed. The hemolymph was immediately diluted 1:4 in Bge medium + 10% fetal bovine serum (FBS) and then centrifuged at 500 rcf for 5 min. The pellet of cells was resuspended in 100 µl of All rights reserved. No reuse allowed without permission.

Morphology Assay
The P. canaliculata hemocytes were stained with 5 µM Draq5 (Thermo Fisher Scientific) for 10 min, moved to ice and subsequently run one by one on the ImageStream® X Mark II (Amnis Millipore Sigma), where 10,000 nucleated and focused events were recorded for each sample. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint mM HEPES for apple snail cells 14 . After 2 h and 30 min we added 5 µM dihydrorhodamine-123 (DHR) (Thermo Fisher Scientific) to the cell suspension to stain cells positive for reactive oxygen species (ROS) production. To control for this treatment with DHR, we incubated the cells with 10 ng/mL phorbol 12-myristate 13-acetate (PMA) to artificially induce ROS production. At 2 h and 50 min since the beginning of incubation with CTV-S. aureus, all the samples were stained with 5 µM Draq5 for 10 min. After 3 h incubation with bacteria, cells were moved and stored on ice and subsequently run on the ImageStream® X Mark II (Amnis Millipore Sigma), where 10,000 nucleated and focused events were recorded for each sample.

Data collection on ImageStream® X Mark II
Following cell preparation, data were acquired from each sample on the ImageStream® X Mark II (Amnis Millipore Sigma) at 60x magnification, slow flow speed, using 633, 488 and 405 nm laser excitation. Bright field was acquired on channels 1 and 9. DHR (488 nm excitation) was collected on channel 2, CTV-S. aureus (405 nm excitation) on channel 7 and Draq5 (633 nm excitation) on channel 11. SSC was acquired on channel 6.

Data analysis
Raw image data from the ImageStream® X Mark II system was compensated, background was subtracted, and features were calculated using IDEAS 6.2 software (Amnis/Millipore). Feature intensities for all cells and samples were then exported from IDEAS into FCS files for processing in R. See github repository and Table S1 for a full list of features used for each organism and a more detailed description of processing steps. Briefly, exported FCS files were processed in R 7 to trim redundant features with high correlation values, fluorescence intensity features were All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint transformed using the estimateLogicle() and transform() functions from the flowCore package 4,15 , and DNA intensity features were normalized to remove intensity drift between samples using the gaussNorm function from flowStats 16 . The processed data was exported from R 7 using writeflowSet() function in flowCore package 4,15 .
Data and clustering results were then imported into the Vortex clustering environment for Xshift k-nearest-neighbor clustering 1 . During the import into Vortex, all features were scaled to 1SD to equalize the contribution of features towards clustering. Clustering was performed in Vortex with a range of k values, typically from 5 to 150, and a final k value chosen using the 'find elbow point for cluster number' function in Vortex and with visual confirmation of the result that over or under-clustering did not occur. Force directed graphs of a subset of cells in each experiment's file set were also generated in Vortex and cell coordinates in the resultant 2d space were exported along with graphml representation of the force directed graph. After clustering and generation of force directed graphs, tabular data was exported from Vortex that included a master table of every cell event and its cluster assignment and original sample ID, as well as a table of the average feature intensities for each cluster and counts of cells per cluster and per sample.
Clustering results were further analyzed and plotted in R 7 by merging all cell events and feature intensities with cluster assignments, and force directed graph X/Y coordinates. Using this merged data and the graphml file exported from Vortex, new force directed graphs were created per treatment condition using the igraph package 17 in R, statistical analysis of differences in cell counts per cluster by condition were performed using negative binomial regression of cell counts per cluster, plots of statistics results and other results generated (see github repository for details), and csv files containing cell and sample ID, feature intensities, X/Y coordinates in force directed and minimum spanning tree plots were exported for each sample in the experiment set for merging All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint results into daf files in FCS Express Plus version 6 (DeNovo software), which allowed visualization of cell images by cluster and by sub setting of regions within the force directed graphs.
Analysis of daf files was performed in FCS Express by opening daf files and using the "R add parameters" transformation feature to merge the csv files generated above with the daf file feature intensity and image sets. This allowed the generation of image galleries of cells within each cluster and additional analysis in the style of traditional flow cytometry (i.e., gating on 2d plots of features of interest) to explore the clustering results and identify candidate clusters and populations of interest.

Statistic
Negative binomial regression was performed on tables of cell counts per cluster, per sample and plots were generated using the edgeR package, which was developed for RNAseq analysis, but includes generally applicable and user-friendly wrappers for regression and modeling analysis and plotting of results.
All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/603035 doi: bioRxiv preprint Supplemental Tables   Table S1:       All rights reserved. No reuse allowed without permission.