Information‐Distilled Generative Label‐Free Morphological Profiling Encodes Cellular Heterogeneity

Abstract Image‐based cytometry faces challenges due to technical variations arising from different experimental batches and conditions, such as differences in instrument configurations or image acquisition protocols, impeding genuine biological interpretation of cell morphology. Existing solutions, often necessitating extensive pre‐existing data knowledge or control samples across batches, have proved limited, especially with complex cell image data. To overcome this, “Cyto‐Morphology Adversarial Distillation” (CytoMAD), a self‐supervised multi‐task learning strategy that distills biologically relevant cellular morphological information from batch variations, is introduced to enable integrated analysis across multiple data batches without complex data assumptions or extensive manual annotation. Unique to CytoMAD is its “morphology distillation”, symbiotically paired with deep‐learning image‐contrast translation—offering additional interpretable insights into label‐free cell morphology. The versatile efficacy of CytoMAD is demonstrated in augmenting the power of biophysical imaging cytometry. It allows integrated label‐free classification of human lung cancer cell types and accurately recapitulates their progressive drug responses, even when trained without the drug concentration information. CytoMAD also allows joint analysis of tumor biophysical cellular heterogeneity, linked to epithelial‐mesenchymal plasticity, that standard fluorescence markers overlook. CytoMAD can substantiate the wide adoption of biophysical cytometry for cost‐effective diagnosis and screening.


Introduction
Recent breakthroughs in label-free singlecell imaging have highlighted the significance of biophysical cytometry in understanding functional cellular heterogeneity in complex biological systems. [1]4][5][6][7][8] These imaging technologies are gaining traction in pharmaceutical industries and life science laboratories, providing critical mechanistic insights into cellular functions that may be concealed in molecular assays. [9,10]owever, cellular imaging experiments are notoriously complicated by batch effects that result from differences in imaging system configurations, image acquisition protocols, and experimental conditions.Specifically, in microfluidic imaging flow cytometry, batch-to-batch variations manifest from various sources, encompassing 1) the optical system factors such as power instability of the laser source and the noise originated from photodetection and signal amplification in the system; 2) the microfluidic aspects associated with the variations in the microfluidic chip fabrication quality that could potentially result in the image distortion/aberration.These unwanted technical variations are no exception in biophysical imaging cytometry, where they obscure genuine biological signals and hinder robust integrated data analysis across multiple datasets.Consequently, minimizing batch effect is of paramount importance, yet challenging, for improving data reproducibility and accurately capturing biological information. [11]atch normalization is a widely used strategy for imagebased batch correction. [12,13]However, current methods, including those based on machine learning techniques, face several limitations that can hinder their effectiveness in correcting batch effects.First, many methods require a priori knowledge or assumptions about the statistical distributions within each batch to align the data across different batches.Second, some methods necessitate the presence of a common control sample (e.g., a negative control) across all batches for normalization.In parallel, a few methods have been developed to address batch effect correction in single-cell omics analysis. [14,15]However, the inherent differences between 1D sequencing data and 2D (or even 3D) images, together with the significant diversity and complexity of biological image data structures, limit the direct applicability of these methods in image-based cellular/tissue analysis.Furthermore, these approaches typically fall short in generating batch-effectfree images directly for downstream analysis.Consequently, disentangling batch effects from biological image datasets remains a challenging task in imaging cytometry and is more complex compared to traditional omics data.This underscores the need to develop more effective batch effect correction methods tailored to image-based data.
Here, we introduce a new generative deep-learning pipeline, cyto-morphology adversarial distillation (CytoMAD).This approach not only enables batch-effect correction, but also allows simultaneous label-free image contrast translation to reveal additional cellular information.18] In contrast to previous deep-learning batch correction approaches [13,19,20] or image translation approaches, [21][22][23][24][25] Cy-toMAD offers three key unique attributes: 1) flexibility in modeling complex, non-linear data distributions, enabling correction of various batch effects without distributional assumptions; 2) accurate generation of QPI that is applicable to batch effect correction by learning to translate (augment) images across batches while conserving the biological content; 3) it simultaneously allows self-supervised batch-corrected morphological profiles for integrated downstream analysis.
We demonstrate CytoMAD's diverse capabilities in various applications, including accurate joint analysis across multiple batches for label-free classification of human lung cell types, functional drug-treatment assays for morphological changes in response to a panel of drugs at different concentrations (even the drug concentration is excluded in training), and integrated biophysical cellular analysis of tumor biopsies from multiple nonsmall cell lung cancer (NSCLC) patients.Our results showcase CytoMAD's versatility and utility for a wide array of applications in cell biology and biomedical research.

Overall Framework of CytoMAD
CytoMAD is a multi-task, generative deep-learning model designed to address batch effects in imaging cytometry, with selfsupervised learning characteristics.Specifically, its capability to help separate biological information from technical confounders is infused in the process image-contrast translation (i.e., QPI outputs predicted from BF images) as well as representation learning (i.e., generation of morphological features/profiles) (Figure 1a).Hence, it allows robust integrated analyses across multiple data batches downstream, including joint-dataset classification, biophysical marker discovery, and morphological profile interpretation (Figure 1b).
The model consists of two interconnected components: 1) the batch-aware generative-adversarial-network-based (GAN-based) image generation and 2) the morphology distillator (Figure 1c) (see the details in Methods).The GAN-based backbone in Cy-toMAD facilitates image generation and contrast translation for augmented cellular information.Prior to implementing the morphology distillator, the backbone model is pre-trained and guided by a feedback mechanism to achieve accurate image contrast conversion from bright field (BF) to quantitative phase images (QPI).Note that QPI formation bypasses the need for common complex optical QPI instrumentation (e.g., interferometric/holographic modules and multiplexed detection modules) and computationally-intensive QPI reconstruction algorithms. [26]When image contrast translation is not required, a convolutional autoencoder architecture can be used by equalizing the input and output target image of this GAN-based backbone.From this pretrained GAN-based backbone, the resultant morphological features at the GAN's bottleneck layer (denoted as no-CytoMAD-profile at this stage) and the output QPI (denoted as no-CytoMAD-images at this stage) would serve as the inputs for training the morphology distillator in the next stage.As the no-CytoMAD-profile is derived from the input image and subsequently utilized in reconstructing the output image, it effectively encodes information from both BF and QPI.Here single cell images are captured by a multi-modal imaging flow cytometer, called multi-ATOM, [2,4,7,27] operating at a high imaging throughput of ≈10 000 s cells sec −1 (see the system details in Methods).
CytoMAD distinguishes itself from the common generative model by integrating a self-supervised morphology distillatory.This component consists of a set of classification networks, including batch classifiers and cell type/state classifiers (Figure 1c).Both types of classifiers work symbiotically to disentangle batch variations from biological information (see the training strategy in Methods).The classifier duo is pre-trained based on no-CytoMAD-profiles and no-CytoMAD-images to identify the batch and cell-type information.To suppress batch distortion and enhance biological information at the phenotypic features and cellular images with minimal alterations, these classifiers are implemented at both the bottleneck region and the image output of the GAN-based backbone (Figure 1).Note that the model parameters of batch classifiers and cell-type/state classifiers are frozen during training, while only the GAN-based backbone parameters are updated in every epoch for ensuring batch-correction without losing the genuine image information (Methods).The batch distillation power of CytoMAD can be further enhanced by dividing the batch classifiers at the bottleneck region into multiple mini-classifiers and periodically retraining them at preset intervals (Figure 1c) (e.g., every 10 epochs (Methods)).Hence, the self-supervised element of the model comes from the continuous refinement of the GAN-based backbone, guided by the outputs from the morphology distillatory, as well as the parameters of the classifier duo.This process allows the model to internally adjust and optimize based on the CytoMAD-profiles and classifications it generates during training, without direct external labels guiding the adjustment.
In summary, the self-supervised morphology distillator forms a dynamic feedback system with the GAN-based image translation to separate batch information from biological variations of interest, ultimately achieving batch-distilled phenotypic features and cell images.For the sake of clarity, we denoted the morphological profiles and cell images generated from the full model of CytoMAD as CytoMAD-profiles and CytoMAD-images respectively.We also assessed the performance of the classical GAN-based backbone without batch-aware morphology distillator (denoted as no-CytoMAD) for comparison, with its resultant phenotypic features and cellular images denoted as no-CytoMAD-profiles and no-CytoMAD-images.

Label-Free Identifications of Human Lung Cancer Cell Types Across Data Sets
We first evaluated CytoMAD's performance on classifying seven human lung cancer cell lines representing three major lung cancer types based on their label-free biophysical morphologies: lung squamous cell carcinoma (LUSC) (i.e., H520, H2170), adenocarcinoma (LUAD) (i.e., H358, H1975, HCC827), and small cell carcinoma (SCLC) (i.e., H69, H526).Three different image batches were captured (on different dates) for each cell line, deliberately introducing complex batch-to-batch variations arising from factors such as variations in the optical systems and microfluidic aspects.These diverse datasets are to test CytoMAD's capability to identify the shared morphological profiles within the same lung cancer types, and thus faithfully distinguish different lung cancer types.
We also compared the performance from the full model of CytoMAD and that of the classical GAN-based backbone without batch-aware morphology distillator (denoted as no-CytoMAD).The average SSIM for QPI of no-CytoMAD-images and CytoMAD-images was 0.9473 and 0.9305, respectively, in-dicating high structural similarity between the generated QPI and the ground truth QPI, and thus reliable image contrast conversion.The average RMSE values for QPI generated of no-CytoMAD-images and CytoMAD-images were 0.0519 and 0.0654, suggesting accurate phase value reconstructions.The similar SSIM and RMSE values in these two cases imply comparable reconstruction performance after adding the morphology distillation module in CytoMAD.The high SSIM and low RMSE values in CytoMAD QPI confirm the reliable image contrast translation from BF to QPI.
In the morphology distillator module, we trained cell type classifiers using CytoMAD-profile (or CytoMAD's bottleneck latent features) (Figure 2c) and CytoMAD-images (Figure S1, Supporting Information) to assess biological information preservation while reducing batch-to-batch differences.Only one batch was used for training and validation, and classification performance was tested on two unseen batches (See Methods).For the cell type classifier based on CytoMAD-profile (Figure 2c), the across-batch classification accuracy was 0.2487 without CytoMAD (i.e., no-CytoMAD-profile), and it significantly improved to 0.7846 with CytoMAD, with all cell lines achieving >0.72 prediction accuracy.Classification performance based on CytoMAD-images was similar, with accuracies of 0.3280 without CytoMAD (i.e., no-CytoMAD-images) and 0.7768 with CytoMAD (Figure S1, Supporting Information).These substantial improvements in acrossbatch cell type classification, based on both CytoMAD features and images, demonstrate biological information preservation and reliability.Furthermore, this highlights CytoMAD's batchdistillation power to overcome batch-to-batch variations.
We used Uniform Manifold Approximation and Projection (UMAP) for visualizing the mitigation of batch-to-batch variations and conservation of biologically relevant cell type information.UMAP analysis was conducted on the CytoMADprofile across major lung cancer types (i.e., LUAD, LUSC, and SCLC) (Figure 2d) and different lung adenocarcinoma cell types (i.e., H358 (KRAS G12C ), H1975 (EGFR L858R/T790M ), HCC827 (EGFR Del19 )) (Figure 2e).Without CytoMAD, we observed multiple clusters within the same type, especially LUAD (Figure 2d), indicating an obvious batch effect which obscures the difference among the three major cell type populations.In contrast, Cy-toMAD merges different batches of the same type into single clusters, and the three major cancer types became distinct.Likewise, in the LUAD cell type study (Figure 2e), without CytoMAD, multiple clusters within the same subtypes suggested strong batch differences.With CytoMAD, clusters unified, and the three subtypes appeared distinct.These findings reaffirm CytoMAD's capability to minimize batch effects through batch-distilled phenotypic profiling.
We further investigate the interpretability of the CytoMADprofile in order to gain the model transparency and credibility, which are particularly pertinent in biomedical diagnosis.Specifically, we correlated the CytoMAD-profile with the handcrafted biophysical phenotypes of cells from the CytoMAD output QPI (i.e., CytoMAD-images) and input BF images.These handcrafted biophysical phenotypes of cells (a total of 84) features (Note S1, Supporting Information) were extracted based on a hierarchical morphological feature extraction approach, [4,7] which has recently shown promises in label-free single-cell morphological profiling.We identified the important CytoMAD-profile in classifying seven lung cancer cell lines based on feature importance and correlated them with hand-crafted biophysical phenotypes, which are categorized into 3 groups (i.e., bulk, global texture and local texture of biophysical morphology) (Figure S2a, Supporting Information).Specifically, hierarchical clustering grouped the selected CytoMAD-profile into five clusters (Figure S2b, Supporting Information).Notably, one group strongly correlated with global phenotypes, while another correlated with local biophysical phenotypes, offering insights into the biological relevance of CytoMAD-profile (Figure 2f).
We proceeded to analyze CytoMAD results for variations across major lung cancer types and LUAD cell types, as in the UMAP analysis.We compared biophysical phenotype distributions (readout from CytoMAD-images) across different cancer types (Figure 2g) and adenocarcinoma cell types (Figure 2h).Notable differences were observed in certain biophysical phenotypes across the three major cancer types (Figure 2g).They included the cell dry mass (bulk feature, effect size d = 0.40), opacity variance (global texture of cell opacity, effect size d = 0.57), and QP entropy mean and QP fiber mean (local textures of quantitative phase, or equivalently dry-mass density, effect size d = 0.44 and d = 0.30 respectively).In contrast, differences among LUAD cell types (Figure 2h) were more subtle, suggesting relatively similar cellular morphology across LUAD types compared to variations across major lung cancer types (Figure 2h; Figure S3, Supporting Information).

Delineating Cellular Responses to Drug Treatments with CytoMAD
Morphological profiling is now becoming a promising technique in drug screening, yet batch effects pose a notable challenge.We assessed the CytoMAD approach using label-free drug response assays on LUSC (H2170) cells treated with docetaxel, afatinib, and gemcitabine at various concentrations across two batches, i.e., 5 concentration levels and 1 negative control)forming 18 unique drug treatment conditions.Notably, during the CytoMAD model training, we provided only batch identifiers and drug types, purposely withholding drug concentration data.This approach was chosen to challenge the model's capability for self-supervised learning of concentration-dependent morphological features (see Figure S4, Supporting Information for detailed training strategy).
In a comparative analysis of single-cell, label-free input images (BF and QPI) and CytoMAD QPI images (Figure 3a), we observed that CytoMAD QPI closely resembled the ground truth QPI, with a high SSIM of 0.9241 and low RMSE of 0.0098 (Figure S5, Supporting Information).These results reinforces the reliability of CytoMAD's image contrast conversion in capturing contrast details pertinent to drug response assays.
To explore how well CytoMAD mitigates batch effects while maintaining critical cellular information, we performed UMAP analyses on negative control samples (DMSO) and IC50 samples of different drug treatments (Figure 3b).Prior to CytoMAD processing (i.e., no-CytoMAD) (orange area), samples from different batches appeared distinctly on the UMAP plot, indicative of batch effects.After applying CytoMAD (green area), the variance attributed to batch effects was significantly reduced among control samples, evidenced by the convergence of samples from six batches into a unified cluster.Interestingly, when assessing the drug-treated IC50 samples, CytoMAD facilitated the formation of well-defined, drug-specific clusters (Green area in Figure 3b), accentuating its ability to discern treatment-specific morphological signatures while minimizing batch-related noise.
To further quantify the CytoMAD's ability to delineate treatment responses, we next trained the drug treatment classifiers on IC50 samples.The training and validation sets consisted of a single batch with 2000 and 500 cells per treatment, respectively (Note S5, Supporting Information).The across-batch classification, tested on an unseen batch with 2500 cells per treatment, achieved accuracies of 0.91 with CytoMAD-profile, considerably higher than 0.43 with no-CytoMAD-profile (Figure 3c).This significant improvement underscores CytoMAD's effectiveness in reducing batch differences while preserving treatment distinctions.
To enhance the interpretability of CytoMAD's batch correction capabilities, we quantify the change in each biophysical phenotypic value after CytoMAD using a metric called, correction ratios (see Methods) (Figure 3d; Figure S6, Supporting Information).We identify the biophysical phenotypes that demonstrate significant correction ratios, particularly the global features such as dry mass variance and QP entropy centroid displacement.These phe-notypes are consistently corrected across all three drug treatment categories.
CytoMAD's morphology-distillation ability was quantified by measuring differences in the hand-crafted biophysical phenotypes (extracted from CytoMAD images) among batches for each drug treatment (Figure 3e; Figure S7, Supporting Information).Based on the correction ratios of each biophysical phenotypes (see Methods) (Figure 3d; Figure S6, Supporting Information), we identified phenotypes with strong and weak batch corrections (Figure 3e; Figure S7, Supporting Information).Interestingly, we observed that the features related to high spatial-frequency information (e.g., DMD contrast 3, and QP entropy centroid displacement) exhibit higher correction ratio.It could indicate to that Cy-toMAD are trained to more effectively correct/reduce the highfrequency image noise in different batches.
Another evidence of CytoMAD's effectiveness is the increased symmetry in violin plots after applying CytoMAD (comparing the left three violins with the right three) (Figure 3e).The CytoMAD plots exhibited a better symmetry when comparing the left and right violins, signifying a substantial reduction in inter-batch discrepancies, particularly for some biophysical phenotypes.
Based on our quantitative analysis that compared phenotypic distributions across drug treatments (represented by the violins of different colors in Figure 3e), the preserved patterns of the phenotypic distributions before and after CytoMAD show its ability to retain intrinsic cellular information for better discriminating the effect of different drug treatments.This is further corroborated by effect size measurements (after CytoMAD) for certain biophysical phenotypes across treatments include global dry-mass-density texture features (DMD contrast 4, effect size d = 0.19; phase range, effect, size d = 0.24) and local dry-massdensity texture features (QP entropy variance, effect size d = 0.17) (Figure 3e), which demonstrated considerable treatment-specific differences.
We further visualized how CytoMAD reduces the batch differences in different biophysical phenotypes along the drug concentrations (Figure 3f).By analyzing the trends of the biophysical phenotypes with different drug concentrations, we observe that CytoMAD effectively reduced the range of uncertainty in most of the biophysical phenotypes, as showed in the reduced shaded area (from orange to green) with CytoMAD, compared to that with no CytoMAD (Figure 3f; Figure S8, Supporting Information).It thus suggests CytoMAD's capability to decrease batch differences in the biophysical phenotypes of cells, such as area (bulk features), phase kurtosis (global texture features), and QP entropy mean (local texture features).Importantly, similar trends along drug concentrations between the orange (without CytoMAD) and green (with CytoMAD) shaded areas demonstrate the preservation of progressive changes in the CytoMAD model, despite excluding the concentration information during training.All the above findings suggest insights into distinct labelfree morphological responses to these drug treatments, consistent with their known different mechanisms of action (MoA).

Label-Free Single-cell CytoMAD Analysis of NSCLC Biopsies
Investigating tumorigenesis in non-small-cell lung cancer (NSCLC), the leading cause of cancer-related mortality worldwide, is essential for unraveling the complex biological processes underlying tumor invasion, metastasis, and therapy resistance, all of which contribute to poor patient outcomes.Among these processes, epithelial-mesenchymal plasticity (EMP), particularly epithelial-mesenchymal transition (EMT), plays a pivotal role in driving tumor malignancy and invasiveness, [28] characterized by the loss of epithelial markers and the gain of mesenchymal markers such as vimentin.In this study, we employ the CytoMAD method to investigate whether label-free biophysical cell morphologies can effectively capture subtle changes associated with EMP or related phenotypes in NSCLC biopsies from different patients.
In contrast to the previous demonstrations where lung cancer cell types (Figure 2) and cell states in response to drug treatments (Figure 3) are well defined, clinical biopsy samples are typically more heterogeneous.Thus, obtaining hand-labeled training data sets would costly or impractical.In order to more robustly train CytoMAD to better account for batch effects across highly heterogeneous clinical biopsies (seven patient samples of lung adenocarcinoma obtained on different dates in this study), we developed a different learning approach for CytoMAD (Methods).In brief, we first trained and tested CytoMAD to perform an auxiliary task of classifying tumor biopsy versus blood samples from each patient (upper panel of Figure 4a).The CytoMAD model, pre-trained with this auxiliary task, was then further trained to analyze the biophysical morphologies indicative of resected tumor and normal biopsies, as well as EMP-like phenotypes in the biopsies (lower panel of Figure 4a).In essence, this learning strategy leverages basic knowledge acquired from the initial, broader auxiliary task, in order to augment the model's capability to make finer distinctions in the downstream tasks.
Setting up this auxiliary task is not our ultimate intent, but instead to equip the model with an initial capability to learn biologically relevant representations of cell morphology through a "simple" classification task involving distinct samples (i.e., bloodversus-tumor).Indeed, tumor cells and blood cells are generally very distinct in morphological characteristics, e.g., cell size, average quantitative phase (i.e., dry mass density) across the cell (Figure S9, Supporting Information).Such a marked distinction in cell types thus enables CytoMAD more easily to recognize and thus distill the unwanted batch effects.Furthermore, the fact that both samples are intrinsically heterogeneous could let the model be aware of the genuine biological heterogeneity of the clinical samples -preventing the model from being overfit.This approach is beneficial for the model to be further trained and tested more effectively with the downstream tasks.
It is crucial to note that the labels for "tumor-versus-blood" are essentially weak labels specifying the origins of the cells, i.e., tumor biopsy or blood sample, instead of the exact cell types within these heterogeneously composed samples.Weak labelling, which has been used in weakly-supervised/self-supervised, [29][30][31] is particularly useful to alleviate the burden of obtaining hand-labeled data sets, which can be costly or impractical as in our study where both the tumor and blood samples could comprise of diverse cell types.Furthermore, blood samples are easily accessible through the minimally invasive procedures.Thus, they represent the clinically practical samples for training CytoMAD to discern batch effect effects (i.e., imaging system configurations, image acquisition protocols etc.) in the presence of tumor heterogeneity.We stress that the normal lung biopsy data was held out in training to assess the robustness of CytoMAD by analyzing unseen data.Based on the feature correction ratio analysis (similar to Figure 3d), we identified the group of biophysical phenotypes that consistently show significant correction ratios, particularly the global and local features (e.g., Fit Texture Skewness/Mean/Centroid displacement) (Figure S10, Supporting Information) across all blood, tumor and normal tissue samplessuggesting that the patterns of the feature corrections are consistent from auxiliary task to downstream task (tumor versus normal tissue sample).
The image translation performance (BF to QPI) proved to be robust across heterogeneous tumor biopsies, blood samples, and even unseen normal lung biopsies, achieving an average SSIM of 0.8881 and an RMSE of 0.0084 (Figure 4b; Figure S11, Supporting Information).We further evaluated CytoMAD's capability in correcting batch-to-batch variations among different patient batches based on the label-free CytoMAD-profile (Figure 4c,d; Figure S12, Supporting Information).After applying CytoMAD, different batches, which were originally largely segregated (especially in tumor samples), exhibited improved merging, demonstrating the effectiveness of our approach in addressing batch effects.
Next, we investigated the CytoMAD-profile (i.e., the biophysical cell morphologies) of the tumor and normal lung biopsies (lower panel of Figure 4a).Using the fluorescence detection module integrated with the same imaging cytometry system (Methods), we first identified that significant cell populations overexpress both epithelial cell adhesion molecule (EpCAM+) and vimentin (Vim+) (i.e., EpCAM+Vim+), as well as epithelial cell adhesion molecule without vimentin (referred to as EpCAM+) (Supplementary Figure S13).Despite there are currently no specific molecular markers can universally define the mesenchymal state in all EMT programs, [28] fluorescence labels targeting Ep-CAM and Vim are broadly regarded as valuable markers indicative of EMP -a partial EMT during which cells express with a mixture of epithelial and mesenchymal phenotypes. [28]With this knowledge, we sought to further train the CytoMAD model (which was pre-trained with the auxiliary task) to learn and analyze the biophysical morphologies of the major populations of EpCAM+ and EpCAM+Vim+ cells in the normal and tumor biopsies, that has not been comprehensively investigated.
Three additional, unseen patients (denoted as patients 5, 6, and 7) were later used to test the generalizability of the model across new patients, resulting in a total of 7 patient samples (Methods and Note S7, Supporting Information).These three additional patients were not part of the CytoMAD training process, serving as phenotypic distribution with CytoMAD.f) Batch distance on biophysical phenotypes along drug concentration.Selected biophysical phenotypes, with their corresponding values in different drug treatments, were plot along the drug concentration.Each line on the plot represents a batch of each drug treatments and the shaded area between lines denotes the phenotypic distance between batches, with orange color denoting no CytoMAD and green denoting with CytoMAD.
a real-life reference to test the model's generalizability in downstream biological analysis.This scenario mimics the application of CytoMAD to new patient data without retraining the model, illustrating its effectiveness with previously unseen patients.
Based on the gating for EpCAM+ and EpCAM+Vim+ cells (Figure S13, Supporting Information), we noted that various patient data batches exhibited reduced segregation in the UMAP embedding, showcasing the batch correction efficacy achieved by CytoMAD (Figure S14, Supporting Information).Notably, despite the exclusion of the three additional patients from the Cy-toMAD training set, their incorporation into the UMAP analysis also revealed a decreased level of batch segregation.This finding highlights CytoMAD's capacity to effectively handle new, unseen patient data, showcasing its robust generalizability.The observed less-segregated clusters in the UMAP plot not only served as a validation but also emphasized CytoMAD's efficacy in minimizing batch-to-batch variations across the patient datasets, even in the presence of new, previously unseen patient data.
To analyze how the CytoMAD profile can distinguish the tumor and normal tissues (Figure 4e,f), we first investigated the qualitative single-cell visualizations which display a great degree of cellular heterogeneity across different patients (for both tumor and normal samples) even after the batch effect correction by Cy-toMAD (Figure 4e).While such heterogeneity will later be studied and verified in the context of EMP-like phenotypes, it is more sensible and practical to next evaluate the ability of CytoMAD to distinguish between tumor and normal tissue on the sample (patient) level (given by the weak labels as mentioned earlier).
To this end, we investigated the ensemble average of the CytoMAD-profile for each tumor/normal tissue sample in the UMAP embedding, from which we observe the general segregation of normal and tumor tissues across all seven patients (Supplementary Figure S15).Furthermore, our logistic regression analysis, with bootstrapping, based on the CytoMAD profiles demonstrated a high classification accuracy, as quantified by the area under the curve (AUC) of 0.82 in the receiver operating characteristic curve (ROC) plot (Figure 4f; Figure S15, Supporting Information), significantly higher than the AUC (0.62) achieved in the case of no-CytoMAD.This demonstration shows that label-free biophysical phenotyping of cells using CytoMAD might represent a cost-effective approach for rapid malignancy detection and assessment of tumor samples -echoing the recent work on label-free biophysical phenotyping of tumor biopsy. [32]hile exhaustive profiling of all cell types within these heterogeneously composed samples is rather challenging with labelfree biophysical phenotyping, the observed cellular heterogeneity can still be analyzed in the context of the EMP-like phenotypes, using the available reference markers EpCAM and Vim (Figure 4g,h).Specifically, we next tested the feasibility of using CytoMAD to identify the specific populations of cells with known molecular phenotypes, i.e., EpCAM+ and the EpCAM+Vim+ cells.This could facilitate establishment of the biological grounding of label-free biophysical profile, extracted by CytoMAD, related to EMP (Figure 4g; Figure S16, Supporting Information).
While the label-free CytoMAD-profile could generally reveal the segregation of the EpCAM+ and the EpCAM+Vim+ cells, we also observed a great degree of inter-patient variations (Figure S16, Supporting Information).To delve deeper, we employed a deep neural network leveraging the label-free CytoMADprofiles to distinguish between EpCAM+ and EpCAM+Vim+ cells across all seven patients' data.The ROC analysis revealed a substantial improvement in classification performance post-CytoMAD, with AUC increasing from 0.88 to 0.97.Moreover, the CytoMAD model's predictions of population ratios between EpCAM+ and EpCAM+Vim+ cells closely matched the ground truth for most patients (Figure 4g).
We further validated CytoMAD's performance in handling heterogenous clinical samples by including EpCAM-Vim-cells in the classification task, i.e., to distinguish three cell populations: EpCAM+, EpCAM+Vim+, and EpCAM-Vim-(Methods).These proportions sufficiently capture the critical phenotypes along the EMT spectrum, with EpCAM+ denoting an epithelial phenotype, EpCAM+Vim+ indicating a transitional or hybrid EMT state, and EpCAM-Vim-highlighting non-epithelial/nonmesenchymal cells within the tumor microenvironment (such as immune cells, fibroblasts, or endothelial cells).Indeed, our analyses showed that CytoMAD profiles can successfully distinguish these populations, further substantiating the biological relevance An auxiliary task was first conducted to pre-train the CytoMAD (upper panel).In this task, the tumor and blood data from 4 patients were employed to train CytoMAD and to guide the disentanglement of valuable biological information from batch distortion during morphological distillation.Normal lung tissue samples, along with data from 3 additional NSCLC patients were not included in this auxiliary task training, in order to assess CytoMAD's performance and test its generalizability across new patient populations.These samples were utilized for our downstream task (Lower panel), focusing on the population analysis involving both normal and tumor samples and EMP-cells analysis.b) Images of NSCLC patient samples from multi-ATOM.The multi-ATOM label-free images of BF and QPI were reported.Color bar shows the phase values.Scale-bar: 10 μm. of the label-free biophysical profiles captured by CytoMAD, particularly relating to the EMP process (see Figures S15 and S16, Supporting Information).In this study, it is encouraging to observe that CyoMAD could reveal the correlation between biophysical morphologies and EMP-like phenotype (given by the markers of EpCAM and Vim).This result might indicate the potential of using label-free biophysical cell morphologies to infer these molecular-specific cell phenotypes, distilled from the batch effect.Specific studies, involving molecular assays (e.g., transcriptomics), with a larger cohort of patients are needed to further establish a more concrete link between the biophysical phenotypes of the targeted EMP-like cells with tumour malignancy, metastatic potential and even survival rate.The fact that CytoMAD can detect the small amount of cell population indicative of exhibiting the EpCAM+Vim+ phenotype (≈4%) in earlystage NSCLC patients may suggest early evidence for the phenotypic transformation required for metastasis.Note that high grade tumors were found in some patients (Note S7, Supporting Information).Indeed, previous studies have shown the EMT transcriptional transformation in the early cancer stages. [33,34]o explore further whether CytoMAD reveals additional biophysical insights that are otherwise absent in fluorescence markers.We examined the EpCAM+Vim+ cells and discovered three distinct subpopulations (Figure 4h; Figure S18, Supporting Information).Visually, subpopulation 1 predominantly consisted of smaller cells, while subpopulations 2 and 3 comprised progressively larger cells (Figure 4h).Furthermore, we observed distinct phenotypic differences between the three subpopulations based on the 84 biophysical features extracted from the CytoMAD output cell images (Note S1, Supporting Information).Based on the hierarchical feature categorization adopted earlier (Figure 2f), we found that the biophysical phenotypic differences between subpopulation 1 and subpopulations 2-3 (quantified by z-score) are mainly manifested by the bulk feature of Area and the global textures of dry mass density (e.g., Phase skewness, dry mass radial distribution).Conversely, subpopulations 2 and 3 exhibited similarities or less pronounced differences when compared to subpopulation 1. Noteworth distinctions between subpopulations 2 and 3 primarily stemmed from cell orientation and cell fit texture, which captures the statistical moments of the profile that highlight the high spatial frequency of the phase.These findings demonstrate that the EpCAM+Vim+ cells, which might represent EMP, exhibit further heterogeneity in the biophysical traits that are not captured by the standard EpCAM and Vim markers.These findings thus underscore the power of label-free biophysical imaging, coupled with CytoMAD, in delineating cell heterogeneity and providing added values in analyzing tumor environments in the clinical samples.We note that, using the imageactivated sorting modality recently developed, [17,[35][36][37][38][39][40][41][42][43] any of these cell populations can be isolated according to the biophysical morphological features and subsequently analyzed for their molecular signatures (e.g., transcriptomes).

Discussion
In summary, CytoMAD introduces a new generative and integrative deep-learning approach that effectively tackles batch effects in image-based cytometry, while simultaneously facilitating image contrast translation to unveil additional cellular informa-tion and deep-learning morphological profiling.This work emphasizes its utility in the burgeoning field of biophysical cytometry, where CytoMAD augments BF image data to obtain quantitative biophysical phenotypes, such as cell mass, mass density, and their subcellular local and global distributions.This has profound implications in simplifying complex optical instrumentation required for conventional QPI operations, such as interferometric/holographic modules and multiplexed detection modules. [26]s a result, CytoMAD could pave the way for the broader adoption of biophysical cytometry across a wide range of applications, as evident by our demonstrations including accurate label-free classification of human lung cell types, functional drug-treatment assays, and biophysical cellular analysis of tumor biopsies from patients with early-stage NSCLC.
Batch effects have been extensively studied in other single-cell data modalities (e.g., single-cell omics), but remain largely uncharted in cell imaging, with few exceptions. [13,20]In the current study, our primary emphasis was on addressing inter-batch variations, operating under the assumption that intra-batch variations are negligible.This assumption is supported by our comprehensive analysis of intra-batch variations in another comparable flow imaging cytometry technique [6,44,45] (Figure S19, Supporting Information).CytoMAD's unique ability to harness the power of self-supervised learning allows it to disentangle batch variations from biologically relevant morphological information during cross-modality image translation, enabling robust integrative image-based analysis across batches.We emphasize that Cy-toMAD is designed to mitigate rather than completely eradicate batch-to-batch variations.We showed that CytoMAD could still preserve subtle intra-batch heterogeneity (e.g., morphologically distinct populations of cells within the same batch (Figure S20, Supporting Information).This could ensure biological traits are retained while adeptly minimizes batch effects, enhancing data integration (e.g., the significant improvement in classification accuracy in Figures 2c and 3c) without indiscriminately conflating batches.
Without imposing any a priori assumptions on complex data distributions or requiring extensive manual annotation, Cy-toMAD delivers accurate QPI generation from BF images, as well as self-supervised batch-corrected morphological profiling for downstream analysis.All these aspects set CytoMAD apart from existing batch correction algorithms (Figure S21, Supporting Information).Our benchmarking revealed that CytoMAD achieved a high across-batch accuracy of 0.91 in feature-based classification, on par with or outperforming the top-performing batch correction methods [15] : HARMONY, [14] MNN [46] and Seurat, [47] which are commonly used in the single-cell omics fields for batch effect correction.It should be noted that these batchcorrection methods are primarily designed for 1D single-cell omics data (e.g., gene count data), not for the 2D morphological image data that CytoMAD handles (Figure S21, Supporting Information).CytoMAD's tailored approach for 2D image data maintains the integrity of morphological information, crucial for accurate downstream biological analysis.In contrast, the batch correction methods designed for omics-data often employ dimensionality reduction, such as principal component analysis, as a pre-processing step.This step, however, could potentially obscure/distort subtle yet significant morphological features, which could impact the direct interpretation of cell morphology (Figure S21, Supporting Information).As a result, we stress that the comparisons with the existing methods for omics data should be regarded as indicative rather than definitive due to the absence of a direct, like-for-like benchmark.
Notably, CytoMAD can accurately predict progressive morphological changes in response to drug concentration trends, even without prior annotation (Figure 3).Utilizing blood and tumor classification as an auxiliary task, CytoMAD effectively corrected batch effects and broadly distinguished the normal and tumor biopsies (Figure 4e,f).We stress that, in this clinical study, biophysical phenotyping of cells by CytoMAD is not meant to provide exhaustive fine-grained classifications of all cell types within the heterogeneously composed tumor and normal tissue samples.Instead, motivated by that substantial body of work has shown a strong correlation between malignancy and the biophysical properties of cells, [32,[48][49][50][51] we evaluated the ability of Cy-toMAD to exploit this correlation for detecting malignancy in human tissue biopsies, i.e., distinguishing between tumor and normal tissue, bypassing the need for staining or expert assessment.On the other hand, to study the correlation between biophysical phenotype and specific cell types within the samples, separate model training with targeted known molecular markers is still required, similar to our study of label-free morphologies correlation with the EpCAM and vimentin phenotypes in NSCLC biopsies (Figure 4g,h).
We envision that this method might be used to provide an objective and diagnostic scoring system that could detect pathological changes in biopsies in an unbiased, and cost-effective manner.It could particularly be advantageous when combined with a rapid biopsy processing pipeline, such as the recently reported protocol of enzyme-free mechanical dissociation. [32]In this case, CytoMAD, together with our high-throughput imaging flow cytometer, could serve as a fast and label-free approach that may aid the intra-operative detection of tissue malignancy. [32]uture investigations on larger patient cohorts could substantially boost the generation of massive patient datasets obtained by CytoMAD analysis, composed of 10-100′s thousands of cell images and high-dimensional morphological information.We thus anticipate that CytoMAD could be further generalized to learn the correlation between the biophysical phenotype single-cell data and tumour malignancy, metastatic potential and even survival rate.
Despite its promising results, CytoMAD could offer several opportunities for further development.First, the crossmodality image translation/augmentation could be extended to other forms of contrast translation, particularly involving fluorescence image contrast, such as BF to fluorescence, [21,22,24] QPI to fluorescence, [23,52,53] or fluorescence to colorized BF. [54,55] This augmentation approach could help establish a knowledge base that correlates and transfers molecular specificity into labelfree morphological phenotypes of cells and tissues.Second, enhancing the interpretability of CytoMAD morphological profiles would enable more intuitive deciphering of underlying biological processes and mechanisms.Potential strategies to improve interpretability in CytoMAD include incorporating feature attribution methods (e.g., Layer-wise Relevance Propagation (LRP), [56] Gradient-weighted Class Activation Mapping (Grad-CAM) [57] ) to visualize influential regions within input images, providing insights into learned representations; or integrating disentangled learning techniques within the autoencoder architecture to learn more interpretable and independent features that might be more readily linked to underlying biology.
Overall, our findings demonstrate that CytoMAD heralds a new development of deep-learning-based, batch-aware morphological profiling of cells.Its application in biophysical cytometry highlights its potential for accurate and insightful biophysical investigations into complex biological processes.It might enrich our understanding of cellular functions and inspire the discovery of cost-effective biomarkers for diagnostic and therapeutic purposes.

Experimental Section
Overall Framework of CytoMAD: CytoMAD was a generative deeplearning model tailored for batch-aware cell morphology distillation.It was built upon a conditional GAN (cGAN) of a Pix2Pix architecture [58] integrated with a set of classification networks, altogether enabling robust image contrast translation and distillation of biological phenotypes from batch variations.CytoMAD offers both batch-distilled phenotypic features (i.e., CytoMAD-profile) and cellular images (i.e., CytoMAD-images) as model output.These results could further be utilized for downstream biological analysis.In general, the CytoMAD model comprises two synergistic functions: i) the GAN-based backbone for image generation and contrast translation, and ii) the classifier-guided, batch-aware morphology distillation (Figure 1b).

Pre-Training CytoMAD for Image Contrast Translation:
With cGAN as the backbone, the CytoMAD model comprises a generator network for image-to-image translation and a discriminator classifier for optimizing the generator prediction through a feedback mechanism.Before implementing the batch-aware module, the model wass initially pre-trained for image generation and contrast translation.
The generator takes cell images captured in a particular image contrasts (e.g., brightfield (BF)) as the model input.The images are then directed to the encoder and undergo multiple layers of 2D convolutional layers, batch normalization layers, and mathematical activation functions.These layers condense the biological information from the input images into a 1D array at the bottleneck layer.Such an array encodes the important features representing the input cell images and, thus, serves as the cellular phenotypic features, which is referred to no-CytoMAD-profile at this pre-training stage.The output images at this pre-training stage, called no-CytoMADimages, are reconstructed based on this 1D information with an additional image contrast translation step (i.e., from BF to quantitative phase image (QPI) in this paper).The array passes through multiple deconvolutional layers, batch normalization layers, and mathematical activation equations at the decoder for image reconstruction and translation.Skip-in layers are implemented between the encoder and decoder for better image feature preservation (Note S2, Supporting Information).
For the discriminator network, it improves the image reconstruction/translation of the generator by distinguishing its predicted images from original target images (i.e., original "ground-truth" QPI (Figure 1)), hence, forming a feedback mechanism with the classification loss for optimizing the generator parameters.Hence, the CytoMAD model can be pre-trained for accurate image reconstruction and contrast conversion from BF to QPI.The contrast translation serves as an additional feature of the CytoMAD for augmented cellular information, in conjunction with its batch-aware capability.In the applications where image contrast conversion was not required, convolutional autoencoder architecture could be adopted by equalizing the input and output target images of the cGAN backbone.
Classifier-Guided Batch-Aware Morphology Distillation: CytoMAD differs from the existing GAN models by the implementation of morphology distillator.The morphology distillator was a set of classification networks, which learn in synergy to minimize batch-to-batch variations while preserving biological information in the images.These classification networks take a crucial role here through 2 types of classifiers: batch classifiers and cell-type/state classifiers (Figure 1).
These classifiers were first pre-trained based on no-CytoMAD-profile and no-CytoMAD-images to identify the batch and cell-type information in the cGAN backbone.They were then implemented at both the bottleneck region and the output of the generator model, with the frozen models' parameters, for guiding the next training round of batch correction in the generator model.By freezing the classifiers' parameters and including the morphology distillatory loss into the CytoMAD loss function, it was enforced the generator to recognize and remove batch information while preserving cell-type information with minimal alterations with the no-CytoMAD-profile and no-CytoMAD-images.
At the bottleneck region of the pre-trained generator, the batch classifiers aim to reduce the batch-to-batch variations, while the cell-type/state classifiers learn to preserve the cellular variations in a 1D feature format, which was referred to the CytoMAD-profile.These classifiers follow the basic framework of neural networks, which was detailed in Note S3 (Supporting Information).Since the output images were reconstructed chiefly based on the bottleneck 1D features, these classifiers promote the disentanglement between the batch information and cell morphological information.Hence, they facilitate the morphology-distillation process, uncovering valuable morphological information in both the 1D CytoMAD profile and 2D batch-aware cell images (i.e., CytoMAD-images).
Additional batch and cell-type/state classifiers based on convolutional neural networks were also present at the end of the generator (Note S3, Supporting Information).They guide the reconstruction of batch-aware cell images (CytoMAD-images) and remove the batch information induced by the encoder-decoder skip-in layers.Employing multiple batch classifiers and periodically retraining them at predetermined intervals (e.g., every 10 epochs) in the CytoMAD model could facilitate and expedite the batch correction procedure.
During the CytoMAD training, the model parameters of batch classifiers and cell-type/state classifiers within the morphology distillator were frozen, while only the generator's parameters and the discriminator's parameters were updated in every epoch for batch-correction and ensuring the image prediction accuracy.
Overall, all these classification networks contribute to the CytoMAD loss function L CytoMAD . where L GAN is the loss of GAN-backbone model, with W gen and L gen denoting the weighting and the mean square error loss of generator model respectively, W dis and L dis denoting the weighting and the binary cross entropy loss of discriminator model.L cnn is the loss of convolutionalneural-network-based classifier models, with W Bcnn and L Bcnn denoting the weighting and the categorical cross-entropy loss of batch classifier model, W Ccnn and L Ccnn denote the weighting and the categorical cross-entropy loss of cell type classifier model.L nn is the loss of neural-network-based classifier models, with W Bnn and L Bnn denoting the weighting and the categorical cross-entropy loss of batch classifier model, W Cnn and L Cnn denoting the weighting and the categorical cross-entropy loss of cell type classifier model.All these networks form an integrated dynamic feedback system with the GAN-based image translation for extracting the batch information from the biological variations of interest and, eventually, achieving batch-aware capability for morphological CytoMAD profiling and the image contrast translation.
Assessment of CytoMAD Performance: Image Contrast Translation Accuracy: It was employed two metrics to evaluate the performance image translation (BF to QPI) by quantifying the similarity and difference in image pixel values between the original images and the CytoMAD images (i.e., the phase values).
Structural Similarity Index Measure: Structural similarity index measure (SSIM) was a perceptual metric broadly adopted for quantifying similarities within pixels between images. [59]It evaluates how well the image structure in CytoMAD images was preserved from original target images (i.e., QPI).Given that the valuable biological information resides in the cell region and the downstream analysis was conducted based on this region, the SSIM values were calculated and reported based on the cell area only through cell segmentation to differentiate cell bodies from the background.A high SSIM value of approaching 1 represents a high similarity between images.
Root Mean Square Error: Root Mean Square Error (RMSE) was also used for computing the pixel to pixel values difference between the original images and the CytoMAD images.Similarly, RMSE were reported based on the cell region only for comprehensive study.
Assessment of CytoMAD Performance: Morphology Distillator: The batch effect removal efficiency of CytoMAD through the visualization given by Uniform Manifold Approximation and Projection (UMAP) and quantification of biophysical phenotypic profile correction was examined.
UMAP Analysis of Batch-Effect Reduction and Biological Information Preservation: To evaluate the ability to mitigate the batch-to-batch variations and to simultaneously preserve the biological information, UMAP analyses were conducted based on the no-CytoMAD-profile and CytoMADprofile.This visualization enables the observation of the degree of batch mixing across the multiple batches, and hence, visualizing the batch effect removal efficiency of CytoMAD.At the same time, it also allows direct visualization of the how well the data of the same cell types/state across the multiple batches could cluster together in UMAP -an indication of the ability to preserve the biologically relevant information in the data.
Quantification of Batch-Effect Reduction in the Biophysical Profiles: In addition to UMAP examination, it was also conducted quantitative analysis to assess the batch effect reduction efficiency.This was achieved by first extracting the biophysical morphological profiles from both the CytoMADimages and the original QPI.This profile parametrized a catalog of 84 biophysical features of cells following a spatial hierarchical manner, including bulk phenotypes (e.g., area, circularity), global phenotypes (e.g., dry mass density, attenuation density) and local phenotypes (e.g., BF entropy, phase entropy). [4,7]It was thus quantify the mean values of each biophysical feature within each batch of samples.It was then defined "batch distance" as the difference in mean values across batches of the same samples.The batch distance of each biophysical features between the original QPI and the CytoMAD images was further compared.The reduction in batch distance thus indicates the decrease of batch-to-batch variations across samples.
Here the rationale of utilizing biophysical profile instead of bottleneck CytoMAD profile for assessing the batch-effect reduction performance is that biophysical profiles are constructed based on the well-defined geometrical metrics as well as human interpretable features.This makes them well-suited for drawing biological interpretation in the detailed morphological analysis (e.g., Figure 2f-h, Figure 3d-f, Figure 4d).By computing the batch distance reduction based on biophysical profile from CytoMAD images, it could gain further insights into which specific biophysical phenotypes were more prone to batch effect and demonstrate stronger batch correction with CytoMAD.This analysis offers valuable cues on the impact of batch effect on different biophysical profile and the efficacy of CytoMAD in overcoming these differences on each biophysical feature (Figures S6-S8 To better assess the preservation of biological information while reducing the batch-to-batch differences, the cell type/state classifiers were trained with one batch/selected batches of samples only and tested using unseen batches to evaluate the across-batch classification performance. Multi-ATOM Imaging: Multi-ATOM combines the time-stretch imaging technique [60,61] and phase gradient multiplexing method to retrieve complex optical field information (including the BF and quantitativephase contrasts) of the cells at high speed in an interferometry-free manner (Figure 1a).Detailed working principle and experimental configuration were reported previously. [2,60]In brief, a wavelength-swept laser source was first generated by a home-built all-normal dispersion (ANDi) laser (centered wavelength: 1064 nm; bandwidth: ≈10 nm; repetition rate: 11 MHz; pulse width = ≈12 ps).The laser pulses were temporally stretched in a single-mode dispersive fiber, and were then amplified by an ytterbium-doped fiber amplifier module. [62]The pulsed beam was subsequently launched to and spatially dispersed by a diffraction grating into a 1D line-scan beam which was projected orthogonally onto the cells flowing in the customized microfluidic channel.This line-scan beam was transformed back to a single collimated beam after passing through a doublepass configuration formed by a pair of objective lenses (N.A. = 0.75/0.8).Afterwards, the beam conveying phase-gradient information of the cell was split into 4 replicas by a one-to-four de-multiplexer, where each beam profile was half-blocked by a knife edge from 4 different orientations (left, right, top and bottom) respectively.Recombining the 4 beams by a fourto-one fiber-based time-multiplexer, its were able to detect the line-scan phase-gradient information in 4 directions in time sequence at high speed by a single-pixel photodetector (electrical bandwidth = 12 GHz).The digitized data stream was processed by a real-time field programmable gate array (FPGA) based signal processing system (electrical bandwidth = 2 GHz, sampling rate = 4 GSa s −1 ) for primary cell detection and image segmentation with a processing throughput of >10 000 cells s −1 in real-time.These segmented phase-gradient images of cells were sent to four data storage nodes (memory capacity > 800 GB) through four 10G Ethernet links, which were reconstructed to 2D complex field information following a complex Fourier integration algorithm, detailed elsewhere. [27]The multi-ATOM system also includes a fluorescence detection module, similar to the previous work. [4]In brief, two continuous wave (CW) lasers (wavelength: 488 and 532 nm) were employed to generate line-shaped fluorescence excitation, that were spatially overlapped with the multi-ATOM illumination.The two epi-fluorescence signals were detected by two photomultiplier tubes (PMT) separately.In the analog electronics backend, we multiplexed the PMT-detected signals by frequency modulation (11.8 and 35.4 MHz respectively, using a multichannel direct digital synthesizer).The multiplexed signals were then separated by digital demodulation and low-pass filtering.The same FPGA was configured to synchronously obtain the signal from multi-ATOM and fluorescence detection from each single cell at high-speed.
Microfluidic Chip Fabrication: The detailed microfluidic chip fabrication steps can be referred to our previous work. [4]In brief, the microfluidic channel was fabricated using standard soft lithography on a silicon wafer mold.A layer of photoresist was applied to a silicon wafer using a spin coater, followed by a two-step soft-bake.The photoresist was then patterned with a maskless soft lithography machine.The exposed photoresist was post-baked and developed.Subsequently, a polydimethylsiloxane (PDMS) precursor was applied to the silicon wafer and cured to form the channel.Post-curing, the channel was demolded and prepped for plastic tubing insertion.The channel was bonded to a glass slide using oxygen plasma, followed by an oven bake to strengthen the bonding.The microfluidic chip contains a serpentine channel structure, critical for robust in-focus single-cell imaging, which included 8 repeated units.The channel dimension at the imaging section was 30 mm × 60 mm.
Datasets: Effective training of deep learning models generally necessitates large datasets.In this regard, the high-throughput nature of the multi-ATOM imaging flow cytometry plays a vital role in generating large-scale, label-free cell images at ultrahigh-throughputs of > 10 000 cells s −1 .Prior to CytoMAD training, these large-scale single-cell image datasets (captured from in vitro cultured cells, as well as the clinical patient samples) were first screened by a pre-processing pipeline for data quality control (Figure S22, Supporting Information).Cell segmentation was first applied to the BF and QPI images captured by multi-ATOM to generate the cell body masks, which facilitated the definition of biophysical phenotypes as well as a set of focusing factors.A total 84 biophysical phenotypes, which represent the cell morphological properties, were defined and grouped into 3 hierarchical categories: bulk phenotypes (e.g., area, circularity), global phenotypes (e.g., dry mass density, attenuation density) and local phenotypes (e.g., BF entropy, phase entropy).Furthermore, it was also defined a set of cell focusing factors from the segmented masks that quantify the image focusing quality and thus to help exclude out-offocused cell image.These focusing factors, considering various aspects such as gradient-based measures and intensity statistics within the cell regions, aim to assess the sharpness and clarity of cellular features.By incorporating both spatial gradients and pixel intensities, these factors provide a comprehensive quantitative evaluation of image focus quality.All these parameters were then used to generate 2D scatter plots for preliminary visual image inspection and cell gating to exclude cells of out-of-focused or debris.Subsequently, all the remaining cells within the population were selected for the CytoMAD training/testing and the downstream analysis.This pre-processing pipeline ensured data quality across all multi-ATOM imaging datasets, laying a robust foundation for quantitative cell analysis.
To produce multiple batches of data for the CytoMAD's demonstration, cell images were collected on different dates using multi-ATOM.This approach intentionally introduced inherent, intricate batch-to-batch variations attributed from various sources, including 1) the optical system factors such as power instability of the laser source (e.g., the intensity noise of the pulsed laser) and the noise originated from photodetection and signal amplification in the system; 2) the microfluidic aspects associated with the variations in the microfluidic chip fabrication quality that could potentially result in the image distortion/aberration.This deliberately introduced complex batch-to-batch variations, aiming to validate CytoMAD's ability in minimizing batch effects and provide a realistic scenario regarding integrated analysis across datasets.
Lung Cancer Cell Lines: In this study, a total of 7 lung cancer cell lines were imaged by multi-ATOM and analyzed on 7 different days, producing 3 batches of ≈120 000 cells per cell line (i.e., >1 000 000 single-cell images in total, each of which consists of two label-free contrasts: BF and QPI).These cell lines represent three major lung cancer types: lung squamous cell carcinoma (LUSC) (i.e., H520, H2170), adenocarcinoma (LUAD) (i.e., H358, H1975, HCC827), and small cell carcinoma (SCLC) (i.e., H69, H526).The CytoMAD model was trained with 1000 cells per cell line per batch, validated with 200 cells per cell line per batch and tested with ≈40 000 cells per cell line per batch (see Note S5, Supporting Information).
Drug Assays of H2170 Cells: In this experiment, H2170 cells were treated with three drugs, which have different mechanisms of action (MoA), i.e., Docetaxel as microtubule stabilizing agent, [63] Afatinib as tyrosine kinase inhibitor in targeted therapy [64] and Gemcitabine as antimetabolite [65] ), each with 5 concentration levels and a negative control with dimethyl sulfoxide (DMSO) treated for 24 h (see Note S6, Supporting Information).They were imaged using multi-ATOM for single-cell BF and QPI images on 6 days, forming 2 batches with ≈100 000 cells per drug.Basically, this dataset consists of 2 batches of data, with each batch containing 3 different drug treatments and each treatment comprising 6 different concentration conditions.This resulted in 18 unique drug treatment conditions in each batch.
In contrast to the previous lung cancer sub-type classification, this drug assay study focused on the ability of CytoMAD to reveal subtle and progressive changes in response to different drug concentration.The Cy-toMAD model was trained with 1000 cells per drug treatment conditions per batch, validated with 500 cells per cell line per drug treatment conditions per batch, and tested with 5000 cells per drug treatment conditions per batch (see Note S5, Supporting Information).To examine the capability of CytoMAD in preserving progressive changes along drug concentration, only the batch information and the drug treatment types (i.e., docetaxel, afatinib, gemcitabine, and control) were provided to the model for guiding the batch-aware morphology distillation.The drug concentration information was held out to assess the power of CytoMAD.
Lung Cancer Patients Samples: We recruited and collected samples from 7 NSCLC patients diagnosed with adenocarcinoma in the Queen Mary Hospital of Hong Kong, with each of them composed of resected lung tumors tissue, normal lung tissue and 9 mL of peripheral blood samples (see Note S7, Supporting Information).Written consents for clinical care and research purposes were obtained from the donors.The research protocol was approved by the University of Hong Kong/Hospital Authority Hong Kong West Cluster Institutional Review Board (UW 19-278).After the standard sample pre-processing (e.g., disaggregating tissue into single-cell suspension, red blood cell lysis), the patient samples were imaged with multi-ATOM on separated dates, forming 7 batches with 2 labelfree contrast images of BF and QPI of ≈180 000 cells.For each patient, their samples (i.e., tumor sample, normal lung tissue sample and peripheral blood sample) were collected on the same days and under the same optical system settings.Thus, negligible batch effect across these samples from the same patient.
CytoMAD model was trained based on tumors and blood samples from patients, comprising 1000 and 500 cells per sample per patient for training and validation respectively.Its performance was then assessed with >120 000 cells from resected tumors and peripheral blood, in addition to ≈56 000 cells from unseen (held-out) patients' normal lung tissue (see Note S5, Supporting Information).Similar to the demonstrations of lung cancer cell type classification (Figure 2) and drug assay (Figure 3), here CytoMAD was trained to generate the batch-corrected morphological profiles (i.e., CytoMAD-profile) and predict the QPI (i.e., CytoMAD-images) from the input BF images.
In order to robustly train CytoMAD to account for batch effects across highly heterogeneous clinical biopsies, it was first set up an auxiliary task for training CytoMAD to recognize the technical batch effect.The auxiliary task was an initial learning phase where the model was first trained and tested on a simpler, yet related, classification problem.In this case, the auxiliary task was to differentiate between distinct sample types (i.e., blood versus tumor sample).This task was not the ultimate intent, but instead to enable the model to first learn biologically relevant representations of cell morphology and thus distill the unwanted batch effect.This was analogous to a model trained initially to distinguish between broad categories of vehicles (e.g., cars, trucks, buses).Once this general distinction was mastered with this auxiliary task, the model was further refined (trained and tested) to identify more specific types, such as different models of cars.This progression from general to specific leverages foundational knowledge acquired from the initial, broader task and enhances the model's capability to make finer distinctions.
To ensure that an auxiliary task can effectively establish a basic classification capability that was batch effect aware in the model, two key criteria are needed: 1) The auxiliary task itself should be "simple" enough to distinguish two biologically different samples such that the model could recognized more easily the technical batch effect.2) The task should allow the model to recognize the heterogeneous nature of the clinical samples.This could prevent the model from being overfit and this was beneficial for more challenging downstream tasks.
Based on these rationales, classification of blood-versus-tumor-sample as the auxiliary task from which the batch effect can be learned and thus corrected was chosed.On one hand, both samples from the same patient were biologically distinct enough.On the other hand, both samples were heterogeneous enough (as mentioned above) to help the model to learn the genuine biological heterogeneity of the clinical sample.It should be noted that instead of seeking for the cell-type-specific ground-truth labels within tumor (as well as normal tissue) biopsies and blood samples, which was generally not easily defined in the clinical samples given the intrinsic heterogeneity of biopsy, it was sought for weak labels of tumor and blood samples, i.e., the label "tumor" is broadly referred to all the cells from tumor biopsies and the label "blood" was referred to the cells derived from blood samples.31] In this study, the pre-trained CytoMAD (with the auxiliary task) was further trained and tested with downstream tasks (i.e., classification of tumor versus normal tissue "samples"; and EMT-related analysis) (Figure 4a).Note that the downstream tasks will still require further training of the model that was pre-trained with the auxiliary task.To further assess the pre-trained model's applicability to new cases, it was included 3 new, unseen patients (denoted as patients 5, 6, and 7) with >100 000 cells in the subsequent analysis.These patients were not part of the initial training datasets and were included to validate the model's ability to generalize and perform robust biological analysis in real-world scenarios.
In the analysis of EpCAM-Vim-, EpCAM+ and EpCAM+Vim+ cells (Figure S17, Supporting Information), it was noted that, due to the insignificant population of cells (≈1%) exclusively expressing Vim (i.e., Vim+ only), which could introduce bias in model training and classification outcomes, The analysis on three well-represented cell populations (Figure S17, Supporting Information) was focused.These populations include EpCAM-Vim-cells (>85%), EpCAM+ cells (≈10%) and EpCAM+Vim+ cells (≈4%).To facilitate a fair classification analysis, a downsampling strategy for the predominant EpCAM-Vim-population in the training phase was implemented.

Figure 1 .
Figure 1.Main framework of CytoMAD.(a) Inputs and outputs of CytoMAD.CytoMAD takes in brightfield (BF) cell images from multiple batches as model input.It enables robust image contrast conversion (from BF to QPI) and distills underlying biologically relevant morphological profiles (i.e., CytoMAD profile) from batch distortions.It thus allows robust "batch-aware" downstream morphological profiling and analysis.b) General workflow of CytoMAD.Without CytoMAD, the batch effect confounds the downstream analysis (e.g.classification of cell type A and B from three datasets/batches).CytoMAD corrects the batch effect and thus allows integrated analysis (e.g., classification, profiling and interpretation of the CytoMAD profiles).c) Deeplearning architecture of CytoMAD.The model comprises of two main parts, the GAN-based morphological profiler and the morphology distillator.The GAN-based backbone takes in BF and converts it into QPI output.The discriminator classifies the CytoMAD output from the ground truth and serves as a feedback mechanism to achieve accurate image contrast conversion.The morphology distillator introduces the self-supervised batch-aware characteristic to CytoMAD through a set of classification networks.The classification networks comprise of batch classifiers and cell type/state classifiers, altogether dynamically enabling disentanglement of batch variations from biological information at both the bottleneck morphological features (CytoMAD profile) and output images.

Figure 2 .
Figure 2. Batch Distillation Performance with CytoMAD on Lung Cancer Cell Lines Dataset.a) Lung cancer cell lines images from multi-ATOM and CytoMAD results.The multi-ATOM label-free images of BF and QPI, and the CytoMAD-images (i.e., CytoMAD QPI) of the 7 types of lung cancer cell lines were reported.Color bar shows the phase values.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.b) Violin plots on SSIM and RMSE distributions.The SSIM and RMSE distributions of no-CytoMAD-images are indicated in orange, while CytoMAD-images are indicated in green.The white circles represent the average values of the distributions.c) Confusion matrix on lung cancer cell types classification with CytoMAD-profile.Left: Confusion matrix of no-CytoMAD-profile.Right: Confusion matrix with CytoMAD-profile.d) UMAP across major lung cancer types.UMAP were computed with 63 000 cells data (i.e., 7000 cells per cell type per batch).Every data point represents an individual cell.Top-row: Different colors are used to indicate different cancer types.Bottom-row: Different colors are used to indicate different batches.Left: UMAP of no-CytoMAD-profile.Right: UMAP with CytoMAD-profile.e) UMAP across lung adenocarcinoma cell types.UMAP were computed with 63,000 cells data (i.e., 7000 cells per cell type per batch).Every data point represents an individual cell.Top-row: Different colors are used to indicate different adenocarcinoma cell types.Bottom-row: Different colors are used to indicate different batches.Left: UMAP of no-CytoMAD-profile.Right: UMAP with CytoMAD-profile.f) Selected absolute correlation profiles of between biophysical phenotypes and selected CytoMAD-profile group (see Figure S2b, Supporting Information for complete profile).The absolute correlation values are reported, with green color denoting a high correlation and black denoting a low correlation.Different colors are used to represent the 3 categories of the biophysical phenotypes.g) Violin plots across major lung cancer types.h) Violin plots across lung adenocarcinoma cell types.

Figure 3 .
Figure 3. Drug Treatment Analyses with CytoMAD Batch Distillation.a) H2170 drug response images from multi-ATOM and CytoMAD results.The multi-ATOM label-free images of BF and QPI, and the CytoMAD-images (i.e., CytoMAD QPI) of the H2170 drug treatment response were reported.Color bar shows the phase values.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.b) UMAP across DMSO negative control samples and across IC50 samples.UMAP were computed with 60 000 cells data (i.e., 5000 cells per batch without CytoMAD, and 5000 cells per batch with CytoMAD).Every data point represents an individual cell.The orange circle denotes data without CytoMAD (i.e., no CytoMAD), while the green circle denotes data with CytoMAD.Top-row: UMAP across DMSO negative control samples.Bottom-row: UMAP across IC50 samples.Left: Different colors are used to indicate different drug treatment samples.Right: Different colors are used to indicate different batches.c) Confusion matrix on IC50 samples with CytoMAD-profile.Left: Confusion matrix of no-CytoMAD-profile.Right: Confusion matrix with CytoMAD-profile.d) Batch distance correction ratio after CytoMAD.The relative changes in batch distance of selected biophysical phenotypes under each drug treatments after implementing CytoMAD were quantified as the correction ratio (see Figure S6, Supporting Information for complete profile).Biophysical phenotypes with high relative changes are reported, with red color denotes a reduction of batch distance and the blue color denotes an increase in batch distance after CytoMAD.e) Violin plots of biophysical phenotypes across IC50 samples.Different colors are used to indicate different drug treatment samples.The 2 sides of each violins represent 2 different batches and the white circle at the center line each violin denotes the average biophysical phenotype values between the 2 batches.The 3 violins on the left side of each plot indicates the biophysical phenotypic distribution without CytoMAD, while the 3 violins on the right side indicates the biophysical

Figure 4 .
Figure 4. Pilot Study on NSCLC with CytoMAD.a) NSCLC experimental pipeline with CytoMAD.With the collected samples (i.e., resected lung tumor tissue, normal lung tissue and peripheral blood samples) from a total of 7 NSCLC patients, they were imaged with multi-ATOM (BF and QPI images).An auxiliary task was first conducted to pre-train the CytoMAD (upper panel).In this task, the tumor and blood data from 4 patients were employed to train CytoMAD and to guide the disentanglement of valuable biological information from batch distortion during morphological distillation.Normal lung tissue samples, along with data from 3 additional NSCLC patients were not included in this auxiliary task training, in order to assess CytoMAD's performance and test its generalizability across new patient populations.These samples were utilized for our downstream task (Lower panel), focusing on the population analysis involving both normal and tumor samples and EMP-cells analysis.b) Images of NSCLC patient samples from multi-ATOM.The multi-ATOM label-free images of BF and QPI were reported.Color bar shows the phase values.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.c) Auxiliary tasks of classifying tumor biopsy versus blood samples.The UMAP on NSCLC blood and tumor samples were reported.Different colors are used to indicate different batches/patients.Top-left: UMAP on patients' blood samples with no-CytoMAD-profile.Top-right: UMAP on patients' blood samples with CytoMAD-profile.Bottom-left: UMAP on patients' tumor samples with no-CytoMAD-profile.Bottom-right: UMAP on patients' tumor samples with CytoMAD-profile.d) UMAP on tumor and normal lung tissue samples.Different colors are used to indicate different batches.e) UMAP on tumor and normal lung tissue samples of each patient.Different colors are used to indicate different sample types.The distributions of each class are projected on the sides for visualization the highly overlapping populations.f) AUC of the ROC analysis of the patient-level classification between tumor and normal tissues by bootstrapping (number of bootstraps: 1000).The AUC of the classifier is significantly improved from the case without CytoMAD from 0.65 to 0.82 with CytoMAD.g) Molecular marker analysis of tumor and normal lung tissue samples.The UMAP, ROC and predicted population ratio of label-free molecular markers classification are reported.The true population ratio between EpCAM+ and both+ cells is presented as a reference and different colors are used to denote the different molecular markers.h) Subpopulation analysis in EMP cells.The UMAP of EMP cells (i.e., EpCAM+ & Vim+) are colored based on cell density.The BF and QPI of the subpopulations are reported.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.The z-score of the subpopulations are reported, with green color denoting a positive z-score and brown denoting a negative z-score.Different colors are used to represent the 3 categories of the biophysical phenotypes.
Figure 4. Pilot Study on NSCLC with CytoMAD.a) NSCLC experimental pipeline with CytoMAD.With the collected samples (i.e., resected lung tumor tissue, normal lung tissue and peripheral blood samples) from a total of 7 NSCLC patients, they were imaged with multi-ATOM (BF and QPI images).An auxiliary task was first conducted to pre-train the CytoMAD (upper panel).In this task, the tumor and blood data from 4 patients were employed to train CytoMAD and to guide the disentanglement of valuable biological information from batch distortion during morphological distillation.Normal lung tissue samples, along with data from 3 additional NSCLC patients were not included in this auxiliary task training, in order to assess CytoMAD's performance and test its generalizability across new patient populations.These samples were utilized for our downstream task (Lower panel), focusing on the population analysis involving both normal and tumor samples and EMP-cells analysis.b) Images of NSCLC patient samples from multi-ATOM.The multi-ATOM label-free images of BF and QPI were reported.Color bar shows the phase values.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.c) Auxiliary tasks of classifying tumor biopsy versus blood samples.The UMAP on NSCLC blood and tumor samples were reported.Different colors are used to indicate different batches/patients.Top-left: UMAP on patients' blood samples with no-CytoMAD-profile.Top-right: UMAP on patients' blood samples with CytoMAD-profile.Bottom-left: UMAP on patients' tumor samples with no-CytoMAD-profile.Bottom-right: UMAP on patients' tumor samples with CytoMAD-profile.d) UMAP on tumor and normal lung tissue samples.Different colors are used to indicate different batches.e) UMAP on tumor and normal lung tissue samples of each patient.Different colors are used to indicate different sample types.The distributions of each class are projected on the sides for visualization the highly overlapping populations.f) AUC of the ROC analysis of the patient-level classification between tumor and normal tissues by bootstrapping (number of bootstraps: 1000).The AUC of the classifier is significantly improved from the case without CytoMAD from 0.65 to 0.82 with CytoMAD.g) Molecular marker analysis of tumor and normal lung tissue samples.The UMAP, ROC and predicted population ratio of label-free molecular markers classification are reported.The true population ratio between EpCAM+ and both+ cells is presented as a reference and different colors are used to denote the different molecular markers.h) Subpopulation analysis in EMP cells.The UMAP of EMP cells (i.e., EpCAM+ & Vim+) are colored based on cell density.The BF and QPI of the subpopulations are reported.Scale-bar: 10 μm.FOV: 45 μm × 45 μm.The z-score of the subpopulations are reported, with green color denoting a positive z-score and brown denoting a negative z-score.Different colors are used to represent the 3 categories of the biophysical phenotypes.
, Supporting Information).Across-Batch Cell Type/State Classification: The capability of biological information preservation was evaluated through cell type/state classification using both the CytoMAD profiles and CytoMAD images.Deep neural networks were used for the classification based on CytoMAD profiles.The model consists of 3 dense layers with 75, 50 and 25 nodes, interconnected with rectified linear unit (ReLU) activation function.As for CytoMAD image classification, the convolution neural network was employed, which was a 5 layers model for image-based classification, with each of the layers composed of 2D convolution, batch normalization, leaky ReLU activation functions and max pooling operation.Both the deep neural network models and the convolution neural network models were trained for 100 epochs, utilizing softmax function as the output activation function and categorical cross-entropy loss as the loss function.