Integrative analysis of multi‐omics data reveals the heterogeneity and signatures of immune therapy for small cell lung cancer


 Small cell lung cancer (SCLC) is highly invasive and lethal. Genomic studies are beginning to characterize its genetic determinants and heterogeneity. Here we performed RNA-Seq and whoe exome sequencing (WES) in 19 Chinese SCLC clinical tumor specimens. Integration with other two public cohorts (n = 129 for RNA-seq, n = 171 for WES), we carried out gene co-expression network analysis and classified into four subgroups: ASCL1-high, NEUROD1-high, a novel variant subtype with CCSP/CC10-high, and a fourth subtype with high expression of NOTCH family and inflammatory genes. We further found that this fourth subtype was characterized by overexpression of immune-related pathways and immunosuppressive factors, referred to as immune subtype. In addition to transcriptomic signatures, specific genomic alterations were also significantly enriched in each subtype. We successfully built a machine learning model to predict the subtypes of newly coming specimens based on transcriptomic data. In particular, we found that POU2F3 protein was effective in distinguishing immunotherapy patients, i.e. the immune subtype, which was further validated by an independent data of chemo-resistant patients received 2nd -line immunotherapy. Collectively, our work systematically studied the transcriptomic and genomic heterogeneity of SCLC, and characterized an immune subtype that displayed features of sensitivity to immune checkpoint-based therapies.

certain SCLC might originate from club cells. Hereafter, we named cluster 4 as SCLC-C. The cluster 1 showed low ASCL1 and NEUROD1 expression and high POU2F3 and NOTCH2 expression ( Figure 1C,D). Gene set enrichment analysis (GSEA) demonstrated that the interferon-gamma response, interferon-alpha response, inflammatory response and IL-6/JAK/STAT3 signaling were enriched in SCLC-I subtype ( Figure 1F), which clearly confirmed its immune-related characteristic. We therefore named cluster 1 as 'immune subtype (SCLC-I) '.
Further analysis showed that the enrichment of immune-related pathways, such as IL-10 signaling and other interleukin-related pathways, in SCLC-I versus other subtypes (Figure 2A, Figure S4B). Multiple immunoinhibitory factors such as PD1, IL-10, IDO1, CD96 and BTLA, were highly expressed in SCLC-I ( Figure 2B,E, Figure S4). Immune cell infiltration is considered to be primary immune signature and strongly associated with the clinical outcome of cancer immunotherapies. 9 Using ImmuCellAI, 10 we found that the abundance of dendritic cells, macrophage, induced regulatory T cells (iTreg) and CD8 + T cells were significantly increased in SCLC-I ( Figure 2C). Moreover, we observed that most of the chemokines, such as CXCL10, CCL17, CCL18, were up-regulated in SCLC-I subtype ( Figure 2D). Taken together, SCLC-I subtype is featured with the activation of immune checkpoint molecules and infiltration of immune suppressive cells such as iTreg, which might help the immune escape.
To explore the unique genetic alterations of each subtype, we performed analyses of 115 samples with available genomic sequencing data and RNA-sequencing data. We found that each subtype harbored specific gene mutations, for example, LRP1, CD163, MME, ABCB1 mutations were frequently observed in SCLC-I ( Figure 3A). In comparison with other samples, SCLC-I group showed F I G U R E 1 Subtyping 129 human small cell lung cancer (SCLC) using weighted gene co-expression network. (A) Heatmap of eigenvectors of 17 network modules from weighted correlation network analysis (WGCNA). We combined three datasets after removing the batch effect by ComBat from 'sva' package in R. We divided the network into modules according to the correlations between the genes. The minimum size of the module was set as 10. We obtained 1016 genes for calculating adjacency matrix (power = 3). The co-expression network was finally partitioned into 17 modules. All samples were divided into four clusters according to the module eigengenes by hierarchical clustering analysis (Euclidean distance, ward.  Figure 3B,C), which indicated the increased genomic instability. SCLC-I had more gene amplifications on Chr2, Chr6, Chr11, Chr12 and Chr19, whereas SCLC-A had more gene amplifications on Chr17, SCLC-N had more amplified genes on Chr9 and Chr21, and SCLC-C had more amplified genes on Chr14 ( Figure 3E, Table S2). Over 3000 genes were observed significantly amplified with high alteration frequency (>50%) in SCLC-I ( Figure 3D, Table S2). Further analysis showed that the cholesterol biosynthesis I pathway was significantly enriched in SCLC-I subtype ( Figure 3F). Consistent with the GSEA result using gene expression profile  Figure 1F), the WNT/beta-catenin signaling was enriched in both SCLC-I and SCLC-N subtypes. Collectively, these data show that SCLC-I subtype has higher level of genomic instability and each subtype harbors unique gene mutations and copy number variations.
To determine whether our findings have clinical relevance, we then identified the biomarker for SCLC-I. According to feature selection from random forest model, we found that 10 genes (POU2F3, ANXA1, LRMP, GFI1B, SLC7A14, PHYHIPL, MAP2, SYP, KCNK3 and CPE) were most important in distinguishing SCLC-I from other subtypes ( Figure 4A, SM 6 and Figure S6A). Using these genes to build random forest model on 100 sets of different testing data, we got the average prediction accuracy of 92.74% and average score of area under curve (AUC) of 93.32% ( Figure 4B, Table S3). Among 10 genes used for SCLC-I prediction, POU2F3 stood out as the most significantly up-regulated gene with high expression (Figures 4C  and 1C and Table S5). Moreover, the SCLC-P samples identified in the previous study 6 were all included in SCLC-I with up-regulated immune-related pathways ( Figure S6C).
We further collected a cohort containing 28 relapsed SCLC samples from patients receiving immunotherapy or chemo-immunotherapy (Table S4, SM 7) and performed immunohistochemical staining of POU2F3 in these specimens ( Figure 4D, Table S4). Our data showed that the patients with high POU2F3 expression exhibited a significantly improved objective response rate (ORR) to F I G U R E 4 Patients with high POU2F3 levels respond well to second-line immunotherapy. (A) Importance of the 10 features from 100-times random sampling trainings. We built a random forest classifier based on the transcriptomic data to predict small cell lung cancer (SCLC)-I subtype. We firstly performed 100-times random samplings to divide training data and testing data. We chose 1000 genes with the highest coefficient of variation from training data to build each model. Then, we obtained the gene list ranked by Gini index. We counted the immunotherapy, with a high AUC of 0.813 ( Figure 4E). Moreover, the POU2F3 protein level was positively correlated with patient prognosis ( Figure 4F). Importantly, two patients with high POU2F3 level showed dramatic regression of lung tumours ( Figure 4G,H). The positive response to immunotherapy indicated the potentially strong immune cell infiltration in POU2F3-high SCLC. These results together supported that the SCLC-I patients are more sensitive to immunotherapy, and POU2F3 might serve as a biomarker for SCLC immunotherapy.
In conclusion, our work systematically uncovers the transcriptomic and genomic heterogeneity in SCLC and characterizes a novel immune subtype with high sensitivity to immunotherapy. We identified POU2F3 as a potential biomarker with a good prediction power to assess SCLC immunotherapy response. Gay et al. identifies an inflamed subtype of SCLC which shows a significant overall survival (OS) benefit relative to all other subtypes with the combined chemotherapy and immunotherapy. 7 In our study, the immune subtype seems correspond to a combination of SCLC-P and SCLC-I from Gay et al. cohorts. Although we used a different method and biomarkers to identify this special subtype of SCLC, both our study and Gay et al. study have proven the potential of SCLC re-clustering in current clinical immunotherapy. Of course, the small size of the validation cohort is a limiting factor. Future clinical efforts and larger cohorts are required to validate the effectiveness of immunotherapy in this subtype.