Cross-platform comparison of immune-related gene expression to assess intratumor immune responses following cancer immunotherapy

Neoadjuvant immunotherapy can induce immune responses within the tumor microenvironment. Gene expression can be used to assess responses with limited amounts of conventionally-fixed patient-derived samples. We aim to assess the cross-platform concordance of immune-related gene expression data. We performed comparisons across three panels in two platforms: Nanostring nCounter® PanCancer Immune Profiling Panel (nS), HTG EdgeSeq Oncology Biomarker Panel (HTG OBP) and Precision Immuno-Oncology Panel (HTG PIP). All tissue samples of 14 neoadjuvant GMCSF treated, 14 neoadjuvant Provenge treated, and 12 untreated prostate cancer patients were radical prostatectomy (RP) tissues, while 6 prostatitis patients and 6 non-prostatitis subjects were biopsies. For all 52 patients, more than 90% of the common genes were significantly correlated (p<0.05) and more than 76% of the common genes were highly correlated (r>0.5) between any two panels. Co-inertia analysis also demonstrated high overall dataset structure similarity (correlation>0.84). Although both dimensionality reduction visualization analysis and unsupervised hierarchical cluster analysis for highly correlated common genes (r>0.9) suggested a high-level of consistency across the panels, there were subsets of genes that were differentially expressed across the panels. In addition, while the effect size of the differential testing for neoadjuvant treated vs. untreated localized prostate cancer patients across the panels were significantly correlated, some genes were only differentially expressed in the HTG panels. Finally, the HTG PIP panel had the best classification performance among the 3 panels. These differences detected may be a result of the different panels or platforms due to their technical setting and focus. Thus, researchers should be aware of those potential differences when deciding which platform and panel to use. Introduction Gene expression profiling (GEP) is a powerful tool with a wide range of purposes, including identifying relevant immune resistance mechanisms in the tumor microenvironment [1] and developing predictive biomarkers for active immunotherapy [2,3]. There is a current need to perform Jo ur na l P re -p ro of Journal Pre-proof highly multiplexed molecular testing to assess RNA expression from a variety of clinically relevant sample types, including: formalin-fixed, paraffin embedded (FFPE) tissue, biofluids, cell lysates, and extracted RNA. One limitation of gene expression profiling for molecular researchers is the difficulty of nucleic acid extraction from small and sometimes highly degraded samples and the introduction of bias by cDNA synthesis. In the case of biofluids, large amounts of this sample type are required to produce enough isolated RNA to test by traditional methods. Two platforms commonly used for targeted GEP analysis are the HTG Molecular Diagnostics (HTG) EdgeSeq Platform and the nanoString nCounter Platform. HTG has developed the HTG EdgeSeq assays, a coupling of RNA extraction-free, nuclease protection assay with next-generation sequencing (NGS) quantification of up to 2600 genes with a partially automated workflow. NanoString has developed the nCounter gene expression assays, which utilizes fluorescent-based opticalbarcode targeted against genes of interest for a non-amplified measurement of gene expression. The nanoString technology is able to detect gene expression of up to 800 genes with an automated workflow. While both assays utilize short DNA based probes to hybridize to targets, HTG is able to boost both the number of probes available in each panel as well as to utilize the high sensitivity and wide dynamic range available with NGS detection. NanoString on the other hand is able to avoid amplification bias and shorten hands on workflow time by using fluorescent-based barcodes. Here, we compare the immune-related gene expression data across three panels: the HTG EdgeSeq Oncology Biomarker Panel (HTG OBP) and Precision Immuno-Oncology Panel (HTG PIP), and the nanoString nCounter® PanCancer Immune Profiling Panel (nS Immune) to assess the cross-platform concordance. Both HTG OBP and HTG PIP are research use only (RUO) gene expression profiling assays that measure expression of mRNAs involved in but not limited to oncology, immune pathway mapping and immune landscape assessment. HTG OBP detects 2,560 transcripts associated with Jo ur na l P re ro of Journal Pre-proof tumor biology, highlighting 24 groups and pathways. The probe set for HTG PIP comprises 1,392 nuclease protection probes, including 4 negative control probes (ANT), 4 positive control probes (POS), and 10 process control probes. Gene groupings include current drug targets, lymphocyte lineage markers, B, T and NK cell activity markers, DNA repair genes, Toll-like receptors, interleukin and chemokine markers, and proteasome-associated genes. Both HTG panels are compatible with clinically relevant sample types such as FFPE, isolated cells, PAXgene and extracted RNA. The nS Immune panel is also an RUO multiplexed gene expression panel designed to quantify 770 genes. It is also fully compatible with clinically relevant sample types such as fresh-frozen tissue, FFPE tumor sections, isolated immune cell populations such as PBMCs, and cell lysates. NanoString has included 109 genes that are associated with 24 immune cell types and populations. The nS Immune panel also includes a unique cell profiling feature that uses gene expression data from 46 human genes to quantify the relative abundance of 14 immune cell types: B and T cells, Th1, T-regs, CD45 and CD8+ T cells, exhausted CD8+ T cells, cytotoxic and dendritic cells, macrophages, mast cells, neutrophils, natural killer (NK) cells, and NK CD56dim cells. Materials and Methods Patients Neoadjuvant GM-CSF (NeoGM) was a single-center phase 1 study of neoadjuvant GM-CSF prior to planned radical prostatectomy (RP) in patients with localized prostate cancer (NCT00305669) [4]. All patients underwent RP within 5 days after the last dose of GM-CSF. Tissue samples of neoadjuvant Provenge (NeoProv) were derived from a single-arm, multicenter phase II trial of neoadjuvant sipuleucel-T treatment in men with localized prostate cancer administered prior to planned treatment (NCT00715104) [5,6]. Subjects received sipuleucel-T at the standard two-week intervals for three planned doses prior to RP. Other patients were a group of patients who were at risk of harboring prostate cancer, based on digital rectal examination and/or serum prostatic-specific antigen (PSA), Jo ur na l P r ro of Journal Pre-proof assessed at the University of California San Francisco (UCSF) Medical Center, or UCSF Comprehensive Cancer Center. All the tissue samples of 14 NeoGM treated, 14 NeoProv treated, and 12 untreated prostate cancer patients (Untreated) were RP tissues, while the tissues of 6 prostatitis patients and 6 non-prostatitis subjects were biopsies. All subjects gave written informed consent to participate in the protocol approved by the Institutional Review Boards (IRBs) of each participating institution. Gene Expression RNA was extracted from FFPE RP sections using PureLinkTM FFPE Total RNA Isolation Kit (Invitrogen) and was evaluated for gene expression using the nCounter® PanCancer Immune Profiling Panel (NanoString Technologies, Inc.); and using the HTG EdgeSeq Oncology Biomarker Panel (HTG OBP) and Precision Immuno-Oncology Panel (HTG PIP) (HTG Molecular Diagnostics, Inc.). Data Standardization and Normalization For simple sample correlations (e.g., replicate comparisons) and quality control metrics, we use a simple per sample standardization called the log2(CPM): log2( rgi+0.5 Ri+1 × 10)

highly multiplexed molecular testing to assess RNA expression from a variety of clinically relevant sample types, including: formalin-fixed, paraffin embedded (FFPE) tissue, biofluids, cell lysates, and extracted RNA. One limitation of gene expression profiling for molecular researchers is the difficulty of nucleic acid extraction from small and sometimes highly degraded samples and the introduction of bias by cDNA synthesis. In the case of biofluids, large amounts of this sample type are required to produce enough isolated RNA to test by traditional methods.
Two platforms commonly used for targeted GEP analysis are the HTG Molecular Diagnostics (HTG) EdgeSeq Platform and the nanoString nCounter Platform. HTG has developed the HTG EdgeSeq assays, a coupling of RNA extraction-free, nuclease protection assay with next-generation sequencing (NGS) quantification of up to 2600 genes with a partially automated workflow. NanoString has developed the nCounter gene expression assays, which utilizes fluorescent-based opticalbarcode targeted against genes of interest for a non-amplified measurement of gene expression. The nanoString technology is able to detect gene expression of up to 800 genes with an automated workflow. While both assays utilize short DNA based probes to hybridize to targets, HTG is able to boost both the number of probes available in each panel as well as to utilize the high sensitivity and wide dynamic range available with NGS detection. NanoString on the other hand is able to avoid amplification bias and shorten hands on workflow time by using fluorescent-based barcodes.
Here, we compare the immune-related gene expression data across three panels: the HTG EdgeSeq assessed at the University of California San Francisco (UCSF) Medical Center, or UCSF Comprehensive Cancer Center. All the tissue samples of 14 NeoGM treated, 14 NeoProv treated, and 12 untreated prostate cancer patients (Untreated) were RP tissues, while the tissues of 6 prostatitis patients and 6 non-prostatitis subjects were biopsies. All subjects gave written informed consent to participate in the protocol approved by the Institutional Review Boards (IRBs) of each participating institution.

Gene Expression
RNA was extracted from FFPE RP sections using PureLinkTM FFPE Total RNA Isolation Kit (Invitrogen) and was evaluated for gene expression using the nCounter® PanCancer Immune Profiling Panel (NanoString Technologies, Inc.); and using the HTG EdgeSeq Oncology Biomarker Panel (HTG OBP) and Precision Immuno-Oncology Panel (HTG PIP) (HTG Molecular Diagnostics, Inc.).

Data Standardization and Normalization
For simple sample correlations (e.g., replicate comparisons) and quality control metrics, we use a simple per sample standardization called the log 2 (CPM): Where, is the number of sequence reads for probe ( ) and sample (i).
Normalization is a construct that is most influential in gene expression differential testing. Here we considered the median normalization suggested originally by Anders and Huber [7]. A scaling factor for each sample is obtained for each gene (over genes) and samples, , which is calculated as the median gene level expression value for each sample-gene count adjusted by the geometric mean over all genes. Any genes without expression over all samples are necessarily excluded from this scaling calculation. The formula for the scaling factor for the th sample is: where, is the raw count for the th sample and th gene.
The scaling factor is then used to modify the original gene count to obtain the normalized count value, The resulting data, , will be used for differential testing analysis.

Statistical Analysis
For each of the common genes, the pairwise relationships between each two panels were assessed by applying Pearson correlation coefficient (r) on both raw and CPM data. The RV coefficient was used to measure the overall relationship across all common genes for each two panels. The disadvantage of only examining common genes present across the platforms is that data from biologically significant genes may be lost if a gene is not represented on the platforms examined. Coinertia analysis (CIA) is a multivariate method that identifies co-relationships in multiple datasets J o u r n a l P r e -p r o o f Journal Pre-proof which contain the same samples [8]. CIA is not limited to the analysis of datasets containing the same number of genes, thus uses all the genes in each platform. CIA simultaneously finds ordinations (dimension reduction diagrams) from the datasets that are most similar by finding successive axes from the two datasets with maximum covariance. UMAP [9], a dimensionality reduction technique, was applied to visualize the comparisons across the platforms. Unsupervised hierarchical cluster analysis with Euclidean distance was utilized on the data from each platform.
We performed differential testing between treated patients (NeoProv or NeoGM) vs. untreated patients within each panel by Limma [10]. We compared the effect size, log 2 of fold change (FC) across the panels by Pearson correlation coefficient, where FC was calculated as the ratio of the normalized expression. In addition, we assessed the classification performance across all 3 panels, where five-fold crossed validation was used. Data were randomly split into 5 subsets of similar sizes.
For each run, by using 4 folds of the data (training data), genes were selected by SVM-RFE [11].
Selected genes were classified by SVM [12]. The 5 th fold of the data (validation data) was used to calculate the classification accuracy and area under the curve (AUC) of receiver operating characteristic (ROC) curve.
We predicted the cell fractions for each sample within each panel by applying the cell composition deconvolution method [13], which uses advanced statistical methods and reference samples from NCBI GEO to build the gene expression signature matrix. We assessed the correlation of the predicted cell fractions between the panels by Pearson correlation coefficient and then compared the predicted cell fractions between treated patients (NeoProv or NeoGM) and untreated patients within each panel by Wilcoxon rank sum test.
between pairs of ordinations on each axis. This, therefore, presents a good summary of the costructure between the any two panels studied. For all 3 comparisons, the correlation between the first axes (F1) of the two ordinations was 0.89, and the correlation between second axes (F2) of the two ordinations was 0.84-0.87. These high values partly result from the maximization of covariance or the product of the correlation and the squared variances projected onto the co-inertia axes. These results show that the expression data of all genes across all three panels are highly concordant.

Divergences among high correlated genes
In addition to direct comparison of gene expression data, we used UMAP to compare the global structure of the expression data of all genes across 3 panels for the 52 patients simultaneously. In We also assessed the gene expression data within a clinical context. First, we performed differential testing between the neoadjuvant treated patients (NeoProv or NeoGM) and untreated patients within each panel by Limma [10]. The volcano plots (Figures 2A-C) present -log 10 (pvalue) vs. log 2 (FC) based on the data from each panel, where FC (fold change) was directly calculated by Limma and stood for the difference of expression of the common genes between two cohorts. Supplementary Table 1 presents the list of the genes identified as significantly differentially expressed between samples from treated and untreated patients based on the 3 panels. We compared the effect size, log 2 (FC) across the panels (Figure 2). Overall, the log 2 (FC) across the panels were significantly correlated (p<0.001). The correlation varied with the highest correlation between HTG PIP and nS Immune (r = 0.6, p < 0.001), while the correlation was 0.4 and 0.45 between HTG PIP and HTG OBP, and between nS Immune and HTG OBP, respectively ( Figure 2D, 2E and 2F). Due to the small study cohort size, each panel could only identify a small amount of significantly differentiated genes (p<0.05). Some genes were only identified as significant in one panel (blue or green dots) while some genes were identified as significant across multiple panels (red dots). For example, 8 genes were significant in all 3 panels, while 16 genes detected as significantly expressed in both HTG OBP and PIP but not in nS Immune ( Figure 2G). Only one gene (PLK3) and 5 genes (MR1, S100A8, THBD, TNFAIP3, TREM1) was overlapped between nS Immune and HTG OBP or PIP respectively. 28 genes were detected by the nS Immune panel itself, 82 gene by HTG PIP only and 232 genes by HTG OBP only. Although the number of genes within each panel differ, the percentage of detection of significant genes were similar between the two HTG panels (PIP = 8% and OBP = 10 %), while nS Immune was slightly lower at 5%. In general, the effect size, log 2 (FC) across the panels were significantly but moderately correlated. There were common significant genes detected by different platforms but some that were only detected by the HTG platform.
In addition, we evaluated the classification performance across the 3 panels by classification J o u r n a l P r e -p r o o f Journal Pre-proof accuracy and AUC. Due to the small sample size, we utilized a 5-fold crossed validation and selected the top 30 genes within each run for each panel. Table 3 shows that for almost each run, HTG PIP had the highest accuracy and AUC, while nS Immune always had the lowest classification accuracy (Table 3). The average taken over 5 runs showed that the HTG PIP panel performed best. The heatmap is a plot of the selection frequency within each panel, where only the genes that were selected at least 3 times among the 5 runs across 3 panels were retained ( Figure 2H).  Table 2). Interestingly, the predicted proportions of Dendritic cells (DCs) (7% vs. 5%, p=0.032), Treg (4% vs. 2%, p=0.002) and memory CD8 T cell (33% vs. 36%, p=0.018) from HTG OBP were found significantly different between treated and untreated patients (Supplementary Table 3), which were not observed in the other two panels.

Discussion
Clinical and research sample sizes are limited while the demand for additional molecular information