Comparative RNA-Sequencing Analysis Benefits a Pediatric Patient With Relapsed Cancer

Clinical detection of sequence and structural variants in known cancer genes points to viable treatment options for a minority of children with cancer.1 To increase the number of children who benefit from genomic profiling, gene expression information must be considered alongside mutations.2,3 Although high expression has been used to nominate drug targets for pediatric cancers,4,5 its utility has not been evaluated in a systematic way.6 We describe a child with a rare sarcoma that was profiled with whole-genome and RNA sequencing (RNA-Seq) techniques. Although the tumor did not harbor DNA mutations targetable by available therapies, incorporation of gene expression information derived from RNA-Seq analysis led to a therapy that produced a significant clinical response. We use this case to describe a framework for inclusion of gene expression into the clinical genomic evaluation of pediatric tumors.

Clinical detection of sequence and structural variants in known cancer genes points to viable treatment options for a minority of children with cancer. 1 To increase the number of children who benefit from genomic profiling, gene expression information must be considered alongside mutations. 2,3 Although high expression has been used to nominate drug targets for pediatric cancers, 4,5 its utility has not been evaluated in a systematic way. 6 We describe a child with a rare sarcoma that was profiled with whole-genome and RNA sequencing (RNA-Seq) techniques. Although the tumor did not harbor DNA mutations targetable by available therapies, incorporation of gene expression information derived from RNA-Seq analysis led to a therapy that produced a significant clinical response. We use this case to describe a framework for inclusion of gene expression into the clinical genomic evaluation of pediatric tumors.

CASE SUMMARY
Patient 1 was diagnosed at 8 years of age with a left tentorial-based CNS sarcoma after a 2-week history of nausea, lethargy, and diplopia. Clinical workup confirmed that the tumor was primary to the brain (Figs 1A and 1B). Histology revealed a mitotically active, epithelioid-to-spindled cell tumor in patternless sheets, interrupted by thick fibrous bands and foci of necrosis (Figs 1C to 1D). Immunohistochemistry revealed diffuse positivity for vimentin, desmin, neuron-specific enolase, epithelial membrane antigen, and CD99 (Figs 1E to 1H). Focal immunohistochemical positivity was observed for pan-cytokeratin (AE1/AE3) and synaptophysin. The tumor was negative for glial fibrillary acidic protein (GFAP), Wilms tumor 1 (WT1), myo-D1, myogenin, smooth muscle actin, nonphosphorylated and phosphorylated neurofilament protein, CD34, CD31, HMB-45, S-100, leukocyte common antigen, and BAF47/INI-1 (retained nuclear positivity). The Ki67 proliferative index was 9%. A diagnosis of desmoplastic small round cell tumor (DSRCT) was favored initially. 7 Because EWSR1 breakapart fluorescence in situ hybridization confirmed an EWSR rearrangement but concomitant WT1 breakapart fluorescence in situ hybridization was negative, the molecular criterion for DSRCT was not met, and a final diagnosis of poorly differentiated sarcoma, not otherwise specified, was rendered. The patient received six cycles of induction chemotherapy-ifosfamide, carboplatin, and etoposide-followed by autologous stem-cell transplantation with a high-dose preparative regimen of carboplatin, thiotepa, and etoposide as well as 54 Gy of focal radiation to the location of the original tumor. After a 2-year remission, the tumor recurred with numerous pulmonary lesions in all lobes. The histologic characteristics of the metastasis were identical to the primary tumor. The patient enrolled in the Personalized OncoGenomics (POG) 3 study, which offers whole-genome sequencing (WGS) and transcriptome sequencing and analysis to identify drivers and potential therapeutic options of relapsed solid tumors for children and adults in British Columbia.
Biopsy material from a lung metastasis was characterized with WGS and RNA-Seq, and peripheral blood was characterized with WGS. 3 The analysis of the sequencing data revealed an EWSR1-ATF1 gene fusion (Appendix Fig A1, online only); although this finding is most suggestive of a clear cell sarcoma, no immunohistochemical support for this diagnosis could be established. 8,9 The POG team identified three somatic variants of unclear therapeutic significance: PDGFRA p.V299F, PRKCB p.D341N and SVIL p.L1374R. No germline singlenucleotide variants with established cancer relevance were detected. 10 Although no therapy was available to target the EWSR1-ATF1 fusion protein directly, the POG RNA-Seq-derived gene expression analysis identified high expression of downstream genes IL6 and JAK1. The finding of the JAK1 overexpression was corroborated by comparative RNA-Seq analysis at the University of California Santa Cruz.

COMPARATIVE RNA-SEQ ANALYSIS
In accordance with the US Food and Drug Administration guidelines, 6 we focused on relative rather than absolute gene expression levels and sought to develop a framework for the analysis of RNA-Seq data from patients. We compared the RNA-Seq-derived tumor gene expression profile of patient 1 with similarly derived profiles of 10,668 samples that represented 38 pediatric and adult tumor types studied by the The Cancer Genome Atlas (TCGA) and Therapeutically Actionable Research to Generate Effective Treatments (TARGET). 11,12 RNA-Seq reads from different laboratories were reanalyzed with a single computational pipeline to reduce batch effects. 13 We searched for tumors in this homogeneously processed compendium in which expression profiles were similar to those of patient 1 by using TumorMap. 14 The tumor gene expression profile for patient 1 resembled lung cancers (Fig 2A), the site of the metastasis. The metastatic sample contained 76% tumor cells, estimated by a POG pathologist, which suggested that most of the gene expression information came from the tumor cells. Lung adenocarcinoma (LUAD) samples formed four groups in the TumorMap ( Fig 2B) and the sarcoma of patient 1 clustered with the 354 LUAD tumors of the terminal respiratory unit and proximal-inflammatory molecular subtype (Fig 2C), associated with the activation of receptor tyrosine kinases (RTK). 15 To define the transcriptional programs that drove placement of the patient's tumor with the lung cancers, we conducted Gene Set Enrichment Analysis 16 with genes differentially expressed between the LUAD cluster that contained the tumor of patient 1 (n = 354) and the remaining samples in the compendium (n = 10,314); we also repeated this analysis and compared the cluster for patient 1 with the remaining LUAD samples (n = 529). Both analyses revealed the overexpression of members of the IL6/JAK/STAT3 signaling pathway (Appendix Fig A2), which suggests that the activation of shared signaling programs likely contributed to the tumor transcriptional phenotype of patient 1 in addition to the site of the metastatic sample. We next searched for genes that were significantly overexpressed in the patient's tumor compared with the whole compendium and compared with only the sarcomas by using outlier statistics 3,17 (Data Supplement). To explicitly subtract the effect of the lung cell expression, we also searched for outlier genes compared with 529 LUAD tumors; 787 genes, including druggable targets JAK1, ALK, NTRK1, and CCND1, emerged as overexpression outliers in all analyses (Data Supplement).

MOLECULAR RATIONALE FOR CLINICAL DECISION MAKING
We speculated that the activation of RTKs contributed to JAK over-expression in patient 1's tumor. 18,19 Increased expression of ATF1 and its transcriptional targets, TOP2A, CALCA, and IL6, was observed, presumably as a result of constitutive transcriptional activation by EWSR1-ATF1 ( Fig 3A; Appendix Fig A3). ATF1 can activate the transcription of JAK1, 8 providing another potential mechanism for the observed high expression of the IL6/JAK/STAT3 pathway. Consolidating the fusion-based and the RTK-based mechanisms of IL6/JAK/STAT3 activation, we reconstructed a candidate pathway driving patient 1's cancer (Fig 3, Appendix  Fig A4). The POG molecular tumor board suggested targeting either JAK (with ruxolitinib) or ALK (with crizotinib). A decision was made to use ruxolitinib given (1) the over-expression of ATF1 target genes, (2) the over-expression of JAK1, and (3) the available pediatric dosing information. 20 In addition, ruxolitinib was favored over crizotinib because it targets downstream of both EWSR1-ATF1 and the overexpressed RTKs, whereas crizotinib only targets the RTKs. We recognize that a combination therapy targeting both ALK and JAK may have been appropriate on the basis of the molecular findings. However, we were unable to use drug combinations that have not been through phase 1 testing, highlighting the need for more pediatric combination therapy trials.

CLINICAL RESPONSE
At the initiation of therapy with single-agent ruxolitinib, patient 1 had severe nausea and lethargy, was mostly bed-ridden, and had a Lansky play-performance score 21 of 60 ( Fig  4). Within a week of ruxolitinib initiation, his mother reported a dramatic improvement in his energy level and complete resolution of his nausea. The patient tolerated this therapy without significant toxicity and had stabilization of the previously rapidly growing lung nodules by RECIST, 22 and his Lansky score improved to 90 to 100 for 5 months. The patient then exhibited a sudden enlargement in one lung lesion, detected during routine imaging, although most of the other lesions remained stable. Ruxolitinib was discontinued, and focal palliative radiation was ascopubs.org/journal/po JCO™ Precision Oncology 3 administered to the one lesion in the left lower lobe for pain control. Within 2 months of ruxolitinib discontinuation, the symptoms of nausea, extreme fatigue, and weight loss returned, and the lung lesions progressed. The family requested that ruxolitinib be restarted for quality of life, and the patient again showed dramatic improvement in clinical status and an unexpected prolonged period of stable disease until dose reduction because of myelosuppression was required. After the dose reduction, rapid progression of the pulmonary lesions resulted in death 23 months after the original relapse.

DISCUSSION
To our knowledge, this is the first report of a pediatric patient with cancer who benefited 4 ascopubs.org/journal/po JCO™ Precision Oncology  from cross-tumor gene expression comparisons. Cross-tumor analyses have been used in the TCGA 11 and POG studies 3,23,24 ; however, a computational framework is necessary for their clinical implementation. This case is also, to our knowledge, the first report of use of a JAK inhibitor to treat a sarcoma. Previous functional studies implicated STAT3 as an oncogene in sarcomas 25 ; the current case report builds on this work and prompts investigation into the potential clinical utility of targeting this pathway. Of note was the patient's marked and rapid clinical response to treatment, which suggests that response may have been related to the modulation of cytokine expression by the medication. Although the clinical benefit of ruxolitinib was apparent, it was challenging to quantify. Ultimately, a randomized clinical trial is necessary to assess the benefit of molecular approaches compared with the standard of care.
The case also highlights tumor heterogeneity: although the majority of the metastases remained stable, one lesion rapidly became resistant. Despite documented progressive disease, this case benefited from resumption of the medication: the patient's clinical status and many metastatic lesions were responsive to retreatment.  A serial molecular analysis of the heterogeneous lesions could inform the mechanisms of resistance; however, it was not pursued because of the family's wishes. To characterize the intratumor heterogeneity of therapeutic response, we consider follow-up biopsies, and those decisions are weighed against the risks to the patient.

Reference compendium
We obtained The Cancer Genome Atlas (TCGA) 11 13 We extracted tumor samples from these data sets and combined them into a single cohort that contained multiple adult and pediatric tumor types (N = 10,668). The expression of 18,357 protein-coding genes was measured (Data Supplement).

Gene expression outlier analysis
Gene-level reads per kilobase of transcript per million mapped reads (RPKM) expression measurements for patient 1's tumor were generated according to the previously published method. 3 We used these data to compute FPKMs by dividing each value by two. We proceeded to quantile normalize the expression values, using theoretical exponential distribution (rate parameter = 1) as a background distribution (Bolstad BM, et al: Bioinformatics 19:185-193, 2003). During this procedure, we normalized expression quantiles within each tumor sample to match quantiles of an exponential distribution (rate parameter = 1). We then performed gene expression outlier analysis 3,17 to identify transcripts significantly enriched in the patient's tumor compared with 10,668 cancer samples in the reference compendium (pan-cancer outlier analysis). We also performed the same analysis using only the 529 lung adenocarcinoma (LUAD) tumors as a reference (LUAD-only outlier analysis) and using 262 sarcoma tumors as a reference (sarcoma-only outlier analysis). Gene expression outliers were identified as described, 17 with the exception of the use of a more stringent interquartile range of 2.0. We analyzed the outlier genes for enrichment of specific pathways and signaling networks that could be targeted by available therapies using MSigDB . The similarity space is represented as a graph and is used as an input into OpenOrd. OpenOrd treats the similarities as spring constants and searches for a configuration among the samples that produces an arrangement to minimize the spring tension of the system as much as possible. We use hexagonal packing for space conservation in the projected two-dimensional plane. For each sample in the full correlation matrix, we extracted samples with top six correlation values to compose a sparse matrix of the top six nearest neighbors. We used this sparse matrix to construct a sparse similarity graph for the samples in the cohort and applied the OpenOrd method to derive the initial (x, y) positions in the map.
Furthermore, to avoid overlapping and crowding samples in the dense graph components, OpenOrd (x, y) coordinates are snapped to their nearest hexagon to arrange all of the samples on a tiling of regular hexagons. With OpenOrd (x, y) coordinates, each sample is placed in a grid cell. If the predetermined cell is occupied, the sample is snapped to an empty grid cell within a minimal distance from the original cell. Multiple samples that compete for a location will thus spiral around a central hexagon in the neighbors around the central location. Therefore, dense clumps are separated so that they can be viewed on approximately the same scale as the distances that separate them. Hexagons were selected as the shape for the grid cell to illustrate that there are no inherently preferred axis-aligned directions in the OpenOrd output.
Google Maps Application Programming Interface (API; https://developers.google.com/maps/documentation/javascript/reference) is used to load and visualize the resulting layout in a browsing environment. The API provides the ability to interactively navigate, zoom, and explore various annotations of locations on the map, analogous to Google Maps and Google Earth applications.
We applied the TumorMap method to the reference cohort of 10,668 tumors together with the tumor of patient 1 by using transcriptional profiles of 18,357 genes (Data Supplement). Of note, the reference compendium contained 262 heterogeneous sarcomas from the TCGA 11 cohort.

Statistical Robustness of TumorMap Placement
The TumorMap method belongs to the family of nearest neighbor classification methods. It projects the similarity space of the high-dimensional genomic profiles into Euclidean space by using only top neighbors of every sample in the cohort. We refer to these top neighbors as the local neighborhood of a given sample. Therefore, it is important to evaluate how robust these local neighborhoods are under small perturbations. Specifically, we wanted to assess whether the local neighborhood of patient 1 remains stable when only subsets of genes are used to compute pairwise similarities between samples.
We subsampled, without replacement, gene expression features at 80% of the original gene features. We repeated this procedure 1,000 times and computed the patient's local neighborhood across all N = 1,000 subsampled spaces. We then compared each of these local neighborhoods computed under perturbation to the true local neighborhood computed with the complete data set. We computed a local neighborhood specificity (LNspecificity) score as follows: where S is a set of nearest neighbors for either true or subsampled computation and |S| is the nearest neighbor set cardinality.
This score represents average overlap, as a fraction, of the true local neighborhood and the perturbed local neighborhoods across all subsampling iterations. The higher this score is, the more overlap we see across all the iterations and the more similarities we see among local neighborhoods under perturbations. The SP score of patient 1 was 0.885, which indicated that 88.5% of the top neighbors were consistently the same across 1,000 perturbations to the gene expression profiles, from which the TumorMap visualization was computed.

Identification of Genes Associated With the Patient 1 Cluster in TumorMap
We identified genes differentially expressed between the LUAD cluster containing patient 1 and the 10,668 tumor samples in the compendium by using the Linear Model for Microarray Analysis method (Ritchie ME, et al: Mucleic Acids Res 43:e47, 2015). The gene set enrichment analysis (GSEA) 16 of the resulting list of genes revealed that IL6/JAK/STAT3 signaling pathway annotation was significantly enriched among these genes (false discovery rate q value, 1.787 e−4 with 39 genes in the leading edge).
We also identified differentially expressed genes between the LUAD cluster containing patient 1 and the remaining LUAD tumors in the compendium (Data Supplement). The GSEA analysis 16 of this gene list revealed that members of the IL6/JAK/ STAT3 signaling pathway were still driving the creation of the LUAD cluster containing patient 1's tumor (false discovery rate q value, 0 with 36 genes in the leading edge).

Gene Expression Percentile Analysis
For the druggable targets in the reconstructed candidate driver pathway , we computed the ranked percentiles of the gene expression levels in patient 1's tumor compared with the TCGA LUAD (n = 529) and TCGA SARC (n = 264) cohorts. For each sample in the cohort as well as for patient 1, we ranked genes relative to the expression of all genes within that sample (ie, the highest expressed gene was ranked as 1). The percentiles are displayed in Figure 3

Differential Expression Analysis Relative to Normal Tissues
For all of the genes in the reconstructed candidate driver pathway that are targetable by cancer drugs (Fig 3) we computed the fold change between the expression in patient 1's tumor and in normal cells. We obtained normal tissue expression data for 16 different tissues from the Illumina Human Body Map 2.0 database (GEO Accession No. GSE30611). For each gene, we mean-aggregated normal expression into a single value and computed log change of the expression of that gene in patient 1 compared with the aggregated normal value.

Histologic Techniques and Immunohistochemistry
The tumor tissue was routinely fixed in 10% buffered formalin and then was processed in an automated fashion. Processed tissue was then embedded in paraffin wax, and the resultant blocks were sectioned at 4 µm for both hematoxylin and eosin staining and immunohistochemistry. All microscopic slides were prepared via standard automated techniques. Immunohistochemical slides were stained on the Ventana BenchMark XT Autostainer (Ventana Medical Systems, Tucson, AZ) with the Ventana iVIEW universal DAB detection kit. Primary antibodies used for immunohistochemistry are noted in Table A1 and the Data Supplement. All primary antibodies were either diluted or received as a ready to use prediluted solution from the relevant vendor.

Fluorescence In Situ Hybridization
Fluorescence in situ hybridization analysis was performed on formalin-fixed paraffin-embedded tumor tissue. Overall, 200 nuclei were quantified with the EWSR1 (22q12) dual-color breakapart probe (Vysis; Downers Grove, IL). Likewise, 200 nuclei were quantified with the WT1 (11p13) dual-color breakapart probe (prepared at the BC Cancer Agency). Standard, internally derived thresholds were followed to determine definitive presence of a translocation that involved the EWSR1 and WT1 loci.    Fig A4. Full candidate pathway that represents molecular drivers of tumorigenesis in the sarcoma of patient 1. We reconstructed this pathway on the basis of the outlier analysis, differential gene expression analysis, copy number information and literature mining (Appendix Methods). A simplified version of this pathway is presented in Figure 3A. Both EWSR1-ATF1 and receptor tyrosine kinases PDGFRB, NTRK1, ALK, and FGFR1 can contribute to the activation of IL6/JAK/STAT3 signaling. All gene expression outliers depicted in this figure were significant in all three comparisons: patient 1 versus all cancers, patient 1 versus lung adenocarcinomas, and patient 1 versus sarcomas. JAK1, the molecular target of ruxolitinib, is indicated with a yellow lightning bolt. TCGA, The Cancer Genome Atlas.