A c-Myc-regulated stem cell-like signature in high-risk neuroblastoma: A systematic discovery (Target neuroblastoma ESC-like signature)

c-Myc dysregulation is hypothesized to account for the ‘stemness’ – self-renewal and pluripotency – shared between embryonic stem cells (ESCs) and adult aggressive tumours. High-risk neuroblastoma (HR-NB) is the most frequent, aggressive, extracranial solid tumour in childhood. Using HR-NB as a platform, we performed a network analysis of transcriptome data and presented a c-Myc subnetwork enriched for genes previously reported as ESC-like cancer signatures. A subsequent drug-gene interaction analysis identified a pharmacogenomic agent that preferentially interacted with this HR-NB-specific, ESC-like signature. This agent, Roniciclib (BAY 1000394), inhibited neuroblastoma cell growth and induced apoptosis in vitro. It also repressed the expression of the oncogene c-Myc and the neural ESC marker CDK2 in vitro, which was accompanied by altered expression of the c-Myc-targeted cell cycle regulators CCND1, CDKN1A and CDKN2D in a time-dependent manner. Further investigation into this HR-NB-specific ESC-like signature in 295 and 243 independent patients revealed and validated the general prognostic index of CDK2 and CDKN3 compared with CDKN2D and CDKN1B. These findings highlight the very potent therapeutic benefits of Roniciclib in HR-NB through the targeting of c-Myc-regulated, ESC-like tumorigenesis. This work provides a hypothesis-driven systems computational model that facilitates the translation of genomic and transcriptomic signatures to molecular mechanisms underlying high-risk tumours.


MYCN-dependently expressed gene signature via meta-analysis of expression profiles
We collected a total of 556 primary patients from four independent studies to derive this signature. Raw gene expression profile data (Affymetrix.CEL files or Agilent.txt files) and the clinical information for patients were retrieved from seven datasets at different GEO 1 or ArrayExpress 2 , comprised of 325 patients of advanced stage (stages 3 and 4) neuroblastoma with outcome records (Supplementary Table 1).
Expression measurements were transformed to the log2 level if they were not originally done so by the authors. The downloaded Affymetrix exon arrays were processed using the Bioconductor aroma.affymetrix package for each dataset 3 . For the custom designed Agilent arrays, we re-annotated the probes to the human genome assembly hg37 by integrating the annotations derived by the BLAST algorithm and that by the DAVID online annotation tools 4 . To increase statistical power, we pooled samples measured on the same platform into a big dataset after normalization 5 . COMBAT 6 was applied to adjust the batch effects.
For differential expression analysis, we performed variance-stabilizing transformation with the Linear Models for Microarray Data (limma) method on the two pooled datasets, respectively. We selected the probe with the lowest p-value to represent the transcriptional change of its designed gene if multiple probes were mapped to the same gene on the array. We then applied Stouffer meta-analysis to combine the linear model resultant statistics into jointed p-values. For multiple testing correction, we considered the genes with probes meeting a false discovery rate (FDR) less than 0.001 with agreement on the orientation of deregulation across all datasets for downstream analyses. Additionally, we included significant genes but only measured by one platform for this meta-analysis to maximum the candidate gene signatures.
Finally, we split the identified MYCN-dependently expressed genes into two sets of gene signatures. The MA genes were higher-expressed in MYCN-amplified (MA) patients than in MYCN-nonamplified (MN) patients with high-risk neuroblastoma (HR-NB), while the MN genes were lowexpressed in MA patients with HR-NB (i.e., higher-expressed in MN patients) than in MA patients. Note that we are interested in the collective effects of the MA genes compared to the MN genes rather than the expression of any individual genes.

Define transcriptional "HR genes" and "LR genes" from MN patients
To identify the genes overexpressed in high-risk patients, we analysed six cohorts collected from nine gene expression data-sets containing patients with normal MYCN copy number. Since all patients with MYCN amplification will be categorized into the high-risk group, we focused on the patients with normal MYCN copy number to make the comparison between high-risk and low-risk groups. Again, we combined data-sets with a small sample size (< 30) after using COMBAT 6 to adjust the batch effects. We used the limma test to identify genes that showed significance in at least one studied cohort (FDR<0.05, fold change>1.3, 1.5 or 1.8 (Supplementary Table S1) and prioritized up to 2000 differentially expressed genes in each cohort. When multiple probesets were designed for one gene in a platform, we kept the one with the lowest p-value and then adjusted the significance of multiple tests among all genes.
We excluded genes showing conflicting orientations of fold changes across multiple cohorts, and calculated the average fold change if a gene presenting the same statistically significant orientation across multiple cohorts.

Three well-known MYCN-independent prognostic signatures in HR-NB 7-10
Asgharzadeh et al. reported a 55-probeset (52-gene) signature stratifying MYCN non-amplified stage 4 patients with poor outcome from those with a high survival probability 7 . Comparing the mRNA expression patterns of high-risk patients with those expression patterns of low-risk patients, Vermeulen et al. identified 59 prognostic genes for NB in at least two of seven cohorts 8 . A year later, the same group purified these 59 genes into a 42-gene signature to be robustly prognostic in independent primary NB, even in HR-NB 9 . Additionally, Valentijn et al. identified a 157-gene signature of MYCN targets by shRNA-mediated silencing MYCN in NB cells, and showed that its prediction of poor prognosis to be more powerful than MYCN-amplification or the mRNA expression of MYCN 10 .

Collecting genomic variations in high-risk NB
Genes with somatic mutations in exons were collected from two publications including the study of the National Cancer Institute's Therapeutically Applicable Research to Generate Effective Treatments (TARGET). From 240 patients with HR-NB, a recent study identified a spectrum of 5291 candidate somatic mutations in the coding regions of 3960 genes. Among them, 268 potential RH-NB causing mutations of 195 genes have been verified 11 . Another study of 87 untreated primary NB tumours of all stages, including 48 patients with HR-NB, revealed 157 genes harbouring verified somatic variations with a somatic score larger than 0.1 12 . From the first study, we collected 182 genes that harbour somatic missense-mutations with evidence, as a coding region missense mutation will mostly impact the gene function. To maximum the research candidates, we also collected the patients with HR-NB from the second study if the they harboured the verified somatic mutations. Additionally, we included LMO1 with the germline variant associated with HR-NB for a following query for prognostic signature (reviewed by 13 ) because LMO1 genotypes are significantly associated with metastatic and high-risk status in HR-NB 14 .

Regulatory network analysis
Gene members of the over-represented genesets were linked if their protein products interact in the STRING (v10) 15 PPI databases. We used Cytoscape to visualize the network and the Bioconductor igraph package to calculate the network degree. The network hubs were identified by both empirical and theoretical tests. Specifically, to evaluate the empirical degree significance, we randomly sampled the same number of gene symbols from the STRING database (STRINGdb) and calculated the empirical significance after B=2000 rounds of simulations, which was calculated by P = # of (simulated degree > observed degree)/B. Meanwhile, we calculated the theoretical significance for each protein in the network using the Fisher's Exact test (FET). This theoretical test compared the observed network-degree in the NB-associated subnetwork to the background degree in the STRING database. The significance was identified with an odds ratio larger than 2 and a p-value less than 1e-5.

MYC occupancy at gene promoters in HR-NB
The following data were retrieved from ENCODE (hg19 assembly): DNase-seq in the SK-N-SH cells: wgEncodeOpenChromDnaseSknshBaseOverlapSignal.bigWig ChIP-seq for cMyc in NB4 cells: wgEncodeSydhTfbsNb4CmycStdSig.bigWig ChIP-seq for MAX in NB4 cells: wgEncodeSydhTfbsNb4MaxStdSig.bigWig Genomic view were generated using Bioconductor packages Gviz and GenomicRAnges, and the gene coordinates were retrieved from the UCSC genome assembly hg19.
Cells were seeded in 96-well plates (30,000 cells/well) in RPMI-1640 medium containing 10% FBS. The absorbance of the formazan at 490 nm were measured directly from the 96-well plates. Cells (100 μl/well) were seeded at seeding densities of 3x104 into 96 well microtitre plates and allowed to adhere for 16 h.
Cells were treated with different dose of Roniciclib in 100 μl solution. To monitor the conversion of MTS to formazan, cells was assessed on a daily basis by adding 20 μl of MTS. Following 1 h incubation period with MTS, the absorbance at 490 nm was recorded using an ELISA plate reader. The background was at zero cells/well, and DMSO was used as vehicle control.

Detection of apoptosis
To determine the number of dead cells in population after Roniciclib treatment, active caspase 3 was detected using the Caspase-3 DEVD-R110 Fluorometric HTS Assay Kit (Biotium, Fremont, CA). The substrate was completely hydrolyzed by the enzyme in two successive steps. Cleavage of the first DEVD peptide resulted in the monopeptide Ac-DEVD-R110 intermediate, which had absorption and emission wave-lengths similar to those of R110 (λ abs /λ abs = 496/520 nm). Hydrolysis of the second DEVD peptide released the dye R110, leading to a substantial fluorescence increase. R110 was used for generating a standard curve to calculate the amount of the substrate conversion.
The SY5Y cells were placed in 100 μl culture medium per well of a black 96-well plate in the same cell number as described for the cell viability assay. Cells were treated by different concentration of Roniciclib to induce apoptosis. Caspase-3 activity was detected by adding 100 ul of assay buffer to 100 ul cells in culture medium in each well at different time points. The plate was incubated at 37°C for 60 minutes and fluorescence with 470 nm excitation and 520 nm emission.    (2005).

Figure S1. The associations between prognosis, genomic variants, and transcriptional MYCNassociation in HR-NB.
A) Venn diagram of overlaps among the HR genes, the prognostic 42-gene classifier, and the MA genes (top sub-panel) or the MN genes (bottom sub-panel). By comparing these two sub-panels, we show that HR genes recapture both MA and MN genes and are significantly enriched for the MA genes. However, from the viewpoint of the prognostic 42-gene classifier, it significantly recaptures both MA and MN genes and shows preference for the MN genes.

B)
Due to the heterogeneity of NB, no one gene was recaptured among all three well-known prognostic signatures (the 52-, 42-and 157-gene signatures), and only three genes were recaptured by two prognostic signatures. However, our MA and MN genes recapture the majority of these three well-known prognostic signatures. Note that the 42-gene signature is further derived from the 59-gene signature.

C)
The results of Fisher's exact tested (FET) for the overlap among the MA-dependently expressed signature (the merger of MA and MN genes) and four published HR-NB prognostic signatures. A background of 19,000 measured human gene symbols are used for the tests.     A) Volcano-plots of significant gene expression changes in HR-NB compared with LR-NB. Each subpanel represents one independent cohort, and the number of patients is given in parentheses. We used the limma test to identify differential expression from the expression profiles of only MN patients. When significant, we highlighted the expression change in genes CDKs, CDKNs, and CCND1.

B)
Results of Western blotting for c-Myc and n-Myc. The LAN1 cells were stimulated by vehicle control (DMSO), BAY1000394 (1 μM) or left alone for the times indicated. The n-Myc protein levels were analysed by immunoblotting using anti-n-Myc antibody (ThermoFisher, PA5-17403). β-actin was used as a loading control.
C) RT-PCR experimental results for MYCN and CDK4 in two HR-NB cell lines. Quantitative changes in mRNA expression was calculated between dimethyl sulfoxide (DMSO) and Roniciclib-treated cells. These relative expression levels were normalized to the mean of GAPDH. The data shown are the mean±SD of replicate experiments after normalization to the non-treated control, followed by the number of biological replicates. Statistics were calculated by a two-tailed Welch's corrected t-test.

B)
Results of RT-PCR. The MCF7 cells were quantitatively analysed to examine mRNA levels between DMSO-and Roniciclib-treated cells. These relative expressions were normalized to the mean of GAPDH. The data present the mean±SD of replicate experiments after normalization to the non-treated control. The p-value of a two-tailed Welch's corrected t-test is given in each scenario, followed by the number of biological replicates.

Figure S8. The mRNA expression changes in SK-N-SH cells.
A) Roniciclib selectively inhibits CDK2 rather than CDK4 in the SK-N-SH cells.

B)
The mRNA expression of both c-Myc and n-Myc response to Roniciclib at 72 h in the SK-N-SH.

C)
Roniciclib induces the mRNA expression of CCND1 significantly.

D)
Roniciclib substantially decreases the mRNA expression of CDKN1A and CDKN2D along with CDK2 after an early phase (24 h) of treatment.
In each sub-panel, quantitative changes in mRNA expression was calculated between DMSO-and Roniciclib-treated cells. These relative expression levels were normalized to the mean of GAPDH. The data present the mean±SD of 2-3 replicate experiments after normalization to the non-treated control. Statistics were calculated by a two-tailed Student t-test, followed by the number of biological replicates.

GSE27608 / GSE21713
1,2,3,4,4s exon #: we exclueded 8 overlaped, 1 withnot MYCN annotated, and 6 MYCN gain pts. *: Pts with possiblly Intermediate-risk are considered due to lack of age and other clinical information. For all other datasets, we excluded patients with intermideate risk and without MYCN status or the patients reported by the authors as "low-risk" but labelled with MYCN-amplification.