Somatic mutations of activating signalling, transcription factor, and tumour suppressor are a precondition for leukaemia transformation in myelodysplastic syndromes

Abstract The transformation biology of secondary acute myeloid leukaemia (AML) from myelodysplastic syndromes (MDSs) is still not fully understood. We performed paired self‐controlled sequencing, including targeted, whole exome, and single‐cell RNA sequencing, in a cohort of MDS patients to search for AML transformation‐related mutations (TRMs). Thirty‐nine target genes from paired samples from 72 patients with MDS who had undergone AML transformation were analysed. The targeted sequencing results showed that 64 of 72 (88.9%) patients presented TRMs involving signalling pathway activation, transcription factors, or tumour suppressors. Of the 64 patients, most of the TRMs (62.5%, 40 cases) emerged at the leukaemia transformation point. Paired whole exome sequencing showed some presumptive TRMs, which were not included in the reference targets in three patients. No patient developed AML only by acquiring mutations involved in epigenetic modulation or ribonucleic acid splicing. Single‐cell sequencing indicated that the activating cell signalling route was related to TRMs in one paired sample. Targeted sequencing defined TRMs were limited to a small set of seven genes (in the order: NRAS/KRAS, CEBPA, TP53, FLT3, CBL, PTPN11, and RUNX1, accounting for nearly 90.0% of the TRMs). In conclusion, somatic mutations involved in signalling, transcription factors, or tumour suppressors appeared to be a precondition for AML transformation from MDS.

features. Myelodysplastic syndromes (MDSs) present a haematopoietic status between normal individuals and patients with leukaemia. Indeed, one-third of people with MDS transform to secondary AML (sAML) when the blasts in their bone marrow are over 20%. Unlike de novo AML, sAML shows unique genetic features. 6 First, there is often an obvious pre-AML stage, where somatic gene mutations result in initial events (clonal haematopoiesis) and/or driver events (development of MDS phenotypes), largely involving epigenetic regulation and ribonucleic acid (RNA) splicing. 7,8 Second, despite preliminary findings on the role of late-stage gene mutations involving various signalling pathways or transcription during sAML development, 9,10 including our primary consideration of sAML-related mutations, 11  Answering these questions will help us find new insights into sAML pathogenesis and explore some new target therapy strategies.
Logically, only systematized findings, but not some sporadic reports, can answer these questions. A systematized finding in this assay means the data should come from a big sample, which could help to focus our attention on some high-occurrence rate mutations related to sAML transformations. In addition, paired self-controlled data is required to inform us what mutations initiate MDS development and what other mutations trigger sAML transformation. Finally, this study is focused on the mutated genes related to sAML transformation rather than an overall MDS clonal evolution. 9 In general, the functions of the mutations that occur during the early/middle stage of MDS (such as those MDS special gene mutations involving epigenetic regulation and RNA splicing) are to initiate clonal haematopoiesis and keep the phenotypes of MDS. In theory, only the late events referring to signalling, transcription, or tumour suppression could start the sAML transformation process. 7 Therefore, this study aimed to observe what happens when MDS develops into sAML in terms of gene mutations.
Paired self-controlled samples were acquired at MDS diagnosis and immediately after AML transformation. Sequencing for 39 target genes in all samples was followed by whole exome sequencing in several patients whose targeted sequencing did not identify novel mutations. An additional sample paired was subjected to single-cell RNA transcription sequencing. Using these techniques, we obtained some useful data regarding the biology of sAML transformation and suggest novel strategies to block the leukaemic transformation of MDS.

| Sample collection
From January 2004 until October 2020, our department sequenced DNA samples from over 800 MDS bone marrow samples. During clinical follow-up, when a patient showed potential clinical progression from MDS to sAML (secondary AML), repeated targeted sequencing was performed to explore some TRMs. Whole exome sequencing was performed when conclusive results for the last mutation events were not obtained via targeted sequencing and when adequate residual DNA extracts were available. Single-cell RNA transcription sequencing was used to examine the underlying transformation dynamics. Diagnoses for MDS and sAML were established in strict accordance with the World Health Organization criteria, and the CMML subset were also included in this assay according to the FAB classification. 13,14 Clinical and hematologic data were recorded after informed consent was sought in accordance with the Declaration of Helsinki. This study was approved by the hospital review board of the Shanghai Jiao Tong University Affiliated Sixth People's Hospital.

| Genomic DNA preparation, target enrichment, and sequencing
Genomic DNA (gDNA) was extracted using the DNeasy Blood and Tissue Kit (Qiagen) according to the manufacturer's protocol.
Genomic DNA was sheared using the Covaris® system (Covaris), and the DNA samples were prepared using the TruSeq DNA Sample Preparation Kit (Illumina) according to the manufacturer's protocol.
Regarding the probe design, both coding and regulatory regions of target genes were included in the custom panel. The regulatory regions comprised promoter regions (defined as 2 kb upstream of the transcription start site), 5′ un-translated region (5′-UTR), and intronexon boundaries (50 bp). Custom capture oligos were designed using the SureDesign website of Agilent Technologies (Agilent).

| Whole-exome sequencing
The gDNA library was prepared using a TruSeq DNA Sample Preparation Kit (Illumina) in accordance with the manufacturer's protocol. In-solution exome enrichment was performed using a TruSeq

| Sequencing data processing, variant calling, and annotation
Before variant calling, the raw sequence reads were mapped to the reference genome (hg19). Duplicate reads were marked and removed to mitigate biases introduced by amplification, and base quality scores were recalibrated using the Genome Analysis Toolkit (GATK). The Ensembl VEP and vcf2maf tools were applied to generate a MAF format for somatic mutation annotation, and the ANNOVAR tool was used to annotate the frequency information of variations in the population database. The variants were identified as low-frequency functional mutations if they had <0.1 frequency in the ExAC03 database, < 0.01 frequency in the 1000-genome database, and <0.05 frequency in the GeneskyDB database. According to the results, the variant was then extracted as one of the following functional annotations: "Frame_Shift_Del/ins", "In_Frame_ Del/Ins", "Missense_Mutation", "Nonsense_Mutation", "Nonstop_Mutation", "Splice_Site" or "Translation_Start_Site".

| Definition of the presumed transformationrelated mutation
When we analysed the paired data, if pre-existing genes at MDS diagnosis or newly-emerging genes at sAML transformation met the following conditions, we considered them to be sAML TRMs: 1. they must be involved in at least one of the three function pathways, namely, active signalling, myeloid transcription or tumour suppression; 2. they emerged after sAML transformation (better weight) or pre-existed at MDS diagnosis (poorer weight). Newly emerged mutations were preferentially considered as the presumed TRMs; 3.
when ≥2 suspicious mutations co-existed to be defined as TRMs, the biologically more aggressive one (active signalling > myeloid transcription > tumour suppressor) or with lower various allele frequency (VAF) among the newly emerged mutations (meaning latest emergence) were defined as TRMs. Occasionally, more than one mutation could be presumed as TRMs.

| Statistics analysis
Statistical analyses were conducted using SPSS software version 18.0. The Kaplan-Meier analysis was used to evaluate the time to survival and time to progression. All P-values were based on 2sided tests, and p-values less than 0.05 were considered statistically significant.

| Targeted sequencing
Paired samples acquired from 72 patients with MDS before and after transformation to sAML were analysed using targeted sequencing. Tiers 1(hot spot mutations) and 2 (potential pathogenic but not confirmed mutation) were included in this study.
The somatic mutations identified are presented in Table 1.
Of course, 72 cases are much lower than the actual number of TA B L E 1 Results of the targeted sequencing from the paired samples of 72 patients 3. The total defined TRM number was more than 64 because sometimes, more than one mutation could be considered a TRM for a patient, and occasionally the TRMs for one case could be related to both signalling and transcription. As seen in Tables 1 and 2 and  Abbreviations: AML, acute myeloid leukaemia; CMML1/2, chronic myelomonocytic leukaemia 1/2; NA means no available data; RA, refractory anaemia; RAEB-1/2, refractory anaemia with excess blasts-1/2; RARS, refractory anaemia with ring sideroblasts; RCMD, refractory cytopenias with multilineage dysplasia.

TA B L E 1 (Continued)
SETBP1 in two patients, respectively. The next set of mutations comprised tumour suppressor genes: TP53 mutations in nine patients; WT1 mutations in three patients; NPM1 in one patient.
The TRM events seemed to be highly enriched in seven genes.  (Table 2). However, ASXL1/BCOR/TET2 mutations were most commonly accompanied by transformation-related events when the transformation occurred (Table 2).

| Whole-exome sequencing
Samples from three of the eight patients whose targeted sequencing showed no presumed TRMs but for whom sufficient DNA extract was still available were further subjected to whole exome sequencing (WES) (patients starred [*] in Table 1). Figure 4

| Single-cell RNA sequencing
As mentioned earlier, abnormal cell signalling induced by gene mutation, such as RAS genes or PTPN11, may be critical for the transformation of MDS into AML. However, it is still unclear whether a RAS mutation is a requirement to activate RAS signalling pathways.
To explore this question, we used single-cell RNA transcription sequencing to study the association between RAS mutation and RAS signalling in UPN4674 and UPN4763 (before and after disease progression). NRAS and PTPN11 mutations occurring during the AML stage are core genes in RAS signalling. As shown in Figure 5A, B, one patient presented several abnormal cell types (orange, UPN4674; blue, UPN4763 in Figure 5A). Gene classification analysis showed that aberrant granular-mononuclear progenitors (GMP), common myeloid progenitor (CMP), megakaryocyte-erythroid progenitor (MEP), and monocytes (Mono) were present during the AML stage ( Figure 5B). We focused on differences in the GMP population during RAS signalling. Integrated analysis based on single-cell sequencing indicated that several RAS signalling-related genes are expressed at high levels in GMP after disease progression ( Figure 5C). These genes have been reported to participate in the activation of RAS signalling. 18 Similarly, RAS signalling-related genes such as FLT3, INSR, and CDC42 are also expressed at high levels in MEP, CMP, and Mono groups after disease progression ( Figure 5D). These genes are closely associated with cell proliferation. Interestingly, apoptosis-related gene BAD is down-regulated in GMP, MEP, CMP, and Mono groups after disease progression. These data suggest that TRMs gene mutations induce the redistribution of clonal cells, which leads to an F I G U R E 2 Evolution route for six cases that were analysed by an additional sequencing assay between MDS diagnosis and AML transformation.
increased number of morphological blasts and monocytes via the activation of cell signalling, and further leads to AML transformation.

| DISCUSS ION
The pathogenesis of sAML differs from that of de novo AML in several respects, including clinical progression and prognosis. 19 Despite poor response to contemporary therapies, including stem-  Figure 2). When TRMs emerged in these patients, their disease was still at the low-risk or pre-AML stage but transformed to AML quickly. Finally, for one patient (UPN 4674), immediately after the occurrence of the AML phenotype, accompanied by NRAS  Most of the TRM clones showed linear evolution from the founding clones (29 cases, compared to five cases with linear evolution in a pre-existing subclone; and six cases by clone sweeping) ( Figure 3). This is somewhat different from previous reports, 23,24 which may be because a subset of the patients did not represent the clonal progressing process, such as the 24 cases whose TRMs mutations, they emerged as common partners of TRMs (Table 2), possibly playing a role in the transformation process. More research should be conducted to explore the relationship between BCOR mutations (usually occur in patients with normal chromosomes) and sAML, 26 especially in patients with sAML and RAS mutations as TRMs (Table 2). Some mutated genes, such as RUNX1 (highly prevalent among patients with chromosome seven abnormalities), 27 could be responsible for the MDS phenotype or driving AML progression. These dynamic genetic features contribute to the complexity and heterogeneity of this disease and may become key to understanding the occurrence of sAML.
Together with TP53, mutations in NRAS/KRAS, CEBPA, FLT3, CBL, PTPN11 and RUNX1 (a total of 7 genes) accounting for almost 90% of patients for whom TRMs were defined by targeted sequencing. KIT mutation was not detected as a TRM in our group of cases, and NPM1 was only detected in one case. The latter could be attributed to the favourable responses of the MDS patients who harbour NPM1 mutations toward HMAs (decitabine), thus blocking sAML transformation. 28 We reported CEBPA as one of the most common TRMs (accounting for nearly one-sixth of all TRMs) and expanded the candidate genes for possible targeted treatment. In addition, the mutation pattern of FLT3 in this assay was less involved in FLT3-ITD (only in one of the 8 cases; often observed in de novo AML). Therefore, it appears that point-mutations-aimed target therapy would be beneficial for patients with this kind of TRM. Given the rapid development of mutation-specific targeted therapy in recent years, we hope to pharmacologically block AML transformation in patients with MDS at high-transformation risk with regular monitoring for TRMs and the administration of effective corresponding targeted therapy.
There are several limitations to this study. This was not a prospective study. Therefore, some data from AML-transformed cases could not be collected. Moreover, not all progression-related genes were included in the 39 targets in this assay, which may have resulted in the exclusion of some useful information. A more exact design is needed to fully explore the sAML transformation process.
In summary, somatic mutations in signalling pathway activation, transcription factors, or tumour suppressor genes appear to be a precondition for AML transformation in MDS. The high propensity to acquire TRM is worthy of further research so that targeted novel therapies can be developed.

FU N D I N G I N FO R M ATI O N
This study was supported by the National Natural Science Foundation of China (grant nos. 81770120 and 81770122).

CO N FLI C T O F I NTE R E S T
The authors do not have any conflict of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data can be available by contacting the corresponding author.

PATI E NT CO N S E NT
Clinical and hematologic data from patients were recorded following informed consent in accordance with the Declaration of Helsinki.

PE R M I SS I O N TO R E PRO D U CE M ATE R I A L FRO M OTH E R S O U RCE S
Not applicable.

CLI N I C A L TR I A L R EG I S TR ATI O N
Not applicable.