Whole Exome Sequencing of Simultaneous Primary Multifocal Lung Cancer (SMPLC): Case Report and Bioinformatics Analysis


 Background: To understand the molecular mechanism of synchronous multifocal lung cancer (SMLC) is of great significance for the differential diagnosis of intrapulmonary metastasis (IM) and synchronous multiple primary lung cancer (SMPLC). Recently, next-generation sequencing (NGS) has become a useful tool for understanding SMLC. Case presentation: In this study, two lesions of a 61-year-old man with lung cancer were detected by whole exome sequencing (WES) and the correlation between different lesions was analyzed at the molecular level. Lesion 1 was adenocarcinoma and lesion 2 was squamous cell carcinoma. Gene mutation and copy number variation (CNV) are different in the two lesions. The genome of lesion 2 is more unstable. The clonal evolution analysis showed that there was no obvious evolutionary relationship between the two lesions, and both lesions were independent double primary lesions. Bioinformatics analysis revealed that the alternate genes of the two lesions were inconsistent in function and pathway. PCA analysis was performed using the Cancer Genome Atlas (TCGA) database and the GTEx database, and it was found that the changed genes in these two lesions were significantly separated from the control group, and the changes of TP53 and EGFR genes in the TCGA database were further described. Conclusions: These results indicate that NGS may provide new ideas for SMLC classification.


Background
Lung cancer is the leading cause of cancer-related deaths worldwide. 1 A more prominent phenomenon is that the rst symptom of more and more lung cancer patients is multinodular lesions, which is called SMLC. As we all know, patients with metastatic tumors have a late clinical stage, lose opportunities for surgery, and have a poor prognosis. Most of the SMPLCs are early stage lung cancers. Reports have shown that the 5-year survival rate of SMPLC in stage I can be as high as 75.8%. 2 Therefore, the correct differential diagnosis of these multiple lesions is SMPLC or IM, determines the judgment of its clinical stage, the formulation of a reasonable treatment plan, and the evaluation of the patient's prognosis, and the in-depth understanding of it. However, due to the high heterogeneity of lung cancers and the limitations of detection methods, the classi cation of SMLC has always been di cult, and its prognosis and pathogenesis are rarely studied.
Recently, the development of molecular pathology has provided many useful methods and tools for the research of SMLC. Researchers try to extract DNA from tumor cells for various detailed and precise analysis, in order to trace the root cause and determine the genetic origin of the tumor. With the use of NGS technology, we can do more than simply detect the mutation sites of known speci c cancer genes.
In this study, we reported a case of SMPLC with different clinical features, histopathological morphology, and mutation sites of key genes. Through the whole exome sequencing of 2 lesions in this case, the molecular level differences between different lesions were compared, combined with the analysis of the public database. These results provide an idea for the identi cation of SMPLC.

Case Presentation
The patient is a 61-year-old man. Chest CT showed that there were 2 lesions in the lung, located in different lobes, with different pathological morphology, and no lymph node metastasis. The pathology identi ed it as SMPLC. Lesion 1 located in the right middle lung was histologically classi ed as adenocarcinoma (Fig. 1A). Lesion 2 located in the right lower lung was histologically classi ed as squamous cell carcinoma (Fig. 1B). Whole exome sequencing was applied for 2 lesions and 1 control (Shihe, Nanjing, China). The library was prepared with Hyper Prep Kit (Kapa) and sequenced on Hiseq 4000 NGS platform (Illumina). After the off-machine data is quali ed, the somatic mutation and copy number variation between different lesions are analyzed.
The somatic mutation and CNV were analyzed basing on NGS data. Figure 2A  TCGA was a cancer genomics database including over 20,000 primary cancer and matched normal spanning 33 cancer types. We used PCA analysis to compare the alternated genes of lesion 1 and lesion 2 in LUAD/LUSC and control groups from TCGA database, GTEx database was used as a normal control.
The 55 alternated genes in lesion 1 had a signi cantly different expression between LUAD cancer and control group (Fig. 5A). Meanwhile, the 140 alternated genes in lesion 2 had a signi cantly different expression between LUSC cancer and control group (Fig. 5B). Figure 6 showed the interactions of gene alteration from lesion 1 and lesion 2. There was no interaction of CNV and mutation from lesion 1. And TP53 was the only interaction gene of CNV and mutation from lesion 2. In lesion 2, TP53 had both missense mutation and loss of CNV. There was no interaction of CNV from lesion 1 and lesion 2. And EGFR was the only interaction gene of mutation from lesion 1 and lesion 2.
We further analyzed the alterations of TP53 and EGFR in TCGA database. In LUAD, 74% and 60% samples had TP53 and EGFR gene alterations (Fig. 7A). Heterozygously and gained alterations were key alterations of TP53 and EGFR in LUAD. In LUSC, 92% and 68% samples had TP53 and EGFR gene alterations (Fig. 7B). Mutation and gained alterations were key alterations of TP53 and EGFR in LUSC. For TP53, wild type accounted for the most both in LUAD and LUSC (Fig. 7C). For TP53, missense accounted for the most mutations in LUAD (Fig. 7C). For EGFR, wild types accounts more than half in LUAD and LUSC (Fig. 7D). And for all detected mutations, missense accounts the most.

Discussion
At present, the most widely used differential diagnosis criterion is the clinicopathological criterion for de ning multiple primary lung tumors rst published by pathologists Martini and Melamed in 1975. 3 Due to limited conditions, the standard can only depend on histological type. However, with the improvement of awareness, more and more defects are exposed by this standard. Lung cancer is highly heterogeneous. Sometimes there are two or three different tissue morphologies in the same lesion. On the other hand, even if the lesions show similar histological types, different molecular changes may occur in the lesions, and these changes are su cient to indicate their independent origin. Another standard commonly used in clinical practice is the TNM classi cation system. Paradoxically, a series of studies analyzing multifocal lung tumors showed that SMPLC may have a good prognosis. 4 Therefore, relying solely on the diagnostic criteria of histopathology may overestimate the staging of some patients, leading to incorrect staging and treatment. Although the 2007 version of the American College of Chest Physicians (ACCP) revised and optimized the Martini and Melamed standards with the addition of genetic and molecular information, the lack of high-level medical evidence standards makes the diagnosis, treatment and the prognosis of multifocal lung cancer generating very big controversy and con icting opinions persist. 5,6 EGFR is currently the most common lung cancer treatment target. The frequency of EGFR mutation was nearly 50% in LUAD patients and 9.9% in LUSC from Asia-Paci c. 7 It's found that in LUAD with known EGFR gene mutations, the surrounding normal lung epithelial cells also have consistent gene mutations. 8 It is suggested that EGFR exists in the early occurrence of lung cancer, which can be used as an important reference for analyzing the source of lung cancer. Yatabe et al performed multi-layer slices and multiple samples of the same tumor on a large sample of LUAD, and detected EGFR gene mutations. 9 They found that no case of lung cancer had multiple different EGFR mutations. At the same time, a paired study was conducted on the con rmed primary and metastatic lesions, and the results showed that there is a high degree of consistency between the primary and metastatic tumor tissue EGFR mutations. Regarding multiple lung adenocarcinomas, a study conducted EGFR molecular detection on the tumor tissues of four patients, and found that two of them had exactly the same mutation sites, while the other two were not consistent, re ecting the complexity of the origin of multiple lung tumors. 10 Some scholars used comparative genome chip method (aCGH) and detection of somatic cell mutation (EGFR) to analyze the genomic change spectrum of lung cancer patients. The results showed that the consistent rate of copy number changes in the metastatic group was higher than that in the primary group (55.5% vs. 19.6) %, P = 0.04). 11 TP53 is a famous tumor suppressor, and the mutation of TP53 is usually related to tumor occurrence, development and poor prognosis. 12 Dual TP53 and EGFR mutations were found in 41% of NSCLC patients. And TP53 missense mutation would lead to signi cantly lower response rates and shorter PFS when receiving EGFR TKI therapy. 13 However, the high frequency of TP53 in tumors limited the application of TP53 mutation in distinguishing SMLC. Recently, there have been studies using EGFR mutations to identify small samples of multifocal lung cancer. But the effect is not satisfactory due to common defects such as low mutation rate. Therefore, the detection of a single gene is far from meeting the needs of classi cation and traceability. The researchers identi ed 27 cases of SMPLC from 122 cases based on EGFR and K-Ras gene mutation analysis. 14 However, there are also different opinions. Takamochi et al classi ed and analyzed the prognosis of 36 patients based on the mutations of EGFR and K-ras, and concluded that there was no difference in survival rates between two groups. 15 Some researchers proposed to combine more genes to establish a differential expression mathematical model to differentiate and diagnose multiple primary and metastatic lung cancers, including p53, p16, p27 and c-erbB2. 16 Another study conducted research on East Asian lung adenocarcinoma, and found that the probability of any one of the four genes of EGFR, k-ras, c-erbB2 and ALK mutations can add up to 90%. 17,18 In this study, we used WES to re ect the various alternations in tumor tissue. We compared the two histologically different lesions and found they were different in mutations and CNVs. Among the alternated genes, we found EGFR and TP53 as two speci c gene. We used TCGA database to further show the alternations of the two genes in the general population. Combine single patient sample and public data could provide more information. However, more samples are necessary to nd the key point for identi cation of SMPLC.

Conclusion
Overall, this study summarizes a rare case of SMPLC involving adenocarcinoma and squamous cell carcinoma. We observed the histological features and sequenced the whole exome of the two lesions. Further analysis of gene mutation, CNV, clone evolution and KEGG pathway by bioinformatics method provides clues to better understand the intrinsic molecular characteristics of SMLC and to carry out accurate differential diagnosis.
Abbreviations CNV: copy number variation; IM: intrapulmonary metastasis; NGS: next-generation sequencing; SMLC: synchronous multifocal lung cancer; SMPLC: synchronous multiple primary lung cancer; TCGA: the Cancer Genome Atlas; WES: total exome sequencing Declarations Ethics approval and consent to participate This publication is approved by the Ethics Committee of the A liated Suzhou Hospital of Nanjing Medical University.

Consent for publication
Consent from the patient is obtained.

Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Competing interests
The authors declare that they have no competing interests.

Funding
This work was supported by the funds of the Suzhou Science and Technology Bureau Project (sys2018084) and Talent Project of Jiangsu Province (WSN-256).

Authors' contribution
Donglin Zhu carried the case analyses and wrote the manuscript. Ming Hong made histopathological sections and performed genetic analysis. Jinghuan Lv conceived of this study and drafted the manuscript. All authors read and approved the nal manuscript.