A Pilot Prospective Study of Plasma Cell free DNA Whole Genome Sequencing Identi ed Chromosome 7p Copy Number Gains is a Speci c Biomarker for Early Lung Cancer detection

Hongjie Yu Department of Thoracic Surgery, the First A liated Hospital of Soochow University Xiaojun Yu (  18362738721@163.com ) The First A liated Hospital of Soochow University https://orcid.org/0000-0002-7318-0220 Jia Tang Department of Thoracic Surgery, First A liated Hospital of Soochow University Xun Lu Department of Thoracic Surgery, First A liated Hospital of Soochow University Haitao Ma Department of Thoracic Surgery,First A liated Hospital of Soochow University

Conclusions: Chromosome 7p copy gains might be a useful peripheral blood tumor biomarker from lung cancer detection.

Introductions
Lung cancer is the leading cause of cancer-related death worldwide. Although there are improvements in lung cancer diagnosis and treatment recent years, 5-year survival rate is still less than 20% of all lung cancer patients [1]. The most important reason for poor survival is still relatively advanced stage when lung cancer was diagnosed. A large prospective clinical trial of applying low-dose CT (LDCT) scan for early lung cancer discovery has showed great survival bene ts. However, LDCT scan results in huge false positives [2]. About 94% of LDCT identi ed nodules are actually benign changes, which dose need further treatments. Usually, invasive diagnosis approaches will be applied to those patients. Hence, there are needs to identify non-invasive approaches to complement LDCT to reduce over-diagnosis. Chromosome 7 aneuploidy was discovered an early events on lung cancer development. Chromosome 7 ampli cations were frequently reported in lung cancer tissue pro ling studies. It is also reported in lung bronchial in-situ hybridization studies. Chromosome 7 aneuploidy could be found in both malignant tissues and in pre-cancerous lesions, but not health tissues, which supports chromosome 7 copy number changes might be an early event of lung cancer tumorigensis [3][4][5]. However, tissue biopsy is an invasive and complicated procedure. And sometimes, it might cause tumor metastasis.
Recently years, chromosome aneuploidy detection by using plasma cell free DNA for prenatal tests, with minimal false positives and false negatives [6]. Similar to fetal tissues, tumors also keep shedding DNA into peripheral blood stream. Cancer somatic mutations, such as EGFR mutations, were successfully detected as a biomarker indicating potential bene ts from targeted therapies. In addition to somatic point mutations, chromosomal copy number changes were also detected in breast cancer [7], hepatocellular carcinoma [8] and lung cancer [9]. However, most of the complicated copy number changes happened in advanced stage tumors. It is showing limited power for early cancer detection. Here we investigate whether chromosome 7 aneuploidy detection in early lung cancer detection.

Materials And Methods
Funding This work is supported by the science and technology fund social development project of Jiangsu Province (No. BE2015640).

Ethics statement
The design and methods of the current study involving human subjects were clearly described in a research protocol. Informed consent was obtained from all patients. Institutional Review Board (IRB) of Soochow University approved the study. All recruited subjects have signed a written informed consent.

Subjects
Twenty-six ovarian cancer patients and thirty-four benign ovarian tumor patients were admitted to the Department of Thoracic of the First A liated Hospital of Suchoow University. All patients were chemotherapy-naive or radiotherapy-naive. Blood samples were collected for cfDNA extracting and CA125 measuring before surgery.

Declarations
The authors have no con icts of interest to declare. generation sequencing Total genomic DNA and cfDNA were isolated from tissue samples and plasma using the Amp Genomic DNA Kit (TIANGEN) and QIAseq cfDNA Extraction kit (Qiagen) respectively. Next generation sequencing was performed as previously described [18,19]. DNA was fragmented into an average size of 300bp (cfDNA without fragmentation), and then 100 ng of fragmented genomic DNA (cfDNA 10ng) was used for preparation of sequencing libraries (NEBnext Ultra II). 8bp barcoded sequencing adaptors were then ligated with DNA fragments and ampli ed by PCR. Puri ed sequencing libraries were massively parallel sequenced by Illumina HiSeq Xten platform. 4G sequencing raw data per sample were ltered and aligned to the human reference genome. Statistical Analysis. R package 'DNACopy' was used to analysis copy number changes. A P value of <0.05 was considered as statistically signi cant binary segmentation. Absolute segment value is used for further analysis. The sensitivity and speci city of UCAD were estimated by Receiver Operating Characteristic (ROC) curves. For categorical variables, the chi-square test was used as appropriate. All statistical analyses were performed using SPSS17.0.

Results
Fresh plasma samples were collected from 8 lung cancer patients. All the patients are informed with signed consent. DNA was successfully extracted from all of these samples with DNA concentration range from 0.19 to 0.49 ng/ul. Sequencing library was prepared with standard NEB protocol. It was then send to Illumina X10 for sequencing. In average, 10G data was collected for each sample.

Patient characterization
As shown in table 1, 16 cancer patient samples, 3 benign lesions and 18 health controls was collected. UCAD detection was performed in 12 adenocarcinoma and 4 squamous cell carcinoma before surgery. Tumors show a signi cant higher UCAD chr7 score compared to control group. And advanced tumors shows higher UCAD chr7 score compared to stage T1abN0M0 patients. Tumor biomarker was almost negative for all 8 samples. There is no obvious correlation between UCAD chr7 score and patient age, histological subtype, molecular subtype (EGFR/KRAS/ALK/ROS1/PIK3CA status). And there is also no obvious correlation between UCAD chr7 score and smoking status. The reads were mapped to human reference genome hg19. Genomic coverage was then counted by using software samtools mpileup. We then calculate average coverage for each 200k bin. Circular binary segmentation algorithm was then used to detect signi cant genomic breakpoints.
As shown in Figure 1 and 2, chromosomal breakpoints were commonly found on centromere regions (vertical dash lines). Chromosome 7 short-arm generally was found with higher coverage compared to long arm, indicating short arm copy gains. Relatively much lower chromosomal imbalance was found in non-tumor controls.    ,CA153 ,CYFRA21-1,CA50 ,CA199 , CA72-4 and SCC. 4 of 12 (33.3%) patients was found with CYFRA21-1 mild increase. 1 of them (8.33%) was found with mild CEA increase. The other 7 was negative for all the 13 tumor biomarkers. As shown in table 3, UCAD chr7p gains is a independent predictor of lung malignancy.   Discussions All 8 cancer samples was found with UCAD chr7 score larger than 0.0020. However, 3 of them did not reach statistical signi cance. The potential reason could be the coverage variations across chromosome due to PCR ampli cation bias, sequencer bias and etc. PCR e ciency is different for different sequence composition, for example, high GC content regions might be preferred by speci c PCR reactions. Similarly for Illumina sequencer, GC bias has been considered to be the major cause of coverage bias. In addition to GC content, secondary structure of cfDNA fragments might be another major factor contribute to coverage bias. The size of cfDNA fragment is around 170 base pair which enable a complicated secondary structure. Hence, PCR reaction conditions need further be optimized for chromosome 7 copy number detections.
Copy number variations detected in non-cancer patient. There is minor chromosome changes which was found in non-cancer patients, which might indicates the background copy number variations among noncancer individuals. In previous report, slight chromosome aneuploidy might be found in pre-cancerous lesions. To nd the baseline across non-tumor individuals, a large scale baseline study might be need.
We also compared UCAD chr7 score to traditional lung cancer tumor biomarkers, such as CEA, SCC and etc. Traditional tumor biomarker is rarely positive in current dataset. In comparison, all tumor patient show higher UCAD chr7 score compared to control samples. The results suggest UCAD ch7 score might be another used non-invasive peripheral biomarker to monitor lung cancer. We even can use UCAD chr7 score to screen cancer by competence LDCT and tumor biomarkers.
In this research, the discovery was found in a small scale prospective study, including 8 cancer patients and 4 health controls. Although the preliminary data was encouraging, we still need a large prospective clinical trial to further con rm the discovery.