Size profile of cell-free DNA: A beacon guiding the practice and innovation of clinical testing

Cell-free DNA (cfDNA) has pioneered the development of noninvasive prenatal testing and liquid biopsy, its emerging applications include organ transplantation, autoimmune diseases, and many other disorders; size profile of cfDNA is a crucial biological property and is essential for its clinical applications. Therefore, a thorough mastery of the characteristic and potential applications of cfDNA size profile is needed. Methods: Based on the recent researches, we summarized the size profile of cfDNA in pregnant women, tumor patients, transplant recipients and systemic lupus erythematosus (SLE) patients to explore the common features. We also concluded the applications of size profile in pre-analytical phases, analytical phases for novel assays, and preparation of quality control materials (QCMs). Results: The size profile of cfDNA shared common features in different populations, and was distributed as a “ladder” pattern with a dominant peak at ~166 bp. However, cfDNA entailed slightly discrepant characteristics due to specific tissues of origin. The dominant peaks of fetal and maternal cfDNA fragments in pregnant women were at 143 bp and 166 bp, respectively. The plasma cfDNA in tumor patients, transplant recipients, and SLE patients had a peak of around 166 bp. In pre-analytical phases, size profile served as a vital indicator to judge the eligibility of specimens, thus ensuring the successful implementation of assays. More importantly, the size profile had the potential to enrich short fragments, calculate fetal fraction, detect fetal abnormalities, predict tumor progress in analytical phase and to guide the preparation of QCMs. Conclusions: Our finding summarized the characteristics and potential applications of cfDNA size profile, providing clinical researchers with novel assays by the extensive application of cfDNA.


Introduction
Cell-free DNA (cfDNA) was first discovered in human serum and subsequently extracted from urine, cerebrospinal fluid, and pleural fluid in the past few decades [1,2]. The cfDNA derived from fetal and tumor tissues has greatly facilitated the development of noninvasive prenatal testing (NIPT), liquid biopsy, and other potential applications, thus holding promise for noninvasive detection of fetal abnormalities or tumor characterization at an early stage. Recently, cfDNA has been shown to be non-randomly fragmented and to have a specific pattern of nucleosome distribution with associated preferred end signatures [3,4]; therefore, a comprehensive understanding of the features, mechanisms, and potential use of the size profile of cfDNA is a promising scientific area.

Ivyspring International Publisher
As a crucial biological property of cfDNA, size profile has been assessed by a variety of methods, such as gel electrophoresis, atomic force microscopy (AFM), quantitative real-time PCR (qPCR), and massively parallel sequencing (MPS), the last two are relatively robust and precise methods ( Figure 1) [5]. Another approach to evaluate size profile is the DNA integrity, which is conducted by using qPCR with long and short amplicons (e.g., >300 bp and <100 bp) [6], and is calculated as the ratio of the number of long to short DNA fragments. Various studies have demonstrated that fetal cfDNA is shorter than maternal cfDNA in pregnant women and, therefore, fetal cfDNA can be used to detect fetal abnormalities. The precise assessment of size profile in cancer patients was slightly different based upon various methods, types and stages of tumors, and positions of cfDNA [7,8] and should be carefully discussed. In addition, the size profile of cfDNA has been widely applied in several fields, which include quality control in laboratory practice, enrichment of the short DNA fragments, the detection of the fetal fraction (FF) and fetal abnormalities in NIPT, the prediction of tumor progression in liquid biopsies, and allograft damage in transplantation. Hence, a summary of the cfDNA size profile in different populations and its applications are required to demonstrate the significance of cfDNA.
In this review, we first discuss and summarize the size profile and mechanism of cfDNA in different populations. We then focus on the applications of size profile in pre-analytical and analytical phases in the laboratory and guidance in the preparation of quality control materials (QCMs). We believe that a thorough understanding of size profile and its relevant implementations in cfDNA could assure reliable results in clinical practice and provide valuable information for the extensive development of assays. The plasma cfDNA in pregnant women contains fetal and maternal cfDNA, primarily derived from fetal tissues and maternal hematopoietic system. Similarly, tumor cfDNA and non-tumor cfDNA originated from the plasma cfDNA in tumor patients. Donor-derived cfDNA and recipient derived cfDNA, and cfDNA from lupus erythematosus tissue and hematopoietic cfDNA were obtained from transplant recipients, and SLE patients, respectively. These cfDNA fragments, which are typically bound with histones or transcription factors, are released into the peripheral blood. After extraction, the size profile of cfDNA fragments can be assessed by using electrophoresis, atomic force microscopy, qPCR with different amplicons, and sequencing, producing different forms of results to represent the size profile of cfDNA. qPCR: quantitative real-time PCR; SLE: systemic lupus erythematosus.

Pregnant women
The analysis of size profile of fetal cfDNA in maternal plasma is simple and straightforward because fetal and maternal cfDNA fragments are easy to distinguish by using specific approaches: (1) The SRY gene, which is located only on chromosome Y, can be applied to analyze fetal cfDNA in the plasma of a pregnant woman with a male fetus [9]. (2) The methylation status of specific genes is a distinguishable marker. For example, the CpG sites of the SERPI-NB5 gene promoter from placental tissues are hypomethylated, but almost completely methylated in the maternal blood cells; the former is the source of fetal cfDNA while the latter releases maternal cfDNA [10].
With the recent developments in detection technologies, analysis of the size profile of cfDNA fragments is becoming more precise (Table 1). Li et al. first found that fetal cfDNA was shorter than maternal cfDNA; the former was usually <300 bp while the latter was >1000 bp by using fluorescent PCR [11]. The authors concluded that fetal cfDNA could be enriched by size selection with the length threshold of ~300 bp [11], which was significant for the development of NIPT. However, subsequent studies have proven that the precise length of fetal and maternal cfNDA in the study of Li et al. was not accurate. Chan et al. reported that the plasma cfDNA in pregnant women mainly ranged between 145-201 bp by qPCR using different amplicons [9]. By employing paired-end sequencing, cfDNA in pregnant women had a dominant peak at around 162 bp and a minor peak at around 340 bp [12]. Fetal cfDNA identified by chromosome Y sequences was rarely longer than 250 bp but was mostly present in sizes of <150 bp [12]. Another sequencing study also reported similar results [13], confirming the feasibility of separating fetal cfDNA from maternal plasma cfDNA based upon the size profile, but the length threshold for separation is not ~300bp, but shorter. At present, implementation of MPS makes the analysis of cfDNA size profile more unequivocal; the fetal cfDNA was reported to have a peak at approximately 143 bp with a strong 10 bp periodicity, while the dominant peak of maternal cfDNA was at approximately 166 bp and the 10 bp periodicity was extremely weak [14,15], which was a convincing view of the size of fetal and maternal cfDNA at present. It is of note that long cfDNA fragments exist in healthy individuals, whereas next genome sequencing can only detect cfDNA fragments of <1000 bp size, and qPCR also cannot detect lengthy cfDNA fragments [5]. Therefore, analyzing the entire size profile of cfDNA either by MPS or qPCR alone is insufficient in healthy individuals, especially pregnant women. A recent study found that the entire maternal plasma cfDNA fragments were ranged from 76 to 5776 bp and approximately 0.06%-0.3% of the cfDNA fragments were longer than 1000 bp by using nanopore sequencing, which could not be detected by MPS or qPCR [16]. Because the application of cfDNA mainly relies on fragments of <1000 bp size, MPS is still an effective approach to accurately evaluate the size profile of cfDNA fragments despite the unavoidable limitation.  The mechanism of difference between maternal and fetal cfDNA have been analyzed in recent studies. Of note, the distribution of nucleosome DNA should be mentioned first. The nucleosome consists of a core, linker histones, and linker DNA. The nucleosome core is composed of an octamer of four types of core histone proteins winded by 147 bp of the linker DNA with a mean size of 20 bp and ranges between 0-80 bp [15] (Figure 2A). It has been reported that the fetal cfDNA is mainly cleaved at the border or within the nucleosome core, but the maternal cfDNA is mostly cut within the linker region [4]; hence the difference is in the trimming of a 20 bp linker DNA ( Figure 2B). Also, the 10 bp repeated periodicity was reported to be related to the structural periodicity in the helical repeat of DNA double helix [15]. The main cause of different cleavage sites in maternal and fetal genomes lies in different nucleosome structures regulated by DNA methylation and histones [17]. The fetal cfDNA has been shown to originate only from the hypomethylated placental tissues, while maternal cfDNA is derived from the methylated hematopoietic and hepatic tissues [18][19][20]. The hypomethylated DNA tends to be less densely linked with histones and is more available to enzymatic degradation [21][22][23]; therefore, the nucleosome cores in placental cells are more easily cleaved by enzymes, leading to fetal cfDNA being shorter than maternal cfDNA. The cleaving enzymes in vivo are complex; the caspase3dependent enzyme and the Dnase1/3 may play an important role [24,25]. It was reported that the deletion of Dnase1/3 resulted in an increase in cfDNA fragments <120 bp and multi-nucleosomal cfDNA molecules; however, the changes only involved a small fraction of DNA molecules [25]. A recent study found different effects of enzymes in the generation of cfDNA, including caspase3-dependent enzyme, Dnase1/3, and Dnase1 [26], which revealed the dynamic generation mechanism of cfDNA through in vitro experiments for the first time.

Tumor patients
The cfDNA in tumor patients is a mixture of cfDNA derived from tumor and non-tumor tissues, and it is a challenging endeavor to distinguish them. Although challenging, two main methods have been developed: (1) The xenografted mice model can be used to differentiate between the human tumor and non-tumor cfDNA fragments [27]. (2) Since the copy number variations (CNVs) involving a whole or a large part of a chromosome arm are relatively common in tumors, for chromosome arm with amplification, the contribution of tumor cfDNA to plasma would increase, whereas, for chromosome arm with deletion, its contribution would decrease. Thus, the chromosome arm-level z-score analysis (CAZA) approach exploited by Lo et al. could infer tumor cfDNA based upon the information on CNVs in specific tumors [5,28]. Recently, the size profile of cfDNA in tumors of patients has been depicted more precisely by using advanced approaches (Table 1). By using AFM, 80% of cfDNA in colorectal cancer patients was found to be <145 bp [29]. Mouliere et al. showed that the cfDNA with KRAS mutation was more fragmented than the wild-type cfDNA by qPCR in colorectal cancer patients [6]. Also, by using MPS, it was reported that short cfDNA fragments preferentially carried the tumor-associated aberrations in hepatocellular carcinoma patients [28]. Therefore, tumor cfDNA was considered to be shorter than nontumor cfDNA. Precisely analyzed by MPS, the size profile of cfDNA in tumor patient was mainly peaked at 166-168bp with smaller peaks at the periodicity of 10 bp in the range of 40-166 bp [27,28,30] ( Figure 2C).
The size of the cfDNA varies subtly with the methods, types and stages of tumors, and positions of cfDNA. With respect to various methods, using a cursory agarose gel electrophoresis, cfDNA size in tumor patients was determined to have "ladder" patterns with the main length of approximately 180 bp [31]. MPS and qPCR were considered relatively robust approaches to analyze size profiles; however, using these methods, the precise size of the main peak was subtly different. The tumor cfDNA was mainly <150 bp using qPCR in xenografted mice; for instance, human cfDNA from hepatocellular carcinoma and glioblastoma was mainly at 134-144 bp and the background of mice cfDNA was distributed at 167 bp [7,27]. However, compared with qPCR, cfDNA in tumor patients measured by MPS was found to be longer and had a prominent peak at 166 bp [28]. Another compelling study performed whole-genome sequencing (WGS) of cfDNA in plasma, indicating that the median overall size of cfDNA in tumor patients was around 163.8 bp [32]. The reason for this variation between the two methods is unclear. A crucial difference could be that the molecules that are short or degraded with nicks in either strand can be effectively recovered through the qPCR procedures, but not readily detectable by double-stranded DNA library construction of sequencing [30,33]. The size of tumor cfDNA in plasma varies in different types and stages of tumors. It was reported that the size of cfDNA in patients with colorectal, ovarian, breast, head and neck, and melanoma cancers was shorter than that in patients with renal, glioblastoma, and bladder cancers because of the higher concentration of tumor cfDNA [34,35]. Furthermore, the cfDNA was observed to be around 176.5 bp in locally advanced pancreatic cancer patients, which was longer than 167 bp found in metastatic patients [8]. Similar observations were made in breast cancer as the short cfDNA was more frequent in the plasma of metastatic than early-stage cancer patients [36]. With regard to cfDNA position, analysis of cfDNA from sources other than plasma would be valuable in specific tumors, such as cerebrospinal fluid (CSF) from brain cancer patients and pleural effusion from lung cancer patients [37,38]. In a recent study, the cfDNA of a size <150 bp was reported in more than 50% of CSF samples in glioma patients, but in less than 20% of plasma samples, and the size of cfDNA in urine was the shortest compared with CSF and plasma [39]. Besides, seminal cfDNA fragments longer than 1000 bp were reported to be more common in prostate cancer patients than in healthy controls [40]. These results indicated different subnucleosomal fragmentation patterns of cfDNA by other nucleases in the CSF, urine, and seminal fluid.
Genome-wide hypomethylation is frequently observed in tumors, such as breast, hepatocellular carcinoma, nasopharyngeal, neuroendocrine, and lung tumors [41][42][43]. Thus, the hypomethylation of tumor tissues may lead to higher accessibility to the enzymatic degradation and the shorter tumor cfDNA fragments, which is similar to the production mechanism of fetal cfDNA in pregnant women.

Transplantation and systemic lupus erythematosus (SLE) patients
Recent reports have indicated that cfDNA in plasma derived from the non-hematopoietic system was shorter than that from the hematopoietic system, and a 10 bp periodicity in size below approximately 143 bp could be observed in both cases [44] ( Figure  2C). Therefore, in the non-hematopoietic tissue transplant recipients, such as the liver, the donor-derived cfDNA that was cleaved mainly from the donated organs was <150 bp, and shorter than the recipient-derived cfDNA, which was from the hematopoietic system [44,45]. A recent study of sex-mismatched liver transplant recipients reported that donor-derived cfDNA was shorter than recipient-derived cfDNA quantified by using Y chromosome capture methodology [46]. The situation was opposite in hematopoietic transplant recipients because the recipient-derived cfDNA originated mainly from the non-hematopoietic tissues while the donor-derived cfDNA was from the hematopoietic system. The size profile of cfDNA in SLE patients is also important. In the case of cfDNA in active SLE patients, the height of 166 bp peak reduced and the short fragments were elevated, especially for molecules <115 bp, which could contribute to more than 84% of the total cfDNA in plasma [22] ( Figure   2C). It is of note that the genome-wide methylation densities in SLE groups showed significant reductions compared with those in the healthy individuals (70.1% vs. 74.3%, p<0.05) [22]; thus, the production mechanism of short cfDNA in SLE was similar to that in pregnant and tumor populations.

Application of size profile of cfDNA
Quality control in the pre-analytical phase The size profile of cfDNA has a specific pattern in each population, and deviations from this pattern, such as increased and decreased size profile, symbolize unqualified conditions in the pre-analytical phase. The DNA integrity based on qPCR with amplicons of <80 bp and >250 bp is the recommended method for quality control in the pre-analytical phase [47] and ranges from 0.3 to 0.8 in healthy individuals. The increased DNA integrity might indicate the contamination of buffy coat, and the decreased DNA integrity (<0.1 assessed by qPCR with amplicons of 300 bp/60 bp in healthy individuals) implies degraded samples [29,48] (Figure 3). In the pre-analytical phase, the size profile is a crucial indicator to evaluate the eligibility of cfDNA. Improper isolation of plasma leads to contamination with gDNA, and an increased size profile or DNA integrity. Improper preservation of plasma and extraction process of cfDNA lead to DNA degradation and a decreased size profile or DNA integrity. (B) In the analytical phase, the size profile can be applied to enrich short fragments to increase the proportion of short cfDNA. As fetal cfDNA is shorter than maternal cfDNA, the FF is linearly dependent on the size ratio; FF can be deduced by calculating the size ratio of the number of short to long cfDNA fragments. Also, the size profile of cfDNA has the potential to detect fetal abnormalities, predict tumor progression, and predict allograft damage. (C) Size profile is vital to assess the reliability and similarity of the quality control materials. Samples collected in the clinic, random DNA fragments induced by physical shear, and specific DNA fragments cleaved by enzymes are identical, weakly similar, and highly similar, respectively, to the real samples in size profile. FF: fetal fraction.

Increased size profile of cfDNA
The proper size profile of cfDNA after extraction should have a dominant peak of around 146-166 bp with no contamination of genomic DNA (gDNA); however, the increase in size profile is mainly due to leukocyte lysis, leading to the release of gDNA and an increase in long fragments. Improper selection of specimen types and separation of plasma are the main reasons. First, plasma is an ideal sample type for cfDNA detection compared to serum as clotting during the extraction of serum significantly increases the observed size of cfDNA [48]. Besides, the improper selection of collection tubes can lead to contamination of gDNA [49]. Last, isolation operations of plasma should also be of concern because a delayed separation of plasma and a low-speed protocol coincide with the contamination of gDNA [50]. Several studies have been employed to avoid this phenomenon, for the selection of specimen types, plasma should be the best type [48], and cell stabilizing tubes should be used whenever possible; for instance, cfDNA was reported to be more stable in Streck tubes than in the BD Vacutainer K2EDTA tubes [51]. More importantly, blood tubes should be placed upright to reduce hemolysis, sloshing should be avoided, and temperature fluctuations should be minimized [52]. During the isolation phase, blood samples should be stored at 4℃ and be processed within 4 h after collection [53]. If the isolation needs to be delayed, the samples should be stored at 4℃ in a K2EDTA tube for one day [54]. Besides, a two-step and high-speed plasma separation procedure is required, it is precisely recommended to be 1600 × g for 10 min at 4°C and 16000 × g for 10 min at 4°C [47]. Also, the samples should be removed carefully after the first centrifugation to avoid contamination by the buffy coat during the separation of plasma [47].

Decreased size profile of cfDNA
A decrease in the size profile is mainly due to the degradation of cfDNA in the samples. After the collection of blood specimens, the repeated freezethaw cycles of plasma reduce the DNA integrity [48]. Also, nuclease contamination is an indispensable factor that influences the size profile of cfDNA [47]. During the cfDNA extraction procedure, the ability of various methods to bind small fragments of cfDNA varies, resulting in a bias in the extracted DNA fragments. The extraction methods generally consist of columns, magnetic particles, and precipitation-based methods. It has been reported that precipitation-based methods generally generate relatively less DNA fragmentation [55]. To standardize the preservation and extraction process, Meddeb et al. suggested that the plasma should be preserved for the final experimental purpose without repeated freeze-thaw cycles [47]. Appropriate selection of an extraction kit is essential [56] and the cfDNA extracts should be stored at -20°C or -80°C to reduce degradation [47]. If the DNA integrity is calculated to be less than 0.1 in healthy individuals, cfDNA is considered to be degraded in samples [34] and cannot be used for subsequent experiments.

Assay innovations in the analytical phase
Size-based enrichment for cfDNA It is noteworthy that low levels of fetal or tumor cfDNA always result in detection failure in NIPT or liquid biopsy, so the enrichment of short fragments is vital in the workflow. Generally, selection of short fragments can be conducted before and after the sequencing phase. Before sequencing, the PCR analysis based on short and long amplicons is an efficient way. As short amplicons can bind long and short cfDNA fragments, whereas long amplicons only can bind long fragments, therefore, using shorter amplicons will bind more interested fragments. Compared to conventional assays, the amount of fetal cfDNA was almost 1.6 times higher when short amplicons of 50 bp were used [57]. In another report, when PCR was performed based on short amplicons of 64 bp, the amount of FF was successfully increased from 18% to 38% [43]. In addition, single-stranded library preparation and hybrid-capture (SLHC) is also a robust selection approach before sequencing. Some of cfDNA fragments in tumor patients are degraded with nicks in the strand, which cannot be captured by the conventional method. The SLHC first denatures cfDNA into single-stranded fragments after end repair, then library construction of the single-stranded fragments is performed, including the short degraded cfDNA [33,58]. It has been shown that SLHC could efficiently recover and enrich the short cfDNA fragments (<100 bp) to attain a higher detection rate from 45% to 75% [33]. After sequencing, the in silico size selection was applied during the read-pair positioning process. Once these unprocessed sequencing reads were mapped to the reference, the method selected the interesting reads ranging from 90 bp to 150 bp, leading to a 2-fold enrichment in 95% of tumor patients [35]. The selection of short cfDNA increases the relative abundance of fetal or tumor fragments, despite the potential loss of long fragments, but this limitation has not been fully discussed in the literature. In summary, the size profile plays an indispensable role in the enrichment, thereby increasing the level of cfDNA of interest and decreasing the rate of detection failure.

Same cleavage pattern
For the exploration of specific end coordinates of cfDNA fragments, the windowed protection score (WPS) is an available assessment approach. WPS refers to the number of cfDNA fragments that have no endpoints in a given 120 bp genome window minus those with endpoints in that window [18]. By WPS analysis, cfDNA fragments have a stable cleavage pattern with the endpoints intensively clustering adjacent to the boundary of the nucleosome core or the linker region [3,18]. Therefore, cfDNA fragments have series of preferred genome coordinates, but these sites on the sides of nucleosome DNA are limited, thus producing a few fragments of the same length and breakpoint. That is, cfDNA in plasma has the characteristic of natural duplication to a certain degree ( Figure 2D). Moreover, the artificially duplicated cfDNA fragments are introduced by PCR during the library construction [59], and the background errors of bases are generated, which is a serious barrier to the liquid biopsy. Another severe barrier is the low level of mutant tumor cfDNA in patients [60]. In the sequencing phase, to reduce errors, the repeat cfDNA is removed, including the naturally occurring repeat cfDNA fragments as they are not specially tagged, exacerbating the low concentration of cfDNA, leading to a level in many patients much lower than the detection threshold.
Based on the cleavage pattern and characteristics of cfDNA, Newman et al. recognized the importance of identifying naturally duplicated cfDNA. They designed the molecular barcode technology to add a unique molecular index (UMI) at the ends of the cfDNA fragments, where the UMI is sufficiently diverse to ensure that each cfDNA molecule can be labeled differently [59]. Naturally repeated DNA fragments are not removed when filtering the errors because of the presence of barcodes, leading to a 15-fold reduced error rate and an improvement of error-free regions from 90% to 98% [59]. Consequently, cfDNA fragments with the same size and endpoints indicate naturally repeated fragments, guiding the utilization of unique barcodes during de-duplication.

Calculation of FF
The FF calculation is an indispensable part of NIPT, and several calculation methods based on chromosomes Y and X, SNPs, and seq FF have been developed recently [61]; the size-based and nucleosome track-based approach can also be used to calculate FF. As fetal cfDNA is shorter than maternal cfDNA, the size-based method mainly calculates the relative proportions of short and long (100-150 bp and 163-169 bp, respectively) cfDNA fragments to determine the FF [13]. The FF thus derived is highly consistent with the ratio determined by the Y chromosome sequence (r=0.827, p<0.0001). In summary, the size-based method performs shallow depth sequencing of maternal plasma DNA and is moderately accurate in conventional NIPT.
Another method of cfDNA size profile is the nucleosome track-based approach, which is based on the different start sites of sequence reads from fetal and maternal cfDNA fragments because only maternal cfDNA involves linker DNA [62]. In the sequencing data, the reads involving linker DNA could be clearly recognized by identifying those that start in the regions over 73 bp upstream and downstream of the nucleosome core [62]. Therefore, the frequency of reads involving linker DNA can be exploited to calculate the FF [14]. However, compared with the size-based method, the correlation between this method and the chromosome Y-based method is low (r=0.636). This method is cost-effective and does not rely on fetal gender, but the accuracy should be further developed in the future [61].

Application in the detection of aneuploidy
Traditional noninvasive prenatal detection of aneuploidy mainly includes count-based methods of chromosomes and single nucleotide polymorphism (SNP)-based methods [63]; the size-based method is also a potential and novel approach. In theory, when extra copies of fetal chromosomes are present, the relative proportion of cfDNA produced by that chromosome increases shortening the size profile of this chromosomal cfDNA in plasma. Therefore, the size-based method calculates the ratio of cfDNA fragments with short sizes (e.g., <150 bp) to all the sequenced fragments from the targeted chromosome in the sample, followed by comparison with the reference proportion in diploid pregnant women to acquire the z-score [13]. If the size of cfDNA fragments of a chromosome in the sample is significantly shorter than the expected value (e.g., z-score >3), the risk of trisomy of this chromosome is higher [64]. As the count analysis of chromosomes is the common method for detecting aneuploidy, combining count-and size-based z-score should be an accurate and rigorous scheme. Zhang et al. combined count-based method with size-based algorithms to obtain a more accurate z-score to facilitate the detection for fetal trisomy. When 180 cases were tested by this combination method, the sensitivity and specificity increased from 75% to 80% and 98.86% to 99.43% after the size-based correction of 100 bp [65]. Besides, the sensitivity and specificity of count-based z-score with 130 bp size-based corrections reached up to 100%, which were more efficient than the correction of 100 bp. Therefore, the combination of count-and size-based analysis would enhance the detection of fetal aneuploidy in NIPT [65,66]. However, determining the specific size value of the analysis is urgently needed. Also, the current method can detect fetal aneuploidy with a cut-off value of 3-4% FF [61]; therefore, lower FF plasma samples should be considered when confirming the performance of the size-based method in the future.

Application in the detection of CNVs
The principle of the size-based method of fetal aneuploidies can also be used in the detection of CNVs. Since the size of fetal cfDNA is shorter than that of the mother, the presence of fetal micro-deletion or micro-duplication would lengthen or shorten the size profile of cfDNA released from that chromosome in the maternal plasma [61]. As was the case with aneuploidy detection, after WGS or targeted sequencing, the ratio of cfDNA fragments with short sizes (e.g., <150 bp) of the target chromosome in the sample was calculated and compared with the ratio of reference to acquire the z-score [67]. The result indicated that the size-based algorithm correctly identified 17 out of 18 cases with CNVs ranging between 3-40 Mb [67]. The sensitivity and specificity have not been studied in large-scale experiments. The size-based method is feasible in theory but has not been widely verified in many studies. Therefore, at present, the combination of traditional and size-based methods is an effective and comprehensive approach to detect fetal CNVs.

Potential application in liquid biopsy
Tumor cfDNA with the mutation information was reported to have the potential to detect tumors and predict drug therapy [68,69]. Similarly, the size of cfDNA in tumor patients has been shown to be shorter in advanced tumor stages and metastasis, and can be applied to monitor the evolutionary dynamics and prognosis of tumors. It has been reported that the cfDNA size in healthy control samples was longer (mean 176.5 bp, range 168-185 bp) than that in local pancreatic cancer samples (mean 170 bp, range 167-173 bp, p=0.001), and was the shortest in metastatic patients (mean 167 bp, range 148-180 bp, p<0.001) [36], indicating that the size of cfDNA was highly related to the progression and metastasis of tumors. The data also indicated that a fragment size of <167 bp before treatment was significantly associated with shorter progression-free survival (p=0.002) and overall survival (p=0.001) [36]. Similarly, the ratio of short (50-166 bp) to large (167-250 bp) cfDNA fragments had a significant association with poor survival in renal cell carcinoma [70] as well as in hepatocellular carcinoma, prostate, and primary and metastatic breast cancer patients [28,37,71]. The analysis of cfDNA size profile in the early stages of the tumor by calculating the ratio of short to long fragments or calculating DNA integrity is a simple, rapid, and economical method to determine prognostic information. Analysis of size profile to predict the progression of the tumor has only been verified in some tumors, and it is worthwhile to evaluate other tumors. In conclusion, cfDNA size profile may be a potential biomarker for monitoring the prognosis of tumors.

Potential application in transplantation and other diseases
Accurate and early assessment of allograft damage is vital for the long-term survival of transplant patients. Recently, several prospective studies have proven the high amount of cfDNA derived from grafts was associated with the allograft rejection in liver and kidney transplantation [72,73]. However, little information is available on the relationship between cfDNA size and allograft damage. The cfDNA derived from the graft is shorter than that from hematopoietic cells, so the increased proportion of short cfDNA is speculated to herald allograft rejection. It was reported that a high ratio of short (105-145 bp) to long cfDNA fragments (160-170 bp) in liver transplantation points towards an early trend of allograft damage. Also, the assessment of allograft damage based on the ratio of short to long cfDNA fragments was highly consistent with that based on the cfDNA quantified by chromosome Y (p<0.0001), and routine liver function enzymes (p<0.0001) [46]. Therefore, the size analysis of cfDNA derived from the graft may be a potential approach to assess allograft damage, for which large-scale analysis and validation are needed. Besides, the quantification of cfDNA has been shown to be related to autoimmune diseases and myocardial infarction [74,75]. Additional studies about the relationship between the size profile of cfDNA and these diseases may be valuable.

Guidance in preparation of QCMs
Although recent advances have improved the performance of NIPT and liquid biopsy, it is still challenging because of the low concentrations of cfDNA, varied detection settings, and complex workflows. Thus, there is an urgent requirement for QCMs for proficiency tests and quality control [76,77]. Currently, several QCMs are available for cfDNA analysis, such as clinical samples for the College of American Pathologists in their performance testing programs, ultrasonically interrupted samples [78], and samples digested by enzymes [79]. The size profile of cfDNA is an important indicator to evaluate whether the QCMs successfully simulate the real samples ( Figure 3).
Samples from the clinic reflect the characteristics of real cfDNA, but the acquisition is poor. Random DNA fragmentation induced by physical shear results in a broad size profile and a random distribution pattern, which is different from the real cfDNA and does not reflect its accurate signature. However, with MNase digestion technology, the QCMs for tumor cfDNA had a dominant peak of 147 bp, which successfully simulated real tumor cfDNA in vitro [79]. Furthermore, based on the differences in maternal and fetal cfDNA size and cleavage sites, the QCMs for NIPT digested by DNA fragmentation factor (DFF) and MNase contained a mixture of DNA with the dominant peak of 162 and 146 bp, respectively, which successfully simulated the maternal and fetal cfDNA fragments [24,80]. Therefore, the size profile is a vital indicator to assess and evaluate the eligibility of QCMs. With the exploration of the crucial mechanism of cfDNA, the materials will be prepared carefully to better resemble the real samples.

Conclusions
The past few decades have witnessed rapid improvement in the comprehension of cfDNA size profile and its application in precision medicine [79]. The size profile of cfDNA is applied widely in NIPT, liquid biopsy, and QCMs in the laboratory. In the pre-analytical phase, the size profile serves as a vital indicator to evaluate the eligibility of specimens and to ensure the successful implementation of experiments. In the analytical phase, size profile can be applied for the enrichment of short fragments, calculation of FF, detection of fetal abnormalities, prediction of progress in tumors and graft rejection in transplantation. It also plays an important role in evaluating the similarity between QCMs and real samples. Thus, the advantages of cfDNA have inspired and broadened its applications in a variety of areas.
Studies on the size profile of cfDNA have paved way for understanding the mechanism of its generation. The tissue-of-origin and footprint analysis of cfDNA are also the hotspots of current research and are expected to expand the current understanding and facilitate its implementation for novel assays [19]. Some of the applications of size profile are only at an exploratory stage. For instance, cfDNA size profile might be applied to predict the progression and prognosis of tumors and serve as a novel diagnostic indicator for transplantation, myocardial infarction, SLE, and severe sepsis. However, these clinical applications await long-term validation studies. Despite the obstacles and the unknowns, the size profile of cfDNA fragments still has a good prospect to guide innovative new assays and provide hope for precision medicine.