Internal validation of an improved system for forensic application: a 41-plex Y-STR panel

Abstract   Y-chromosome short tandem repeats (Y-STRs) have a unique role in forensic investigation. However, low–medium mutating Y-STRs cannot meet the requirements for male lineage differentiation in inbred populations, whereas rapidly mutating (RM) high-resolution Y-STRs might cause unexpected exclusion of paternal lineages. Thus, combining Y-STRs with low and high mutation rates helps to distinguish male individuals and lineages in family screening and analysis of genetic relationships. In this study, a novel 6-dye, 41-plex Y-STR panel was developed and validated, which included 17 loci from the Yfiler kit, nine RM Y-STR loci, 15 low–medium mutating Y-STR loci, and three Y-InDels. Developmental validation was performed for this panel, including size precision testing, stutter analysis, species specificity analysis, male specificity testing, sensitivity testing, concordance evaluation, polymerase chain reaction inhibitors analysis, and DNA mixture examination. The results demonstrated that the novel 41-plex Y-STR panel, developed in-house, was time efficient, accurate, and reliable. It showed good adaptability to directly amplify a variety of case-type samples. Furthermore, adding multiple Y-STR loci significantly improved the system’s ability to distinguish related males, making it highly informative for forensic applications. In addition, the data obtained were compatible with the widely used Y-STR kits, facilitating the search and construction of population databases. Moreover, the addition of Y-Indels with short amplicons improves the analyses of degraded samples. Key Points A novel multiplex comprising 41 Y-STR and 3 Y-InDel was developed for forensic application. The multiplex included rapidly mutating Y-STRs and low–medium mutating Y-STRs, which is compatible with many commonly used Y-STR kits. The multiplex is a powerful tool for distinguishing related males, familial searching, and constructing DNA databases.


Introduction
The non-recombination region (NRY) of the human Y chromosome is a highly informative haplotype system. Theoretically, male individuals in the same paternal lineage share identical NRY genetic information without considering the gradual accumulation of mutations [1]. Currently, genetic markers on the Y chromosome have been widely used in human genetics. Y-chromosomal short tandem repeats (Y-STRs) have proven to be excellent markers in forensic casework such as patrilineal relationship analysis in kinship testing and mixture deconvolution in sexual assault cases [2]. However, the discriminatory power of a limited number of Y-STRs is insufficient due to the lack of recombination on Y-chromosome, especially for male differentiation in an inbred population [3,4]. Some different Y-STR systems with >20 Y-STRs have been developed [5][6][7][8][9][10][11][12][13], of which some rapidly mutating (RM) Y-STRs (mutation rates >1 × 10 −2 ) could yield high-resolution paternal lineage differentiation. However, the main limitation of RM Y-STRs is that they may also lead to a false exclusion of male individuals from the same familial lineages [14]. In regard to low-medium mutating Y-STRs, they were shown to play a vital role in familial searching [6]. Therefore, we aimed at designing a novel panel using Y-STRs with both low and high mutation rates to better distinguish male individuals in pedigree screening and analysis of genetic relationships.
In this study, a novel 41-plex Y-STR panel was developed for co-amplifying 41 Y-STRs plus three Y-Indels in a 6dye configuration. The system contains all the 17 Y-STRs from the Yfiler kit [15], and the inclusion of the nine RM Y-STRs improved the discrimination of related males [16]. Conversely, the 15 low-medium mutating Y-STRs made it suitable for familial searching. This novel panel was found to be compatible with PowerPlex Y23 [17] and the Yfiler Plus PCR Amplification kit [18], providing powerful haplotype recognition and data comparison compatibility. In addition, the amplicons of these three Y-InDels were designed to be very short, providing information even for highly degraded samples. Since their mutation rates are close to zero, the Y-InDels may also contribute to familial searching. Two internal quality controls, IPC60 and IPC500, were added to monitor polymerase chain reaction (PCR) efficiency and sample quality. To validate the efficiency of this 41-plex Y-STR panel, we tested size precision, stutter effect, species and male specificity, sensitivity, DNA mixture, concordance, PCR inhibition, and case-type samples, and performed a population study following the guidelines of the Scientific Working Group on DNA Analysis Methods [19].

DNA samples
This study was approved by the Ethics Committee of the Academy of Forensic Science, Ministry of Justice, P. R. China. A total of 595 peripheral blood samples were obtained from Jilin Han males (n = 233) and Jilin Korean males (n = 362). Informed consent and declaration of genetic relationships with other participants were obtained from each recruited volunteer. Genomic DNA was extracted from the blood samples using the QIAamp DNA Blood Mini kit (Qiagen, Hilden, Germany), followed by quantification using the Qubit ® dsDNA High Sensitivity Assay kit and a Qubit ® 3.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA). Control DNA 9948 was purchased from Promega (Promega, Madison, WI, USA) and was used for sensitivity and inhibition studies. ddH 2 O was used as the negative control. For real casework samples, six different types of DNA samples: semen stains, saliva, buccal swabs, hair with follicles, nails, and muscle tissue were obtained from our laboratory (Academy of Forensic Sciences, Shanghai, China).

PCR amplification and electrophoresis
PCR amplification was performed by the GeneAmp 9700 PCR system (Thermo Fisher Scientific). To determine the optimal annealing temperature for PCR, 44 qualified primer pairs were amplified in triplicate at different annealing temperatures (58 • C, 60 • C, and 62 • C) under the recommended 28 cycles with 1 ng of positive-control DNA 9948. At the optimized annealing temperature, three PCR cycle numbers (26, 28, and 30) were tested to determine the appropriate number of PCR cycles for the new panel.
The 3500XL Genetic Analyzer (Thermo Fisher Scientific) was used to analyze the PCR products at the default settings. The dye set of J6 Matrix Standards (Peoplespot Technology Ltd, Beijing, China) was used for spectral calibration. The T500 (orange) was used as the internal size standard. The fragments of T500 were 65, 70, 80 100, 120, 140, 160, 180, 200, 225, 250, 275, 300, 330, 360, 390, 420, 450, 490, and 500 bp. Capillary electrophoresis (CE) was performed by adding 1 μL of allelic ladder or amplified product and 0.5 μL of T500 to 8.5 μL deionized Hi-Di™ Formamide (Thermo Fisher Scientific). Then, the samples were denatured at 95 • C for 3 min and chilled on ice before CE detection. Sample injection was performed in a POP 4 polymer (Thermo Fisher Scientific) using the operating conditions: injection at 3 kV for 10 s and electrophoresis at 15 kV for 1 500 s. GeneMapper ® ID-X v1.2 (Thermo Fisher Scientific) was used to analyze the results with a 150 relative fluorescence unit (RFU) peak height threshold for genotyping.
The allelic ladder was developed as previously described [22]. Briefly, the alleles reported in STRbase (https://strba se.nist.gov) and YHRD (http://www.yhrd.org/) and the data included in our previous works [21,[23][24][25] were used to determine locus ranges. At each locus, the PCR product of each allele was cloned into the T-vector and validated by Sanger sequencing. The PCR products of the cloned alleles were diluted, mixed, analyzed, and balanced to generate an allelic ladder for each locus, which were then mixed in appropriate proportions to form an allelic ladder for the panel [26] (Figure 1). All alleles presented in the ladder were validated by Sanger sequencing. The Panel and Bin files were programmed, and GeneMapper ® ID-X v1.2 (Thermo Fisher Scientific) was used for genotyping. The nomenclature of the Y-STR alleles was defined following the latest recommendations of the DNA commission of the International Society of Forensic Genetics [27].

Sizing precision and stutter study
Sizing precision was tested by calculating the average base pair sizes and standard deviation (SD) for each allele from 10 injections of the allelic ladder on the same CE platform. Stutter peaks are common artifacts observed during PCR process. Fifty male samples were analyzed on a 3500XL Genetic Analyzer to assess the effects of the stutter peaks. Peaks that differed from the true allele by one repeat motif (n ± 1 repeat units) were considered stutter peaks. Stutter values were calculated by dividing the height of the stutter peaks by that of the true alleles. In this work, the analytical threshold of the stutter peak height was set to 50 RFU. A stutter filter (average stutter value plus 3 SDs) was also configured.

Sensitivity
To evaluate the sensitivity of the 41-plex Y-STR panel, we measured the allele call rate for control DNA 9948 with inputs of 2, 1, 0.5, 0.25 ng, 125, 62.5, 31.25, and 15.625 pg.

Species specificity
The DNAs of nine common animals (cat, chicken, cow, dog, duck, long-tailed macaque, pig, rabbit, and sheep) were extracted from saliva or tissue samples. Standard PCR protocol with 1 ng of the animal template DNA was used to evaluate the species specificity of the new panel. All procedures were conducted in accordance with the United Kingdom Animal (Scientific Procedures) Act 1986 (https://www.gov.uk/ guidance/research-and-testing-using-animals), and approved by the Ethics Committee at the Academy of Forensic Science, Ministry of Justice, P.R. China.

Male specificity
Ten female samples were used to verify the male specificity with a DNA input of 1 ng. Besides, mixed DNA samples from females and males were used to assess male specificity by mixing 125 ng of female DNA with 125, 12.5, 1.25, or 0.125 ng of male DNA 9948.

DNA mixtures
Two random samples, M1 and M2, were selected for the male-male mixture study. A total of 1 ng of the mixture in known ratios of 1:1, 1:3, 3:1, 1:9, 9:1, 1:19, and 19:1 was amplified. Each mixture was tested in triplicate to reduce accidental errors and ensure the accuracy of the results.

Concordance evaluation
The Yfiler plus kit, Yfiler kit, PowerPlex Y23 kit, and the 41-plex Y-STR panel were used, respectively, to genotype 50 unrelated males. The genotyping results of the same loci were consistent.

PCR inhibition study
Three common PCR inhibitors, hematin, humic acid, and nigrosine (Sigma-Aldrich, Darmstadt, Germany), were used to assess the performance of the 41-plex Y-STR panel. Stock solutions with high concentration were prepared by dissolving each of the inhibitors in 0.1 N NaOH (nigrosine and hematin) or DNA suspension buffer (humic acid), which were further diluted in ddH 2 O to obtain a working stock. The quantity of control DNA 9948 was constant at 1 ng with the inhibitors at the following concentrations: 20, 40, 60, 80, 100, and 150 ng/μL of humic acid; 100, 150, 200, 250, 300, 400, and 500 ng/μL of nigrosine; and 100, 200, 300, 500, and 750 μmol/L of hematin. The analysis was performed in triplicate for each condition.

Casework sample testing
Using standard conditions for real casework samples, semen stains, saliva, buccal swabs, hair with follicles, nails, and tissue were also amplified with the 41-plex Y-STR panel.

Population study
A total of 595 unrelated males from the Jilin Han and Jilin Korean populations were genotyped using the 41-plex Y-STR panel under standard conditions. Allele and haplotype frequencies were calculated by direct counting. The gene diversity (GD) was computed as GD = n (1− P a 2 )/(n − 1) [28], where n represents the total number of samples, and P a represents the frequency of the a-th allele at the locus. Haplotype diversity (HD) was computed in the same way as GD, as HD = n(1 − P i 2 )/(n − 1) [29], where P i represents the frequency of the i-th haplotype, and n represents the total sample number. The match probability (MP) was computed as the sum of the squared haplotype frequencies. Discrimination capacity (DC) was determined by DC = h/n, where h represents the total number of unique haplotypes.

Construction of the 41-plex Y-STR panel
In this study, a total of 41 Y-STR loci were selected to construct a robust panel able to co-amplify 10 RM Y-STRs, 31 low-medium mutating Y-STRs, and three Y-InDels in a single PCR procedure. The detailed information on all the loci is presented in Table 1, and the primer information is presented in Supplementary Table S1.
After the construction and optimization, the 41-plex Y-STR panel was optimized in a 10-μL reaction volume, including 2.0 μL 5 × Master Mix, 2 μL 5 × Primer Mix, 1 μL template DNA, or a 1.2-mm punch of the blood card sample and ddH 2 O to obtain a final reaction volume of 10 μL. The PCR samples were amplified in MicroAmp ® Optical 96-well reaction plates using the GeneAmp PCR system 9700 based on the following conditions: 95 • C for 2 min; 28 cycles of 94 • C for 5 s, 60 • C for 90 s, and 62 • C for 60 s; 60 • C for 5 min, and followed by a final hold at 15 • C. The profile of positivecontrol DNA 9948 is shown in Supplementary Figure S1.

Sizing precision and stutter study
Sizing precision was evaluated, and a targeted SD <0.1 was obtained for all alleles (Figure 2), indicating that the precision of the 41-plex Y-STR panel was better than those obtained in validation studies of other Y-STR kits [13,15,18] and good enough to distinguish microvariants or ladder peaks.
Stutters, caused by strand slippage, are the common byproducts of PCR amplification [30][31][32]. To avoid complications in interpreting the profiles, the expected stutter ratio of each STR was accessed for the novel panel. We tested 50

74
Chai et al. male samples with 1 ng DNA inputs using the 41-plex Y-STR panel to obtain the ratio of stutter products. The average stutter ratio plus 3 SDs were applied to set a stutter file for GeneMapper ® ID-X v1.2 ( Table 2). A one repeat unit shorter or longer (n ± 1) than the true allele peak (n) was the dominant of two stutter products. The results showed that the repeated trinucleotide locus DYS481 had the highest n−1 stutter filter value (0.2468), whereas DYS385 had the highest n + 1 stutter filter value (0.2101).

Sensitivity studies
The sensitivity of the 41-plex Y-STR panel was assessed by serial dilution of control DNA 9948 from 15.625 pg to 2 ng in triplicate, resulting in complete Y-STR profiles obtained with DNA inputs ≥125 pg ( Figure 3). Allele dropouts were observed when the DNA inputs decreased to 62.5 pg.

Species specificity
The 41-plex Y-STR panel was used to test nonhuman genomic DNA samples from cat, chicken, cow, dog, duck, long-tailed macaque, pig, rabbit, and sheep. The results showed no crossreactions at any locus range above 150 RFU, except for the long-tailed macaque ( Supplementary Figures S2 and S3). For this monkey species, an "OL" peak with a peak height of 1 705 RFU was found at DYS389II (Supplementary Figure S4), unlikely to disturb human genotyping. These results demonstrated the high human specificity of the 41-plex Y-STR panel, whereas samples mixed with primate DNA should be analyzed with caution.

Male specificity
When processing mixed samples with DNA from both females and males, the specificity for the male component is critical for a given Y-STR system. The 41-plex Y-STR panel identified one to three reproducible artifacts marked as "OL" from 10 female samples (Supplementary Figure S5), which were out on the bin set and would not affect the correct genotyping profile. Besides, as shown in Supplementary Figures S6  and S7, although there was a general decrease in the peak heights, the 41-plex Y-STR panel could still detect male samples at an extreme mixing ratio of the males and females (1:1 000).

DNA mixture examination
DNA mixtures are usually encountered in daily forensic works. Here, two male donors were used to evaluate the performance of the 41-plex Y-STR panel on DNA mixtures. The genotypes of sample M1 and sample M2 are shown in Supplementary Table S2. They had only eight identical alleles, making them beneficial for mixture profile discrimination. The peak heights of minor alleles declined with an increase in mixing ratio. Complete genotypes of the minor donor at nonoverlapping and nonstutter positions were observed at the mixing ratios of 1:1, 1:3, and 3:1 ( Figure 4). The profiles generated at the mixing ratios of 1:1, 1:3, and 3:1 are shown in Supplementary Figures S8-S10. Allele loss was found at ratios of 1:9, 9:1, 1:19, and 19:1, and the unique minor profile was observed at an average of 98.57% and 90.48%, 70.48% and 60.95%, respectively. These findings indicate that the 41-plex Y-STR panel allows all alleles to be correctly detected for 1:1, 1:3, or 3:1 male:male DNA mixtures.

Inhibition study
Mock inhibition samples with 1 ng of male control DNA 9948 containing varying concentrations of humic acid, nigrosine, and hematin were prepared to evaluate the inhibitor tolerance of the 41-plex Y-STR panel. Complete profiles were observed with ≤60 ng/μL of humic acid, ≤250 ng/μL of nigrosine, and ≤ 200 μmol/L of hematin ( Figure 5). When the concentration further increased, allelic dropouts were observed at the loci with longer fragment size, whereas shorter ones were preferentially detected. These results indicated that the 41-plex Y-STR panel had greater resistance to the PCR inhibitors than the 36-plex Y-STR system [7] and Yfiler ® kit [15].

Case-type sample testing
Apart from blood samples, various biological materials are encountered in routine forensic works. Here, the 41-plex Y-STR panel was applied to several nonhematological case-type samples to evaluate its ability in obtaining reliable genotypes from common case-type samples. The results showed that each sample yielded full genotypes, indicating that the 41-plex Y-STR panel was reliable and suitable for forensic application. The 41-plex Y-STR panel demonstrated good adaptability and could be used for direct amplification of various samples, such as blood cards, saliva cards, FTA cards, and cotton swabs (data not shown), and can also be used for amplification detection of extracted template DNA.

Population study
A total of 595 unrelated males recruited from Chinese Jilin Han and Jilin Korean were detected at 41 Y-STRs and three Y-InDels, which generated 595 distinct haplotypes (Supplementary Table S3). We identified 20 copy number variations (CNVs) and 42 microvariants including 11 single-copy loci (DYS522, DYS570, DYS576, etc.) and three multicopy loci DYS444, DYS447, and DYS449. All samples with variants are co-listed in Supplementary Table S4. The allele frequencies and GDs of all 41 Y-STRs and three Indel markers were calculated (Supplementary Tables S5 and S6), and as shown in Supplementary Figure S11, the number of observed alleles for these 41 Y-STRs ranged from four (DYS391 and DYS437) to 18 (DYS527a/b), and the GD values were distributed from 0.1128 (DYS645) to 0.9732 (DYS385a/b). In addition, compared with the Yfiler ® Plus kit, the GD values of the added Y-STR loci, such as DYS522, DYS444, DYS557, DYF404S1a/b1, DYS447, DYS527a/b, were all >0.6585, indicating that these loci had higher GD. Moreover, all Y-STR markers were used to analyze the forensic parameters in the two populations. According to the different panels of the Yfiler, PowerPlex Y23, and Yfiler Plus amplification systems (17, 23 and 27 Y-STR loci), we calculated the standard forensic parameters (HD, DC, and MP) for 233 Jilin Han samples and 362 Korean samples ( Table 3). The 41-plex Y-STR panel showed that the number of observed unique haplotypes was higher than for the three commercial Y-STR kits and that this new panel provided improved HD and DC. Compared with Yfiler ® , the 41-plex Y-STR panel demonstrated higher power of discrimination, with an increase in DC from 0.973 109 to 1.

Conclusion
The 41-plex Y-STR panel developed in this study is a 6dye multiplex that combines 41 Y-STRs and three Y-Indels, and could be at least complementary to or even substitute the commonly used Yfiler ® kit, PowerPlex ® Y23 System, and Yfiler ® Plus kit. Compared with commonly used Y-STR systems, the co-amplification of 41 Y-STRs and three Y-InDel markers not only enhanced the system effectiveness but also had advantages such as high HD, DC, and data compatibility. In addition, a series of validation studies of the 41-plex Y-STR panel indicated that the panel possessed stable sizing precision, male and species specificity, and high sensitivity.
Overall, these results demonstrate the robustness and validity of the 41-plex Y-STR panel for forensic sampling and indicate that it could be a convenient and reliable tool for distinguishing related males, familial searching, and constructing DNA databases.

Authors' contributions
Siyu Chai and Min Li developed the concept of the research and wrote the original draft of the manuscript. Ruiyang Tao, Ruocheng Xia, Qianqian Kong, Yiling Qu, Liqin Chen, Shiquan Liu, and Chengtao Li collected data and performed the analysis. Pengyu Chen and Suhua Zhang supervised and secured funds for the research. All authors contributed to manuscript revision and approved the final version.