Population data and phylogenetic analysis of 37 Y-STR loci in the Hui population from Yunnan Province, Southwest China

Abstract Background Previous studies of the genetic polymorphism of the Y-chromosomal short tandem repeat (Y-STR) of Huis were focussed on the northwest of China. However, the population genetic characteristics of Chinese Hui residing in Yunnan province, Southwest China, remain unclear. Aim To provide genetic data for 37 Y-STRs in the Chinese Hui population of Yunnan province, as well as to investigate population genetic relationships between the Chinese Hui and another 26 populations from China and neighbouring countries. Subjects and methods In total, 326 unrelated healthy male individuals were genotyped using the GoldeneyeTM Y Plus PCR Amplification Kit. Genetic relationships between different populations were analysed using YHRD’s AMOVA tools. Results A total of 279 haplotypes were detected, out of which 244 were unique. The overall haplotype diversity (HD) and discrimination capacity (DC) were 0.9989 and 0.8611, respectively. The gene diversity (GD) ranged from 0.0544 (DYS645) to 0.9656 (DYS385). Conclusions The population comparison indicated that Muslim populations (Hui, Salar and Uighur) showed significantly more genetic affinity than other populations. Our results could be applied in forensic practice and population genetic studies.


Introduction
The Y chromosome has the characteristics of male-specific inheritance, which makes it useful in paternity identification, especially in paternity screening or determining the male criminal suspect in sexual assault cases (Dekairelle and Hoste 2001;Yin et al. 2022).Y-chromosome-specific short tandem repeats (Y-STRs) are the mainstream genetic markers and are now widely used in forensic casework.Recently, a large number of Y-STR haplotype frequencies and diversities from different groups have been developed, but population data for the Yunnan Hui are rare and the relationship between the Yunnan Hui and other reference populations is unclear.
China is a big country, comprising 56 ethnic groups, and it is important to obtain population genetic data from all groups to shed light on the genetic characteristics and population relationships of these groups.Yunnan is a multi-ethnic province with 25 ethnic groups, and among them 15 groups (Bai, Hani, Dai, Lisu, Lahu, Va, etc.) are unique ethnic minorities.Hui is the third largest and most widely distributed ethnic minority in China, with a population of 11,377,914 according to the 2020 census, which contributed to 0.7943% of the total population of the country.The Hui are mainly scattered in the Ningxia Hui Autonomous Region and Gansu, Qinghai, Henan, Hebei, Shandong and Yunnan provinces and the Xinjiang Uygur Autonomous Region.Historical records show that the Hui originated from migrating Muslim groups from west and central Asia and interacted with other groups for a long time, such as the Mongolians, the Uighur and most of the Han groups.The Huis speak Mandarin Chinese and write in Chinese characters (Lewis 2009).The location of the studied population sample in this paper is shown in Supplementary Figure S1.

Sample preparation
A total of 324 blood samples were collected from unrelated healthy male Hui individuals living in Yunnan province.All participants had ancestors who had lived in Yunnan for at least three generations and signed informed consent prior to participating in this study.This study was approved by the Medical Ethics Committee of Kunming Medical University.Genomic DNA was isolated from FTA cards using the Chelex-100 method (Walsh et al. 1991).

PCR amplification and capillary electrophoresis
Thirty-seven Y-STR loci were amplified using the Goldeneye TM Y Plus (Peoplespot Corporation) STR kit according to the manufacturer's instructions on a GeneAmp PCR system 9700 (Applied Biosystems, USA).The amplified products were detected and separated by capillary electrophoresis on an ABI Prism 3500 Genetic Analyser (Applied Biosystems, USA).The raw data was then analysed using GeneMapper TM ID-X software (Applied Biosystems, USA).The DNA typing and assignment of nomenclature were based on the ISFG recommendations (Bar et al. 1997;Lincoln 1997).Control DNA (9948) and ddH2O were used as positive and negative controls, respectively.The lab has participated and passed the YHRD quality control.The haplotypes obtained in this study have been submitted to the YHRD and the population accession number is YA004738.

Analysis of data
Allele frequencies and haplotype frequencies were estimated by the direct counting method.Single-marker GD was calculated according to Nei with the formula GD = n(1-ΣPi 2 )/ (n-1), where n is the total number of samples and Pi is the relative frequency of the i-th allele (Clegg 1987).Haplotype diversity (HD) was calculated similarly to GD.The discrimination capacity (DC) was performed by dividing the total number of different haplotypes by the total number of observed haplotypes.The analysis of molecular variance (AMOVA) test was carried out to measure the population pairwise genetic distances (Rst) values and associated P values between Hui and the other 26 populations.A two multidimensional scaling (MDS) plot was visualised based on pairwise Rst values.Both AMOVA and MDS were conducted by the YHRD (http://www.yhrd.org)online statistical tool.According to the algorithm of YHRD's AMOVA tools, all haplotypes with intermediate alleles, null-alleles and multiple alleles were removed from this analysis.

Forensic parameters of the 37 Y-STR loci system when applied to the Yunnan Hui population
The distributions of allele frequency and gene diversity values of 33 single-copy Y-STR loci are listed in Supplementary Table S1.A total of 243 alleles were observed and corresponding allelic frequencies ranged from 0.0031 (for alleles occurring only once) to 0.9722 (allele 8 at DYS645).Among 33 single-copy Y-STR loci, DYS449 was the most informative locus with a GD of 0.8921, while DYS645 was the least informative loci with a GD of 0.0544.We have compared the GD values with the data in Hui populations collected from other published data, such as Guizhou Hui (Wang Jing et al. 2021), Ningxia Hui (Zhu et al. 2014), Gansu Hui (Meisen 2019), Liaoning Hui (Guo 2017) and Xinjiang Hui (Li et al. 2020).The GD values of 15 common Y-STR loci did not show obvious differences among these compared populations (see Supplementary Figure S2).The allelic combination distributions of four multi-copy loci are also presented in Supplementary Table S2.All four multi-copy Y-STRs, DYS527, DYF404S1, DYS385 and DYF387S1, had higher GD than most of the single-copy Y-STRs, with diversity values of 0.9501, 0.9261, 0.9656 and 0.9482, respectively.The haplotype distribution is shown in Supplementary Table S3.A total of 279 different haplotypes were found in 324 individuals, out of which 244 (87.46%) were unique, 29 different haplotypes appeared twice, three different haplotypes occurred three times, two different haplotypes were observed in four individuals and one different haplotype was observed in five individuals.The overall HD value was calculated as 0.9989 with a DC value of 0.8611.Microvariant alleles were found at loci DYS448 (19.2),DYS527 (20.2 and 20.2, 21.2), DYS576 (19.1),DYF404S1 (13, 13.2 and 13.2, 14) and DYS385 (13, 14.1).Multiple microvariants were also found at loci DYS458 (20.2,21.2,22.2,23.2 and 24.2), DYS627 (17.2 and 19.2) and DYS518 (36.2 and 37.2) (see Supplementary Table S4).

Genetic relationship between the Yunnan Hui population and reference populations in China or neighbouring countries
Twenty-four different Chinese populations and three Asian populations (South Korea, Northern Thailand and North Vietnam) containing a total of 9,371 haplotypes existing in the YHRD database (see details in Supplementary Table S5) were included to explore the genetic relationships.Twenty-seven populations' pairwise genetic distances (Rst) and associated P-values were calculated with the AMOVA online tool available on the YHRD and the results are shown in Supplementary Table S6.No significant genetic differentiation was observed between Yunnan Hui and other Muslim (including Ningxia Hui, Qinghai Hui, Qinghai Salar and Xinjiang Uighur) ethnic groups (p values < 0.000142, after Bonferroni's correction).Based on Rst values, Sichuan Tibetan is the most distantly related to Yunnan Hui (Rst = 0.2469), followed by Qinghai Tibetan (Rst = 0.1947), while Ningxia Hui is the most closely related to Yunnan Hui (Rst = 0.0006), followed by Qinghai Hui (Rst = 0.0091) and Qinghai Salar (Rst = 0.0099).The STRs from autosomal chromosomes between the Yunnan Hui population and other compared populations also give similar results; Yunnan Hui had genetic affinity with Ningxia Hui and Qinghai Hui populations (Zhang et al. 2022), which was consistent with the historical records of the Mongolian Song war < occurring 770 years ago, prompting a number of Hui people to migrate from Ningxia to Yunnan.The population relationships are displayed with an MDS plot (Figure 1).In the MDS plot, Chinese Muslim (Hui, Salar and Uighur) populations and Mongolian populations had closely related relationships and resided in the horizontal axis of the MDS.A similar conclusion was also observed in Liaoning Hui and Ningxia Hui populations (Zhu et al. 2014;Guo 2017).The result was in line with the historical record that the Hui people had undergone century-long processes of admixture with Salars, Uighurs and Mongolians in China (Bai 2007), indicating that the Yunnan Hui was a mixed population and had a greater opportunity for gene exchange with Salar, Uighur and Mongolian groups than other groups in China.Additionally, the Hui, Salar and Uighur populations have the same faith of Islamism, as well as similar customs and habits, which is in accordance with the proximity relationship in the present study.To reveal more genetic information of the Hui populations, more studies with larger sample sizes and marker sets are needed to further confirm the genetic background and relationship with Salars and Uighurs.

Conclusions
In summary, our study indicates that these 37 Y-STR loci are genetically polymorphic and discriminating in the Yunnan Hui population and could enrich the basic genetic information in forensic application and population genetic studies.MDS analysis demonstrated that the Yunnan Hui population showed stronger genetic affinities with other Chinese Hui from Ningxia and Qinghai and even with other Chinese Muslim populations such as Salar and Uighur.