BasePhasing: a highly efficient approach for preimplantation genetic haplotyping in clinical application of balanced translocation carriers

Background Preimplantation genetic testing (PGT) has already been applied in chromosomally balanced translocation carriers to improve the clinical outcome of assisted reproduction. However, traditional methods could not further distinguish embryos carrying a translocation from those with a normal karyotype prior to implantation. Methods To solve this problem, we developed a method named “Chromosomal Phasing on Base level” (BasePhasing), which based on Infinium Asian Screening Array-24 v1.0 (ASA) and a specially phasing pipeline. Firstly, by comparing the number of single nucleotide polymorphism (SNP) loci in different minor allele frequencies (MAFs) and in 2Mbp continuous windows of ASA chip and karyomap-12 chip, we verified whether ASA could be adopted for genome-wide haplotype linkage analysis. Besides, the whole gene amplification (WGA) of 3–10 cells of GM16457 cell line was used to verify whether ASA chip could be used for testing of WGA products. Finally, two balanced translocation families were utilized to carry out BasePhasing and to validate the feasibility of its clinical application. Results The average number of SNP loci in each window of ASA (473.2) was twice of that of Karyomap-12 (201.2). The coincidence rate of SNP loci in genomic DNA and WGA products was about 97%. The 5.3Mbp deletion was detected positively in cell line GM16457 of both genomic DNA and WGA products, and haplotype linkage analysis was performed in genome wide successfully. In the two balanced translocation families, 18 blastocysts were analyzed, in which 8 were unbalanced and the other 10 were balanced or normal chromosomes. Two embryos were transferred back to the patients successfully, and prenatal cytogenetic analysis of amniotic fluid was performed in the second trimester. The results predicted by BasePhasing and prenatal diagnosis were totally consistent. Conclusions Infinium ASA bead chip based BasePhasing pipeline shows good performance in balanced translocation carrier testing. With the characteristics of simple operation procedure and accurate results, we demonstrate that BasePhasing is one of the most suitable methods to distinguish between balanced and structurally normal chromosome embryos from translocation carriers in PGT at present. Electronic supplementary material The online version of this article (10.1186/s12920-019-0495-6) contains supplementary material, which is available to authorized users.


Background
Balanced translocation is one of the most frequent indications for preimplantation genetic testing (PGT), which occurs at an incidence of 1/500 to 1/625 in the general population and even up to 1/20 in the patients who has a history of repeated IVF failure or recurrent miscarriages [1]. Although they often have normal phenotypes, but the risk of producing unbalanced gametes is high (typically approximately 70%) due to the abnormal segregation of rearranged chromosomes during meiosis [2,3]. The unbalanced gametes will lead to apparent infertilities [4], recurrent miscarriages [5,6] or other congenital abnormalities [4,7]. Therefore, it is of great significance to prevent balanced translocations from being passed to the next generation by assisted reproductive technology (ART) and PGT.
The first PGT case for translocations reported in 1998 was used by fluorescence in situ hybridization (FISH) [8]. However, the application of FISH is limited by some technical problems, such as ambiguous signals and complex operation in detecting limited chromosomes [9][10][11]. With the development of technology, aCGH/ SNP array [12,13] and whole genome sequencing [14,15] methods have been employed for balanced translocation detection. Although these traditional PGT methods can clearly identify embryos with chromosomally unbalanced translocation or aneuploidies, they can hardly further distinguish the balanced and structurally normal embryos. Over the past 5 years, researchers tried to use mate pair sequencing [16,17], MicroSeq-PGD [18], MaReCs [19], and long read sequencing [20] to obtain the precise breakpoints of balanced translocation, followed by subsequent identification of normal embryos through PCR-Sanger sequencing or linkage analysis based on the breakpoints. However, it remains unstable and inaccurate to identify the precise breakpoints in the highly repetitive and variable translocation regions.
In our former study [21], we successfully utilized preimplantation genetic haplotyping (PGH) to distinguish balanced and structurally normal embryos prior to implantation for both reciprocal translocation and Robertsonian translocation carriers accurately, along with the genetic screening for all 23-pairs of chromosomes. However, the used SNP microarray contained relatively less SNP loci. In another study [22], the researchers chose Human CytoSNP-12 BeadChip, which also faced the same trouble. Therefore, it remains a challenge to obtain a highly efficient approach to distinguish the normal embryos from those with a balanced translocation karyotype in clinical. In this study, based on our previous PGH technology theory, here we established a new method named "Chromosomal Phasing on Base level" (BasePhasing). More SNP loci (700 K SNPs) bead chip (Illumina Infinium Asian Screening Array-24 v1.0, ASA) was used to perform BasePhasing and a new analysis pipeline was developed for ASA data. BasePhasing was validated by the whole genome amplification result of cell line GM16457, and two balanced translocation families were also analyzed. The accuracy of this method was validated by the conventional amniotic fluid karyotypes in the second trimester.

ASA assessment for BasePhasing
The basic Microarray technical data of ASA and Karyomap-12 was downloaded from Illumina official website (https://www.illumina.com). As we known, the number of informative SNPs is more valuable to linkage analysis, which allows for directly determining the accuracy of haplotype classification. For a SNP to be informative, one parent must have a heterozygous genotype and the other one should have a homozygous genotype. This is limited by specific families and may also be affected by the distribution of SNP frequency. For the same family, SNPs with high MAF are more likely to generate the informative SNP.
Usually, the region within ±2Mbp flanking region of the breakpoints was chosen to avoid misinterpretation from possible recombination events that might occur during meiosis. More importantly, sufficient number of informative SNPs could be obtained in the 2Mbp region to distinguish homologous recombination. Therefore, the whole human genome could be divided into large amounts of 2Mbp segments (called windows). By comparing the number of total SNPs and effective SNPs in each widow, we can get analytical performance of ASA bead chip.

Ethics statement
Written informed consent was obtained from each family and the study protocol was approved by the Ethics Committee for Human Subject research of the Obstetrics and Gynecology Hospital, Fudan University.

Samples preparation and DNA isolation
Cell line GM16457 was obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research, with karyotype 46,XX,del(18)(q22.3) and a known 5.3Mbp deletion in chr18. Two translocation carrier families that would undergo assisted reproductive were enrolled in Shanghai Ji Ai Genetics & IVF Institute in May 2018. Both families had a history of recurrent spontaneous abortion, infertility or pregnancies with chromosome anomalies. The translocation karyotypes were 46,XY,t(1:5)(q21;q35) and 46,XY,t(6;7)(q23;q34) respectively. Five milliliter peripheral blood from each couple and family members was collected at recruitment.
For cell line GM16457 and peripheral blood samples, the high molecular weight DNA was isolated as described in the manufacturers' protocol (DNeasy Blood & Tissue Kit, QIAGEN, Germany).

Single cell preparation and WGA
Three to ten cells of the GM16457 cell line were isolated by micromanipulating under a dissection microscope (Olympus CKX41, Japan) using a finely pulled glass Pasteur pipette. For embryos at the blastocyst stage, three to ten cells were removed from the trophectoderm on day five of embryonic development. The biopsied cells were placed into 0 .2mL PCR tubes in a total volume of less than 4 .0μL. Whole genome amplification (WGA) was conducted by means of multiple displacement amplification (MDA) according to the manufacturer's instructions (Repli-g single cell kit, QIAGEN, Germany). The isothermal amplification was carried out at 30°C for 8 h and followed by enzyme inactivation at 65°C for 3 min.

SNP-array and analysis
The isolated DNA and WGA products were treated according to the manufacturer's instructions (Illumina, San Diego, CA, USA), which were then scanned using an Illumina iScan Bead Array Reader. The microarray scanning results were processed using the B allele frequency and Log R Ratio of Genome Studio software (Illumina) and Karyo Studio software (Illumina) to analyze the copy number of the chromosomes. Genome wide preimplantation genetic haplotyping (PGH) analysis based on Illumina Human Karyomap-12 V1.0 microarray was performed as our previous description [21].

BasePhasing
Based on informative SNPs and chromosomal phasing principles we developed BasePhasing pipeline, which was programmed in Practical Extraction and Reporting Language (Perl), and was capable of obtaining the clear haplotypes of each family member in linkage analysis (Fig. 1). The raw scanning data would be imported into the BasePhasing to produce the accurate chromosomal aneuploidy and haplotype results, once all the family samples were detected by ASA bead chip in a single test. To avoid misinterpretation precious statement, the region within ±2Mbp flanking region of the breakpoints was chosen to analyze the balanced translocation carriers.

ASA basic performance for BasePhasing
ASA bead chip contained approximately 700 K SNPs, more than double what those for Karyomap-12 bead chips. Although half amount of ASA SNPs MAF was lower than 0.1, the total number of high frequency SNPs was more than Karyomap-12 (Additional file 1: Table S1, The total number of windows in the whole genome was 1561, but some regions like centromeres and satellites had no SNP probes distribution, so the effective number of windows was 1476. The number of uniquely mapped SNPs in each window was determined (Table 1). To get enough number of informative SNPs for haplotype analysis, knowing that in typical cases approximately only 10-20% of SNPs were informative SNPs, we considered there should be at least 50 SNPs in each window. In the whole genome, ASA with 98.85% region met the requirements, and the average number of SNPs in each window was up to 473.2.

Evaluation of BasePhasing feasibility
To assess ASA performance with single cell MDA products, we used both gDNA and single cell MDA products of cell line GM16457. Cell line GM16457 gDNA gave call rate of 98.7% and heterozygous call rate of 16.1%. SNP call rates were a little lower in single cells, about 96.3% and heterozygous call rates 14.8% (Table 2). We also compared the SNPs accordance between cell line gDNA and its single cell MDA products. It showed that the accordance rate was up to~97%, not only between gDNA and its MDA products but also between the MDA products themselves. In addition, the known 5.3Mbp deletion in cell line GM16457 was clearly detected in MDA products (Fig. 2b).
Except the SNPs amount and distribution, ASA could also perform well for comprehensive chromosome screening (CCS). The CCS analysis mainly relied on the SNP allele frequency analysis in the whole genome. We gathered and compared 136 previous samples data of bead chip in our laboratory, including 43 samples by ASA and 93 samples by Karyomap-12. We found that there were no differences between ASA and Karyomap-12 on the LogRdev versus %Defects (Fig. 2c), which signified the copy number variation (CNV) performance for CCS.

BasePhasing clinical test
In this study, two balanced translocation families were collected. In family 1, the carrier's parents were normal, therefore their unbalanced embryos were used as reference. In family 2, the translocation was inherited from his mother, therefore paternal grandparent as reference. With our method BasePhasing, we obtained molecular karyotypes from all the 18 biopsied blastocysts. Of the 18 diagnosed blastocysts, 8 were unbalanced, 10 were balanced or normal (such rate might be a little higher than the published data due to the small sample size).
BasePhasing analysis was performed in the 10 blastocysts, which verified that 5 were balanced carriers and 5 were normal embryos. Sample information and Base-Phasing results of the two families are listed in Table 3. Specifically, the results of BasePhasing and PGH were exactly identical (Fig. 3). Such as in family 2, both of the two methods indicated embryo-2, embryo-3 and embryo-5 were translocation carrier embryos and embryo-1 and embryo-4 were structurally normal embryos.
In the meantime, the number of informative SNP loci within ±2Mbp regions flanking the breakpoints and the whole related chromosome was calculated (Table 4). It was clearly shown that there were more informative SNPs in the ASA bead chip than Karyomap-12 both in the related whole chromosomes and breakpoints flanking regions. For example, in chromosome 5 (0-180,915,260), there were 3774 SNPs available on the ASA chip, and the average number of informative SNPs per window was 83.4 (ranging from 13 to 151). While there were only 2732 SNPs available on the Karyomap-12 chip, and the average number of key SNPs per window was 60.4 (ranging from 6 to 118).
We also calculated the number of informative SNP loci of family 2 in the whole human genome. The average number of informative SNPs was 85.9 in ASA bead chip and 59.4 in Karyomap-12 bead chip within ±2Mbp region along. The number of informative SNPs on ASA bead chip was a little bit more (Fig. 4). Based on this property of ASA bead chip, we asserted BasePhasing can   After completing the BasePhasing analysis, embryo E1 in family 1 and embryo E4 in family 2 were transferred back to patients. For the two women that were pregnant after embryo transfer, cytogenetic analysis of amniotic fluid was required to be performed in the second trimesters. It was confirmed that the results predicted by BasePhasing and cytogenetic analysis of amniotic fluid cells were totally consistent.

Discussion
Theoretically, only two kinds of gametes from alternate segregation, one with a normal karyotype and another with a balanced karyotype, can produce a viable conceptus in balanced translocation carriers, the remaining unbalanced gametes from other segregation patterns may lead to repeated miscarriage, infertility or newborns with congenital malformations [23][24][25]. Therefore, balanced translocation carriers are generally suggested to get a successful pregnancy with PGT. In 2016, Treff et al. [26] reported a new method using unbalanced embryos as a reference to distinguish between balanced translocation and normal blastocysts based upon SNP genotype. Recently Zhang et al. [21] and Wang et al. [22] utilized PGH to successfully distinguish between balanced and normal embryos prior to implantation from balanced translocation carriers accurately. Any SNP marker-based data can be applied for PGH analysis only if there are sufficient informative SNPs flanking the breakpoint to establish haplotype and program family linkage analysis. Meanwhile, the comprehensive chromosome screening (CCS) can also be completed with SNP allele frequency analysis in the single test.
In this study, we applied the ASA bead chip in BasePhasing pipeline and got accurate PGT results. By comparing the number of SNP loci in different MAFs and in 2Mbp continuous windows of ASA chip and karyomap-12 chip, it was verified that ASA could be used to perform haplotype linkage analysis of the whole genome. The average number of SNP loci in each window of Compared with previous studies, several advantages of this research could be concluded. First, large numbers of SNPs could be used to perform the chromosome aneuploidies and haplotype linkage analysis, guaranteeing the accuracy of results. Second, without the need of precise translocation breakpoint location and personalized design, our method was universal for any kind of Fig. 3 BasePhasing results of family 2 in the two balanced translocation breakpoints related chromosomes. a BasePhasing results of family 2 in the balanced translocation breakpoint related chromosome 6. The left smooth bar charts were performed with Blue-fuse-Multi software from Karyomap-12 bead chip data, the right scatter bar charts were performed with BasePhasing from ASA bead chip data. b BasePhasing results of family 2 in the balanced translocation breakpoint related chromosome 7. The left smooth bar charts were performed with Blue-fuse-Multi software from Karyomap-12 bead chip data, the right scatter bar charts were performed with BasePhasing from ASA bead chip data. SR region: the breakpoint of balanced translocation (gray box labeled). The different colorful histograms represented different haplotypes. The blue and red histograms represented the father's haplotypes. The orange and green histograms represented the mother's haplotypes. In the embryos, the gray column represented the haplotype that was inherited from the normal parent. And in the carrier's family number, the gray column represented the haplotype that wasn't passed on to the carrier translocation. Third, the bead chip experiment was relatively simple and data analysis was convenient, which was suitable for clinical work. In addition, ASA based BasePhasing had over twice loci while only took about one quarter (22.5%) cost of Kayomap-12 bead chip (Additional file 2: Table S2). So BasePhasing would be one of the most suitable methods for PGT of balanced translocation carriers in clinical application at present. Nonetheless, for performing the analysis, one carrier's family member or an unbalanced embryo should be used as a reference. Therefore, one limitation of our research was that the method didn't apply to these patients both with de novo translocation and without an unbalanced embryo. To our knowledge, no current methods could effectively overcome this difficulty.

Conclusions
Infinium ASA bead chip based BasePhasing pipeline shows good performance in PGT of balanced translocation. With the characteristics of simple operation  procedure and accurate results, we demonstrate that BasePhasing is one of the most suitable methods to distinguish between balanced and structurally normal chromosome embryos from translocation carriers at present. In the meanwhile, we show a referable strategy to effectively expand the newly developed bead chips to clinical application. However, the sensitivity and specificity of BasePhasing should be further validated in a larger sample size. Furthermore, whether BasePhasing could be used to detect other single-gene diseases or other genetic diseases is worth further verifying.

Additional files
Additional file 1: Table S1.