Next-Generation Sequencing Reveals Novel Genetic Variants (SRY, DMRT1, NR5A1, DHH, DHX37) in Adults With 46,XY DSD

Abstract Context The genetic basis of human sex development is slowly being elucidated, and >40 different genetic causes of differences (or disorders) of sex development (DSDs) have now been reported. However, reaching a specific diagnosis using traditional approaches can be difficult, especially in adults where limited biochemical data may be available. Objective We used a targeted next-generation sequencing approach to analyze known and candidate genes for DSDs in individuals with no specific molecular diagnosis. Participants and Design We studied 52 adult 46,XY women attending a single-center adult service, who were part of a larger cohort of 400 individuals. Classic conditions such as17β-hydroxysteroid dehydrogenase deficiency type 3, 5α-reductase deficiency type 2, and androgen insensitivity syndrome were excluded. The study cohort had broad working diagnoses of complete gonadal dysgenesis (CGD) (n = 27) and partially virilized 46,XY DSD (pvDSD) (n = 25), a group that included partial gonadal dysgenesis and those with a broad “partial androgen insensitivity syndrome” label. Targeted sequencing of 180 genes was undertaken. Results Overall, a likely genetic cause was found in 16 of 52 (30.8%) individuals (22.2% CGD, 40.0% pvDSD). Pathogenic variants were found in sex-determining region Y (SRY; n = 3), doublesex and mab-3–related transcription factor 1 (DMRT1; n = 1), NR5A1/steroidogenic factor-1 (SF-1) (n = 1), and desert hedgehog (DHH; n = 1) in the CGD group, and in NR5A1 (n = 5), DHH (n = 1), and DEAH-box helicase 37 (DHX37; n = 4) in the pvDSD group. Conclusions Reaching a specific diagnosis can have clinical implications and provides insight into the role of these proteins in sex development. Next-generation sequencing approaches are invaluable, especially in adult populations or where diagnostic biochemistry is not possible.

Conclusions: Reaching a specific diagnosis can have clinical implications and provides insight into the role of these proteins in sex development. Next-generation sequencing approaches are invaluable, especially in adult populations or where diagnostic biochemistry is not possible. This article has been published under the terms of the Creative Commons Attribution License (CC BY; https://creativecommons.org/licenses/by/4.0/). Freeform/Key Words: desert hedgehog, DSD, DHX37, sex determination, SRY, steroidogenic factor-1 It is now more than 25 years since SRY was identified as the main testis-determining gene in humans and mice and as a cause of 46,XY differences (disorders) of sex development (DSDs) [1][2][3]. Since then, at least 40 other genetic causes of DSDs have been reported, but the relative contribution to these genes within the clinical setting is poorly documented [4][5][6].
DSDs are usually considered in three broad categories: sex chromosome DSD, 46,XY DSD, and 46,XX DSD [7]. Although the diagnosis of specific forms of sex chromosome DSD and 46,XX DSD can usually be made with karyotyping or biochemical analysis (for congenital adrenal hyperplasia), the specific diagnosis of 46,XY DSD is often more challenging. 46,XY DSD is commonly divided into conditions affecting gonad (testis) development (e.g., complete gonadal dysgenesis, also known as Swyer syndrome) or conditions affecting androgen biosynthesis and action. Proximal genetic blocks in androgen biosynthesis are rare and usually affect the adrenal gland as well as the gonad (steroidogenic acute regulatory protein/STAR), P450 side-chain cleavage/CYP11A1, 3b-hydroxysteroid dehydrogenase type 3/HSD3B2, 17ahydroxylase/CYP17A1, P450 oxidoreductase/POR). More specific defects in testosterone biosynthesis [17b-hydroxysteroid dehydrogenase deficiency (17b-HSD) type 3 (HSD17B3)], in the conversion of testosterone to dihydrotestosterone [5a-reductase deficiency type 2 (SRD5A2)], or in androgen action [androgen insensitivity syndrome (AIS), androgen receptor] can sometimes be suspected on clinical or biochemical grounds, but genetic testing is becoming increasingly important given the overlap in clinical features among various causes [8,9].
To complicate diagnosis further, each condition can have a spectrum of genital phenotypes depending on the underlying variant and other modifying factors. This situation is especially apparent for changes in steroidogenic factor-1 (SF-1; encoded by NR5A1), where clinical and endocrine features ranging from complete gonadal dysgenesis through to hypospadias or male factor infertility can be seen [10,11].
Newer technologies such as high-throughput next-generation sequencing (NGS) now allow the analysis of many genes simultaneously in the DNA from one individual. Panel-based approaches and exome sequencing have been reported recently for the diagnosis of several endocrine conditions such as neonatal diabetes mellitus and primary adrenal insufficiency [12,13], as well as for DSDs [14][15][16][17][18][19][20].
In this study, we report the results of a targeted NGS analysis of a relatively large cohort of individuals with 46,XY DSD who had the working diagnosis of complete gonadal dysgenesis (CGD) (n 5 27) or what we term "partially virilized 46,XY DSD" (pvDSD) (n 5 25), a group that included partial gonadal dysgenesis (PGD) and those with a broad partial androgen insensitivity syndrome (PAIS) label (androgen receptor negative). We analyzed .150 genes that are either established or potential causes of 46,XY DSD, or candidate genes for 46,XY DSD based on our published "atlas" of differential transcriptomic expression during early human embryonic/fetal gonad development [21].

A. Study Cohort
The study was undertaken in a cohort of 52 individuals (46 sporadic, 3 sibling pairs) with 46,XY DSD of unknown molecular etiology attending adult DSD clinics at the University College London Hospitals NHS Foundation Trust (Fig. 1). This adult DSD clinic takes place in the setting of a Department of Women's Health. Therefore, the spectrum of DSD phenotype is predominantly female. Consecutive subjects were approached without clinical selection criteria providing a real-life clinical cohort. These individuals were part of a larger cohort of 400 women with 46,XY DSD, where 17b-HSD type 3 (HSD17B3) (n 5 26), 5a-reductase deficiency (5ARD) type 2 (SRD5A2) (n 5 30), and AIS, androgen receptor (n 5 170) had been prescreened ( Fig. 1) [5,6].
A total of 27 women had a working diagnosis of CGD and mostly presented with femaletypical genitalia and primary amenorrhea/absent puberty in adolescence (Swyer syndrome). They were recruited from a total cohort of 84 women with CGD.
Another 25 individuals presented with atypical genitalia and were brought up as girls with a working diagnosis of PGD or PAIS. They were recruited from a total of 53 women with Figure 1. Overview of the study cohort and summary of findings. this phenotype where the underlying diagnosis was unknown. Historically, many individuals have been labeled under the "umbrella" term PAIS, without genetic or biochemical evidence of the diagnosis. We have termed this group pvDSD. Individuals with adrenal insufficiency or high steroidogenic blocks/congenital adrenal hyperplasia, renal anomalies, or obvious syndromic associations of DSD were excluded.
Informed consent was given by all participants. Research Ethics Committee approval was obtained as part of the Reproductive Life Course Project (IRAS project ID 184646, REC reference 16/LO/0682) or study of the genetics of reproductive biology (07Q050824).

B. Genetic Analysis
DNA was extracted from peripheral blood leukocytes using standard methods and subjected to genetic analysis using the approaches outlined below.

B-1. Sanger sequencing for NR5A1
Analysis of NR5A1 was undertaken using standard Sanger sequencing with primers and conditions reported previously [22].

B-2. HaloPlex ® targeted NGS panel
A HaloPlex DNA targeted gene enrichment panel (Agilent Technologies, Santa Clara, CA) was designed using SureDesign software (www.agilent.com/genomics/suredesign). This panel included 168 known DSD-associated genes as well as potential candidate genes ( Table 1). The panel captured all coding exons and 100 bp of intronic flanking sequence of genes of interest with predicted target coverage .98.87%.
Genomic DNA samples (225 ng) were processed for Illumina sequencing according to the HaloPlex target enrichment system protocol (version D.5, Agilent Technologies) and as described previously [13]. Resultant libraries were then subjected to NGS using an Illumina MiSeq platform (Illumina, San Diego, CA). Raw FASTQ files were analyzed using SureCall (version 3.0.1.4) software (Agilent Technologies). Visual inspection of BAM files was also undertaken to ensure that deletions of key genes were not overlooked.  Underlined genes were sequenced using a Nonacus panel.

B-3. Nonacus ® targeted NGS panel
A further 12 potential known and candidate genes for DSD were analyzed using a Cell3 Target enrichment panel for NGS (Nonacus, Birmingham, UK) ( Table 1). This captured all coding exons, including a 100-bp intronic flanking sequence, with a predicted target coverage of 97.35%. Genomic DNA samples (100 ng) were prepared for Illumina sequencing according to Cell3™ Target, a cell-free DNA target enrichment system for NGS (Illumina sequencers; version 1.2.2 protocol). In brief, DNA was sheared by an enzymatic fragmentation before undergoing end repair and dA tailing. This enables the Illumina unique molecular identifier adapters to be ligated on both 5 0 and 3 0 ends. DNA was then purified using Agencourt AMPure beads to remove residual nonligated adapters. DNA was amplified using primers that bind to the ligated adapter. Libraries were hybridized with the customized probes before undergoing a further bead wash to remove nontargeted DNA. Targeted library DNA sequences were then amplified using primers binding specifically to sequences within the Illumina adapters. A final bead clean-up step was carried out before quantification of captured libraries. These were then sequenced on a MiSeq platform (Illumina). Raw FASTQ files were analyzed using a bioinformatic pipeline provided by Nonacus, and consequent variant analysis was carried out in Ingenuity Variant Analysis software (Qiagen, Valencia, CA). Inspection of BAM files was carried out using the Integrative Genomics Viewer [23].

B-4. Variant validation
Validation of key variants identified was performed using Sanger sequencing. Regions of interest were PCR amplified and sequenced using a BigDye Terminator version 1.1 cycle sequencing kit (Applied Biosystems, Foster City, CA) on an ABI 3130 sequencer (Applied Biosystems), and visualized using Sequencher version 5.2.4 (Gene Codes Corporation, Ann Arbor, MI). Control data for population genomic variation was obtained using the Genome Aggregation Database (gnomAD) (gnomAD, Cambridge, MA, https://gnomad.broadinstitute. org) [24].

C. Molecular Modeling and Simulation of DMRT-DNA Complexes
The molecular models of wild-type (WT) and mutant doublesex and mab-3-related transcription factor 1 (DMRT1) proteins were constructed using MODELLER, within the protein modeling module of Discovery Studio version 2.1 (Accelrys, San Diego, CA) [26]. The human DMRT1 sequence retrieved from the UniProt database (https://www.uniprot.org/uniprot/ Q9Y5R6) was used as a reference, and only residues 70 to 130 were considered for modeling purposes [27]. WT and a p.R80S mutant were modeled using the recently reported crystal structure of human DMRT1 (residues 70 to 130) in complex with DNA (PDB ID 4YJ0) as a template [28].
For each protein model, a set of 100 models was constructed and the best one according to the MODELLER internal PDF score was selected for simulations steps. The DMRT1 WT and p.R80S mutant in complex with DNA systems were generated using VMD version 1.93, by inserting each protein into a 110-3 70-3 80-Å box consisting of a classic TIP3P model for water molecules, neutralized with Na 1 or Cl 2 ions [29,30]. Periodic boundary conditions were imposed in all three directions, and the particle mesh Ewald method was used to account for full long-range electrostatic interactions within the selected boundary condition within a relative tolerance of 1 3 10 26 [31]. The final systems were composed of nearly 60,120 atoms.
Molecular dynamics simulations were carried out with the NAMD version 2.13 simulation package, using the CHARMM36 force field parameters for proteins [32][33][34]. The simulation was started with a brief energy minimization for 5000 steps, followed by 1 ns of heating with protein backbone sequential release of alpha carbon restraints (force constants were gradually reduced from 10 kcal/molÅ 2 to 0 kcal/molÅ 2 ) and 4 ns of equilibration, and 10 ns of production simulation for each protein was performed. The particle mesh Ewald method was used for full long-range electrostatics within a relative tolerance of 1 3 10 26 . A 12-Å cutoff was used to compute nonbonded interactions with a smooth switching function applied at a distance of 10Å. To impose the thermal exchange with an external thermostat, the isobaricisothermal ensemble with a constant number of particles, pressure, and temperature was used. Constant temperature was maintained by coupling the system to a thermal bath whose temperature is maintained via Langevin dynamics with a friction coefficient of 1 ps 21 . Constant pressure was maintained using a Langevin piston at a nominal value of 1 atm [35]. The SHAKE algorithm, with a tolerance of 1 3 10 28Å , was applied to constrain the length of all covalent bonds involving hydrogen, thus allowing the use of a 2 fs integration time. Trajectory analyses and measurements were performed using VMD version 1.9.3 [29]. By plotting Ca-root-mean-squared deviation along the molecular dynamic simulation, we assessed the structural equilibration reached by the models and the distances between the terminal heavy atom (Cz in Arg80 or O in the case of Ser80) and the nearest oxygen on the phosphate backbone. Figures were rendered using PyMol version 2.3.2 (Schrödinger, New York, NY; https://www.schrodinger.com).

D. Transient Gene Transcription Assays
Expression vectors containing the NR5A1/SF-1 p.G22D or p.L420P missense variants were generated by site-directed mutagenesis (QuikChange, Stratagene, Amsterdam, Netherlands) using a WT pCMX-NR5A1 template and validated by direct sequencing. Transient transfection studies were performed using Lipofectamine 2000 (Invitrogen, Paisley, UK) in 96-well plates and a Dual-Luciferase reporter assay system (Promega, Madison, WI), as reported previously [22]. Studies were performed in tsa201 human embryonic kidney cells by transfecting empty, WT, or mutant NR5A1/SF-1 expression vectors (2 ng per well; p.G22D, p.L420P) with the SF-1-responsive minimal promoter of Cyp11a linked to luciferase (100 ng per well). Twenty-four hours following transfection, cells were lysed and luciferase activity was assayed (Dual-Luciferase reporter assay system, Promega; FLUOstar Optima, BMG Labtech, Aylesbury, UK), with standardization for Renilla coexpression. Results are shown as the mean 6 SEM of three independent experiments, each performed in triplicate.

Results
Overall, a likely pathogenic variant was found in 16 individuals in the cohort (16 of 52, 30.8%).  Table 2) Three individuals had hemizygous pathogenic variants in SRY sex-determining region Y (SRY), with two changes affecting codons in the high-mobility group box (p.R62P, p.N65D) and one variant being a novel, complex insertion-deletion affecting the native stop codon of SRY (Table 2; Fig. 3A and 3B).  One individual with CGD was found to have a heterozygous p.R80S variant in the first zinc finger of DMRT1 (Table 2; Fig. 4A). This variant is predicted to be damaging and was not found in the gnomAD database (Table 2). To gain insight into the potential effect this variant has on DMRT1, detailed molecular dynamics simulation of both WT DMRT and the p.R80S mutant was performed. The R80 residue interfaces with DNA, binding through a hydrogen bond (H-bond) interaction with the phosphate backbone of thymidine 16 ( Fig. 4B and 4C). The p.R80S mutant loses H-bond contacts with this phosphate backbone of DNA and displays longer distances compared with WT (7.63 6 0.36Å vs 4.34 6 0.18Å) during simulation time ( Fig. 4C and 4D). The electrostatic potential of the surface does not change significantly (Fig. 4C).

A. Complete Gonadal Dysgenesis
One woman diagnosed with CGD was found to harbor a heterozygotic p.A280E variant in helix 3 of the ligand-binding domain (LBD) of SF-1 (NR5A1) (Table 2; Fig. 5A). Another woman with primary amenorrhea and absent uterus was found to have a homozygous p.F242L variant in DHH (Table 2; Fig. 6A).

B. pvDSD
A molecular diagnosis was attained in a greater proportion of individuals with a clinical phenotype of pvDSD (10 of 25; 40.0%) compared with CGD (Figs. 1 and 2; Table 2).
Five pathogenic variants, all in the heterozygous state, were found in NR5A1 (p.G22D, p.R281C, p.G328R, p.E367Sfs*15, p.L420P) (Table 2; Fig. 5A). These include four missense changes and a frameshift change affecting codons in the LBD. Transient transfection studies confirmed that two of these NR5A1 missense variants (p.G22D and p.L420P) likely cause loss of function (Fig. 5B), whereas the other two variants affect codons previously shown to be disrupted in individuals with DSD.
One woman with partial virilization and absent Müllerian structures was found to have compound heterozygous (p.A227V/p.R245P) variants in DHH, affecting a fairly localized region within the C-terminal domain close to the p.F242 codon described above (Table 2; Fig.  6A). Using IHC, we showed strong expression of DHH in interstitial cells of the testis just after the establishment of androgen biosynthesis in the human fetus (9 weeks postconception), with weaker expression in the Sertoli cells of primitive seminiferous tubules (Fig. 6B).
Finally, four individuals had pathogenic heterozygous variants in DEAH-box helicase 37 (DHX37). Three of these affect a p.R308Q hotspot, and these have recently been reported  (Table 2; Fig. 7) [40]. One additional variant (p.T477M) was identified, which affects a highly conserved amino acid in the Rec-A2 motif IV RNA-binding region and is predicted to be damaging and disease causing.

C. Candidate Genes for DSD
Analysis of candidate genes for DSD based on our experience of transcriptomic profiling did not reveal any clear pathogenic variants in those genes analyzed (Table 1). No likely pathogenic changes were found in the three sibling pairs studied.

Discussion
Reaching a specific diagnosis in 46,XY DSD can be challenging, as there are many different potential causes and, historically, these may have been grouped under umbrella terms such as Swyer syndrome (CGD) or pvDSD [41].
Although many of these patients have classic conditions such as 17b-HSD, 5ARD, and AIS, individuals may not have been fully investigated initially, and making a diagnosis can be difficult when the testes have been removed [5,6]. Reaching a specific diagnosis can sometimes have benefits for understanding the natural history of a condition, identifying associated features, defining likely inheritance and chances of other family members being affected, and in the long term for understanding tumor risk [8]. Therefore, a molecular approach to diagnosis is becoming increasingly important in both research and clinical settings [4,8,20].
In this study, we report our analysis of a large cohort of 46,XY women with DSD from a single center. Using targeted genetic analysis and biochemical profiling (e.g., urine steroid profiling by gas chromatography-tandem mass spectrometry), those with 17b-HSD, 5ARD and AIS had been previously screened out, leaving a cohort of .50 women who were recruited with a working diagnosis of CGD or pvDSD [5,6]. This is a typical scenario in clinical practice and will be increasingly common as adult DSD services, and teams are established to provide better long-term follow-up and support in a multidisciplinary setting [42][43][44].
Using an NGS targeted sequencing approach of known and candidate genes, we were able to reach a likely molecular diagnosis in 30.8% of the cohort as a whole, with a higher proportion of genetic diagnoses being reached in the pvDSD group (40.0%) compared with the CGD group (22.2%). This is similar to the other limited data available, but importantly note, that many studies to date have included individuals with conditions such as 17b-HSD, 5ARD, and AIS, and they have been less stringent than the current study that focuses on a cohort of adults with undiagnosed DSD.
Within our study cohort we found likely pathogenic variants in five different genes (SRY, DMRT1, NR5A1, DHH, DHX37), with some overlap in presentation between the two broad diagnostic groups. Our findings also provide insight into molecular aspects of these conditions.
As expected, hemizygous variants in SRY were found in three women with CGD who had presented with absent puberty and primary amenorrhea in adolescence. The prevalence of 11.5% of CGD is consistent with previous data and, as expected, the two missense variants (p.R62P, p.N65D) affect amino acids in the high-mobility group box of SRY, potentially in the region of a nuclear localization signal or calmodulin-binding motif [45,46]. Of interest, a third individual had a complex insertion-deletion event that removes the native stop codon of SRY and is predicted to result in the translation of a protein with an additional seven amino acids at the C-terminal end (p.L204fs*211 p.L204PLDKANG*). This alteration was not defined clearly on NGS sequencing, and a Sanger sequencing approach was needed to clarify the exact changes (Fig. 3B).
One individual with CGD was found to harbor a heterozygous p.R80S variant in the first zinc finger of DMRT1. DMRT1 has been known for several years to play a key role in sex determination in different species, but its role in human sex development beyond the 10q23 deletion syndrome is less clear [47]. More recently, a variant in DMRT1 (p.R111G) has been reported in association with testicular dysgenesis that affects the DMRT1 recognition helix [28]. Using a similar approach, it is likely that the p.R80S variant affects interactions between DMRT1 and the minor groove of DNA (Fig. 4B-4D). As well as being located in a key domain, this variant is highly conserved among species, is predicted to be damaging, and is not present in population databases. Using molecular modeling and simulation, we have shown that substitution of the larger, polar arginine residue with the smaller serine disrupts an H-bond between DMRT1 and DNA. It is expected that an additive effect on the loss of affinity that will depend on the stoichiometry of the DMRT1-DNA complex occurs, resulting in the final phenotypic effect. Of note, Wang et al. [14] recently reported a p.Y84C variant in DMRT1 associated with DSD, but concluded that the significance was unknown.
Six individuals in our cohort were found to have heterozygous missense (n 5 5) or frameshift (n 5 1) changes in SF-1 (NR5A1). SF-1/NR5A1 is a nuclear receptor transcription factor that was originally shown to be important in adrenal and gonad development [48]. However, it has emerged in the past 15 years that variants in SF-1/NR5A1 are one of the most common causes of 46,XY DSD, with phenotypes ranging from a complete gonadal dysgenesis scenario through various degrees of virilization and hypospadias to male factor infertility [10,11,22,49]. Most changes occur de novo or in a sex-limited dominant pattern, although primary ovarian insufficiency can also occur [50].
The prevalence of SF-1/NR5A1 variants in our cohort (6 of 52, 11.5%) is similar to other studies, with an enrichment seen in the pvDSD subgroup (19.2%) [22]. One of the probands had a family history consistent with sex-limited dominant inheritance. One of the variants discovered (p.G22D) affects the DNA-binding domain of SF-1/NR5A1 and was shown to impair SF-1 function, as did the p.L420P variant in the LBD (Fig. 5B). The other missense variants identified (p.A280E, p.R281C, p.G328R) have either been reported previously or affect previously disrupted codons, and all are predicted to be damaging [10,[51][52][53]. Although it is not clear whether SF-1 has a true biological ligand, crystallization has shown that codons 280/281 form a key region of helix 3 that interacts with corepressors such as NR0B1/NR0B2 (DAX-1, SHP) [54]. Our findings confirm that variants in SF-1/NR5A1 are still the most prevalent cause of 46,XY DSD currently known.
Another key gene emerging as a relatively prevalent cause of DSDs is DHH [55][56][57][58][59][60][61]. We identified DHH changes in two individuals with a recessive form of DSD who had homozygous (p.F242L) or compound heterozygous (p.A227V/p.R245P) variants, and with differing phenotypes. Disruption of DHH is sometimes associated with minifascicular neuropathy and was thought potentially to affect testicular interstitial/Leydig cell development in mice and humans, as well as Leydig-Sertoli cell interactions [56,59,62]. Furthermore, we have shown by IHC that DHH is expressed predominantly in developing Leydig cells in the human fetal testis at a critical early stage of development just after the onset of steroidogenesis (9 weeks postconception) (Fig. 6B) [21]. However, DHH may also play a role in Sertoli cells and Leydig-Sertoli cell interactions, as areas of gonadal dysgenesis or streak gonads have been reported and Dhh is expressed in mouse Sertoli cells in single-cell RNA sequencing, and DHH expression was seen in Sertoli cells, albeit at a lower level (Fig. 6B) [56,59,63,64]. Consistent with this variability, one woman had a CGD phenotype with a functional uterus and streak gonads, whereas the other had partially virilized genitalia and no Müllerian structures. The variants identified are located within a fairly localized region of DHH, but relatively little is known about the structural or biological effects of these changes.
Finally, four individuals (7.7% overall; 15.4% of pvDSD) in our cohort had heterozygous disruptive variants in DHX37. DHX37 is an RNA helicase and predicted ribosomal RNAbinding protein, but its exact biological function in testis development is still unclear. Variants in DHX37 are emerging as a relatively prevalent cause of a range of DSD phenotypes, including vanishing testis syndrome [40,65]. Three individuals in our cohort were found to have a recurrent p.R308Q variant in the RecA1 motif, and they have been included in a recent series describing the role of DHX37 in 46,XY DSD [40]. The p.R308Q variant is often de novo and has been found in diverse ancestral backgrounds. One additional person in our cohort with mild virilization and an absent uterus was found to have a p.T477M variant located within the RecA2 motif IV involved in RNA binding (Fig. 7). This variant occurs in a highly conserved codon/region and is a predicted disruptive event. Future studies are needed to elucidate the role of DHX37 in sex development, but the association of variants in this gene with 46,XY DSD is becoming well established.
Although we did not expect to discover variants involved in proximal steroidogenic blocks with adrenal or biochemical phenotypes (e.g., STAR, CYP11A1, HSD3B2, CYP17A1, POR, CYB5A) or associated with specific features (e.g., SOX9, campomelic dysplasia; GATA4/ ZFPM2, cardiac; WT1, renal), we did not find likely pathogenic variants in other DSDassociated genes such as MAP3K1, SOX8, ESR2, or ZNRF3. Furthermore, no clear pathogenic variants were found in candidate genes based on our transcriptomic studies of genes expressed in early human testis development [21]. In these studies, we used a modeling approach to identify genes that were either upregulated in a time-series data set with similar patterns to SOX9 (e.g., CITED1), or that were differentially expressed fetal testis genes or potential novel components of steroidogenesis. Although it might be expected to find variants in some of these candidate factors, in fact no clear causative defects were identified.
This study has certain limitations. First, a panel-based approach means that this study is more effective at identifying the relative prevalence of changes in known genes in this cohort. Whole-exome and whole-genome sequencing approaches are better for gene discovery. As costs reduce, analysis becomes more robust and more accurate counseling can be provided about the risk of incidental findings, these technologies are becoming the first-line strategy in many situations, especially where multiple family members are affected. It remains possible that nongenomic events such as methylation defects or complex gene-environment interactions may cause some forms of DSD, especially complete testicular dysgenesis where the diagnostic rate is still relatively low. Second, functional assays are not well established for several of these factors, although we think that the genetic evidence for causality presented here is strong. Finally, studying a predominantly adult cohort means that access to historic data or family members is limited. However, it is increasingly being recognized that adult DSD services need to be established in parallel with multidisciplinary pediatric services, so this work provides useful insight into the range of diagnoses that might be made.

Conclusion
NGS approaches are improving the diagnostic yield in individuals with complete and partial forms of gonadal dysgenesis, or in those who have been labeled as having "partial androgen insensitivity" previously. Reaching a specific genetic diagnosis can inform genetic counseling, particularly for families with ongoing consanguinity, and can help to identify associated comorbiditites. Lastly, as we gain more information on the life course of adults with DSD, so we can seek correlations between genotype and phenotype especially with regard to late onset features such as osteoporosis.