Three‐Dimensional Model of Human Nicotinamide Nucleotide Transhydrogenase (NNT) and Sequence‐Structure Analysis of its Disease‐Causing Variations

ABSTRACT Defective mitochondrial proteins are emerging as major contributors to human disease. Nicotinamide nucleotide transhydrogenase (NNT), a widely expressed mitochondrial protein, has a crucial role in the defence against oxidative stress. NNT variations have recently been reported in patients with familial glucocorticoid deficiency (FGD) and in patients with heart failure. Moreover, knockout animal models suggest that NNT has a major role in diabetes mellitus and obesity. In this study, we used experimental structures of bacterial transhydrogenases to generate a structural model of human NNT (H‐NNT). Structure‐based analysis allowed the identification of H‐NNT residues forming the NAD binding site, the proton canal and the large interaction site on the H‐NNT dimer. In addition, we were able to identify key motifs that allow conformational changes adopted by domain III in relation to its functional status, such as the flexible linker between domains II and III and the salt bridge formed by H‐NNT Arg882 and Asp830. Moreover, integration of sequence and structure data allowed us to study the structural and functional effect of deleterious amino acid substitutions causing FGD and left ventricular non‐compaction cardiomyopathy. In conclusion, interpretation of the function–structure relationship of H‐NNT contributes to our understanding of mitochondrial disorders.


Introduction
Nicotinamide nucleotide transhydrogenase (NNT, MIM #607878), a widely expressed integral protein of the inner mitochondrial membrane [Arkblad et al., 2002], catalyzes the transhydrogenation between NADH and NADP+ and the proton translocation across the mitochondrial membrane. Because of its role in maintaining the redox balance, NNT has a crucial role in defence against oxidative stress [Circu and Aw 2010]. Variations in NNT have recently been reported in patients with familial glucocorticoid deficiency (FGD, MIM#614736), a rare, life-threatening condition [Meimaridou et al., 2012;Yamaguchi et al., 2013;Novoselova et al., 2015] and in one patient with combined adrenal failure and testicular adrenal rest tumor [Hershkovitz et al., 2015]. Moreover, a reduction in human NNT (H-NNT) activity has been reported in the heart of patients with heart failure [Sheeran et al., 2010;Bainbridge et al., 2015] and knockout animal models suggest a role for Nnt in impaired insulin secretion, diabetes mellitus, and obesity [Freeman et al., 2006;Rydström 2006;Andrikopoulos 2010;Heiker et al., 2013]. Studies in mice with Nnt deletion and acute pulmonary infection by Streptococcus pneumoniae suggest that Nnt can also be a regulator of the macrophage-mediated inflammatory response and the defence against pathogens [Ripoll et al., 2012].
The increase in reactive oxygen species resulting from perturbation of the redox balance (oxidative stress) can lead to cell damage and apoptosis [Circu and Aw 2010], which contribute to aging, neurological disorders [Uttara et al., 2009], and cancer [Reuter et al., 2010]. NNT is, thus, a strong candidate in the search for novel genes involved in the pathogenesis of the above conditions. Several novel genetic variations are likely to be identified in NNT in the near future because of sequencing efforts such as the 100K genomes project. Understanding the structure and function of NNT is, therefore, of paramount importance in the prioritization and characterization of novel NNT genetic variations.
NNT exists as a homodimer [Pedersen et al., 2008]. Each monomeric NNT is characterized by three domains: the NAD(H) binding domain (domain I), the transmembrane domain (domain II), which is responsible for anchoring NNT to the mitochondrial inner membrane and for proton translocation and the NADP(H) binding domain (domain III). Domain I and III are located on the matrix side in mitochondria and on the cytoplasmic side in bacteria [Cotton et al., 2001]. Hydride transfer between NAD(H) and NADP(H) occurs at the interface between domains I and III and is coupled with proton translocation across the proton canal  [Pedersen et al., 2008]]. The crystal structure of domain I, with and without NAD(H), has been determined in Rhodospirillum rubrum [Prasad et al., 2002] and Escherichia coli [Johansson et al., 2005], whereas that of domain III in bovine [Prasad et al., 1999] and human transhydrogenase . Moreover, a solved structure of domain I-III complex was determined in R. rubrum [Cotton et al., 2001]. A recent breakthrough has been the crystallized 6.9Å structure of the Thermus thermophilus transhydrogenase (Tt-NNT) homodimer [Leung et al., 2015], which allowed clarification of how the three domains are positioned in relation to one another. Moreover, it showed the 180°flipping of domain III relative to domain I, which allows the NAD(H) and NADP(H) binding sites to come in close contact and exchange hydride.
We present a structural model of the H-NNT protein and a structural analysis of NNT deleterious amino acid substitutions causing FGD. Moreover, we present the results of a comparative sequence analysis among different species and structural annotation of evolutionary conserved motifs for the purpose of identifying residues, which are unlikely to tolerate substitutions.
A structural model of NNT domains was generated using Phyre2 prediction program [Kelley et al., 2015]. Although the structure of the holo Tt-NNT was available [Leung et al., 2015], its resolution is very low and not appropriate for homology modeling. We, therefore, first modeled individual domains using the best template structures available. Modeling of individual domains was performed automatically using Phyre2 [Kelley et al., 2015], as detailed in Supp. Methods. A list of all templates used to generate the H-NNT model is presented in Table 1. The amino acid percentage relative surface accessibility area (rASA) was calculated by dividing its total surface area with that in the extended conformation (φ = = 180°) of the Gly-X-Gly tripeptide. Residues were defined as solvent accessible if rASA > 7%, otherwise buried [Worth and Blundell 2010]. Residues were assigned to interface when the distance between at least one atom on two different interacting proteins was within 5Å. Interface residues were defined as "interface core" when becoming buried upon interaction, otherwise they were "interface rim" [Chakrabarti and Janin 2002]. Disordered regions were predicted using Diso-pred2 prediction program [Ward et al., 2004]. 3DLigandSite was used to predict residues interacting with dinucleotide coenzymes [Wass et al., 2010].
For structural analysis, the following structural elements were considered: (i) salt bridges, defined as at least one pair of atoms on oppositely charge groups within a 4.5Å distance; (ii) hydrogen bonds (H-bond), defined as a donor-acceptor distance ࣘ2.5Å and an angle at the acceptor ࣙ90°; (iii) pi-pi stacking, defined as an interaction between two aromatic rings, where the maximum distance between the ring centroids is <4.4Å and the angle between ring planes <30°(face-to-face) or the distance is <5.5Å and >60°a ngle <120°(edge-to-face); (iv) disulfide bridge (S-S bridge) defined as the side chains of two cysteines at a 3.0Å distance. Pairs of cysteines at a greater distance were also considered as potentially forming an S-S bridge when their Cα-Cα distance was <10Å. We used Cα distance to allow for errors in side-chain placement and a relatively high threshold to accommodate possible deviation of the backbone from the native.
The NNT electrostatic potential was calculated using PBEQ program [Jo et al., 2008], which computes the protein electrostatic potential by solving the Poisson-Boltzmann equation. The presence of a signal peptide was investigated using SignalP 4.1 [Petersen et al., 2011]. The presence of a mitochondrial targeting sequence was predicted using TargetP 1.1 [Emanuelsson et al., 2007] and Predotar [Small et al., 2004]. Three variant phenotype prediction servers SIFT [Kumar et al., 2009], Polyphen2 [Adzhubei et al., 2010] and Suspect [Yates et al., 2014] were used to assess the damaging nature of single amino acid variations (SAVs). The logo was generated using Weblogo [Crooks et al., 2004].
The amino acid sequences for NNT homologs were retrieved from Uniprot and Ensembl. Homologs were selected among those with high-coverage sequencing and complete genome assembly, from species spanning a broad evolutionary distance. For bacterial genomes, the sequences of the different elements of the operons were concatenated. The sequences were aligned using the G-INS-i algorithm implemented in the MAFFT package (global-global alignment) [Katoh and Standley 2013]. The alignment robustness was checked using GUIDANCE2 [Penn et al., 2010]. Amino-acid conservation profiles were analyzed using ConSurf . The sequence alignment was generated using ESPript 3.0 [Robert and Gouet 2014]

Overview of Human NNT
H-NNT is encoded by a single gene, whereas bacteria transhydrogenases are encoded by two (E. coli, UniProt IDs P07001 and P0AB67, GenBank: CAB37089.1 and CAB37090.1, respectively) or three genes (T. thermophilus, UniProt IDs Q72GR8, Q72GR9, Q72GS0, GenBank: AAS82122.1, AAS82121.1, AAS82120.1, respectively). Nevertheless, the structural organization of the NNT protein is similar across different species, with domain III always at the Cterminal part of the protein [Cotton et al., 2001;Leung et al., 2015].  H-NNT is comprised of three domains ( Fig. 1]: the N-terminal domain (domain I: residues 1-474) located in the mitochondrial matrix and containing a predicted mitochondrial targeting peptide (1-39 aa) and the NAD binding site; a transmembrane domain (domain II: residues 475-880), which spans the inner mitochondrial membrane and a C-terminal domain (domain III: residues 881-1086), which contains the NADP binding site.
The structural model of NNT was built by template-based modeling. A total of 924 (85%) of H-NNT residues were modeled (for 62 of these residues no coordinates were found in the template and their predicted secondary structure was automatically generated by Phyre2). For 162 residues, no template was available and no secondary structure was generated. Forty-one of these residues were predicted disordered (the detailed sequence-structure alignment is presented in Supp. Table S1). Each of the three domains was modeled using a different structural template (detailed in Table 1] and the structure of the individual domains is discussed separately (a multiple sequence alignment (MSA), between NNT and its close homologues is presented in Supp. Fig. S1 and the degree of conservation of the amino-acid sites between NNT and its close homologues is presented in Supp. Table S2). The Phyre2 confidence score, which reflects the probability that the chosen template is correct, was 100% in all cases.

NNT domain I
The 3D structure of domain I (residues Val59-Lys442) was modeled using the NAD transhydrogenase experimental structures of E. coli (PDB: 2bru, sequence identity with human sequence = 56%). This was preferred to that of T. thermophilus (PDB: 4izh, sequence identity = 41%) because of the higher sequence identity to H-NNT. The quality of the model was overall good (PROCHECK Ramachandran statistics: 1% in disallowed regions, 85% residues in favored regions and 14% in additionally allowed regions; ProSA Z-score -8.48, which is within the range of scores typically found for native proteins of similar size; Supp. Fig. S2). Superposition between H-NNT domain I and E. coli NNT template demonstrated that the two structures have the same overall conformation with minimal structural differences (r.m.s.d. = 0.60Å).
Several salt bridges, which are likely to contribute to structural stabilization of domain I, were identified: Glu64-Lys127, Lys127-Glu141, and Asp312-Lys338, which are also present in the E. coli template structure and Arg254-Glu310 and His210-Glu361, which appear to be specific to H-NNT. A structural analysis demonstrated the presence of a beta-alpha-beta fold (Rossmann fold), which is typically present in domains binding dinucleotide coenzymes [Kutzenko et al., 1998]. In particular, residues 234-239 formed the consensus sequence Gly-X-Gly-X-X-Gly located at the beta-alpha junction of the beta-alpha-beta fold [Hanukoglu 2015].
NAD was modeled with H-NNT domain I using 3Dligandsite and residues that are predicted to interact with NAD are presented in Figure 2 and Supp. Figure S3. The conformational changes occurring at the NAD binding site (as described below), do not allow excluding that additional H-NNT residues may participate in NAD binding. Amino acids 319-323 (residues LIPGK, which are located within a loop) and Arg182 are conserved between human and E. coli, as shown by sequence alignment (Supp. Figures S4A and S4B). Studies in E. coli show that these residues are crucial for NNT function. Comparison of experimental structures of E. coli domain I, determined both in the absence and presence of NAD and NADH, demonstrate that the side chains of Arg120, Leu257, and Pro262 (corresponding to Arg182, Leu319, and Pro321 in H-NNT, respectively) adopt different conformations according to the presence or absence of NAD and NADH. When domain I is bound to NAD, the side chain of Arg120 points toward NAD and makes strong interaction with it, whereas in the Apo Domain I (with no substrate) and in the Domain I-NADH complex, the side chain points in the opposite direction, away from NADH. Similarly, Leu257 and Pro262 can adopt at least three different conformations, according to presence of NAD or NADH [Johansson et al., 2005]. Residues 181-194 form the so-called RQD loop. Residues Arg182, Val183, Thr184, and Gln187 are evolutionarily conserved and are predicted to interact with NAD (Fig. 2). In addition, Asp190 is located at 3.8Å from NAD C4 atom and could participate in NAD binding. Ser193 is also an invariable residue and is part of the enzyme catalytic site, which also involves the invariable Arg182, Gln187, and Asp190 (Supp. Fig. S5).
Residues 211-229 are arranged in a beta hairpin fold. Residues Phe211 and Phe215 are evolutionarily conserved, as demonstrated by the MSA and are predicted to form stacking interactions, thus, stabilizing the beta hairpin structure. The beta hairpin fold is a conserved structural feature of NNT as it is present in bacterial transhydrogenases. This suggests that the beta hairpin fold is involved in human NNT catalytic activity, similar to what is observed in bacteria [Johansson et al., 2005]. In addition, the beta hairpin fold participates in domain I homodimerization, as detailed below.
The last 35 amino acids of domain I (residues 437-472) are predicted to be disordered and no alpha helices or beta strands regions were predicted. This short polypeptide is likely to represent a linker between domains I and II. The linker is likely to be flexible, with the HUMAN MUTATION, Vol. 37, No. 10, 1074-1084, 2016 exception of its initial sequence (residues Pro437, Arg438, Pro439, Thr440, and Pro441), which is predicted to have a rigid structure. The MSA showed that residues P437, A438, and P439 are highly conserved among different species.

NNT domain II
Different species have different numbers of the transmembrane α helices (TM) in the transmembrane domain (domain II): 14 TMs in H-NNT, 13 TMs in E. coli, and 12 TMs in Tt-NNT. H-NNT domain II (residues Leu496-Met581 corresponding to TMs 2-4 and residues Asn618-Ile888 corresponding to TMs 6-14) was modeled using the Tt-NNT structure (PDB 4o9p, sequence identity to human protein = 33%; r.m.s.d. between template and model = 0.63Å). This allowed modeling of 349 out of 405 (86%) residues. TM1 and TM5 were not modeled due to the presence of only twelve TM helices in Tt-NNT (a cartoon of H-NNT domain II is presented in Supp. Fig. S6). The model accuracy was overall good (PROCHECK: Ramachandran statistics: 0% in disallowed regions, 94% residues in favored regions and 5% in allowed regions; ProSA Z-score -4.89, which is within the range of scores typically found for native proteins of similar size; see Supp. Fig. S7).
One of the most important features of domain II is the presence of the proton translocation canal. In T. thermophilus, the proton canal is formed by TM helices 2, 3, 7, 8, 11, 12, and has a hexagonal topology [Leung et al., 2015]. The same hexagonal topology is seen in the H-NNT proton translocation canal, which is formed by TM helices 3, 4, 9, 10, 13, and 14.
Mutagenesis studies in E. coli have identified residues within the proton canal that are essential for proton translocation through the mitochondrial membrane. Structural alignment between the human model and the bacterial templates allowed mapping and identification of these functionally and structurally important residues in H-NNT. Residues His707 (in TM9), Ser756 (in TM10), and Asn839 (in TM13) form part of the canal's interior face. They are highly conserved and correspond to residues His89, Ser139 and Asn222 in E. coli, respectively. Mutational studies in E. coli showed that these residues are essential for proton translocation activity [Holmberg et al., 1994;Bragg and Hou 2001]. This crucial function is likely to be retained by the corresponding residues in H-NNT. Alignment of TM helices forming the proton canal showed a high level of conservation at sequence and structural level. The conserved His524 (H450 in E. coli) is part of the loop between TM2 and TM3, which is located in the mitochondrial matrix. In E. coli, its substitution results in a markedly reduced proton translocation activity [Holmberg et al., 1994]. An additional residue, which, when substituted, inhibits the enzyme function possibly by altering its structure, is the conserved Met876 (Met259 in the beta subunit of E. coli) [Karlsson et al., 2003]. Although the majority of substitutions result in an inactive enzyme, it is worth noticing that substitution of the conserved Ser867, Ser868, and Ser873 (Ser250, Ser251, and Ser256 in E. coli, respectively) located in TM14 have been shown to enhance NNT activity possibly through an allosteric effect mediated by TM14 on NADP binding [Yamaguchi and Stout 2003;Karlsson et al., 2003].
Inhibition of NNT function can also result from amino acid substitutions that affect the tight packing of domain II alpha helices, which is driven by the interaction between the side chains of glycine and isoleucine/valine (the so called "groove-ridge system"). Because of its special role in the packing of helices, glycine substitution with any other amino acid can result in profound structural modification and NNT malfunction. Accordingly, in vitro studies have demonstrated that substitution of the invariant Gly711 (TM9), Gly749 (TM10), Gly755 (TM10), Gly843 (TM13), Gly850 (TM13) impairs proton translocation and transhydrogenation activity (Gly95, Gly132, Gly138, Gly226, Gly233 in E. coli, respectively) [Yamaguchi and Stout 2003]. Moreover, substitution of the invariant Gly862, Gly866, and Gly869 (Gly245, Gly249, and Gly252 in beta E. coli, respectively), which form the last TM helix (TM14, which completes the proton translocation canal hexagonal structure), has been shown to reduce NNT function [Yamaguchi and Stout 2003;Karlsson et al., 2003], possibly through disruption of the grooveridge system, which contributes to tight packing of TM14 against TM13 in human and bacterial transhydrogenases.
A partially disordered linker connects domain II and III (residues 880-912, of which residues 893-912 are predicted to be disordered). The disordered nature of the linker allows the 180°flip of domain III relative to domain I and II described by Leung et al. (2015). Of great interest are the highly conserved residues Asp830 (located in the cytoplasmic loop of domain II, between TMs 12-13, at the opening of the proton canal toward the mitochondrial matrix) and Arg882 (located within the linker). These two residues form a salt bridge, which is a conserved feature across species (a salt bridge is formed by equivalent residues Asp202 and Arg254 in Tt-NNT [Leung et al., 2015] and βAsp213-βArg265 in E. coli [Althage et al., 2001]. In line with the conformational changes adopted by domain III in relation to its functional status, residues Asp830 and Arg882 have been shown to interact in the absence of NADP, but not in its presence [Althage et al., 2001]. Moreover, substitution of equivalent residues in E. coli alters NNT proton translocation activity [Yamaguchi and Hatefi 1995]. The holo Tt-NNT solved with NADP [Leung et al., 2015] shows that the salt bridge formed by these two residues is adjacent to the entrance of the proton canal and in the face-down conformation (which is competent for proton translocation), the NADP binding site in domain III comes in close contact with the salt bridge and the proton canal. Residues Asp830 and Arg882 in the H-NNT are likely to have the same crucial functional and structural role as described for the equivalent residues in Tt-NNT and E. coli.

NNT domain III
The NNT domain III (residues Pro902-Ser1083) bound to NADP was crystallized from human heart mitochondria in 2000 (PDB id: 1djl) White et al., 2000] and an extensive structural analysis of this domain is described by White et al. (2000). We will therefore limit our analysis to the identification of structurally and/or functionally important residues in this domain.
Five salt bridges can be identified in this domain and are likely to help stabilize its structure (Asp915-Lys1079, His995-Asp996, Glu1018-Lys1059, Glu1031-Lys1034, Lys1070-Asp1074). Structurally, the NADP binding site is characterized by the Rossmann fold and residues 932-937 form the consensus sequence Gly-X-Gly-X-X-Ala/Val. The residues forming the NADP cleft, which makes contact with NADP, are shown in Figure 3.

H-NNT dimeric structure and open challenges
NNT assembles in the inner mitochondrial membrane as an asymmetric homodimer [Leung et al., 2015]. We used the H-NNT model to identify the homodimerization interface sites for domain I. In order to construct the homodimer structure for H-NNT domain I, the E. coli domain I complex was used as a template (PDB: 2bru, dI 2 -dIII structure). Two monomeric chains of H-NNT were superposed to the E. coli dimeric structure and rotated using the E. coli complex coordinates. Superposition of equivalent domain I C alpha atoms (r.m.s.d. = 0.80Å) between human and E. coli NNT, demonstrated a high level of similarity between model and template. As expected, manual comparative analysis showed that the highest level of structural divergence was located within loop regions. The model allowed calculation of the 3D atomic coordinates of residues forming the domain I-domain I interface site for H-NNT. Residues forming this large protein-protein interface and their conservation score are reported in Figure 4. The majority of residues forming the beta hairpin fold were shown to be part of this large interface (Fig. 4A). Among these, Phe211, Gly212, Phe214, Phe215, Ala222, and Ala228 were predicted to be interface core residues, as they shifted from solvent accessible to fully buried (solvent inaccessible) upon dimer formation. Identification of interface core residues is of particular relevance, since they play a major role in protein-protein interaction and binding affinity [Chakrabarti and Janin 2002] and are a hot spot for disease causing variations [David et al., 2012]. Although mutational studies in E. coli have shown that deletion of the beta hairpin does not impair the formation of domain I dimer [Johansson et al., 2005], the interface residues, which are part of the beta hairpin, could still play a role in modulating the affinity of domain I dimer formation.
The NNT transmembrane domain (domain II) has been crystallized as a dimer in T. thermophilus [Leung et al., 2015]. We generated the H-NNT domain II dimer using bacterial transhydrogenase 3D coordinates as a template (PDB 4o9t, sequence identity 34%). Sequence and structural alignment showed that H-NNT TM2, 3 and 4 correspond to Tt-NNT TM1-2-3. Structural superposition between model and template shows that several residues on opposite TM1s in Tt-NNT and the equivalent TM2s in H-NNT can potentially take part in hydrophobic interactions. Nevertheless, the presence of two additional helices in the H-NNT (TM1 and TM5, which could not be modeled due to the absence of a suitable template) could result in different placement of TM2 within the inner mitochondrial membrane and therefore a different role in domain II dimerization. Moreover, participation of additional residues, located on TM1 and/or TM5, to H-NNT domain II interface site cannot be excluded. A cartoon of the E. coli NNT domain II (which contains 13 TM helices) generated in 2001 based on literature data, suggests that TM1 may be located next to TM2, thus in close proximity to domain II dimerization site [Meuller et al., 2001]. Nevertheless, in this cartoon, the spatial location of the helices surrounding the proton canal appears to be different from what reported by Leung et al. using   of the large interface site (in green) formed by two domain I monomers (presented in light and dark gray, respectively). The NAD binding sites, one on each monomer, are presented in orange. B: Residue His365, which harbors the p.His365Pro substitution, is a rim interface residue and is predicted to interact with Asp104 on the opposite domain I. C: Phe215, which harbors the p.Phe215Ser substitution, is a conserved interface core residue. Phe215 is predicted to form stacking interactions with Phe211 on the same chain. The name (res) and position on the NNT amino acid sequence (pos) of residues predicted to form the core and rim of dimer I interface is also presented.
to contribute to the hexagonal topology of the proton canal. Because of this discrepancy, and in the absence of a solved structure for E. coli domain II, it is difficult to speculate on the location of TM1 in relation to other helices and to the dimerization site in H-NNT.

Sequence and structural analysis of NNT deleterious variations
The availability of the H-NNT model allowed analysis of the structural/functional consequences of deleterious NNT amino acid substitutions and their predicted effect on H-NNT is detailed below and in Table 2, Supp. Figure S1, and Supp. Figure S8A-Q.
p.Ser193Asn, homozygous in FGD patient [Meimaridou et al., 2012]. Ser193 is an invariable residue part of the enzyme catalytic site of the proton-translocating transhydrogenase, which also involves the invariable residues Arg182, Gln187, and Asp190 (Supp. Fig.  S5). Substitution of Ser138 in R. rubrum, which is equivalent to Ser193 in H-NNT, leads to inhibition of the transhydrogenation rate without changing the binding affinity of the transhydrogenase to NAD [Brondijk et al., 2006].
p.Gly200Ser, homozygous in patients with combined mineralocorticoid and glucocorticoid deficiency [Weinberg-Shukron et al., 2015]. Position 200 is an invariant Gly. It is located in an alpha helix, tightly packed against the alpha helix of Rossmann fold in domain I. Substitution of Gly to Ser is predicted to create a steric clash with this important structural motif. Moreover, this substitution introduces a hydrophilic residue in an otherwise hydrophobic environment.
p.Phe215Ser, homozygous in FGD patient [Yamaguchi et al., 2013]. Phe215 is a core residue of the dimerization interface of domain I. It is at the beginning of the beta hairpin structural motif. Phe215 stabilizes bacterial and human NNT structure by forming a stacking interaction with Phe211 on the same chain (Fig. 4B). Phe is a large, hydrophobic amino acid. Its substitution with serine would introduce a polar and much smaller amino acid, not able to form stacking interactions. This substitution is predicted to affect the H-NNT structure and in particular the dimerization site.
p.Asp277Tyr, identified in a patient with left ventricular noncompaction (LVNC, MIM#604169) [Bainbridge et al., 2015]. This position is predicted to be part of a loop in domain I. The loop in human NNT is longer than the corresponding loop in bacterial transhydrogenase and therefore a structural analysis could not be performed. Position 227 is not conserved, nevertheless tyrosine is never observed in the MSA. This substitution is predicted to be deleterious by prediction programs (Table 2) and functional studies demonstrated that it causes a partial loss in NNT function in Zebrafish larvae [Bainbridge et al., 2015]. probability that a mutation is deleterious and classifies it accordingly, in "possibly" or "probably" deleterious or tolerated. * , unreliable estimate due to high number of gaps in alignment. S Fig, Supp. Figure. p.Thr357Ala, found in compound heterozygosity in an FGD patient in addition to p.Met880Ter, which is predicted to cause nonsense-mediated NNT mRNA decay [Meimaridou et al., 2012]. Thr357 is a hydrophilic, highly conserved residue on the surface of bacterial and human NNT and participates in hydrogen bonding. Its substitution with the hydrophobic alanine disrupts the H bonding. This substitution is predicted damaging by SIFT and Suspect (scores 0 and 75, respectively) but not by Polyphen2 (score = 0.204).

Table 2. Conservation Score and Prediction Scores from Three Variant Prediction Servers for H-NNT Deleterious Amino Acid Substitutions and Rare Genetic Variations Reported in the EXAC Database
p.His365Pro, homozygous in an FGD patient [Meimaridou et al., 2012]. His365 is part of H-NNT domain I homodimerization site. It is an interface rim residue. His365 is likely to form a salt bridge with Asp104 on the other chain of the dimer (Fig. 4C). His365 is part of a loop and its substitution with proline is likely to be structurally damaging, as it prevents formation of the salt bridge and can introduce a rigid structure in an otherwise short flexible motif. Interestingly, although His365 and Asp104 are not conserved, the salt bridge at this position is. In E. coli, a salt bridge can be formed by residues Glu304 and Lys48 corresponding to His365 and Asp104, respectively. This substitution is predicted possibly damaging by Polyphen2 but benign by SIFT and Suspect (Table 2).
p.Tyr388Ser, reported in a patient with combined adrenal failure and testicular adrenal rest tumor [Hershkovitz et al., 2015]. Tyr388 is located in domain I outside the NAD binding cleft and the dimerization site. Tyr388 is likely to form a stacking interaction with Phe154, thus helping stabilizing the structure of domain I, similarly to what is observed in bacterial NNT (Tyr327 and Phe92, respectively). Substitution of tyrosine with serine is predicted to alter the structural stability of domain I.
p.Pro437Leu, found in compound heterozygosity in an FGD patient in addition to p.Gln557Ter, which is predicted to cause nonsense-mediated NNT mRNA decay [Meimaridou et al., 2012]. Pro437 is part of the highly conserved amino acid sequence PAP in the linker between domains I and II. Substitution with any other amino acid is likely to be structurally and functionally damaging.
p.Ala533Val, homozygous in an FGD patient [Meimaridou et al., 2012]. Ala553 is a highly conserved residue and contributes to the formation of an alpha helix (TM3 in domain II). Substitution with valine is predicted to cause structural damage, as valine is a poor alpha helix forming residue [Gregoret and Sauer 1998]. Moreover, substitution with valine is predicted to create a steric clash with the surrounding residues.
p.Gly664Arg, found in compound heterozygosity in an FGD patient in addition to p.Thr689LeufsTer320, which is predicted to cause nonsense-mediated NNT mRNA decay [Meimaridou et al., 2012]. Gly664 is an evolutionarily conserved residue in TM7 of the NNT transmembrane domain. This variation replaces the smallest in size, hydrophobic, neutral residue with the largest in size, hydrophilic, charged residue. These large physico-chemical changes are likely to be structurally damaging, especially since this amino acid change occurs in a hydrophobic environment.
p.Gly678Arg, found in a patient with FGD in compound heterozygosity with the p.Gly862Asp change described below [Meimaridou et al., 2012]. Gly678 is located in TM8 of the transmembrane domain. The MSA shows that Gly678 is not evolutionarily conserved. Nevertheless, this position is generally occupied by hydrophobic residues (valine, leucine, glycine, and alanine), which is consistent with the hydrophobic environment in which this residue is located. Similarly to the previous variation, Gly678Arg is therefore predicted to be structurally damaging.
p.Gly862Asp Gly862 is an invariant residue located in the TM14 of the transmembrane domain. It is part of an alpha helix, which is tightly packed with other TMs. Substitution between glycine and aspartic acid introduces major physico-chemical changes (aspartic acid is a charged, hydrophilic residue). Moreover, introduction of aspartic acid is likely to create a steric clash with nearby TMs and, thus, structural damaging.
p.Leu977Pro, homozygous in an FGD patient [Meimaridou et al., 2012]. Leu977 is a highly conserved residue, located in domain III, outside the NAD binding site. Leu977 is part of an alpha helix and its substitution with proline is predicted to cause a kink in the alpha helix and loss of stability, causing structural damage to domain III. HUMAN MUTATION, Vol. 37, No. 10, 1074-1084, 2016  p.Ala1008Pro and p.Asn1009Lys. Both variations were found in patients with FGD: p.Ala1008Pro was present in homozygosity, whereas p.Asn1009Lys was in compound heterozygosity with p.His370Ter, which is predicted to cause nonsense-mediated mRNA decay [Meimaridou et al., 2012]. Ala1008 and Asn1009 are invariant residues located at the NADP cleft in H-NNT domain III, directly interacting with NADP ( Fig. 3B and C). Substitution of these crucial amino acids is predicted to alter NADP binding, thus, greatly affecting H-NNT function.

Sequence and structural analysis of NNT rare SAVs
The following rare variants (detailed in Tables 2 and 3) have been identified in homozygosity in the 1000 Genomes Project [1000Genomes Project Consortium et al., 2015 and are reported in the EXAC database : p.Arg27His (rs34241095) Arg27 could not be mapped onto the H-NNT structure. The initial 39 amino acids in H-NNT may represent the mitochondrial targeting peptide. This short sequence is not present in bacterial transhydrogenases. Position 27 is not conserved and p.Arg27His is predicted benign by most prediction programs (Polyphen2 = 0, SIFT = 0.13, Suspect = 24). Nevertheless, one prediction program (TargetP) suggests that this substitution may affect the mitochondrial targeting peptide. The latter has been shown to be enriched in positively charged and hydroxylated residues, but no clear consensus sequence is known to date [Habib et al., 2007]. Since the mitochondrial targeting peptide is not well characterized, it is not possible to make a prediction on the neutral or deleterious nature of this amino acid change.
p.Lys63Arg (rs35201656) Lys63 is a positive surface amino acid. The position in not conserved and can be occupied by arginine (Supp . Table S2). Lys63 is neither in domain I dimerization site, nor in proximity to NAD binding site. The p.Lys63Arg substitution is likely to be tolerated (SIFT = 0.47 tolerated, Suspect = 28 tolerated, Polyphen2 = 0.895 damaging).
p.Thr589Ser (rs370370846) Thr589 is part of a loop located in the mitochondrial matrix between helices TM4 and TM5. The loop in human NNT is longer than the corresponding loop in bacterial transhydrogenase. This position was not modeled in the H-NNT model and could not be structurally analyzed. Threonine and serine have similar chemical properties (polar, non-charged, and capable of forming hydrogen bonds) and are often interchangeable. Threonine to serine substitution is predicted tolerated by most programs (Polyphen2 = 0.376 tolerated, SIFT = 0.09 tolerated, Suspect = 15 tolerated).
p.Leu663Phe (rs41271083) Leu663 is in the TM7 alpha helix of the transmembrane domain. Its substitution with phenylalanine is predicted damaging (Polyphen2 = 1 damaging, SIFT = 0.01 damaging, Suspect = 24 tolerated). Nevertheless, leucine and phenylalanine are amino acids with similar chemical properties (hydrophobic, not charged amino acids) and the MSA shows that, although this position is generally occupied by leucine, other residues such as isoleucine and phenylalanine, can also be present (Supp. Table S2). The lack of all 14 TM helices in the H-NNT model does not allow excluding that this substitution could affect packing of TM7 with TM1 or TM5 in H-NNT.
p.Thr731Met (rs75710404) Thr731 is part of a loop located in the intermembrane mitochondrial space between helices TM9 and TM10 and it is unlikely to cause any structural damage. Methionine can be found at this position in the MSA and the threonine to methionine substitution is predicted benign by prediction programs (Polyphen2 = 0.035 tolerated, SIFT = 0.07 tolerated, Suspect = 21 tolerated).
p.Ile993Val (rs78818665) Ile993 is located within an alpha helix in domain III. It is not in proximity to the NADP cleft and its substitution is not predicted to affect NADP binding. Two prediction programs predict this substitution to be not tolerated (Polyphen2 = 0.804 possibly damaging, SIFT = 0.04 damaging, Suspect = 35 tolerated). Nevertheless, isoleucine and valine have similar chemical properties and valine can be seen, although rarely, at this position in the MSA.

Discussion
This study describes the first structural model of the human NNT, an enzyme crucial in the defence of cells against oxidative stress, defects in which have not only been linked to FGD and LVNC, but are also strong candidates for several other human disorders [Freeman et al., 2006;Reuter et al., 2010;Ripoll et al., 2012]. The 3D model was used to identify functional and structural H-NNT key motifs and gain essential insight into the structural and functional effect of deleterious amino acid substitutions causing glucocorticoid deficiency and LVNC cardiomyopathy, as well as rare homozygote amino acid variations.
NNT is widely expressed and is likely to contribute to the pathogenesis of a wide range of medical conditions ranging from aging to cancer [Uttara et al., 2009;Reuter et al., 2010]. Therefore, identification of residues, which represent susceptible positions for disease-causing variations, is important. Although amino acid evolutionary conservation is a good indicator of the structural and functional importance of residues, it cannot inform us on the effects of amino acid substitutions on a biological system. One fourth of amino acid variations in the human genome are predicted to be deleterious by the most widely SAV prediction tools [Yue and Moult 2006;Allali-Hassani et al., 2009;Adzhubei et al., 2010]. As an amino acid substitution may alter protein fitness by affecting its folding, its location within the cell, its interaction with other molecules, or its ligand binding and catalytic activity [Yates and Sternberg 2013;Stefl et al., 2013], additional methods are required for informing SAV prioritization.
Individual transhydrogenase domains as well as the domain Idomain III complex have been determined in bacteria [Bergkvist et al., 2000;Prasad et al., 2002] and bovine NNT [Prasad et al., 1999]. Availability of 3D structures for individual domains and complexes allowed us to generate a model for H-NNT and to demonstrate how amino acid substitutions affect H-NNT fitness through a wide range of functional and structural mechanisms.
Without the H-NNT 3D model, molecular mechanisms could only be identified for two variations (p.Ala1008Pro and p.Asn1009Lys), which are located in the NADP binding site. The H-NNT 3D model allowed predictions of residues forming the NAD cleft. This would not have been possible without the aid of a 3D model. As NNT has a crucial role in the redox process, characterization of the NAD cleft becomes of paramount importance toward the understanding of NNT function and to guide in vitro studies. Knowledge of the residues predicted to form the NAD cleft also allowed characterization of variations, such as Ser193Asn.
Another important example of the importance of H-NNT 3D model was the ability to identify domain I interface site. Disruption of protein-protein interaction, which includes the proteins ability to dimerize, has been recently demonstrated to be an important mechanism in human disease [David et al., 2012;Nishi et al., 2013;Das et al., 2014]. We showed that H-NNT dimer has a large interface region and at least two NNT variations may affect NNT function by altering its dimerization site. In particular, we dissected the domain I interface into "core" and "rim" residues. Core residues are of great importance in establishing and maintaining protein-protein interaction and their substitution is unlikely to be tolerated [David and Sternberg 2015], as in the case of p.Phe215Ser.
Another important mechanism by which deleterious amino acid substitutions impair NNT function is the disruption of its ability to correctly fold within the mitochondrial inner membrane. This was likely to be the case for three variations in domain II (p.Gly664Arg, p.Gly678Arg, and p.Gly862Asp). We predicted that these substitutions would alter the "groove-hinge" system, which represents the basis for transmembrane helices packing.
A crucial feature of NNT is the ability of domain III to flip according to NNT functional state (proton translocation across the membrane or hydrogen transfer). [Leung et al. 2015] demonstrated in the structure of bacterial transhydrogenase (which is localized in the cytosol) that domain III cycles from an up-face orientation (NADP binding site oriented toward domain I and interacting with NAD binding site) to a face down orientation (NADP binding site oriented toward domain II, thus away from NAD binding site). The opposite phenomenon would occur in the second monomer, thus conferring an asymmetric structure to the NNT dimer. Moreover, they proposed that different orientation of domain III are associated with modifications in domain II proton canal orientation (inward or outward facing) and its ability to translocate protons across the mitochondrial membrane. The high level of structural similarity between the bacterial transhydrogenase structure and the human NNT model, suggests that the same mechanism is likely to occur in the human NNT dimer, as previously suggested [Krengel and Törnroth-Horsefield 2015].
The flipping of domain III requires the absence of a rigid structure in the linker between domain II and III. Our structural analysis reveals that the amino acid sequence of this linker is highly disordered, thus supporting the hypothesis that this region is highly flexible. Identification of the amino acid sequence forming the linker in human NNT would not have been possible without the availability of a 3D model. Moreover, understanding the physical properties of the linker is of particular importance when analyzing amino acid substitution occurring in this region. There is mounting evidence of the role of intrinsically disorder regions (IDRs) in human disease . Amino acid substitutions occurring in the hinge should be evaluated for their propensity to create a disorderto-order transition, which would affect the hinge flexibility, as it is often the case in deleterious mutations occurring in IDRs . Although at the moment no NNT deleterious variations are known to occur in this region, identifications of potentially deleterious amino acid substitutions in the hinge may occur in the future.
In conclusion, structural biology can provide valuable information on protein structure-function relationship and integration of genetic analysis with protein 3D modeling can greatly enhance prioritization and interpretation of genetic variants. Analysis of H-NNT 3D model and structural interpretation of its deleterious amino acid substitutions, represent a powerful example. Moreover, availability of H-NNT 3D model and identification of key structural/functional residues will prove valuable, as several novel NNT genetic variations are likely to be identified in the near future, not only as a cause of adrenal disorders, but as a risk factor for a wide range of conditions, such as aging, inflammatory response, and cancer.