Evolution of Vertebrate Solute Carrier Family 9B Genes and Proteins (SLC9B): Evidence for a Marsupial Origin for Testis Specific SLC9B1 from an Ancestral Vertebrate SLC9B2 Gene

SLC9B genes and proteins are members of the sodium/lithium hydrogen antiporter family which function as solute exchangers within cellular membranes of mammalian tissues. SLC9B2 and SLC9B1 amino acid sequences and structures and SLC9B-like gene locations were examined using bioinformatic data from several vertebrate genome projects. Vertebrate SLC9B2 sequences shared 56-98% identity as compared with ~50% identities with mammalian SLC9B1 sequences. Sequence alignments, key amino acid residues and conserved predicted transmembrane structures were also studied. Mammalian SLC9B2 and SLC9B1 genes usually contained 11 or 12 coding exons with differential tissue expression patterns: SLC9B2, broad tissue distribution; and SLC9B1, being testis specific. Transcription factor binding sites and CpG islands within the human SLC9B2 and SLC9B1 gene promoters were identified. Phylogenetic analyses suggested that SLC9B1 originated in an ancestral marsupial genome from a SLC9B2 gene duplication event. *Corresponding author: Roger S Holmes, Eskitis Institute for Drug Discovery and School of Natural Sciences, Griffith University, Nathan QLD 4111, Australia, Tel: +61 7 37356008; E-mail: r.holmes@griffith.edu.au Received May 18, 2016; Accepted June 03, 2016; Published June 10, 2016 Citation: Holmes RS, Spradling-Reeves KD, Cox LA (2016) Evolution of Vertebrate Solute Carrier Family 9B Genes and Proteins (SLC9B): Evidence for a Marsupial Origin for Testis Specific SLC9B1 from an Ancestral Vertebrate SLC9B2 Gene. J Phylogen Evolution Biol 4: 167. doi:10.4172/2329-9002.1000167 Copyright: © 2016 Holmes RS, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
SLC9/NHE gene family members (sodium lithium carrier 9/ sodium hydrogen exchanger) contribute to the maintenance of cellular and intracellular pH homeostasis and are predominantly responsible for the absorption of Na + ions in the kidney and the GI tract [1][2][3]. SLC9 comprises three subfamilies: SLC9A, with 9 members (SLC9A1-A9), of which SLC9A1-9A5 are localized in plasma membranes and SLC9A6-SLC9A9 in subcellular organelles; SLC9B, with two members, SLC9B1 and SLC9B2 [2], which are the subject of this study; and SLC9C, with two members localized in sperm, SLC9C1 and SLC9C2 [2,4,5].
A mitochondrial inner membrane Na + /H + exchanger (NHA2 or NHEDC2; designated as SLC9B2), was originally identified in yeast (Saccharomyces cerevisiae) and bacteria (Escherichia coli), with the human homologue showing ubiquitous expression, particularly in tissues with high mitochondrial content, including the kidney distal convoluted tubule [6,7]. SLC9B2 has been identified as a contributor to sodium-lithium counter-transport activity (SLC), and may serve as a candidate gene for essential hypertension in human populations, particularly in urban black populations [7,8]. Moreover, genetic linkage studies of SLC activity in baboons have reported association with chromosome 5, the homologue to human chromosome 4 [9]. Human SLC9B2 has also been shown to reverse the Na + /H + exchangernull phenotype when expressed in Na + /H + exchanger deficient yeast [10], which supports a major role for SLC9B2 as a sodium/hydrogen exchanger. A second gene (NHA1 or NHEDC1; designated as SLC9B1) has been located in tandem with SLC9B2 on human chromosome 4 which is specifically expressed in testis in mammals [7,11]. This is in contrast to Drosophila, for which two SLC9B-like genes (designated as NHA1 and NHA2) are widely expressed in epithelial tissues and play crucial roles in organismal ion homeostasis and as Na + /H + exchangers [12].
This paper reports the predicted gene structures and amino acid sequences for several vertebrate SLCB2 and mammalian SLC9B1 genes and proteins, the predicted structures for mammalian SLCB1 and SLCB2 proteins and the structural, phylogenetic and evolutionary relationships for these genes and enzymes. The results suggest that the mammalian SLC9B1 gene arose from the duplication event of an ancestral mammalian SLC9B2 gene with the appearance corresponding to the emergence of marsupial mammals during vertebrate evolution.

SLC9B gene and protein identification (SLC9B1 and SLC9B2)
BLAST studies were undertaken using web tools from the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm. nih.gov) [13]. Protein BLAST analyses used vertebrate SLC9B2 and mammalian SLC9B1 amino acid sequences previously described [2,7,10,11] (Table 1; Tables S1 and S2). Predicted SLC9B-like protein sequences were obtained in each case and subjected to protein and gene structure analyses.
BLAT analyses were undertaken for each of the predicted SLC9B1 and SLC9B2 amino acid sequences using the UC Santa Cruz Genome Browser (http://genome.ucsc.edu) with the default settings to obtain the predicted locations for each of the vertebrate SLC9B-like genes, including exon boundary locations and gene sizes [14]. The structures for the major human SLC9B2 and SLC9B1 transcripts were obtained Alignments, structures and predicted properties of mammalian SLC9B2 and SLC9B1 proteins Amino acid sequence alignments of vertebrate SLC9B2 and SLC9B1 proteins were undertaken using Clustal Omega [16]. Predicted secondary and tertiary structures for human and other mammalian SLC9B2 and SLC9B1 proteins were obtained using the SWISS-MODEL web-server [17] and the reported tertiary structures for bacterial (Thermus thermophilus) Na/H antiporter [18] (PDB:4bwzA) with modeling residue ranges of 114-512 for human SLC9B2 and 94-502 for human SLC9B1 ( Figure S1). Predicted transmembrane structures for human SLC9B2 and SLC9B1 proteins were obtained using web tools (http://au.expasy.org/tools/pi_tool.html). Identification of conserved domains for vertebrate SLC9B2 and SLC9B1 proteins was made using NCBI web tools [19]

Phylogeny studies and sequence divergence
Phylogenetic analyses were undertaken using the http://phylogeny. fr platform [20]. Alignments of vertebrate SLC9B2 and mammalian SLC9B1 sequences were assembled using PMUSCLE [21] (Table 1  and Supplementary Table S1). Alignment ambiguous regions were excluded prior to phylogenetic analysis yielding alignments for comparisons of vertebrate SLC9B2 and mammalian SLC9B1 sequences. The phylogenetic tree was constructed using the maximum likelihood tree estimation program PHYML [22].

Results and Discussion
Alignments and possible roles of SLC9B2 amino acid sequences The deduced amino acid sequences for baboon (Papio anubis), chicken (Gallus gallus), zebra fish (Danio rerio) and coelacanth (Latimeria chalumnae) SLC9B2 proteins are shown in Figure 1 together with the previously reported sequence for human and mouse SLC9B2 proteins [2,7] (Table 1). Alignments of human with other vertebrate sequences examined were between 52-98% identical, suggesting that these are likely to be members of the same family of genes, whereas comparisons of sequence identities with vertebrate SLC9B1proteins exhibited lower levels of sequence identities, suggesting that these are members of a distinct but related gene family (Table 1; Tables S1 and  Table S2).
The amino acid sequences for vertebrate SLC9B2 proteins contained between 537 (for human SLC9B2 isoform 1) and 564 (for zebra fish SLC9B2) residues ( Figure 1). Previous studies have reported several key regions and residues for structurally related SLC9 Na + /H + exchange proteins. These included an N-terminus region (residues 1-86), predicted to be external to the inner-mitochondrial membrane; hydrophobic transmembrane segments which may anchor the enzyme to the inner-mitochondrial membrane; and a C-terminal region (residues 513-537), predicted to be localized within the mitochondrial matrix. The N-terminal and C-terminal regions showed lower levels of amino acid sequence conservation which are in contrast to 13 identified hydrophobic regions (TM1-TM13), which are more highly conserved ( Figure 1; Table 2). These regions were identified as transmembrane segments, also designated as 9B2:1/TM1-9B2:13/TM13 hydrophobic zones ( Figure 2; Table 2). Several intra-hydrophobic sequences were also highly conserved, including the N-terminus region prior to 9B2:1/ TM1 (residues 77-84 designated pre9B2:1); ITM1 (residues 107-112); ITM2 (residues 136-138); ITM4, containing a basic amino acid cluster (residues 193-208); ITM5 (residues 230-232); and ITM6 containing the active site (residues 255-311), particularly the double aspartate sequence (278-279) [4,23]. In addition, the C-terminus region, immediately following the 9B2:13/TM13 transmembrane sequence (residues 514-519), was highly conserved as well ( Table 2). Mutation analyses of a plasmid containing human SLC9B2 expressed in a transfected salt sensitive strain of yeast cells demonstrated reduced ability to remove intracellular sodium when mutants in Val161 and Phe357 were used [23]. These residues were conserved among the mammalian SLC9B2 sequences examined although conservative substitutions were observed in some lower vertebrate SLC9B2 sequences ( Figure 1).
The roles for these conserved amino acid sequences located outside of the hydrophobic and transmembrane regions for SLC9B2 remain to be determined; however, it is proposed that the presence of multiple proline and glycine residues in these sequences may contribute to the sharp turns required for maintaining these transmembrane structures, particularly for the ITM1 and ITM6 sequences, which contained double or triple glycine residues (112-113 and 258-260, respectively), See Table 1 for sources of SLC9B-like sequences; *shows identical residues for SLC9B2 subunits; similar alternate residues; dissimilar alternate residues; predicted transmembrane residues are shown in blue and numbered in sequence TM1, TM2 etc.; predicted interhelical segments are numbered in sequence (ITM1, ITM2 etc.); active site residues are in green; predicted glycine zipper sequences are shaded pink; bold underlined font shows residues corresponding to known or predicted exon start sites; exon numbers refer to human SLC9B2 gene exons; key functional residues are shown in turquoise [23]   Predicted key residues are identified based upon their positioning within transmembrane and/or inter-transmembrane structures; conserved residues are underlined; predicted Glycine Zipper sequences were identified designated as GZ1 (GFILGFVLG), GZ2 (GGGYGVEKG) and GZ3 (GVATGSVLG); a highly basic amino acid residue region was observed within ITM4, shown in blue; single and double Gly (G) and Pro (P) sequences were identified which may support the formation of transmembrane helices; the Asp-Asp (DD) active site sequence is shown in red. Table 2: Conserved key residues for human SLC9B2. The transmembrane structures for the human SLC9B2 and SLC9B1 subunits are based on results [31]. Amino acid residues and predicted transmembrane structures are numbered in order from N-terminus as 9B2:1, 9B2:2 etc. for SLC9B2 and 9B1:1, 9B1:2 etc. for human SLC9B1. The arrow indicates position of the active site. Predicted transmembrane regions are shown as red bars. Pink and blue colors indicate external and internal inter-membrane regions Figure 2: Comparisons of the predicted transmembrane structures for human SLC9B2 and SLC9B1. and the pre9B2:1 region, which contained a double proline sequence (79-80) ( Figure 1; Table 2). A distinctive intra-hydrophobic region was observed for ITM4, which exhibited high basic amino acid content, namely K195, K198, K199 and K201. This positive ion cluster may assist in the transfer of salt ions across the inner-mitochondrial membrane or form a salt-ion with the double aspartate sequence (278-279DD), previously reported as the active site for SLC9B2 [3,7].

Predicted structures for vertebrate SLC9B2 and human SLC9B1
Predicted secondary structures for vertebrate SLC9B2 and human SLC9B1 sequences were examined, which were consistent with the presence of 13 hydrophobic regions, including at least 10 transmembrane structures, symmetrically placed on either side of the double aspartate (278-279DD) active site sequence (Figures 1-3). Predicted tertiary structures for human SLC9B2 and SLC9B1 were also examined which were based on the reported 3D structure for the sodium proton antiporter protein from Thermus thermophilus (also called NapA) ( Figure S2) [18]. These transmembrane structures may be compared to pfam02080 domains [24], for which five transmembrane helices are found on either side of a 10 helical structure, predicted to form a pore involved in sodium proton exchange. The aspartatealanine antiporter (aspT) from Tetragenococcus halophilus has also been shown to function with similar domain architecture [25].
A similar secondary structure was observed for human SLC9B1, although with a shorter N-terminus sequence (Figures 2 and 3) and a notable absence of 3 glycine zipper sequences (GlyxxxGlyxxxGly), observed for human SLC9B2 (GZ1: 231-239; GZ2: 257-266; GZ3: 313-321) (Table 2; Figure 3). Glycine zipper sequences have been previously shown to encourage close helix-helix packing within transmembrane structures, and to facilitate the formation of membrane pores associated with channel formation [26,27], which is particularly relevant to the role that SLC9B2 plays as a mitochondrial inner membrane Na + / H + exchanger. Vertebrate SLC9B2 sequence alignments also showed that GZ1 and GZ3 glycine zipper sequences were conserved during vertebrate evolution, whereas GZ2 sequences were not fully conserved, at least among the vertebrate sequences examined (Figure 1). It is not known what impacts, if any, these differences in glycine zipper sequences have on SLC9B2 and SLC9B1 Na + /H + exchanger functions. Table 1 and Supplementary Tables S1 and S2 summarize the predicted locations for vertebrate SLC9B2 and mammalian SLC9B1 genes based upon BLAT interrogations of several vertebrate genomes using the reported sequences for human SLC9B2 [7,28,29] and SLC9B1 [7,11] and the predicted sequences for other vertebrate SLC9B2 and mammalian SLC9B1 proteins [11,14]. The vertebrate SLC9B2 and mammalian SLC9B1 genes examined were transcribed on either strand, depending on the species. Figure 1 summarizes the predicted exonic start sites for human, baboon, mouse, chicken, zebra fish and coelacanth SLC9B2 genes, based on the 'a' isoform, with each having 11 or 12 coding exons, in identical or similar positions to those predicted for the human SLC9B2 gene. Figure 4 shows the predicted structures for the major human SLC9B2 and SLC9B1 transcripts. In each case, two major transcripts were observed, including the reference SLC9B2 sequences (AK172823 and BC047447) which were 3601 and 2945 bps in length; and the reference SLC9B1 sequences (BC136966 and AY461581) which were 1879 and 1755 bps in length. The two major human SLC9B2 transcript isoforms, designated as 'a' and 'b', encode identical proteins with 537 amino acids with 11 coding exons and 1 or 2 non-coding exons, respectively (Table 1; Figure 4). The two human major SLC9B1 transcripts, also designated as 'a' and 'b', encoded proteins with 515 and 475 amino acids, with 11 and 10 coding exons respectively and 2 non-coding exons (Table 1; Figure 4). The first human SLC9B1 intron is much larger in comparison with intron 1 for human SLC9B2, which explains the observation that primate SLC9B1 genes were double the size for the primate SLC9B2 genes (Table S1; Figure 4).

Gene Locations, exonic structures and regulatory sequences for vertebrate SLC9B2 and mammalian SLC9B1 genes
The human SLC9B2 promoter region contained several potential transcription factor binding sites, including FOXD3, which promotes neural crest development; FOXJ2, which serves as a transcriptional activator; and HNF3B, which is involved in the development of liver, kidney and notochord (Table S3). It would appear that the SLC9B2 gene promoter is well endowed with gene regulatory sequences which may contribute to the high levels of SLC9B2 expression in mammalian neural, liver and kidney cells and to the maintenance of this expression during development. In addition, the human SLC9B2 promoter region contained a large CpG island (CpG92), which may play an important See Table 1 for sources of human SLC9B2 and SLC9B1 sequences; *shows identical residues for SLC9B subunits; similar alternate residues; dissimilar alternate residues; predicted transmembrane residues are shown in blue and numbered in sequence TM1, TM2 etc.; predicted interhelical segments are numbered in sequence (ITM1, ITM2 etc.); active site residues are in green; predicted glycine zipper sequences are shown as GZ1, GZ2 and GZ3 for human SLC9B2; bold underlined font shows residues corresponding to known or predicted exon start sites; exon numbers refer to human SLC9B2 gene exons; proline residues are shaded in pink; glycine residues in turqouise

SLC9B2 and SLC9B1 tissue expression
Supplementary Figure S2 compares the tissue expression patterns for human and primate SLC9B2 and SLC9B1 genes respectively. Of particular interest are the distinct tissue expression profiles observed for these genes, with SLC9B2 having a wider tissue expression pattern and showing much higher levels of expression in kidney, brain and liver; but with SLC9B1 showing a testis-specific pattern in both human and rhesus macaque tissues. This is consistent with previous studies [2,3,11,14]. The higher levels and wider distribution patterns of expression for SLC9B2 have been associated with the Na(+)/H(+) exchanger, sodium-lithium counter transport, cation/proton antiporter and salt homeostasis roles across tissue organelle lipid bilayers, including the inner mitochondrial membranes of kidney, which has led to this being designated as a candidate gene for essential hypertension [7]. No specific role for the unique testis-specific SLC9B1 expression has been identified [11,31].

Phylogeny and divergence of vertebrate SLC9B2 and mammalian SLC9B1 genes and proteins
A phylogenetic tree ( Figure 5) was calculated by the progressive alignment of 26 vertebrate and invertebrate SLC9B2 amino acid sequences with 17 mammalian SLC9B1 sequences which was 'rooted' with the coelacanth (Latimeria chamumnae) SLC9B2 sequence (Table 1  and Table S1, S2). The phylogram showed clustering of the SLC9B-like sequences into groups consistent with their evolutionary relatedness as well as groups for the vertebrate SLC9B2 and mammalian SLC9B1 sequences. These groups were significantly different from each other (with bootstrap values >93). It is apparent from this study of vertebrate SLC9B-like genes and proteins that SLC9B2 represents the vertebrate ancestral form while SLC9B1 appeared early in marsupial and eutherian mammalian evolution for which a proposed common ancestor for these genes may have predated or coincided with the appearance of marsupial mammals during vertebrate evolution. In addition, the close localization of the SLC9B2 and SLC9B1 genes on human chromosome 4 as well as their similarities in sequence and exonic structure suggested that the mammalian SLC9B1 gene evolved following a gene duplication event of the SLC9B2 gene, similar to that reported for many gene families [32][33][34].

Conclusions
The results of the present study indicated that vertebrate SLC9B2 and mammalian SLC9B1 genes and encoded proteins represent a distinct gene and protein family of SLC9B-like proteins which share conserved sequences and active site residues with those reported for SLC9/NHE gene family members (sodium lithium carrier 9/sodium hydrogen exchanger), which contribute to the maintenance of cellular and intracellular pH homeostasis and are predominantly responsible for the absorption of Na + ions in the kidney and the GI tract [1][2][3].
SLC9B1 is encoded by a single gene (SLC9B1) among the marsupial and eutherian mammalian genomes studied and are highly expressed in human testis and usually contained 11 or 12 coding exons on the negative or positive strand, depending on the species. SLC9B2 is also encoded by a single gene (SLC9B2) which is more widely expressed in human tissue, with highest levels being observed in kidney, liver and brain, being coexpressed with a proximal gene encoding 3-hydroxybutyrate dehydrogenase (BDH2). Several transcription factor binding sites were localized within the human SLC9B2 gene promoter region, including FOXD3, FOXJ2 and HNF3B, which regulate gene expression, and may contribute significantly to the high level of gene expression in kidney, liver and neural cells. In addition, this region contains a large CpG island which may also contribute to SLC9B2 gene regulation during development.
Predicted secondary and tertiary structures for vertebrate SLC9B2 and mammalian SLC9B1 proteins showed a strong similarity with similar Na/H antiporters originally identified in yeast (Saccharomyces cerevisiae) and bacterial (Thermus thermophilus) Na/H antiporter proteins [17] (PDB:4bwzA). Several major structural domains were apparent for vertebrate SLC9B2 and mammalian SLC9B1 proteins, including a N-terminal tail of varying lengths predicted to be external to the inner-mitochondrial membrane; hydrophobic transmembrane segments which may anchor the enzyme to the inner-mitochondrial membrane; and a C-terminal region (residues 513-537), predicted to be localized within the mitochondrial matrix. The N-terminal and C-terminal regions showed lower levels of amino acid sequence conservation which are in contrast to 13 identified hydrophobic regions (9B2:1-9B2:13), which are more highly conserved.
A phylogenetic study used 17 mammalian SLC9B1 and 26 SLC9B2 protein sequences which indicated that the SLC9B1 gene appeared early in marsupial mammalian evolution, prior to or coincident with the appearance of marsupial mammals, and derived from a gene Derived from the AceView [15]; shown with capped 5'-and 3'-ends for the two isoform mRNA sequences, in each case; NM refers to the NCBI reference sequence; exons are in pink; numbers within the intron sequences refer to bps

Page 7 of 8
The tree is labeled with the SLC9B-like name and the name of the animal and is 'rooted' with the coelacanth (Latimeria chalumnae) SLC9B2 sequence, which was used to 'root' the tree. Note the 2 major groups corresponding to the SLC9B2 and SLC9B1 gene families. A genetic distance scale is shown. The number of times a clade (sequences common to a node or branch) occurred in the bootstrap replicates are shown. Only replicate values of 0.9 or more, which are highly significant, are shown with 100 bootstrap replicates performed in each case. A proposed duplication event is shown arising from an ancestral invertebrate SLC9B2-like gene, generating the mammalian SLC9B1 gene Figure 5: Phylogenetic tree of vertebrate SLC9B2 and mammalian SLC9B1 amino acid sequences.
duplication event of the ancestral vertebrate (and invertebrate) SLC9B2 gene, which originated much earlier in evolution, since it has been reported in bird, reptile, amphibian, and fish genomes.