A phylogenetic analysis between humans and D. melanogaster : A repertoire of solute carriers in humans and flies

The solute carrier (SLC) superfamily is the largest group of transporters in humans, with the role to transport solutes across plasma membranes. The SLCs are currently divided into 65 families with 430 members. Here, we performed a detailed mining of the SLC superfamily and the recent annotated family of “ atypical ” SLCs in human and D. melanogaster using Hidden Markov Models and PSI-BLAST. Our analyses identified 381 protein sequences in D. melanogaster and of those, 55 proteins have not been previously identified in flies. In total, 11 of the 65 human SLC families were found to not be conserved in flies, while a few families are highly conserved, which perhaps reflects the families ’ functions and roles in cellular pathways. This study provides the first collection of all SLC sequences in D. melanogaster and can serve as a SLC database to be used for classification of SLCs in other phyla.


Introduction
The environment inside the cell is well protected by biological membranes.The inner and outer compartments are still highly connected in that the cells have integral membrane-bound proteins that facilitate movement across the protective barrier that surrounds the cell and its organelles (Alenghat and Golan, 2013;Singer and Nicolson, 1972).
The second largest group of integral membrane-bound proteins is transporter proteins.Approximately-one third of the human genome encodes transporter proteins and they are expressed in all organs of the human body (Almen et al., 2009).Many transporter proteins, such as the primary and secondary active transporters as well as the passive transporters, work together to maintain the transportation chain between cells and within the cell.The SoLute Carrier (SLC) superfamily is secondary active and facilitative transporters that transport a wide range of solutes (sugars, amino acids, peptides, fats, ions, vitamins, drugs, and toxins) across membranes.The SLCs have, with a few exceptions, in common that they rely on an electrochemical gradient over the membranes as the driving force for transport (Fredriksson et al., 2008;Hediger et al., 2013;Hediger et al., 2004).SLCs can roughly be divided into four categories: antiporters/exchangers e.g.SLC9 (Donowitz et al., 2013); symporters/co-transporters e.g.SLC5 (Wright, 2013); uniporters/facilitated transporters, e.g.SLC2 (Mueckler and Thorens, 2013) and orphan transporters (Zhang et al., 2018).
and the protein conservation findings from this model organism have contributed greatly to the research fields of neurobiology (Bellen et al., 2010).Moreover, in D. melanogaster there have been several publications investigating the function of orthologous proteins to 40 of the existing SLC families (Artero et al., 1998;Paik et al., 2017;Hua et al., 2010;Yin et al., 2017;Knight et al., 2010;Dourlen et al., 2015;Gai et al., 2016;Martin and Krantz, 2014;Sun et al., 2010;Southon et al., 2008;Hirata et al., 2012;Reynolds et al., 2009).In 2014, many of the SLCs in fruit fly were summarized with focus on transport across the blood-brain barrier (Limmer et al., 2014).However, this study did not describe the evolutionary relationship to human SLCs.Also, in 2017, the fly database combined all published works into a D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, FB2021_02, (Gene Group, 2021;Larkin et al., 2020).Still, this summary lacked the connection and information about phylogenetic relationship to the human SLCs.
The HUGO Gene Nomenclature Committee (HGNC) provides a list of the SLC superfamily members (www.genenames.org) and according to their criterion a protein is assigned to a SLC family if the protein has at least 20 % sequence identity to at least one other member of that family (Hediger et al., 2004).Another way to classify proteins is through the Protein Family system (Pfam).Here, the classification is based on protein alignment of functional domains, which resulted in larger clans of protein sequences.Today the SLC superfamilies' populate 12 clans of which six contain more than one annotated SLC family (Perland and Fredriksson, 2017).More SLCs are being identified and recently the superfamily grew from 52 families with approximately 400 members to the current 65 families with 430 members.These can all be found in the SLC table database (slc.bioparadigms.org).Furthermore, in 2017, Perland and colleagues found that there are probably more proteins that will be categorized as SLCs due to their similarity to already existing members.This group of proteins were named atypical SLCs (Perland et al., 2017).
In this paper, we systematically mined the D. melanogaster proteome aiming to identify proteins with resemblance to human SLCs.The results were obtained by building Hidden Markov Models (HMM) based on the Pfam categorization of the SLC family using HMM BUILD from the HMMER package (Eddy and Pearson, 2011) and the Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST) (Altschul et al., 1997), and amino acid identities were analyzed using the European Molecular Biology Open Software Suite (EMBOSS) Needle (Li et al., 2015).Within here, we provide the first phylogenetic analysis of all human SLCs and "atypical" SLCs and their homologues in D. melanogaster.In all, we identified 381 protein sequences in fly, of which 55 protein sequences have not yet been listed in the D. melanogaster SLC table.

Construction of hidden Markov models
Sequences to each Pfam clan were combined into eleven multiple sequence alignment using a Multiple sequence Alignment software with Fast Fourier Transform (MAFFT) (Katoh et al., 2002).Hidden Markov Models (HMMs) that were used to scan for homologues in the D. melanogaster proteome (BDGP6.pep.all)via HMM SEARCH, with a cutoff at E = 10, were built with HMM BUILD from the HMMER package (Eddy and Pearson, 2011).The results were manually curated and splice variants were removed and only the longest protein sequences were kept.A summary of all sequences identified in D. melanogaster can be found in Supplementary data 2.

Phylogenetic analysis
The human SLC proteins and the identified sequences in D. melanogaster were aligned a second time using MAFFT.The phylogenetic relationships of sequences belonging to ANL, CPA/AT, DMT, Fz, IT, MtN3-like and Timbarrel Pfam clans were inferred using the Bayesian approach as implemented in mrBayes 3.2.7 (Huelsenbeck et al., 2001).mrBayes was run on a non-heated chain with two runs in parallel (n run = 2) under the mixed amino acid model with eight gamma categories and invgamma as gamma rates for a total of 2,000,000 generations.Due to the great number of sequences, the phylogenetic relationships of human and D. melanogaster protein sequences belonging to the MFS and APC clans were calculated using the GAMMAJTT amino acid model with 100 bootstrap replicas, and a consensus tree was calculated from these using the built in consensus tree calculation using Randomized Axelerated Maximum Likelihood (RAxML) (Stamatakis, 2014).All trees are found as rectangular trees with roots and posterior probabilities or bootstraps in Supplementary data 3.The horizontal lines, branches, in the phylogenetic trees represent evolutionary lineages changing over time, i.e. amino acid substitutions per site divided by the length of the sequence.All phylogenetic trees therefore contain a scale, a line, that represent the number of genetic changes.

Global pairwise alignment
Global pairwise alignments using EMBOSS Needle (Li et al., 2015) were performed to investigate the amino acid identity between the human and the D. melanogaster homologous proteins and to establish that the protein sequences fulfilled the criterion of 20 % to 25 % amino acid identity, a criterion established by HGNC and Hediger et al (Hediger et al., 2004;Hediger et al., 2004).The results from the global pairwise alignments are summarized for all transporters with a clan affiliation in Supplementary data 4.The human and D. melanogaster homologues with the highest amino acid identity are presented in Table 1-10, together with the human SLCs, its substrate and transporter type (slc.bioparadigms.org); as well as the function of the D. melanogaster proteins (flybase.org,FB2021_02, (Gene Group, 2021;Larkin et al., 2020).Predicted functions were not included in the tables.

Protein psi-blast using blastp suite
A little more than hundred protein sequences of the SLC family do M.M. Ceder and R. Fredriksson Table 1 Summary of human and D. melanogaster SLCs belonging to the Major Facilitator Superfamily clan.The human SLC of MFS type divided into subfamilies, members, substrate profile, transporter type as well as revealing the identified homologue/homologues that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar, meaning that the D. melanogaster proteins presented next to a human SLC had the highest amino acid identity score to that SLC in the subfamily.The   (Altschul et al., 1990;Li et al., 2016) were performed to search for the most similar protein sequences to these human SLCs within the D. melanogaster proteome, Supplementary data 5.The same approach was used for the SLC64 belonging to the LysE transporter clan and the SLC45 family belonging to the MviN, MATE clan, as well as for SLCs populating the MFS and APC clans where the HMM failed to identify hits.

Results
In total, 368 SLCs and 13 atypical SLC homologues were identified in the D. melanogaster proteome, Fig. 1.A total of 55 previously unidentified protein sequences were found via the HMMs.The criterion currently used to assign a new protein sequence into a SLC subfamily is based on the work of (Hediger et al., 2004) and 2013) (Hediger et al., 2013;Hediger et al., 2004), where it is stated that "a transporter has been assigned to a specific SLC family if it has at least 20 to 25 % amino acid sequence identity to other members of that family".Therefore the amino acid identity between the human and fly protein sequences were studied using EMBOSS Needle pairwise alignment software (Li et al., 2015).The closest relative to each human SLC member, as well as the substrate profile and transport mechanism (co-transporter, antiporter, facilitator, channel, and orphan) of the human SLCs are presented in Table 1-10 and all alignments are presented in Supplementary data 5.The average pairwise sequence identity of human SLCs, atypical SLCs and their members in flies were calculated by pair-wise alignments between each human SLC member and the identified fly protein sequences within each family.Furthermore, an average per family was calculated.The average pair-wise alignments were above 20 % within each subfamily except for SLC10, SLC16, SLC22, SLC35, SLC39, SLC38 and SLC46 where the sequence identity was lower than 20 %.However, 51 protein sequences that were already annotated as SLCs in D. melanogaster were not identified by the HMMs used here.This issue most likely arises because the HMMs are biased for the Pfam clan affiliation, i.e. the number of members in each SLC subfamily in a Pfam clan varies greatly.For example, the SLC subfamilies do not contain an equal number of members hence smaller families with only one or few protein sequences will not affect the motifs that the HMMs later is built on compared with larger families that consist of many protein sequences.This is observed for the HMM built using the members in the MtN3-like Pfam clan, where the SLC54 family has three members compared with the SLC50 family that only has one member with a rather short protein sequence, Figs. 1 and 5. Also, the amino acid sequence identity was already low within the SLC superfamily and often also within each SLC subfamily, which could contribute to the difficulty to directly, without any other complementary experiment, identifying protein sequences.Furthermore, a few related protein sequences did not pass the cutoff at E = 10, e.g. two protein sequences belonging to the SLC29 family, and were therefore, unfortunately, unidentified.The phylogenetic relationship between the human SLCs, as well as the putative SLCs, and the identified fly protein sequences are presented in Figs.2-5.
Unfortunately, HMM could not be used for the SLC64 family since the family only contains one protein sequence, which is not enough sequences to perform the HMM BUILD .HMMs were also not built for the SLCs that lack a Pfam clan assignment (no clan SLCs).Moreover, the HMM built for the MviN, MATE-like clan (SLC47 and SLC62) could not identify any related protein sequences in D. melanogaster.To identify possible homologues in the fly proteome to these SLC families, protein PSI-BLASTs were performed with the blastp suit.The results of the PSI-BLASTs are found in Table 11 and Supplementary data 5.  Ceder and R. Fredriksson Table 2 Summary of human and D. melanogaster SLCs belonging to the Amino Acid-Polyamine-Organocation clan.The human SLC of APC type divided into subfamilies, members, substrate profile, transporter type as well as revealing the identified homologue/homologues that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar, meaning that the D. melanogaster proteins presented next to a human SLC had the highest amino acid identity score to a particular SLC in the subfamily.The table also presents the function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).* Indicates a D. melanogaster homologue that has a higher amino acid identity to another member but regarding to annotation, function and location that are more similar to the human SLC that it is listed next to.

Solute carriers of major facilitator superfamily type in D. melanogaster
In humans, the MFS Pfam clan is populated by 20 SLC subfamilies, Fig. 1.In addition to these 20 SLC subfamilies, the atypical SLC protein sequences (Perland and Fredriksson, 2017;Perland et al., 2017) were included in the phylogenetic analysis of this subgroup of SLCs, Fig. 2.
In total, 162 protein sequences related to the SLCs of MFS type were identified from the fly proteome using the HMM, Fig. 2, and two sequences were manually added, one predicted orthologue to the atypical SLC MFSD11 and one orthologue to the SLC19 family.Out of these 164 sequences, 26 protein sequences were not listed in the D. melanogaster SLC table.Most of the identified members, approximately 78 % of the identified protein sequences, clustered to the SLC2 (25sequences), SLC16 (15sequences), SLC17 (26sequences) and SLC22 (41sequences) families.The protein sequence of CG7448 was identified as a homologue to the human SLC22 family, however, this protein was recently annotated as a pseudogene.It is still listed as a protein-coding gene in the fly proteome (BDGP6.pep.all) and therefore it was included here.
The SLC21 and SLC46 families were found to have eight members each in D. melanogaster, while most families (SLC15, SLC18, SLC19, SLC29, SLC33, SLC37, SLC49 and SLC45) had one to three related protein sequences each in D. melanogaster.For five families: SLC40, SLC43, SLC59, SLC60 and SLC61, no related protein sequences were identified neither by HMMs nor by PSI-BLASTs, Supplementary data 5. SLCs of MFS type in D. melanogaster are well studied and most of the identified proteins are already annotated as SLCs in D. melanogaster.However, the HMMs for SLC15, SLC16, SLC17, SLC19 and SLC22 identified sequences that were not previously listed.
For six of the SLC families populating the MFS clan: SLC2, SLC16, SLC17, SLC22, SLC29 and SLC46, protein sequences that do not fulfill the criterion of 20 % sequence identity were identified by the HMM, Supplementary data 4.This was most likely because these protein sequences have features and similarities with other related sequences in D. melanogaster.For instance, they have similar conserved motifs that contribute to the secondary and tertiary structures.The closest human homologues to each identified D. melanogaster protein sequence are presented in Table 1.Unfortunately, a majority of the identified and predicted homologues to the human SLCs of MFS type do not have a known function in D. melanogaster, which makes it more difficult to propose orthologous proteins between the two species.Moreover, a part of the identified homologues do not even have a transporter function.However, there are identified protein sequences in D. melanogaster that also exhibit similar or same function as its predicted human homologue.For example, members of the SLC18 family of vesicular amine transporters in human and flyfly both contain transporters (e.g.SLC18A2-dmVmat and SLC18A3-dmVAchT) that exchange monoamines such as dopamine, serotonin and acetylcholine, but the SLC18A2 homologue in fly (Vmat) does not transport epinephrine but octopamine, which has a similar function in invertebrates as epinephrine has in human (Sreedharan et al., 2011).
In 2017, Perland and Fredriksson suggested that there are more protein sequences that ought to be classified into the SLC superfamily due to high resemblance of SLC subfamilies, which alreadypopulates the MFS Pfam clan (Perland and Fredriksson, 2017).This suggestion was based on a study performed by Sreedharan, S et al (2010) that aimed to M.M. Ceder and R. Fredriksson  (Fotiadis et al., 2013).Perland and Fredriksson (Perland and Fredriksson, 2017) suggested that 28 more protein sequences (MFSD1; MFSD2A-B; MFSD3; MFSD4A-B, MFSD5; MFSD6; MSD6L; MFSD8; MFSD9; MFSD10; MFSD11; MFSD12; MFSD13A-B; MFSD14A-B; UNC93A-B1; SPNS1-3; SV2A-C; SVOP and SVOPL) are atypical SLCs of MFS type (Perland and Fredriksson, 2017).A couple of years later it was shown to be true when 16 of these 28 protein sequences were classified into existing and new SLC subfamilies (slc.bioparadigms.org).The majority of the atypical SLCs, except MFSD9, MFSD12 and MFSD13a, were found to be conserved in D. melanogaster, Fig. 2 and Supplementary data 4.In total, 13 protein sequences were identified as atypical SLCs, but they are not clustering with any current SLC family, which suggests that they either will be classified into new SLC families or not be included into the SLC superfamily at all, Fig. 1.
In addition, when using the HMM of the MFS Pfam clan, eight protein sequences were identified in D. melanogaster, but they were not grouped into any existing SLC family and instead formed a separate cluster (group), Fig. 2 (blue text).Furthermore, no significant hits were found when performing protein PSI-BLAST against the human, Supplementary data 5.The MFS Pfam clan is a heterogeneous group of proteins and even if the focus was only on the SLCs that populates the MFS Pfam clan, the protein sequences are still highly diverge.The HMM model calculates the probability of amino acid shifts over time and the likelihood of how similar the collection of protein sequences is between species.These eight protein sequences could therefore have been identified by the model since their protein sequence resembles other sequences of SLCs of MFS type.This could possibly point to that there are SLC subfamilies that are not conserved between human and other species or that they have not evolved in human.

Protein sequences identified to belong to the SLCs of amino acidpolyamine-organocation type in D. melanogaster
There are 11 SLC families populating the APC Pfam clan, Figs. 1 and  3. A total of 54 protein sequences were identified in D. melanogaster, Fig. 3. Homologous proteins between human and fly were identified in the SLC4, SLC5, SLC6, SLC7, SLC12, SLC17, SLC26, SLC32, SLC36 and SLC38 families, while no related D. melanogaster sequences were found for the SLC11 and SLC23 families.
In the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, released FB2021_02, (Gene Group, 2021), 23 additional  orthologous were listed as homologues to the SLC5, SLC6, SLC7, SLC11, SLC26 and SLC36 families.Many of these protein sequences (18intotal) were not included in the phylogenetic tree since they did not pass the cutoff at E = 10 when performing the HMM SEARCH and were therefore removed.The remaining five of the 23 protein sequences (CG6928, CG8785, CG9702, Mvl and tadr) were not identified by the HMM.The related protein sequences identified via HMM SEARCH and the additional sequences listed at the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, released FB2021_02, (Gene Group, 2021), (a total of 77 protein sequences) were included when the pairwise alignments were performed.A majority, 72 of the protein sequences, were found to fulfill the criterion established by HGNC (20 % amino acid identity to a member in the human SLC family), and 46 % of the related protein sequences identified in D. melanogaster had 35 % or higher amino acid sequence identity to their homologue in human, Supplementary data 4. Furthermore, several of the identified homologues in

Fly homologues to SLCs belonging to the drug/metabolite transporter clan and the ion transporter clan
Twenty-four protein sequences were identified to be related to the 49 members of SLC35 (purple shade), SLC39 (green shade) and SLC57 (blue shade) families (DMT Pfam clan), Fig. 4A and Table 3.According to the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, released FB2021_02, (Gene Group, 2021), one additional sequence is annotated as an orthologue to SLC39.All the protein sequences fulfilled the criterion of being a SLC, Supplementary data 4. Three new members of the SLC35 family and one new member of the SLC57 family were identified in D. melanogaster.
The IT Pfam clan held, for a long time, only one SLC family (SLC13) until recently when the SLC53 family was classified into the Pfam clan.Both the SLC13 and SLC53 families were found to have four related protein sequences in D. melanogaster, Fig. 4B.Two of the four protein sequences of the SLC13 were listed as SLCs in the D. melanogaster SLC table, Indy and Indy-2, two proteins with known functions (Table 4), and they resembled the human SLC13A2 and SLC13A5, while the other two, CG7309 and CG33934, have not been listed as homologous proteins previously.According to the phylogenetic tree, the human SLC13 proteins cluster, and the fly protein sequences cluster, suggesting that the proteins are more similar within the phylum rather than presenting a clear orthologous relationship, Fig. 4B (dark pink).All the protein sequences identified in D. melanogaster fulfilled the criterion to be classified as a SLCs, Supplementary data 5.The SLC53 family consists of one member, SLC53A1 and it was found to share a recent common ancestor with CG7536 and CG10483.However, CG2901 and CG10481 were also identified as closely related protein sequences to the SLC53 family, Fig. 4B (light pink).All of them had over 35 % amino acid sequence identity to SLC53A1.The two proteins CG10483 and CG7536 that shared a more recent common ancestor with SLC53A1 compared with CG2901 and CG10481, had as high as 52 and 54.2 % amino acid identity to SLC53A1, Supplementary data 5.

SLCs in D. melanogaster belonging to the ANL, CPA/AT, Fz, MtN3like, thioredoxin-like and timbarrel clans
The fatty acid transporters of the SLC27 family were found to have three closely related protein sequences (Fatp1-3) in D. melanogaster, Table 5, but additional 21 sequences were identified via the HMM and used when building the phylogenetic tree, Fig. 5A (green).Fatp1, Fatp2 and Fatp3 shared over 20 % amino acid sequence identity with the human SLC27 family members, Supplementary data 4.
The SLC9 family of sodium:proton exchangers was found to be well conserved in D. melanogaster, and all the five known orthologues were identified with the HMM based on the SLCs of CPA/AT type, Fig. 5B (blue).However, little is reported about their function on flybase.org(FB2021_02, (Larkin et al., 2020), but they exhibit similar functions as the human SLC9 members, Table 6.The HMM model did not identify any fly protein sequences related to the second SLC family belonging to this clan, the SLC10 subfamily.Interestingly, when performing PSI-BLAST for this family-two uncharacterized proteins were found, CG9903 and CG11655.Both were listed as bile acid:sodium symporters on flybase.org(FB2021_02, (Larkin et al., 2020), a function that the human SLC10 transporters possess, and they were found to share 21.1 % and 21.9 % amino acid sequence identity to human SLC10A3 and SLC10A5, Supplementary data 4. Therefore, these two protein sequences were manually added before the phylogenetic tree was built, Fig. 5B (light blue), as well as in Table 6.
The SLC3 family has been debated if it should really be accounted as a SLC subfamily.However, according to today's criterion this family is listed as a SLC subfamily according to the work published by Hediger, M. A. et al (2004) (Hediger et al., 2004) and hence, it was included in the analysis.The SLC3 family consists of two members, SLC3A1 and SLC3A2, that act as subunits to the SLC7 family (Fredriksson et al., 2003).In fly, one orthologous, CD98hc, has been described.CD98hc and 10 more protein sequences encoding maltose degrading enzymes were identified with the HMM search, Fig. 5C and Table 7.The maltose degrading enzymes have not been identified as members of the SLC3 Table 9 Summary of human and D. melanogaster SLCs belonging to the MtN3-like clan.The human SLC of MtN3-like type divided into subfamilies, members, substrate profile, transporter type as well as revealing the identified homologue/homologues that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar, meaning that the D. melanogaster proteins presented next to a human SLC had the highest score to that particular SLC in the subfamily.The table   When performing the HMM search for homologues to the SLC50 subfamily, no related protein sequences in D. melanogaster were identified.However, quite recently it was suggested that slv is the orthologous protein to the human SLC50A1 (Limmer et al., 2014), and hence it was manually added to the alignment, Table 8, and the phylogenetic analysis, Fig. 5D.SLC50A1 and slv form a cluster in the MtN3-like clan-based tree and they were found to share 27.3 % amino acid sequence identity, Supplementary data 4. Furthermore, they share similar functions, Table 8, and should therefore be suggested to be orthologous proteins.SLC54 is a relatively new SLC family with three members, SLC54A1-3.In total, four related protein sequences were identified in D. melanogaster: Mpc1, CG9396, CG9399 and CG32832, Fig. 5D and Table 8.Mpc1 was found to have highest amino acid sequence identity to SLC54A1, while CG9396, CG9399 and CG32832 were found to have highest amino acid sequence identity to SLC54A2, Table 8 and Supplementary data 4.Both SLC54A1 and Mpc1 transport pyruvate across the mitochondrial membrane and could therefore be considered orthologues.Meanwhile, the other three identified fly homologues lack information regarding their function, Table 8.SLC58, a SLC of Thioredoxin-like type, and SLC65, a SLC family of Fz type, are families recently added to the SLC superfamily and no orthologues have been suggested in D. melanogaster.One homologue, Ostγ, with approximately 50 % amino acid sequence identity was identified to the two human SLC58 members, Supplementary data 4.The reported function of Ostγ does not indicate that it transports ions similar as the human homologues, Table 9.Perhaps Ostγ has a transporter function that has not yet been fully established or it could be that it has evolved to better suit the needs of D. melanogaster.To the SLC65 family-two protein sequences, Npc1a and Npc1b, were identified with the HMM, and both fulfill the amino acid sequence identity criterion, Fig. 5E and Supplementary data 4.Both Npc1a and Npc1b exhibits functions, cholesterol trafficking/transport, that are similar to the human SLC65 members, and hence, they could be considered orthologous proteins, Table 10.

Related protein sequences to the SLC47, SLC62, SLC64 and Pfam clan unclassified SLC families
The HMM search did not identify any related protein sequences to the SLC47, SLC62 and SLC64 families.The SLC47 subfamily is not listed in the SLC table of D. melanogaster, and the SLC62 and SLC64 subfamilies were recently added to the SLC table and, hence, the information about them is limited.PSI-BLASTs were performed using the blastp suit from (caption on next column) Fig. 1.Tabulated representation of SLC proteins and putative ("atypical") SLCs in human and D. melanogaster.Summary of results provided by the HMMs and information obtained from slc.bioparadigms.org and flybase.org/reports/FBgg0000686.html.The Pfam clans MFS (Major Facilitator Superfamily, CL0015), APC (Amino Acid-Polyamine-Organocation, CL0062), ANL (Acyl-CoA synthestases, NRPS adenylation domains and Luciferase enzymes, CL0378), CPA/AT (Cation:Proton Antiporter/Anion transporter, CL0064), DMT (Drug/Metabolite transporter, CL0184), Fz (Frizzled cysteine-rich domainrelated, CL0644), IT (Ion transporter, CL0182), LysE transporter (Lysine exporter, CL0292), MviN, MATE-like (CL0222), MtN3-like (CL0141), Thioredoxin-like (CL0172), Timbarrel (Triose phosphate IsoMerase barrel, CL0058) and unclassified SLCs (referred to as "No clan") as well as protein sequences belonging to the putative SLCs named Major Facilitator Superfamily containing domain (MFSD) and un-coordinated homolog protein (UNC-93) are noted in the left margin of the figure.Light orange boxes specify SLC families where protein PSI-BLASTs were performed.The SLC families are specified in the column furthest to the left.The white column provides information about the SLC family in humans and the green column specifies members identified in flies.Numbers within the pink column represents previously unidentified protein sequences that are not listed in the SLC table of D. melanogaster.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)NCBI/NIH to investigate the related protein sequences to these SLC subfamilies sequences in D. melanogaster.The SLC47 and SLC62 subfamilies were found to have no related protein sequences in fly while one sequence, CG42542, was identified for the SLC64 family, Supplementary data 5.
In the SLC table database there are 18 families: SLC1, SLC8, SLC14, SLC20; SLC24, SLC25, SLC28, SLC30, SLC31, SLC34, SLC41, SLC42, SLC44, SLC48, SLC51, SLC52, SLC55 and SLC56, with a total of 125 protein sequences that are not classified into any existing Pfam clan.For these 125 proteins no HMMs were built.Instead, PSI-BLASTs using the human SLC protein sequences were used to search for similar protein sequences in the D. melanogaster proteome and thereby identify homologues, Supplementary data 5.The PSI-BLASTs and the available data at the D. melanogaster SLC table were combined and presented in Table 11.No related protein sequences were identified for SLC14, SLC34, SLC48 and SLC51.One homologue was identified for SLC20, SLC41, SLC42 and SLC52 each, while two homologues were found for SLC1, SLC28, SLC44, SLC55 and SLC56.Furthermore, SLC8, SLC24, SLC30 and SLC31 were found to have three or more related protein sequences in D. melanogaster.The largest SLC family in humans, SLC25 with 53 members, was found to have 47 homologous proteins in D. melanogaster, thereby also the largest SLC family in fly.All 47, except one (Mtch), were previously listed in the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, released FB2021_02, (Gene Fig. 2. Homologous proteins to human SLCs populating the Major Facilitator Superfamily (MFS) Pfam clan.162 homologous proteins were identified in D. melanogaster using the HMM for the human SLCs with MFS motifs [SLC2 (Facilitative GLUT transporter family), SLC15 (Proton oligopeptide cotransporter family), SLC16 (Monocarboxylate transporter family), SLC17 (Vesicular glutamate transporter family), SLC18 (Vesicular amine transporter family), SLC19 (Folate/thiamine transporter family), SLC21 (Organic anion transporter family), SLC22 (Organic cation/anion/zwitterion transporter family), SLC29 (Facilitative nucleoside transporter family), SLC33 (Acetyl-CoA transporter family), SLC37 (Sugar-phosphate/phosphate exchanger family), SLC40 (Basolateral iron transporter family), SLC43 (Na + -independent, system-L-like amino acid transporter family), SLC45 (H + /sugar cotransporter family), SLC46 (Folate transporter family), SLC49 (FLVCR-related transporter family), SLC59 (Sodium-dependent lysophosphatidylcholine symporter family), SLC60 (Glucose transporters) and SLC61 (Molybdate transporter family)] and two predicted SLCs were manually added, CG18549 and CG17036.Most protein sequences were identified as homologous to the human SLC2, SLC16, SLC17 and SLC22 families (78 % of the identified proteins), while no homologues were identified for SLC40, SLC43, SLC59, SLC60 and SLC61.The pyramid shapes indicate SLC families, magenta texts indicate putative ("atypical") SLCs and their predicted fly homologues and dark blue texts show fly sequences identified with the HMM that do not cluster into any existing human SLC family.The proteins are presented in a polar tree layout with branch length 0.70 built using RAxML (40).Human SLC proteins (in bold) are called by their SLC nomenclature and D. melanogaster proteins are named by their protein name as defined on flybase.org.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)Group, 2021).However, it is important to note that the PSI-BLAST is not a powerful method compared with the HMM to identify homologues and only 30 of the 47 homologues were found, Supplementary data 5.

Discussion
Here we present a collection of 368 curated protein sequences in the fly that are related to the 430 SLCs in humans, which add approximately 20 unidentified SLC protein sequences to the previous number established by Höglund et al. (2011) (Hoglund et al., 2011).Moreover, it is likely that this dataset contains all the SLC sequences present in the current assembly of the D. melanogaster proteome, as well as 13 curated atypical SLC sequences in D. melanogaster, Fig. 1.
In total, 56 of 65 SLC families (86 %) were conserved in D. melanogaster, whereas nine of 13 (69 %) of the atypical SLCs were conserved.The HMMs identified 55 protein sequences that have not previously been identified as SLCs in fly, and interestingly, 17 of these protein sequences were grouped into families that were not found in the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html, released FB2021_02, (Gene Group, 2021).
Unfortunately, the HMMs also failed to identify 51 protein sequences that are listed in the D. melanogaster SLC table (flybase.org/reports/FBgg0000686.html,released FB2021_02, (Gene Group, 2021), and therefore the completeness of the screen can be questioned.However, there are explanations that could possibly clarify the discrepancy.(I) It could be that the protein sequences did not pass the cutoff E = 10 when performing the HMMs searches and hence, the sequences were not included when constructing the phylogenetic trees.This occurred for 23 protein sequences that would otherwise have been annotated to the SLC5, SLC6, SLC26, SLC36, SLC39 and SLC50 families.For example, two protein sequences were identified by the HMM as homologous to SLC50A1 (slv (27.3 % aa identity) and CG7272 (22.4 % aa identity)), but  both were excluded during the initial analysis since they did not pass the cutoff (E = 10).The slv sequence is annotated as the SLC50 orthologue (Limmer et al., 2014;Artero et al., 1998).However, the evolutionary relationship between the human SLC50A1 and slv has not yet been shown.The slv protein sequence was therefore added before constructing the phylogenetic tree seen in Fig. 5D.The model of SLCs of CPA/AT type did not identify any related protein sequences to the SLC10 family, but through PSI-BLAST two potential homologues, CG9903 and CG11655, were identified.So far, no orthologues have been described for this family, but according to preliminary data presented at flybase.org (FB2021_01, FBrf0174215, (Larkin, 2020), these putative SLC10 proteins have similar functions as the human proteins.Therefore, the sequences were added before constructing the tree seen in Fig. 5B.A few protein sequences were not identified at all by the HMMs, which occurred for a total of five protein sequences belonging to the SLC7, SLC11, SLC29 and the atypical SLC MFSD11.
(II) In general, the amino acid sequence identities among the SLCs are low and protein sequences have been sorted into SLC superfamilies based on their function, rather than their structure and sequence identities.The current practice assigns a new sequence to an existing SLC family, if the protein sequence have at least 20 % amino acid sequence identity to at least one other members of that SLC family (Hediger et al., 2013;Hediger et al., 2004).SLC families therefore have a great degree of variation in structure and primary sequences compared to other large superfamilies of membrane proteins, e.g.G protein-coupled receptors (Yu and The, 2004) and voltage-gated ion channels (Hellsten et al., 2017).
(III) Another possible explanation for the differences in the global alignment similarities could be with the large variations of the N-and Cterminus as well as the loops.It is also possible that the differences arise due to the criterion of 20 % amino acid sequence identity among members that have resulted in low amino acid identities between members, hence an orthologue does not have to share similarities with all members of a subfamily, Supplementary data 4.This is observed for e.g. the human SLC38A10, which diverges in the global amino acid sequence from the other members of the SLC38 family and should, according to the criterion, not be included into the family, yet it is.SLC38A10 has a long C-terminus compared to the other members, which most likely results in the large difference in the observed global protein alignment (Pao et al., 1998).A local protein alignment between SLC38A10 and SLC38A1 showed an amino acid sequence identity above 20 %, illustrating that there are other parts of the protein sequence such as the transmembrane regions that are conserved.Hence, it is possible that if a rather time-consuming method was used to study the sequences with local protein alignment and secondary protein structure predictions for each sequence more members to e.g. the SLC10, SLC16, SLC22, SLC35, SLC39, SLC38 and SLC46 families in humans would have been found.
(IV) Furthermore, the MFS superfamily, which is the largest group of classified transporters and transporter-related proteins across different phyla (Reddy et al., 2012;Saier et al., 1999;Schlessinger et al., 2010), contains protein sequences that are unique for certain phyla and are not found as evolutionary conserved (Hoglund et al., 2011).This might result in a low alignment score of the multiple alignments and therefore also a skewed model of conserved regions that the HMM will use.Meanwhile, the Pfam clans contain several protein families with various numbers of members, where a family with many members could have a great impact on the HMM compared with a family with few members.It is therefore possible that some information could be lost by the model and, hence, the HMM fails to identify protein sequences that have been annotated as SLC orthologues through functional studies.We therefore believe that several of the 51 protein sequences were undiscovered due to large variations in primary sequence compared with the conserved motifs that the HMMs were based on.This could possibly have been [SLC3 (Heavy subunits of the heteromeric amino acid transporters)] were generated using mrBayes 3.2.7 (39).Human SLC proteins are called by their SLC nomenclature and D. melanogaster proteins are named as defined at flybase.org.(A) The phylogenetic tree illustrates the relationship between the human protein sequences belonging to the SLC27 family, the fly Fatp1, Fatp2 and Fatp3 proteins (green) and 21 additional protein sequences that were mined with the HMM.(B) The Phylogenetic tree of CPA/AT displays the phylogenetic analysis between the fly and human SLC9 and SLC10 families.(C) In D. melanogaster, one protein sequence, CD98hc, was found to cluster together with SLC3A1 and SLC3A2, while additional 10 protein sequences were identified with the HMM.(D) Four homologous proteins were identified for the SLC54 family (light orange), and the fly slv was found to be homologue to human SLC50A1 (E) Npc1a and Npc1b were identified as homologs in D. melanogaster using the HMM for the SLC65 family.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)Ceder and R. Fredriksson avoided by building HMMs based on the sequences from each family.However, by constructing the models based on the conserved motifs most likely more homologous proteins were identified.As seen in the phylogenetic tree of the SLCs of MFS type, Fig. 2, the HMM identified not only annotated SLCs proteins in fly, but also new sequences belonging to the families and additional eight protein sequences that do not cluster with any existing family.Moreover, these eight protein sequences indicate that there are SLC related proteins in human that are only present in other species, which has been observed before (Hoglund et al., 2011).This shows the importance to not only base the searches on a single family since it would most likely would lead to more unidentified sequences.Taken together, we therefore believe that this present screen is constructed in the best suited format to be able to identify homologous SLCs in D. melanogaster.Two members of the SLC22 (SLC22A18, SLC22A32, Fig. 2) and the SLC38 (SLC38A7, SLC38A8, Fig. 3) families do not cluster with their respective SLC family when constructing a tree with both human and fly protein sequences.However, they do when excluding the fly protein sequences.SLC22A18 has earlier been indicated to be diverge from the rest of the SLC22 family members (Schiöth et al., 2013), and SLC22A32 was recently added as a member of the SLC22 family.It is possible that SLC22A32 cluster together with the putative SLCs described by Perland and colleagues (Perland et al., 2017) because it has sequence similarities to the other major facilitator superfamily domain (MFSD) proteins.A reason why SLC22A18 cluster as it does could be that it has the highest amino acid sequence identity to SLC22A32 (Perland et al., 2017).SLC38A7 and SLC38A8 have previously been found to share a common ancestor and are believed to have related functions (Goberdhan, 2010).In Fig. 3; both these proteins are found to cluster with the SLC5 family.It is possible that these proteins are more similar to one of the related protein sequences identified within the SLC5 subfamily.An observation is that many of the identified D. melanogaster protein sequences that display low amino acid sequence identity to the human homologues are generally shorter.The length of the sequences can certainly be a reason why the global pairwise alignment score differs, but the amino acid composition might still be similar.It would also explain why HMMs identify orthologues with low global amino acid sequence identity.
HMMs were not constructed for SLCs that lack a Pfam clan classification since these SLC families do not have an amino acid sequence identity to other families and their members.Several of these families contain too few members to perform a correct alignment of each family alone, which together with the low sequence identity would have resulted in incorrect alignments and difficulties in identifying SLCs in the D. melanogaster proteome.This problem could have been avoided if adding other species in the analysis such as teleost fish, e.g.D. rerio (Zebra fish), S. richardsonii (Snow trout) or T. putitora (Golden mahseer), but that was not the aim of this article.However, there are a few elegant publications that addresses the SLCs in teleost fish, where approximately 338 homologous protein sequences have been identified in fish, and that discusses the evolutionary relationship to human SLCs (Verri et al., xxxx;Barat et al., 2019;Barat et al., 2016) Metabolic and neurobiological pathways and mechanisms are well conserved between human and D. melanogaster (Hoglund et al., 2011;Bellen et al., 2010).It is known that the fruit fly and mammals use many of the same neurotransmitters (Martin and Krantz, 2014), which also explains the high conservation of the SLC6 family between humans and D. melanogaster.Furthermore, the diverse and large families of SLC22 and SLC25 in human contain transporters with very specific expression and function, and hence it is likely that these transporters are well conserved in D. melanogaster.The high conservation was also observed when performing the global pairwise alignment, where it was obvious that some subfamilies of the SLC superfamily were more conserved than others, Table 1 to 10. Except for the SLC6 family, the SLC18 and SLC36 families, belonging to the SLCs of APC type, all had members that performed same or similar functions in fruit fly as in human.Moreover, members of the SLC36 family in both human and fruit fly have been found to be important in the vital pathway of mTOR (Jewell and Guan, 2013;Cheatham et al., 1994).These similarities between the species are good examples of the possibilities there are to study the physiology and pathophysiology of SLCs in invertebrates.
In human, the SLC2 transports various sugars in different tissues and with varying affinities (Mueckler and Thorens, 2013) to regulate the glucose homeostasis by feedback mechanism e.g. through insulin release (Mattila and Hietakangas, 2017).Even if D. melanogaster is commonly used as a model to understand human metabolic pathways (Ceddia et al., 2003) there are differences in the sugar metabolic pathway (Hall et al., 2007;Becker et al., 1996;Reimer, 2013), which could explain why the fruit fly have more SLC2 transporters.
The SLC17 family has nine members in human and can be considered as a functionally diverse family of organic anion transporters found in various tissues in human (Hunter et al., 2009).In D. melanogaster, almost three times more SLC17 related proteins, 26 in total, were identified.Some of the homologues are differently expressed during the developmental stages of D. melanogaster according to flybase.org(FB2021_02, (Larkin et al., 2020), which could be an explanation for the vast number of SLC17 related proteins.However, it is a speculation and it needs to be confirmed by further studies.Important to note is that several of the identified protein sequences in D. melanogaster do not have a confirmed function.However, the one-to-one suggested orthologous relationship between genes (e.g.SLC17 family in human and fly) found using the phylogenetic analysis as well as the predictions from the online software InterPro (Mi et al., 2013) and PANTHER (Thomas et al., 2003;Kendrick et al., 2017); suggest that the function is conserved among different phyla and indicates that these findings are reliable..
The SLC3 family is one of the SLC families that does not have a transporter function by itself and instead the members act as subunits to the SLC7 family (Fredriksson et al., 2003).It has been argued that this family should not be classified to the SLC superfamily, mainly because there are several proteins that act as subunits to other SLC subfamilies that are not included in the SLC superfamily, for example CD147 (basigen) to the SLC16 family of monocarboxylate transporters (Halestrap and Meredith, 2004;Suda et al., 2018).However, due to the present classification of SLCs, the SLC3 family was included in the analysis.In 2009, the D. melanogaster protein CD98hc was suggested to be the orthologue to human SLC3A1 and SLC3A2, and it was found to both share function and protein structure with the human SLC3 proteins (Reynolds et al., 2009).CD98hc was also identified as a fly homologue to the SLC3 family from our phylogenetic analysis.However, and also interestingly, the global amino acid sequence identity did not pass the criterion of 20 % amino acid sequence identity (Hediger et al., 2013;Hediger et al., 2004).This suggests that the CD98hc should not be grouped into the SLC3 family.These findings also suggest that the present criterion used for classifying SLCs might need rephrasing and that it might not always be applicable when studying the evolutionary relationship between species.Some of the SLC families without a clan affiliation, e.g. the highaffinity glutamate and neutral amino acid transporters (SLC1) and the mitochondrial transporters (SLC25) (Limmer et al., 2014;Cui et al., 2016;Featherstone, 2011;Carrisi et al., 2008), have an extensive and deep-rooted research history in several model organisms including D. melanogaster.The physiological systems that these SLCs operate in are of great importance for both vertebrates and invertebrates to maintain a healthy, normal internal environment.For instance, SLC1 is needed for neurotransmitter storage, release and recycling (Martin and Krantz, 2014;Carrisi et al., 2008); members of the SLC25 family are crucial to provide the cells with energy through the electron transport chain (Slabbaert et al., 2016); and both the SLC1 and SLC25 families are associated with neuronal survival and fundamental neurobiological processes (Limmer et al., 2014;Martin and Krantz, 2014).Despite this, our knowledge about the SLC superfamily and its subfamilies are limited and still there are transporter proteins that we know little to nothing about.
M.M. Ceder and R. Fredriksson 4.1. Conclusion We have scanned the sequenced fly proteome for SLCs and found a total of 381 SLCs, and thereby showed that 84 % of all the SLC families in humans have an equivalent in the fruit fly.To our knowledge, this provides the first collection of all SLC sequences in D. melanogaster and we believe that our SLC dataset can aid in further research about the SLCs in several species.Our results highlight the importance of SLCs and how vital it is to be able to use model organisms.We also believe that the results point to the possibility to use D. melanogaster to further investigate the function of the SLC superfamily and their role in conserved cellular processes as well as health and disease.

Fig. 3 .
Fig. 3. Protein sequences in D. melanogaster that are related to the human SLCs populating the Amino Acid-Polyamine-Organocation (APC) Pfam clan.54 protein sequences were identified in D. melanogaster using the HMM for the human SLCs populating the APC Pfam clan [SLC4 (Bicarbonate transporter family), SLC5 (Sodium glucose cotransporter family), SLC6 (Sodium-and chloride-dependent neurotransmitter transporter family), SLC7 (Cationic amino acid transporter/ glycoprotein-associated family), SLC11 (Proton-coupled metal ion transporter family), SLC12 (Electroneutral cation-coupled Cl cotransporter family), SLC23 (Na + -dependent ascorbic acid transporter family), SLC26 (Multifunctional anion exchanger family), SLC32 (Vesicular inhibitory amino acid transporter family), SLC36 (Proton-coupled amino acid transporter family) and SLC38(System A and System N sodium-coupled neutral amino acid transporter family)].No sequences were identified for the SLC11 and SLC23 subfamilies.The proteins are presented in a polar tree layout with branch length 0.80 built using RAxML (40) and the colored shapes indicate different SLC families.Human SLC proteins (in bold) are called by their SLC nomenclature and D. melanogaster proteins are named by their protein name as defined on flybase.org.

Fig. 4 .
Fig. 4. Protein sequences identified via the HMMs for SLCs in the Drug/Metabolite Transporter (DMT) Pfam clan and Ion Transporter (IT) Pfam clan.In D. melanogaster 23 related protein sequences to the human SLCs [SLC35 (Nucleoside-sugar transporter family), SLC39 (Metal ion transporter family) and SLC57 (NiPA-like magnesium transporter family)] populating the DMT Pfam clan, and 8 related protein sequences for the human SLCs [SLC13 (Human Na + -sulfate/ carboxylate cotransporter family) and SLC53 (Phosphate carriers)] in the IT Pfam clan were identified.(A) The DMT Pfam clan families are specified by different colors: purple indicates the SLC35 family with 13 fly protein sequences, green the SLC39 family with 9 fly protein sequences and blue the SLC57 family with one identified protein sequence in the fly.(B) There are only two families populating the IT Pfam clan so far, SLC13, in dark pink, and SL53, in light pink.Four proteins from the fly proteome were identified as related protein sequences to SLC13 and SLC53, respectively.The phylogenetic relationships are presented in radial trees with branch length (A) 0.80 and (B) 0.40 built using mrBayes 3.2.7 (39).Human SLC proteins are called by their SLC nomenclature and D. melanogaster proteins are named by their protein name as defined on flybase.org.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig. 5. Phylogenetic representation of relationships for the remaining clan-classified SLCs.Phylogenetic trees for protein sequences in the Acyl-CoA synthetases, NRPS adenylation domains and Luciferase enzymes (ANL) clan [SLC27 (Fatty acid transporter family)], Cation:Proton Antiporter/Anion transporter (CPA/ AT) clan [SLC9 (Na+/H + exchanger family), SLC10 (Sodium bile salt cotransport family)], Frizzled cysteine-rich domain-related (Fz) clan [SLC65 (NPC-type cholesterol transporters)], MtN3-like [SLC50 (Sugar efflux transporters), SLC54 (Mitochondrial pyruvate carriers)] and Triose phosphate IsoMerase (TIM)_barrel clan[SLC3 (Heavy subunits of the heteromeric amino acid transporters)] were generated using mrBayes 3.2.7 (39).Human SLC proteins are called by their SLC nomenclature and D. melanogaster proteins are named as defined at flybase.org.(A) The phylogenetic tree illustrates the relationship between the human protein sequences belonging to the SLC27 family, the fly Fatp1, Fatp2 and Fatp3 proteins (green) and 21 additional protein sequences that were mined with the HMM.(B) The Phylogenetic tree of CPA/AT displays the phylogenetic analysis between the fly and human SLC9 and SLC10 families.(C) In D. melanogaster, one protein sequence, CD98hc, was found to cluster together with SLC3A1 and SLC3A2, while additional 10 protein sequences were identified with the HMM.(D) Four homologous proteins were identified for the SLC54 family (light orange), and the fly slv was found to be homologue to human SLC50A1 (E) Npc1a and Npc1b were identified as homologs in D. melanogaster using the HMM for the SLC65 family.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) table also presents function of the D. melanogaster proteins if found on flybase.org, predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).
dependent;C (Na + ); Ch (continued on next page) M.M.Ceder and R. Fredriksson

Table 2
(continued ) (continued on next page) M.M.Ceder and R. Fredriksson

Table 3 Summary of human and D. melanogaster SLCs belonging to the Drug/Metabolite Transporter clan.
Summary of the human SLC of DMT type divided into subfamilies, members, substrate profile, transporter type as well as the identified homologue/homologues in flies that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar, meaning that the D. melanogaster proteins presented next to a human SLC had the highest amino acid identity to that particular SLC in the subfamily.The table also presents function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 4 Summary of human and D. melanogaster SLCs belonging to the Ion Transporter clan.
The human SLC of IT type divided into subfamilies, members, substrate profile, transporter type and the identified fly homologue/homologues that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar.The table also presents function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 5 Summary of human and D. melanogaster SLCs belonging to the Acyl-CoA synthestases, NRPS adenylation domains and Luciferase enzymes (ANL) clan.
The human SLC populating the ANL clan, it contains information regarding subfamilies, members, substrate profile, transporter type as well as the identified fly homologue/homologues that were most similar (through amino acid identity) to the human SLC.One human SLC can be presented to have more than one homologue as the most similar.The table also displays function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 6 Summary of human and D. melanogaster SLCs belonging to Cation:Proton Antiporter/Anion transporter clan.
SLCs of CPA/AT type are divided into subfamilies, members, substrate profile, transporter type, and identified homologous protein sequences in flies that had the highest most similar (through amino acid identity) to the human SLC.One human SLC can have more than one homologue as the most similar.The table also states the function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 7 Summary of human and D. melanogaster SLCs belonging to the A Triose phosphate IsoMerase (TIM) barrel clan
. The human SLC3 subfamily and summarizes information about members, substrate profile, transporter type as well as the most similar homologue/homologues in flies.One human SLC can be presented to have more than one homologue as the most similar, meaning that the D. melanogaster proteins presented next to a human SLC had the highest score to more than one member.The table also presents function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 8 Summary of human and D. melanogaster SLCs belonging to the Frizzled cysteine-rich domain-related clan.
Summary of the SLC65 subfamily, its members, substrate profile, transporter type as well as the identified fly homologue/homologues that were most similar (through amino acid identity) to the human SLC.The table also presents the function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).
D. melanogaster exhibit similar functions as the human proteins.Therefore, potential one to one orthologue relationship between humans and D. melanogaster could be proposed, Table2, e.g.within the SLC5, SLC6, SLC7 and SLC36 families.The SLC23 family is not included in the D. melanogaster SLC table, however, the PSI-BLAST search using the human protein sequences and aligning them against the fly proteome revealed one possible homologue, CG6293, a protein of unknown function sharing approximately 44 % amino acid identity, Supplementary data 5.
also present function of the D. melanogaster proteins if found on flybase.org,predictions were not included.(Transporter type: F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 10 Summary of human and D. melanogaster SLCs belonging to the Thioredoxin-like clan.
The human SLC58 subfamily divided into members, substrate profile, transporter type as well as the identified fly homologue/homologues that were most similar (through amino acid identity) to the human SLC and its function (if available).Predicted functions were not included.(Transporter type: Ions within brackets indicate the transport coupled-ion, F = Facilitator, C = Cotransporter, E: Exchanger, Ch = Channel, O = Orphan).

Table 11 Summary of possible fly members of the SLC subfamilies that are not classified into a Pfam clan.
Summary of potential D. melanogaster homologous protein sequences that were found when aligning the human, no clan SLC subfamilies to the D. melanogaster proteome.Protein sequences were identified using the NCBI/NIH PSI-BLAST suit software (blast.ncbi.nlm.nih.gov) and the searches were carried out against the human SLC protein sequences.The table describes SLC family abbreviation, number of members in humans and the name of the identified sequences in D. melanogaster.For SLC14, eight proteins are listed in the human SLC table at slc.bioparadigms.org, but the eight proteins share only two protein sequences, therefore the number of members in that family is calculated as two and the number within the brackets indicate the number listed in the online database.No symbol = protein sequence identified by PSI-BLAST that was already listed in the D. melanogaster SLC table found on Flybase.org,* = found only in the D. melanogaster SLC table on Flybase.org,† = protein homologue only found through PSI-BLAST.