In Silico Characterization of Mate Genes in Wheat


 Background: Multidrug and toxic compound extrusion (MATE) genes are a group of multidrug efflux transporters that widely exists in all living organisms and play a major role in the detoxification of heavy metals, metalloids, exogenous xenobiotics and endogenous secondary metabolites out of the cells. However, insilico analysis of MATE gene family in plant species is very limited and thus such analysis need to be elucidated in wheat.Results: We have identified forty-four MATE genes in wheat and categorized into seven families based on their phylogenetic analysis. Further, 43 genes were found to exhibit protein-protein interaction at the protein level by using STRING software. We observed that the maximum number of exons i.e., 14 was identified in genes TraesCS6A02G418800.1 and TraesCS6D02G407900.1. We employed MEME software to find protein motifs associated with the MATE genes where maximum number of motifs were set to 22. Here, the protein motifs among the families 1,2 and 3 were significantly different from the rest. We found that the majority of MATE genes were showing expressions during biotic stress conditions due to disease infestations and the highest level of expression was shown by the gene TraesCS5B02G326600.1 belonging to family 1 which got expressed during Fusarium head blight infestation by Fusarium graminearum after 4 days of inoculation by using Wheat expression browser tool. A total of 39 ternary plots consisting of homoeologous genes for 39 MATE genes, showing different level of expressions during biotic and abiotic stress conditions were composed, where we found 44 % of the triads tend to show non balanced expressions (extreme values) due to their higher tissue- specificity and greater intensity.Conclusion: The results obtained from this study indicated that total 44 MATE genes were found to be directly involved in the metabolism of wheat and were expressed during different biotic and abiotic stress conditions. So such genes can be further evaluated for their interaction with heavy toxic metal elements and sequestration from the cells.

. The recent reports suggest that the study of MATE gene family and their diverse roles extend to numerous other plant species such as in rice and Arabidopsis , tomato , soybean (Liu et al.,2016), cotton (Lu et al.,2018), Sorghum bicolor (Sivaguru et al.,2013), blueberry (Chen et al.,2015), Camellia sinensis (Chen et al.,2020) etc. but in wheat the study of MATE gene family needs to be elaborated.
Completion of wheat whole genome sequencing in 2018 has opened up the area of gene annotation especially in silico in order to characterize the MATE gene family as well as their application in making gene based molecular markers for plant breeding purposes. In the current study, MATE genes in wheat (Triticum aestivum L) were identi ed and interaction among the genes at protein level was performed. In addition to this, the insilico expression analysis of MATE genes along with their homoeologous candidates in different genomes for various biotic and abiotic stress conditions was studied as well as protein motifs were analyzed.

Selection of mate genes and their chromosomal location
The bread wheat genome comprises of approximately 107,891 coding genes and 12,853 non-coding genes (Appels et al.,2018), out of which 44 MATE genes were identi ed with help of Uniprot and Uniprac databases and corresponding nucleotide and protein sequences were obtained (Additional le 1 & 2). The details of the 44 MATE genes including genome and protein ID's, size of the gene and protein, number of exons in each gene sequence as well as their source were presented in the table 2. Three different diploid wheat parents viz., T. urartu, Ae. speltoides, and Ae. tauschii (contributed A, B, and D genome, respectively) had contributed to the genomic evolution of modern hexaploid wheat . Out of the total MATE genes, 33 genes were displayed based on their descriptive chromosomal locations ( Fig. 1) as karyotype view, retrieved from Gramene server (https://ensembl.gramene.org/Triticum_aestivum/Location/Genome? time=1600500223).

Phylogenetic analysis of mate genes in wheat
Studies have revealed that the phylogenetic analyses of membrane transporters were generally inaccurate to condemn speci c substrates . However, phylogeny of the MATE family has been represented relatively useful to predict the a nities with potential molecule groups, such as organic acids (citrate), alkaloids (nicotine), and avonoids (anthocyanin, proanthocyanidin etc.) . Multiple sequence alignments of MATE protein sequences were generally carried out by using ClustalX 2.1 software with its default settings (Peng et al.,2012). From that we employed Clustal-Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) and phylogeny.fr:TreeDyn (http://www.phylogeny.fr/one_task.cgi?task_type=treedyn) tools to construct a multiple sequence alignment of the identi ed 44 full-length protein sequences. The complete multiple alignment pro les of protein and nucleotide sequences were used to establish a phylogenetic tree. The Neighbour-joining (N-J) tree topography has been employed to study the MATE gene family (Fig. 2.).
Here, we have estimated seven major groups or families of MATE genes by increasing the scale of 0.1 to 0.12 and categorised the results in Table 1.
From the earlier study by Liu et al., in 2016, it was con rmed through the genome wide analysis of MATE genes in soybean that 117 MATE genes could be classi ed into four primary clades or families by using MEGA 6.0 tool and a maximum likelihood (ML) tree was constructed. In maize, a total of 49 recognized ZmMATE genes were found and grouped into seven clusters in the form of Neighbourjoining tree by using Mega5.0 software (Zhu et al.,2016). Similarly, to study the comparative evolutionary history as well as relationship of MATE gene family among two species of cotton and Arabidopsis, total 196 putative MATE genes were analysed, out of which 68 genes were of G. arboreum (GaMATE), 70 genes of G. raimondii (GrMATE) and 58 genes of Arabidopsis. Then, based on phylogenetic classi cation, these MATE genes were categorised into three subfamilies by using NJ method of MEGA6.0 software, where M1 subfamily was the largest one having 124 genes, followed by M2 having 48 genes M3 having 24 genes (Lu et al.,2018). From this, it may be concluded that the classi cation implemented in this study could be accepted like prior ndings.
Further, the multiple sequence alignment of 44 genes were represented where variation at the nucleotide level has been exhibited (additional le 3). The nucleotide sequence alignment was performed by using Multalin database (http://multalin.toulouse.inra.fr/multalin/) where 44 MATE genes have been displayed in three colours which were red, blue and black signifying high consensus, low consensus and neutral colours respectively. Here, we noticed deletion regions at several places in the nucleotide sequence alignments which were highlighted in the Fig. 3. These evolutinary footprints are of immense importance for understanding the basic structure of wheat genes. Similarly, according to Ma et al.,2011, in rice and Arabidopsis, the multiple peptide sequence alignment of the plastocyanin-like domains (PCLDs) determined the conserved amino acids involved in copper binding. In rice, all the MAPKKKs (Mitogen activated protein kinase kinase kinase) genes, exclusively involved in signal transduction pathways, were classi ed under Raf, ZIK and MEKK subfamilies and thus the analysis of rice MAPKKKs along with that of Arabidopsis for all the Raf, ZIK and MEKK subfamilies was carried out by creating multiple sequence alignments of kinase domains through Multalin program in order to detect speci c conserved signature sequences. It was revealed that in subfamily Raf, 43 MAPKKKs in rice and 48 in Arabidopsis, in subfamily ZIK, about 10 MAPKKKs in rice were identi ed whereas in MEKK, 22 MAPKKKs from rice and 21 from Arabidopsis were found to be conserved (Rao et al.,2010). Also, in blueberry (Vaccinium corymbosum), out of total 33 MATE genes, the multiple sequence alignments of 08 novel VcMATE protein sequences along with the selected MATE transporter orthologs were analysed using ClustalX software. Here, it was found that VcMATE 2 shared highest level of identity with the known avonoid transporters while VcMATE 1 and VcMATE 4 exhibited lowest similarities to the MATE-type avonoid transporters (Chen et al.,2015).

PROTEIN-PROTEIN INTERACTION
It has been documented that, the cells communicate with each other through protein-protein interactions and perform all the physiological processes of life through interactions of various proteins (Szklarczyk et al.,2017). We have constructed the association network of protein interactions among 43 genes belonging to Triticum aestivum out of total 44 protein sequences (remaining 1 protein sequence i.e, A0A3B6AXC7 comes under Triticum utaru and didn't participate in the interaction) by using STRING online program. The primary interaction unit in STRING is mainly the 'functional association', and a link between two proteins that contribute mutually to a speci c biological function (Szklarczyk et al.,2017). Here, the network ( Fig. 4) has been stretched by an additional 20 proteins (through MORE button in STRING Interface) so as to get an extra clear image of the interaction, and the con dence cut-off for screening interaction links has been set to "medium con dence" at 0.4. This (Fig. 5) determines the results retrieved after entering the set of 43 protein sequences projected to be involved in e ux of toxic multidrug compounds from the cells and tissues of the wheat plant. The network statistics of the set of proteins identi ed in functional subsystems revealed that number of nodes were 53 with average node degree being 1.55 and PPI enrichment p-value at 3.9e-12. Further, we have also noticed that the protein sequences from the same phylogenetic group were interacted closely and clustered together in the association network.
Similar studies using this approach was seen in Brassica rapa, where proteins involved in biosynthesis of camalexin (involved in resistance against Botrytis cinerea and Alternaria brassicae) were found to be interacted through functional proteins association networks and analysed using STRING database version 10.5. Here, a phytoalexin de cient 3 (PAD3) gene was identi ed as a key functional node along with CYP71A12 gene as a potential functional partner with which all other multi proteins were found to be associated (Gaur et al.,2018). In case of in silico analysis of functional linkage among arsenic induced MATE genes in rice by Seth et al.,2019, it was found that 37 MATE genes were found to be interacting at the protein. Also, in rice (Oryza sativa ssp. japonica), around 30 genes were identi ed from AbS (Abiotic Stress) responsive gene family, involved in stress responsive signalling during various abiotic conditions like drought, submergence, cold, salinity, metal toxicity etc., and out of these 30 seed -proteins, 22 genes were found to be extensively involved in protein protein interaction network along with the extra 34 derived neighbours by using String software, showing closely related functional modules and complexity of AbS (Muthuramalingam et al.,2017).

STRUCTURAL ANALYSIS OF MATE GENES IN WHEAT
The phases and structures of exons / introns of the MATE genes was examined by using EnsemblPlant software (https://plants.ensembl.org/Triticum_aestivum/Transcript/Summary). This analysis provides more insight into evolution of gene structures in wheat which provides detailed information regarding the transcript and translation length, number of coding exons, amino acids and the base pairs along with the transcript diagram. There were 44 exon intron transcripts that have been identi ed (Additional le 4).The maximum number of exons present in the following gene sequences were 14 which exist in genes TraesCS6A02G418800.1 and TraesCS6D02G407900.1 while the least number of exons that is only 1 was available in the genes sequences viz., TraesCS1A02G188100.1.cds1,TraesCS5B02G562500.1.cds1, TraesCS6A02G256400.1.cds1 and TraesCS6D02G384300.1.cds1. Fan et al.,2014, suggested that on the basis of phylogenetic tree analysis, MATE genes were mainly grouped in to three subfamilies and the intron-exon structures were subfamily-speci c, indicating the cotton MATE genes were considerably conserved and functionally diversi ed.

PROTEIN MOTIFS
Generally protein structures have conserved elements called motifs, which have a su cient in uence on the function of proteins (Conklin.,1995). The function of proteins usually imposes tight constraints on the evolution of speci c regions of protein structure residues directly or indirectly in a function and often clustered in a short sequence motif (signature, pattern, framework or ngerprint) that is conserved across the various proteins sharing that function (Manning et al.,1998).The online software Multiple EM for Motif Elicitation (MEME) was employed to analyse the motifs in MATE proteins (Bailey et al., 2015). These conservative protein motifs in wheat MATE protein gene sequences were identi ed and predicted using MEME tool where the maximum number of motifs was set at 22. These were then arranged according to their four families. The motifs of wheat MATE protein were shown as coloured boxes, each motif represented as a number in the coloured box. They were listed according to the families 1 to 7 from the phylogenetic tree (Additional le 5). Zhu et al.,2016, observed that, usually, it is perceived that most of the closely related members within the same family were having common motif compositions, indicating their functional similarities. In this case, a total maximum of 22 conserved motifs were identi ed and represented as the different coloured boxes as symbols for different motif consensus. The types and sequences of the protein motifs among the families 1, 2 and 3 were signi cantly different from the rest. Further, we have noticed that the family 3 and 5 were having 19 and 14 number of protein motifs respectively while the rest have 22. Here, it may be concluded that the interacting MATE genes or protein sequences within a family are also having similar protein motifs.
An early study had de ned that motifs were the short DNA or protein sequence which contribute towards the biological functions of the sequences in which they resides where they become one of the basic functional units of molecular evolution (Grant et al.,2011). Similar work has been done by Liu et al.,2016, in genome-wide analysis of MATE transporters in soybean where they have analysed maximum 12 conserved motifs in which identical type of motif sequences were present in the rst three families and signi cantly different were in the fourth family with very less number of motifs by using MEME software.

IN SILICO EXPRESSION ANALYSIS AND THEIR HOMOEOLOGOUS CANDIDATES
Wheat is having MATE transporter proteins which are responsible for controlling different expressions and functions during vegetative growth, reproductive development, senescence as well as resistance to biotic and abiotic stresses similar to the other plants. The heat maps of gene expressions of these 7 families were obtained where all the genes were displayed according to their phylogenetic associations (Additional le 6). Family 1 contained two genes with constitutive, high expressions which were TraesCS5B02G326600.1 and TraesCS4A02G245300.1 with tpm values 6.99 and 5.4 out of 7 Log (tpm) which were highly expressed in fth leaf blade, spikes, rachis, anthers for disease infestation such as Fusarium head blight and stripe rust respectively. While the genes TraesCS3B02G563400.1 and TraesCS5B02G245500.1 had lowest expression in the group as they were showing least or even no expression in high as well as intermediate level of biotic and abiotic stress conditions. Similarly, the family 2 contains ve genes showing higher expressions in mainly reproductive stages at spikes, rachis and during Fusarium infestation. These were TraesCS2B02G296000.1, TraesCS2D02G277400.1, TraesCS5D02G378200.1, TraesCS5D02G378300.1 and TraesCS2B02G296100.1 having tpm values 6.26, 6.35, 5.35, 6.45 and 5.13 respectively. The negligible expressions were displayed by the gene TraesCS5D02G413800.1. In family 4, two members out of three were constitutively expressed, although at mid to low levels in most of the tissues such as grain, spike, leaves and other shoot parts mainly at reproductive stage and during disease infestation except the gene TraesCS1A02G188100.1, showing no expression at all. In family 5, only one gene (TraesCS5B02G562500.1) is present which shows maximum degree of expressions at 24 days after sowing during 10th day of phosphorus starvation and other abiotic stress particularly at roots. Here, in family 6, we can notice that out of the two, only TraesCS4B02G244400.1 is highly expressive during reproductive stage at anthers and spikelets with maximum tmp value of 4.52. Unlike other families, all the MATE gene members in family 3 and 7 were constitutively expressed with varying transcriptional intensities with their tmp values ranging between 3 to 4 at both vegetative and reproductive stages in various tissues of the plant during stress conditions. As all these MATE genes were located in the integral part of the plasma membrane, they were exclusively involved in molecular functions like solute -solute antiporter activity, xenobiotic transmembrane transporter activity and transmembrane transporter activity ( Table 5).
The results obtained suggested that different MATE genes were showing expressions during various biotic and abiotic stress conditions but majority of the genes were exhibiting expressions during biotic stress conditions due to disease infestations (Table 4), where the overall highest level of expressions has been shown by gene TraesCS5B02G326600.1 belonging to family 1, expressing during disease infestation of Fusarium head blight by Fusarium graminearum after 4 days of inoculation.
Lu et al., in 2018, had analysed the expression pro les of GaMATEs and GrMATEs genes belonging to two species of cotton viz., G. arboreum and G. raimondii. The study was carried out in root tissues in order to examine the expressions levels of genes in roots tissues under abiotic stress conditions of drought, salinity and Cd stress. Out of the total MATE genes, GrMATE54, GrMATE53 and GaMATE21 were found to be highly expressed during these three abiotic stress conditions and were also involved in vacuolar sequestration and toxin e uxers while GrMATE34 and GaMATE54 found signi cantly expressed during stress conditions were involved in ABA transporting. Similarly, in Soybean, out of 117 MATE genes, expression pro les of 113 MATE genes were constructed through heatmap by using MeV 4.9 software, which were differentially expressed in nine tissues viz., leaf, stem, ower, pod, seed, root, root hair, nodule and shoot apical meristem. These GmMATE genes exhibited tissue speci c expressions such as GmMATE107 and GmMATE27 showed highest expression level in roots, root hairs and nodules while least expressions in above ground tissues. Similarly, GmMATE44, GmMATE81 and GmMATE36 showed high level of expressions in pods and developing seed while GmMATE62 and GmMATE7 were expressed in leaf tissues (Liu et al.,2016). In Medicago truncatula, out of total all the MATE genes, UGT78G1, MaT4, MaT5, and MATE2 were found to be expressed in various parts of the plant such as leaves, roots and ower but MATE2 gene had shown highest level expressions in owers, followed by roots, vegetative buds, leaves, and seeds and was associated with the transport of glycosylated avonoids. It has been reported that the gene was exclusively involved in the pigmentation of anthocyanin compound and thus lack of this pigment resulted in discolouration in leaves and owers . Tiwari et al.,2014, had also revealed that in genome-wide expression analysis of rice MATE genes, two arsenic responsive genes OsMATE1 and OsMATE2 were taken for functional study in transgenic lines of Arabidopsis, where majority of the expressions were shown in leaf, seed and ower morphology, pattern of rosette arrangement and owering time. These OsMATEs were found to regulate plant growth and development in transgenic lines but their expressions were showing more susceptibility to the biotic and abiotic stresses as compared to the wild types.
We know that wheat is an allopolyploid having two (tetraploid wheat with two homoeoalleles ) or three (hexaploid wheat with three homoeoalleles ) homologous sub genomes. The homoeoalleles of a gene in polyploid wheat having higher a nity in DNA sequence and function, makes the gene cloning and functional analysis a challenging task . The polyploidy that arises from whole genome interspeci c hybridisation or duplication is present ubiquitously across the plant and fungal kingdom and thus the existence of extremely related genes in polyploids known as homoeologs has promoted the domestication and adaptation of many major polyploidy crops like hexaploid bread wheat (Triticum aestivum; AABBDD sub genome), cotton, coffee etc., (González et al.,2018). By understanding in what way these homoeologous genes interactions effect the gene expressions, will ultimately help to build strategies so as to improve the crops by targeting and manipulating individual or multiple homoeologs to quantitatively modulate trait responses (Borrill et al.,2015).
All the possible homologous genes for the 44 MATE genes were found with help of EnsemblPlant database and displayed in the form of ternary plots through wheat expression browser software and listed in table 3. Here, we have found that the ternary plot shows two homologous genes from different species like Azhumaya wheat (TraesCS1A02G188100.1) and Chinese spring wheat (TraesCS1B02G195900.1) of Triticum aestivum (TraesCS1D02G188200.1) as shown in Fig. 7.and their level of expressions indicates their transporting roles of alkaloids in tissues such as leaves, roots, rachis, spikes, coleoptiles etc. in different biotic and abiotic stress conditions like stripe rust, powdery mildew, heat and cold stress etc.
Similarly, from Table 4 and Additional le 7, we can analyze that the homoeologous genes of the identi ed wheat MATE genes TraesCS1A02G305200.1, TraesCS2B02G247700, TraesCS2D02G277400.1, TraesCS3B02G298700.1, TraesCS4B02G244400.1, TraesCS5B02G326600.1 and TraesCS2B02G296000.1 were exhibiting low to medium level of expressions of Abiotic stress in case of High level stress-disease, ranging from 20-50% where the maximum expression was displayed by the homoeologous gene

Conclusions
In the present investigation, we have concluded that a total 44 MATE genes of Triticum aestivum were analysed for phylogenetic classi cation, protein-protein interaction among the genes, structural and functional analysis of genes, protein motifs as well as in silico expression analysis. The 44 MATE genes were further classi ed into seven families and a representative phylogenetic tree was constructed using Clustal omega and Phylogeny.fr:TreeDyn tools. Out of these 44 genes, 43 genes were found to be interacting at the protein level by using STRING software with a medium con dence value at the protein level indicating that these genes were moderately interacted at the protein level. The maximum numbers of exons of 14 were found to be present in genes TraesCS6A02G418800.1 and TraesCS6D02G407900.1. We employed MEME software to nd protein motifs associated with the MATE genes and a total of 22 conserved motifs were identi ed where the protein motifs among the families 1, 2 and 3 are signi cantly different from the rest and also family 3 and 5 were having 19 and 14 number of protein motifs respectively while the rest had 22. We found that the majority of MATE genes were showing expressions during biotic stress conditions due to disease infestations and the highest level of expression was shown by the gene TraesCS5B02G326600.1 belonging to family 1 which got expressed during Fusarium head blight infestation by Fusarium graminearum after 4 days of inoculation by using Wheat expression browser tool. A total of 39 ternary plots consisting of homoeologous genes for 39 MATE genes were constructed using Wheat expression browser tool showing different level of expressions during biotic and abiotic stress conditions. We further found that 44 % of the triads tend to show non balanced expressions (extreme values) due to their higher tissue-speci city and greater intensity.

Identi cation of MATE genes and their chromosomal location in wheat genome
The MATE nucleotide as well as protein sequences of wheat were obtained and downloaded from the UniProt and UniPrac databases ( https://www.uniprot.org/). The UniProt is a universal protein resource which is comprehensive, high-quality and freely accessible database of protein sequence and functional information and UniPrac is a non-redundant database which stores each unique sequence only once and provides a stable and unique identi er (ID), thus making it possible to identify the same protein from different database sources (Apweiler et al.,2004). The peptide sequences were converted to their corresponding nucleotide sequences by using EMBOSS backtranseq tool (https://www.ebi.ac.uk/Tools/st/emboss_backtranseq/). The chromosomal position of the 33 MATE genes of wheat were obtained and the karyotypic view displayed ( g.1) with relative distances from Gramene database (https://www.gramene.org/) through genome browser of IWGSC (Triticum aestivum).

Phylogenetic analysis and classi cation of MATE genes
The full-length nucleotide and amino acid sequences of MATE genes (Additional le 1 & 2) were used for phylogenetic analysis. The sequences were subjected to multiple sequence alignments by Clustal Omega software (https://www.ebi.ac.uk/Tools/msa/clustalo/) with the default parameters, in order to retrieve the newick tree data format. An unrooted neighbour-joining phylogenetic tree was constructed using TreeDyn-Phylogeny.fr tool (http://www.phylogeny.fr/one_task.cgi?task_type=treedyn) at a scale value of 0.12 with the default parameters. The variation in nucleotide sequence was examined through multalin browser (http://multalin.toulouse.inra.fr/multalin/) where the conserved nucleotide residues were displayed in red, blue and blaack colour according to their degree of conservation (Additional le 3).

Protein -protein interaction
The protein -protein interactions are essential to each and every aspect of cellular functions and are characterised as transient or stable. Therefore, a comprehensive knowledge of protein interactions is a crucial source of information to functionally interpret the proteins and to understand the model cellular processes on a genome-wide level (Uhrig.,2006). The Information of functional interactions between the expressed proteins of MATE genes in wheat is achieved by using STRING (Search tool for retrieval of interacting genes/proteins) database which helps to collect and integrate the information by consolidating the already known as well as predicted protein-protein link data for a large number of organisms. These interactions could be direct interactions as physical or as indirect functional interactions (Szklarczyk et al., 2017). The network view of this database determines the network of predicted interaction for a speci c cluster of proteins where the nodes are indicated as proteins and the edges signify predicted functional connections The number of links at each node is called its "degree" (Uhrig.,2006). Moreover, in evidence mode, an edge can be drawn with up to seven contrarily coloured lines which symbolize the existence of seven types of evidence used in predicting the connections (Seth et al.,2019).The con dence mode shows thickness of the indicated line as the degree of con dence prediction of the interaction while the action mode represents the additional information about the prediction (Seth et al.,2019).In addition to this, the con dence score denotes the estimated probability that a predicted link occurs between two enzymes in the same metabolic map in KEGG database. The lower score indicates more interaction along with more false positives. Here, the interactions which are above the minimum required score are only included in the predicted network while the maximum number of interactions can be chosen, however, the output limit has been set to the 10 bestscoring hits by default (Seth et al.,2019).The analysis component gives information of the inferred network, such as the number of nodes and edges where the average node degree determines the number of interactions (at score threshold) that a protein has on average in the network. Here, the clustering coe cient determines the measure of connected nodes in the network where the highly connected networks show higher values.

Structural organisation and protein motifs
The exon-intron organisation of the wheat MATE family genes were retrieved based on their nucleotide transcript gene ID, and a diagram was obtained using EnsemblPlant database (https://plants.ensembl.org/Triticum_aestivum/Transcript/Summary? db=core;g=TraesCS1A02G305200;r=1A:497861545-497869619;t=TraesCS1A02G305200.1). The transcript determines number of exons, number of domains and features, associated variant alleles and oligo probe maps.
The motifs of MATE proteins were retrieved by using the Multiple Expectation Maximization for Motif Elicitation (MEME) (http://memesuite.org/), and a representative diagram of protein motifs of each MATE protein was presented according to the default parameters, and the maximum number of motifs was set at 22.The MEME Suite delivers a large number of proteomic and genomic sequence databases for motif scanning and various motif databases for the motif comparison (Bailey et al., 2015). It is composed of web -based integration of tools and database sets for executing motif-based amino acid sequence analyses. Moreover, the Suite having uni ed web server interface enables the users to implement four sorts of motif analysis viz. motif discovery, motif-motif database searching, motifsequence database searching and assignment of function.

In Silico Expression Analysis and their homoeologous candidates
The understanding of expression patterns in speci c tissues and organs suggests the molecular clues for their role and support in functionality of plants (Bhati et al.,2015). The widely accessible wheat RNA-Sequence datasets which are utilised for producing gene expression levels were obtained from the Wheat Expression Browser software (http://www.wheat-expression.com). It is an expression database for polyploid wheat which helps us to visualize identi ed MATE gene expression pro les in wheat. This tool is helpful to analyse and relate homologous-speci c transcript pro les across a wide range of tissues from different developmental stages in polyploid wheat which could be further queried by

Variation in homoeologs expressions across tissues (stable and dynamic triads)
The preference of variation in homoeologs expression in each ternary plot across the different tissues, was calculated and each individual tissue in which the triad was measured is expressed (González et al.,2018). The average of distances was de ned as the "triad mean distance" and the triads were ranked by their triad mean distance and the percentile was calculated by , where CMD is vector containing all the triad mean distance and the rst and last deciles were categorized as stable 10% and dynamic 10% triads, respectively (González et al.,2018).
Relative expression levels of the A, B, and D sub genome homoeologs across triads The analysis is mainly focused exclusively on the triad which have a 1:1:1 correspondence across the three homoeologous sub genomes, a triad can be de ned as expressed when the summation of the A, B, and D sub genome homoeologs was > 0. 5

Consent for publication
Not applicable.

Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information les

Competing interests
The authors declare that they have no competing interests.

Funding
Not applicable.
Authors' contributions SD contributed in the collection of data and helped in data analysis and conceptualisation of research experiment. The author read and approved the nal manuscript.    Representation of protein association network of 43 MATE genes using STRING software Figure 5 Network and enrichment analysis of MATE genes through STRING software