Duplicated flavonoid 3’-hydroxylase and flavonoid 3’, 5’-hydroxylase genes in barley genome

Background Anthocyanin compounds playing multiple biological functions can be synthesized in different parts of barley (Hordeum vulgare L.) plant. The diversity of anthocyanin molecules is related with branching the pathway to alternative ways in which dihydroflavonols may be modified either with the help of flavonoid 3′-hydroxylase (F3′H) or flavonoid 3′,5′-hydroxylase (F3′5′H)—the cytochrome P450-dependent monooxygenases. The F3′H and F3′5′H gene families are among the least studied anthocyanin biosynthesis structural genes in barley. The aim of this study was to identify and characterise duplicated copies of the F3′H and F3′5′H genes in the barley genome. Results Four copies of the F3′5′H gene (on chromosomes 4HL, 6HL, 6HS and 7HS) and two copies of the F3′H gene (on chromosomes 1HL and 6HS) were identified in barley genome. These copies have either one or two introns. Amino acid sequences analysis demonstrated the presence of the flavonoid hydroxylase-featured conserved motifs in all copies of the F3′H and F3′5′H genes with the exception of F3′5′H-3 carrying a loss-of-function mutation in a conservative cytochrome P450 domain. It was shown that the divergence between F3′H and F3′5′H genes occurred 129 million years ago (MYA) before the emergence of monocot and dicot plant species. The F3′H copy approximately occurred 80 MYA; the appearance of F3′5′H copies occurred 8, 36 and 91 MYA. qRT-PCR analysis revealed the tissue-specific activity for some copies of the studied genes. The F3′H-1 gene was transcribed in aleurone layer, lemma and pericarp (with an increased level in the coloured pericarp), whereas the F3′H-2 gene was expressed in stems only. The F3′5′H-1 gene was expressed only in the aleurone layer, and in a coloured aleurone its expression was 30-fold higher. The transcriptional activity of F3′5′H-2 was detected in different tissues with significantly higher level in uncoloured genotype in contrast to coloured ones. The F3′5′H-3 gene expressed neither in stems nor in aleurone layer, lemma and pericarp. The F3′5′H-4 gene copy was weakly expressed in all tissues analysed. Conclusion F3′H and F3′5′H-coding genes involved in anthocyanin synthesis in H. vulgare were identified and characterised, from which the copies designated F3′H-1, F3′H-2, F3′5′H-1 and F3′5′H-2 demonstrated tissue-specific expression patterns. Information on these modulators of the anthocyanin biosynthesis pathway can be used in future for manipulation with synthesis of diverse anthocyanin compounds in different parts of barley plant. Finding both the copies with tissue-specific expression and a copy undergoing pseudogenization demonstrated rapid evolutionary events tightly related with functional specialization of the duplicated members of the cytochrome P450-dependent monooxygenases gene families.


INTRODUCTION
Plant phenolic compounds flavonoids and their coloured derivatives anthocyanins are secondary metabolites providing important functions (Grotewold, 2006a;Grotewold, 2006b). Flavonoids are ubiquitously present in plant cells. They are involved in the regulation of developmental processes, in the protection against biotic and abiotic stress and in the attraction of seed dispersers and pollinators (Khlestkina, 2013;Pourcel et al., 2007;Landi, Tattini & Gould, 2015). Due to their antioxidant activity, these compounds are also useful for the health of plant foods consumers-humans and animals (Khoo et al., 2017;Chaves-Silva et al., 2018).
Cytochrome P450 (also called CYP) proteins, named for the absorption band at 450 nm, are one of the largest proteins superfamilies (Werck-Reichhart & Feyereisen, 2000). These proteins are found in all organisms from protists to mammals, but their number has exploded in plants. Flavonoid 3 -hydroxylase (F 3 H , CYP75B, EC 1.14.13.21) and flavonoid 3 , 5 -hydroxylase (F 3 5 H , CYP75A, EC 1.14.13.88) are cytochrome P450dependent monooxygenases that require NADPH as a co-factor (Tanaka & Brugliera, 2013). These enzymes are involved in the biosynthesis of anthocyanin compoundsglycosylated forms of anthocyanidins producing by the flavonoid biosynthesis pathway (Fig. 1). F 3 H and F 3 5 H compete for substrate recruitment and hydroxylate 3 or 3 5 position of dihydroflavonols for the parallel synthesis of delphinidin and cyanidin, the precursors of blue and reddish-purple pigments (Tanaka, Brugliera & Chandler, 2009;Tanaka & Brugliera, 2013). Barley (Hordeum vulgare L.) is an important agricultural crop. In addition to the photosynthetic pigments giving a green colour, barley produces pigments that form diverse colouration patterns of different parts of plant. Purple and blue anthocyanins are accumulated in barley grains in the pericarp and aleurone layer, respectively (Adzhieva et al., 2015;Shoeva, Strygina & Khlestkina, 2018). Despite the fact that the genes coding the enzymes involved in anthocyanin biosynthetic pathway is well understood at the genetic and molecular level, the least studied genes in this branch are F3 H and F3 5 H . Because of useful properties of anthocyanin compounds, the study of genes involved in the anthocyanin biosynthesis is important. Previously, the presence of one F3 H gene copy (F3 H-1) expressing in genotype with purple pericarp was shown, as well as the presence of one F3 5 H copy (F3 5 H-1) with aleurone specific expression (Shoeva et al., 2016;Strygina, Börner & Khlestkina, 2017). Since the fact of tissue-specific activity of these genes and the fact that these anthocyanin compounds can be accumulated in other parts of the plant, it was concluded that there should be other copies of the F3 H

Primer design and qRT-PCR
Gene-specific primer pairs were constructed using Oligo Primer Analysis Software v.7 (https://www.oligo.net/) based on sequences found in IPK Barley BLAST Server (  (Himi & Noda, 2005). The raw data is in File S1. Each sample was run in three technical replications. The differences among the lines were tested by Mann-Whitney U -test (p ≤ 0.05).

Study of the structural organisation of the F 3 H and F 3 5 H genes
All F3 H and F3 5 H genes identified in the current study in H. vulgare genome consist of two exons with the exception of F3 5 H-1 having three exons. Analysis of the promoter elements for the annotated genes (∼600 bp upstream to the ATG start site) revealed many motives responsible for light-dependent activation (especially in F3 H-1 and F3 5 H-1), as well as Myb-dependent and Myc-dependent elements required for genes involved in the biosynthesis of flavonoid compounds ( Fig. 2A, File S3). Unlike other copies, F3 5 H-2 and F3 5 H-3 have only one light-induced promoter element (GATA-box). Amino acid sequences alignment with framing functional domains are shown in Fig. 2B. All the identified genes have a Cytochrome P450 domain (E-class, group I; IPR002401), however, F3 5 H-3 gene copy carries a frameshift indel mutation, which results in the truncation of the functional Cytochrome P450 domain in the middle and affects the tertiary protein structure (Fig. 2B, File S4). These sequences also possessed the conserved domains of flavonoid hydroxylase, including proline-rich region, heme binding domain, oxygen binding motif, hydroxylation activity site (CR1), EXXR motif and substrate recognition sites (SRS) (Fig. 3). Six functional SRSs, that are important for the determination of substrate specificity in CYP75 proteins, were determined in the predicted amino acid sequences of barley F 3 Hs and F 3 5 Hs. In F 3 5 H-3 only three SRS, proline-rich and CR1 motifs are present (Fig. 3). All other barley CYP75s have not lost their functional domains.

Evolutionary analysis of CYP75 genes
The number of non-synonymous substitutions per non-synonymous sites (Ka), the number of synonymous substitutions per synonymous sites (Ks) and the Ka/Ks ratio for CYP75 genes of barley were calculated. Synonymous and non-synonymous substitution rates ranged between 0.541-0.685 and 0.269-0.461 for identified paralogs, respectively (File S5). Using the formula Ka/Ks, it was predicted that F3 H and F3 5 H paralogs may be under stabilising selection (Ka/Ks is close to 0.5) with the exception of F3 5 H-3. This copy may experience neutral selection since the Ka/Ks F 3 5 H -3 is close to one (File S5).
The phylogeny of F3 H and F3 5 H genes was analysed using complete coding sequences of identified genes from genome of barley and other angiosperm species. The phylogenetic tree indicated that F3 H and F3 5 H families form two separate clusters (Fig. 4, blue and purple clusters, respectively); within each one clearly divided into two groups-monocot and dicot plant species. It was assumed that F3 H and F3 5 H genes are the results of duplication and neofunctionalization of the single CYP75 gene in a genome of the common ancestor of monocot and dicot plant species. The analysis of genetic similarity and the divergence time calculation revealed that this event occurred about 129 million years ago (MYA) (Fig. 4) shortly before the monocots and dicots divergence (estimated time is 110-116 MYA).
In addition, we calculated the time of segmented duplications in H. vulgare genome with the formation of paralogous gene copies (Fig. 4). The F 3 H copy apparently occurred

Analysis of the F 3 H and F 3 5 H genes expression
Comparative analysis of relative gene expression levels was performed using RNAs isolated from the aleurone layer, pericarp, lemma and stems of the Bowman's near-isogenic lines (NILs) contrasting in anthocyanin pigmentation: 'BW' (Bowman, NGB22812), 'PLP' (purple lemma and pericarp, NGB22213) and 'BA' (intense blue aleurone, NGB20651) (File S6). It was found, that the F3 H-1 gene was expressed in aleurone layer, pericarp and lemma with an increased expression level in a pigmented pericarp of 'PLP' (3.6 times higher than in uncoloured one) (Fig. 5). A tissue-specific expression was detected for the F3 H-2 gene. Activation of the expression of this gene occurs in stems only. Moreover, in coloured stems of 'PLP' the relative expression level was three times higher than in uncoloured stems of 'BW' (Fig. 5). Expression of the F3 5 H-1 gene only in aleurone layer was confirmed (Fig. 5). It was shown that in pigmented aleurone of 'BA' this gene was expressed 30 times actively than in uncoloured aleurone of 'BW'. F3 5 H-2 was strongly expressed in pericarp and aleurone layer of 'BW' in comparison to coloured ones (9.3 and 12.7 times higher, respectively) (Fig. 5). Expression of the F3 5 H-3 gene was not detected in analysed tissues. The gene F3 5 H-4 was weakly expressed in all studied tissues with slight expression increasing in the pigmented stems (Fig. 5).

DISCUSSION
Gene duplication is an important evolutionary mechanism providing a source of genetic material for the specialization or the new gene function appearance through the mutations and selection (Proulx, 2011;Magadum et al., 2013). Evolution by gene duplication has arisen as a general principle of biological evolution, which is apparent from the prevalence of duplicated genes in all genomes of sequenced organisms (Ohno, 1970). Gene copies have occurred as a result of segmental duplications (duplication of individual genomic regions) or polyploidization (whole genome duplications) (Ohno, 1970;Lynch et al., 2001;Eichler & Sankoff, 2003). Gene duplicates can expect one of the possible fates: pseudogenization (PG), subfunctionalization (SF) or neofunctionalization (NF) (Ohno, 1970). In the PG process, one of the gene copies loses its function after degenerate mutation acquiring, for example, in the promoter region. The NF process proposes that one gene copy retains the ancestral function while the other gets a novel function. The SF is a major process of divergence with differential division of ancestral gene functions (Ohno, 1970).
In plants, the pattern of the SF leading to tissue-specific expression is frequent. For instance, regulatory genes coding bHLH/Myc-type transcription factors controlling the  -Strid, 1993;Cockram et al., 2010;Strygina, Börner & Khlestkina, 2017). As an example of tissue-specification of structural anthocyanin biosynthesis genes flavanone 3-hydroxylase (F3H ) genes in Triticum aestivum genome could be considered: the copy designated TaF3H-B2 is transcribed specifically in roots of bread wheat while the TaF3H-B1 gene copy is not expressed in roots but it is expressed in other different parts of the plant (Khlestkina et al., 2013).
In the flavonoid biosynthesis pathway, F 3 H and F 3 5 H are important enzymes controlling the hydroxylation at the 3 and 5 of reddish-purple and blue pigments, respectively (Tanaka, Brugliera & Chandler, 2009;Tanaka & Brugliera, 2013 , 2004;Herron et al., 2009;Cheng et al., 2012). The duplication of F3 H and F3 5 H in barley genome took place several times: the F3 H copy arose approximately 80 MYA while the appearance of F3 5 H copies occurred 8, 36 and 91 MYA (Fig. 4). Thus, the first acts of duplication of both genes occurred before the origin of the family Poaceae (Gramineae) (Kellogg, 2001).
The ratio of non-synonymous (Ka) to synonymous (Ks) substitutions is used to determine the direction of natural selection after duplication: Ka/Ks > 1 implies positive selection, Ka/Ks < 1 means stabilising selection, Ka/Ks = 1 indicates neutral selection (Kondrashov et al., 2002). Analysis of duplicated F3 H and F3 5 H genes indicated that most of the identified gene copies are under stabilising selection. The exception is F3 5 H-3 gene copy, which is supposed to be a pseudogene due to the mutation in the coding part of the gene, which breaks the reading frame and changes the protein structure. In addition, we did not detect its transcriptional activity in analysed tissues.
The genes encoding F3 H showed a precise tissue-specific activity likewise TaF3H genes of bread wheat: F3 H-1 is expressed in aleurone layer, pericarp and lemma, while F3 H-2 is transcriptionally active in stems only (Fig. 5). Besides, increasing of the expression level were observed in tissues with reddish-purple pigmentation (pericarp and stems) apparently provided by cyanidin derivatives (these identifications are putative due to the absence of a biochemical study of the gene products). The increase of relative expression level of F3 H-1 in the aleurone layer or lemma was not detected in BA and PLP NILs (Fig. 5). In these tissues, there are almost or completely no cyanidin derivatives, which is evident from the phenotype of these lines (File S6). An increase in the level of gene expression in anthocyaninpigmented plant tissues is a common feature of genes in anthocyanins biosynthesis pathway in cereals (Shoeva et al., 2015;Shoeva et al., 2016;Shoeva & Khlestkina, 2015). For example, in the pericarp of purple-grained PLP line the expression level of flavonoid biosynthesis structural genes (CHS, CHI, F3H, F3 H, DFR, ANS) was significantly higher than in the uncoloured Bowman, that led to total anthocyain content increase in PLP line identified by ultra-performance liquid chromatography (HPLC) (Shoeva et al., 2016).
Among the F3 5 H copies, only two have a tissue-specific activity: F3 5 H-1 and F3 5 H-2. The F3 5 H-1 copy was expressed only in the aleurone layer, and the level of its activity was much higher in the blue aleurone compared to the uncoloured one (Fig. 5). Aleurone-specific expression of this gene was noted earlier, and it was shown that F3 5 H-1 is one of the key regulators of the aleurone layer pigmentation (Strygina, Börner & Khlestkina, 2017). The copy designated F3 5 H-2 was expressed only in the barley grain. Moreover, the expression of this gene is much higher in the aleurone layer and pericarp in the green BW line in comparison to coloured ones (Fig. 5). The F3 5 H-4 gene copy was expressed in all tissues analysed. Since there are almost no light-dependent elements in the promoters of F3 5 H-2 and F3 5 H-4, it can be assumed that these gene copies encode for different isoenzymes specialised in the synthesis of such flavonoid compounds as catechin available in the barley at the high level (McMurrough, Loughrey & Hennigan, 1983;Madigan, McMurrough & Smyth, 1994). Alike specialization was demonstrated earlier for such organisms as tea plant and its relatives (Punyasiri et al., 2004;Jin et al., 2017). These results suggest the SF and diversification of F3 Hs and F3 5 Hs in the barley genome.