Systemically functional characterization of regiospecific flavonoid O-methyltransferases from Glycine max

Plants produce diverse flavonoids for defense and stress resistance, most of which have health benefits and are widely used as food additives and medicines. Methylation of the free hydroxyl groups of flavonoids, catalyzed by S-adenosyl-l-methionine-dependent O-methyltransferases (OMTs), significantly affects their physicochemical properties and bioactivities. Soybeans (Glycine max) contain a rich pool of O-methylated flavonoids. However, the OMTs responsible for flavonoid methylation in G. max remain largely unknown. We screened the G. max genome and obtained 22 putative OMT-encoding genes that share a broad spectrum of amino acid identities (25–96%); among them, 19 OMTs were successfully cloned and heterologously expressed in Escherichia coli. We used the flavonoids containing the free 3, 5, 7, 8, 3′, 4′ hydroxyl group, such as flavones (luteolin and 7, 8-dihydroxyflavone), flavonols (kaempferol and quercetin), flavanones (naringenin and eriodictyol), isoflavonoids (daidzein and glycetein), and caffeic acid as substrates, and 15 OMTs were proven to catalyze at least one substrate. The methylation activities of these GmOMTs covered the 3, 7, 8, 3′, 4′- hydroxyl of flavonoids and 7, 4′- hydroxyl of isoflavonoids. The systematic characterization of G. max flavonoid OMTs provides insights into the biosynthesis of methylated flavonoids in soybeans and OMT bioparts for the production of methylated flavonoids via synthetic biology.

Peer review under responsibility of KeAi Communications Co., Ltd.
O-Methyltransferase (OMT) mediates the transfer of a methyl group from S-adenosyl-L-methionine (SAM) to the hydroxyl group of natural products, producing the methylated product and by-product S-adenosyl-L-homocysteine (SAH) [20,21].Plant OMTs are divided into three different families (type I to III) based on protein sequence and structure [22].Type I OMTs typically have a relative molecular mass of 38-45 KD and their activity is independent of Mg2+, most of the plant flavonoids OMTs belong to this family [23].Type II OMTs have a smaller relative molecular mass of 22-27 KD and their activity is dependent on Mg2+, they are mainly involved in lignin biosynthesis and recent studies have showed that they could also catalyze the methylation of flavonoids [23].Plant OMTs are a superfamily of genes with many candidate genes predicted from the genome; for example, 58 potential OMTs have been predicted in citrus [24], 47 in Vitis vinifera [25], 26 in Populus trichocarpa [26], 82 in Gossypium hirsutum, 55 in Gossypium arboreum, and 55 in Gossypium Raimondii [27]; however, most have not been functionally characterized.Thirty-eight OMTs from different plants have been characterized to catalyze the O-methylation of flavonoids, typically in the 3, 7, 8, 3′, 4′-hydroxyl of flavonoids and 7, 4′-hydroxyl of isoflavonoids.Although G. max produces many methylated flavonoids, only one OMT (SOMT2) that catalyzes the 4′-OH methylation of daidzein, genistein, and naringenin has been reported [28].
The utilization of SAM as a common methyl donor and flavonoids as methyl acceptors suggests that conserved motifs exist for SAM binding or substrate recognition.After comparing five characterized MTs from different species, Bugos et al. suggested that five conserved regions contribute to SAM binding [29].Kagan and Clarke also found three conserved domains in 84 MTs [30].However, these early attempts employed only a limited number of MTs for motif prediction, and none of the MTs were characterized as flavonoid OMTs.In 1998, Joshi and Chiang proposed three SAM-binding domains (motifs A, B, and C) and four putative substrate-recognition domains (motifs I, J, K, and L) based on an analysis of 56 plant SAM-dependent MT sequences [31].The functional characterization of additional FOMTs would improve the understanding of this unique conserved information of FOMTs, facilitating the annotation of putative FOMTs in plant genomes and the understanding of structure-function relationships between protein sequences and their substrate specificities and, thus, improving the prediction of unknown OMTs.
In this study, we aimed to systematically characterize flavonoid Omethyltransferases from G. max at the genome scale.First, we attempted to discover all potential OMT candidates in G. max by screening the whole genome using the reported FOMT as a query.Candidates were cloned from G. max and heterologously expressed in Escherichia coli.A series of general flavonoid compounds containing free 3-, 5-, 7-, 8-, 3′, 4' -hydroxyl groups were used as substrates to test the enzymatic activities of the candidates (Fig. 1).On the basis of phylogenetic analysis and sequence alignment, we proposed seven novel conserved motifs with improved accuracy for FOMTs, which could facilitate the discovery and functional prediction of unknown plant OMTs.The systematic functional characterization of a series of FOMTs helps explain the biosynthesis and regulation of methylated flavonoids in G. max and provides OMT bioparts with diverse substrates and regiospecificity for the production of methylated flavonoids in synthetic biology.

Materials and reagents
Glycine max (soybean) was grown in greenhouse of Center for Excellence in Molecular Plant Science, Chinese Academy of Sciences; E. coli JM109 was used for gene cloning and E. coli BL21 (DE3) was used for heterologous expression of GmOMTs.Flavonoid standards were purchased from Nantong Feiyu Biological Technology (Nantong, China).

Cloning of OMTs from Glycine max
Soybean leaves were frozen in liquid nitrogen, ground in a mortar with pestle to fine powder.Total RNA was extracted using RNAprep Pure Plant Kit (Polysaccharides& Polyphenolics-rich) (TIANGEN, China), cDNA was prepared using PrimerScript™ RT reagent Kit with gDNA Eraser (TAKARA, Japan).The soybean OMT candidates were identified by screening the entire genome of Glycine max downloaded from SoyCyc 8.0 (https://www.plantcyc.org/databases/soycyc/8.0) using SOMT2 (GeneBank accession number: TC178411) as a query for BLAST searches.Candidate genes which encoding more than 350 amino acids and protein sequence identities >40% to SOMT2 were selected for cloning.Primers were designed according to the candidates' sequences and ordered from Sangon-Biotech (Shanghai, China).Polymerase chain reactions (PCR) using soybean cDNA as template were performed with I-5™ High-Fidelity DNA Polymerase (TSINGKE Biological Technology, China).The PCR products were then subcloned into pMD18-T vector (TAKARA, Japan) and subjected to gene sequencing.The primers used for OMTs cloning were list in Table S1, the sequences of GmOMT1-22 were deposited in Table S2 (deposited in http://npbiosys.scbit.org/under accession No. OENC18-OENC36).

Phylogenetic and motif analysis of OMTs
The alignment of protein sequences was performed using MEGA 11 ClustalW with gap open penalty = 10 and gap extension penalty = 0.2 [32].The phylogenetic tree was constructed by the program MEGA 11, using the neighbour-joining method with a 1000-replicate bootstrap search [32].Motif analysis of plant flavonoid OMTs was performed using WebLogo [33].

Heterologous expression and function assay of G. max OMTs
The coding sequences of the 19 candidate GmOMTs was inserted into the pGEX4T-1 vector, respectively, the GmOMTs was expressed as Nterminal GST-fusion form.Recombinant pGEX4T-1 vector was transformed into E. coli BL21 (DE3).Inoculate the overnight cultured recombinant E. coli solution into a medium containing 50 ml LB and ampicillin, and culture at 37 • C, 150 rpm until OD 600 = 0.4-0.6.IPTG with the final concentration of 200 μM was added and cultured at 18 • C, 110 rpm for about 18 h to complete the induced expression of the protein.The recombinant E. coli was collected by centrifugation at 4 • C and resuspended with 6 ml of 100 mM Tris-HCl pH = 8.0.The recombinant E. coli liquid was lysed by ultrasonic, and the supernatant was collected and used as crude enzyme for the following assays.The reaction was performed in a 200 μl volume containing 100 mM Tris-HCl buffer (pH 8.0), 500 μM SAM, 100 μM acceptor substrate, and 50 μl crude enzyme liquid in 30 • C water bath overnight and was terminated by adding 200 μl of ethyl acetate.The product was extracted into the organic phase, which was then evaporated.The residue was dissolved in methanol for subsequent assays.

High Performance Liquid Chromatography analysis
A Shimadzu LC-20A prominence system was used for the High Performance Liquid Chromatography (HPLC) analysis.Chromatographic separations were carried out at 35 • C on a Shodex C18-120-5 4E column (5 mm, 4.6 mm 3250 mm).The gradient elution system consisted of water (A) and acetonitrile (B).The HPLC program for the extracted compounds from the in vitro reactions by incubating GmOMTs with flavonoids as substrates was as follows: 0 min (22.5% B), 0-55 min (62.5% B), and 55-60 min (22.5% B).The flow rate was kept at 0.8 ml/ min.

Mass spectrometry and nuclear magnetic resonance spectrometry
Mass spectrometry analysis was conducted on a Q-TOF 6520A mass spectrometer (Agilent Technologies, Germany) equipped with an ESI interface.The mass scan range was set from m/z 100 to 3000 in positive mode.The ion source parameters were: drying gas (N2) flow rate 9 L/ min and temperature 345 • C; nebulizer pressure 38 psig; capillary voltage 3400V; skimmer 64V; OCT RF Vpp 750V and fragment 160V.The raw m/z data was processed with MassHunter Qualitative Analysis software (Agilent Technologies, version B.06).Nuclear magnetic resonance spectrometry (NMR) experiments were performed in (CD3)2SO for flavonoids on a Bruker Avance III 400 (for 1H NMR) (Bruker, Billerica, MA, USA).All spectra were referenced to residual protic resonance of solvent at 2.51 ppm.

Annotation and cloning of GmOMT candidates from Glycine max
To discover potential OMTs in G. max at the genome-scale level, we used a previously reported G. max O-methyltransferase, SOMT2 [28], as a query to search for putative homologs in G. max protein database.Two criteria, a full open reading frame of more than 350 amino acids in length and protein sequence identity to SOMT2 higher than 40%, were used to identify putative G. max OMT homologs.In total, 22 OMT candidates (designated GmOMT1-22) were obtained (Table S2).They were between 26.3% and 96.7% identical at the amino acid level and contained all three conserved motifs in plant SAM-dependent methyltransferases [30].Primers were designed according to the candidate GmOMTs' sequences, and 19 of the 22 full-length GmOMTs were cloned and sequenced using G. max cDNA as a template.The 19 OMTs were GmOMT1-12 and 14-20.

Phylogenetic analysis of GmOMTs and characterized plant OMTs
In the further analysis of these GmOMT candidates, a phylogenetic analysis of the other 22 characterized plant FOMTs (Table S3) from different plants was conducted, in which three main clades were generated (Fig. 2).OMTs from different clades shared less than 40% amino acid sequence identity.
The regiospecificity of OMTs from clade B was not as consistent as that of clade A. For example, MpOMT2 and ObF8OMT-1 are flavonoid 8-OH OMTs, and CrOMT2 and CrOM6 are 3′-OH/5′-OH and 4′-OH OMT, respectively.Five GmOMTs (4, 7, 10, 12, and 15) classified in this clade Fig. 3. Enzyme activities assay of GmOMT6, 11, 14, and 18 toward different flavonoid substrates.HPLC analysis of the reaction products produced by incubating crude enzymes of GmOMT6, 11, 14, and 18 with eriodictyol (A), luteolin (B), quercetin (C), and caffeic acid (D) as the substrate, respectively.Crude enzyme made from E. coli strain with empty pGEX-4T-1vector was used as a negative control.Authentic standards homoeriodictyol, chrysoeriol, isorhamnetin, and ferulic acid were used for the verification of the corresponding methylated products produced by each OMT.

Summarizing conserved motifs for plant flavonoids OMTs
Thirty-eight FOMTs from different plants have been characterized previously (Table S3).The substrate specificities of these FOMTs and 15 GmOMTs characterized in this study are listed in Table S3.Most plant FOMTs belonged to the Leguminosae family, including Glycine max ( 16), Medicago truncatula (7), Glycyrrhiza echinata (2), Medicago sativa (1), and Lotus japonicus (1).The next genera was Lamiaceae, containing 12 FOMTs (Ocimum basilicum (7) and Mentha x piperita (5)), and Poaceae, containing three FOMTs (Triticum aestivum (1), Oryza sativa (1), and Hordeum vulgare (1)).The remaining species were Chrysosplenium americanum (three), Catharanthus roseus (two), Citrus depressa (two), Fig. 5. Enzyme activities assay of GmOMT2, 8, 9, and 17 toward different flavonoid substrates.HPLC analysis of the reaction products produced by incubating crude enzymes of GmOMT2, 8, 9, and 17 with naringenin (A), eriodictyol (B), luteolin (C), kaempferol (D), and quercetin (E) as substrate, respectively.Crude enzyme made from E. coli strain with empty pGEX-4T-1vector was used as a negative control.Authentic standards naringenin, sakuranetin, eriodictyol, 7-O-methyleriodictyol, hesperetine, luteolin, diosmetin, hydroxygenkwanin, kaempferol, kaempferide, quercetin, and rhamnetin were used for the detection and verification of the corresponding flavonoids products.The structures of the two novel products (compound 2 and 3) produced by GmOMT8 and 17, respectively, were further determined by 1H NMR.Compound 2 is confirmed as isosakuranetin (Fig. S3).Compound 3 is confirmed as rhamnocitrin (Fig. S5).instead of residues E/I in the fourth position.The consensus sequence for Motif 7 was G (K/R) E RX (E/K/Y) XE(W/F), which was present in all 53 characterized FOMTs with 0-1 mismatches (Fig. 7).Motif 7 had three fewer residues than Motif L; the N-terminal residue of Motif 7 was W or F with W being dominant; and in Motif L, this position was only residue F. Collectively, we found that although the seven motifs were highly conserved among OMTs, the novel proposed motifs 1-7 of FOMTs had small modifications compared with the literature.The summary and proposal of conserved motifs specific to flavonoids might facilitate the discovery and functional prediction of OMTs in plants.

Conclusion
Glycine max (soybean) contains various flavonoids, most of which are in methylated form; for example, isorhamnetin is 3′-methylated quercetin, afromosin, and formononetin are 4′-methylated glycetein and daidzein.Therefore, G. max may be ideal for studying flavonoid OMTs.In this study, using gene screening at the genome-scale level, we obtained 22 potential OMTs from G. max, 19 of which were cloned successfully, and 15 were able to catalyze the methylation of flavonoids.Regarding methylation sites, these OMTs could catalyze isoflavone 7-/ 4′-OH and flavonoid 7-/8-/3'-/4′-OH, covering almost all the common hydroxyl sites of flavonoids.Notably, we are the first to observe that GmOMT2 and 9 are OMTs that can catalyze the methylation of luteolin 4′-OH to produce diosmetin.The systematic characterization of the GmOMTs expands the knowledge of the biosynthesis of methylated flavonoids in G. max and of the enzyme library of flavonoid OMTs, facilitating the biosynthesis of methylated flavonoids via synthetic biology.In addition, we summarized seven novel motifs specific to FOMTs; they are generally as conserved as the OMT motifs in the literature but have many modifications specified for FOMT.These novel motifs will facilitate the discovery of novel flavonoid OMTs.

Fig. 2 .
Fig. 2. Phylogenetic analysis of GmOMTs and previously characterized plant flavonoid OMTs.Nineteen GmOMTs cloned in this study (marked in bold), and 22 characterized plant OMTs were used to generate the phylogenetic tree.The clade and subclade of the phylogenetic tree are marked with different colors.GenBank accession numbers of characterized plant OMTs sequences are listed in TableS3.

Fig. 6 .
Fig. 6.Enzyme activities assay of GmOMT1, 3, 5, and 16 toward different flavonoid substrates.HPLC analysis of the reaction products produced by incubating crude enzymes of GmOMT1, 3, 5, and 16 with daidzein (A) and glycetein (B) as the substrate, respectively.Crude enzyme made from E. coli strain with empty pGEX-4T-1vector was used as a negative control.Authentic daidzein and glycetein standards were used for the detection of the flavonoid products.The structures of the four novel products (compounds 4, 5, 6, and 7) produced by GmOMTs were further determined by 1H NMR.Compounds 4 and 5 are characterized as isoformononetin and formononetin, respectively (Figs.S7 and S12).Compounds 6 and 7 are characterized as 7-O-methylglycetein and 4′-O-methylglycetein, respectively (Figs.S9 and S14).