Arbuscular mycorrhizal fungi induce the expression of specific retrotransposons in roots of sunflower (Helianthus annuus L.)

Retrotransposon expression during arbuscular mycorrhizal (AM) fungal colonisation of sunflower roots (Helianthus annuus) was analysed using Illumina RNA-Seq, in order to verify whether mycorrhizal symbiosis can activate retrotransposable elements. Illumina cDNA libraries were produced from RNAs isolated from the roots of sunflower plants at 4 and 16 days after inoculation with the AM fungus Rhizoglomus irregulare and from their respective control plants. Illumina reads were mapped to a library of reverse transcriptase-encoding sequences, putatively belonging to long terminal repeat retrotransposons of Gypsy and Copia superfamilies. Forty-six different reverse transcriptase sequences were transcribed, although at a low rate, in mycorrhizal or control roots and only four were significantly over-expressed at day 16, compared with control roots. Almost all expressed or over-expressed sequences belonged to low-copy elements, mostly, of the Copia superfamily. A meta-analysis, using publicly available Illumina cDNA libraries obtained from sunflower plants treated with different hormones and chemicals, mimicking stimuli produced by abiotic and biotic stresses, was also conducted. Such analyses indicated that the four reverse transcriptase sequences over-expressed in mycorrhizal roots were explicitly induced only by AM symbiosis, showing the specificity of AM stimuli compared to that of other fungal/plant interactions.


Introduction
Arbuscular mycorrhizal (AM) fungi are beneficial symbionts of roots of most plant species, including many cultivated plants. AM fungi (AMF) are beneficial obligate biotrophic microorganisms, represent essential elements of soil fertility, and are involved in plant nutrition and productivity. These fungi facilitate mineral nutrient uptake, particularly phosphorus and nitrogen from soil, producing large networks of fungal hyphae, which spread from colonised roots into the soil [1,2], receiving plant carbon compounds, in exchange [3]. AMF also provide some ecosystem services, improving plant tolerance to biotic and abiotic stresses, thereby reducing the need for chemical fertiliser and pesticide inputs in agriculture [4]. The symbiosis PLOS  induces physiological changes in the colonised plants, affecting the fundamental metabolism of the host cells [5,6], and modulates their secondary metabolism, for example, enhances the activity of the antioxidant enzymatic systems and the biosynthesis of diverse phytochemicals with health-promoting activities [7]. These physiological modifications are, in turn, linked to the activation/repression of several genes, as observed using mRNA sequencing (RNA-Seq) analyses, in a few plant species such as Medicago truncatula, Solanum lycopersicum, Solanum tuberosum, Glycine max, Oryza sativa, Lotus japonicus and Helianthus annuus [8][9][10][11][12][13][14][15]. In roots, where AM symbiosis is established, such changes in both plant and fungal transcriptomes were related to mycorrhizal establishment and development, involving post-translational regulation, signalling, transport, hormone metabolism, and biotic and abiotic stresses [15][16][17][18][19][20].
Compared with pathogenic fungi, many similarities occur in the molecular interaction of plant roots with beneficial microorganisms [21], including the activation of pathogen-related genes, radical oxygen species burst and callose deposition, as part of the AMF transiently induced early defence responses [22]. Moreover, AMF may protect host plants by directly inhibiting deleterious microbe propagation or through induced systemic resistance as observed in of Solanum lycopersicum, Nicotiana tabacum and Phaseolus vulgaris [23].
It is known that besides affecting gene expression, pathogenic fungi and their elicitors, as well as compounds related to plant defence, can activate the expression of transposons, especially retrotransposons (e.g Nicotiana tabacum and Solanum lycopersicum) [24][25][26][27][28][29][30][31]. In contrast, the influence of AM symbiosis on the activity of transposable elements (TEs) is not known.
Transposable elements (TEs) are DNA sequences, widespread in the nuclear genome of all eukaryotes [32] and potentially moving across the genome. In particular, class I elements (retrotransposons or retroelements, REs) represent the main fraction of the repetitive component of eukaryotic genomes because they transpose by producing RNA intermediates which are copied to cDNA and inserted into different chromosomal sites, leading to their accumulation in the genome. The most frequent REs in plants show two direct long terminal repeats (LTRs) flanking a coding portion. This portion encodes Pol, a polyprotein which includes four enzyme domains, protease, reverse transcriptase (RT), RNaseH and integrase, involved in the mechanism of replication and integration in the chromosomes of the host; and Gag, a protein resembling that of virus-like particles [33]. LTR-REs use these enzymes to transpose. LTR-REs that are missing one or more domains use enzymes produced by other LTR-REs to replicate and move [32]. LTR-REs are most abundant in plant genomes, especially those belonging to the Copia and Gypsy superfamilies [32], which differ in the position of the integrase domain within the open reading frames [33] and may be distinguished into a number of lineages [34][35][36][37].
Transposon dynamics contribute considerably to the evolution of genomes, having primary roles in different genome functions. Besides affecting the genome size [38][39][40], transposable elements are involved in genome restructuration [41], the generation of new genes by rearrangements of gene segments [42] and affect heterochromatin formation in the nucleus [43]. Finally, transposons can alter the host's regulatory network and gene expression, at both the structural and epigenetic levels, possibly leading to phenotypic variations [44]. The regulatory role of TEs might have important implications into the switch of the host expression leading to symbiosis.
The sunflower, Helianthus annuus (Asteraceae), is widely cultivated and one of the four most important sources of vegetable oil. It has a large genome (about 3.3 Gbp), which has recently been sequenced [45]. Repetitive elements, and especially LTR-REs, represent the vast majority (around 80%) of the genome [46][47][48][49]. Natali et al. [47] assembled and annotated the repetitive portion of the sunflower genome, finding that the abundance of Gypsy elements exceeds by four-fold that of Copia REs. In recent works, different RE lineages to which sunflower LTR-REs belong were identified [48][49][50]. The Copia lineages included AleI/Retrofit/ Hopscotch, AleII, Angela, Bianca, Ivana/Oryco, Maximus/SIRE and TAR/Tork [32]. Conversely, the most frequent Gypsy lineages were Ogre/Tat [51], Athila [52] and Chromovirus, a lineage that is especially abundant in the sunflower genome (it exceeds 30% of the genome [49]) and is generally associated with centromeric regions [51,53].
In the last 10 years, studies have shown that LTR-REs are transcriptionally active in sunflower [54][55][56]. The transcription of Copia and Gypsy REs commonly occurs in different organs (embryos, leaves, roots and flowers) and, reportedly, it is not induced by environmental factors or culture conditions [55]. In the same study, one of 64 individuals of a progeny of a selfed line exhibited the integration of a new element into the genome. Similarly, multiple lineages of Gypsy and Copia REs were active in natural populations of H. annuus and H. petiolaris, and in their natural interspecific hybrids [56].
Concerning the establishment of AMF symbiosis in sunflower, recent investigations performed on sunflower wild accessions, cultivars and inbred lines colonised by Rhizoglomus irregulare, demonstrated a relationship between the susceptibility to AMF and the degree of domestication, with wild accessions more susceptible to colonisation than cultivars [57][58].
One of the most studied and widespread arbuscular fungi is Rhizoglomus irregulare (formerly known as Rhizophagus irregulare), because of its ability to colonise most of plant species [59]. The genome of R. irregulare is about 153 Mbp and reportedly contains around 36% repeated elements, including LTR-, non-LTR-retrotransposons (also called long interspersed nuclear elements, LINEs) and DNA transposons [60]. Rhizoglomus irregulare is said to share multiple identical nuclei in coenocytic hyphae (homokaryon hypothesis) although it has been shown that nuclei can undergo large genomic rearrangements [61], leading to high genome variability, which may affect both genes and repeated elements, in different strains of this fungus [62].
Despite the importance of TEs as regulatory elements, little is known about their involvement in the modulation of host physiology towards symbiosis. This plant-fungus interaction have a considerable importance in exchange of nutrients from the soil, especially phosphate and nitrogen, improving both growth and health of the host plant [3,4]. In addition, the recently sequenced sunflower (Helianthus annuus) genome revealed a high distribution of repeats in the genome with an elevate percentage of Copia and Gypsy retroelements [45]; for these reasons a detailed analysis of retrotransposons activation in mycorrhizal symbiosis between sunflower and Rhizoglomus irregulare could bring an important insight into the relationships between two meaningful biological processes such as the plant-fungus interaction and the modulation of transposition events. In this sense, the goal of the present study is to identify differentially expressed TEs which might be involved in sunflower response to AMF colonisation.
An RNA-Seq approach may be exclusively applied to the analysis of LTR-RE expression when the genome sequence or a complete reference library of LTR-REs is available. For example, RNA-Seq has been recently exploited to investigate the expression of TEs in poplar [63]. To analyse the activation and identify the differentially expressed lineages of LTR-REs in sunflower roots during AMF colonisation, we assessed the differential expression of the reverse transcriptase (RT)-encoding domain, by mapping Illumina RNA-Seq libraries onto a database of sunflower and R. irregulare RE sequences. Moreover, using sunflower genomic resources available in public repositories [45], including Illumina cDNA libraries obtained from roots of plants treated with different hormones and chemicals mimicking stimuli produced by biotic and abiotic stresses, we compared the RT-encoding sequence expression induced by AMF with that induced by these treatments, to infer the factors possibly inducing LTR-RE expression in mycorrhizal sunflower.

Plant material and treatments
Plant materials were the same as described in a previous work [15]. Overall, 40 seeds of H. annuus (inbred line HA412-HO, USDA Accession Number PI 642777) were germinated on moistened filter paper lied in Petri plates. Then, roots of 1-week-old plantlets were placed on a 90-mm diameter membrane (cellulose acetate-cellulose nitrate, 0.45 μm pore diameter size, MF-Millipore) and covered with a 100 cm 2 nylon net (41 μm mesh, Millipore). Rhizoglomus irregulare (IMA6 strain) spores, mycelium and colonised fine roots, previously obtained from a pot culture soil after wet sieving through a 100-μm sieve, were spread onto nylon net surfaces. Uninoculated control plants received 10 mL of a filtrate, obtained by sieving the mycorrhizal inoculum through a 50 μM pore diameter sieve and then through Whatman paper no. 1 (Whatman International Ltd, Maidstone, Kent, UK), to ensure a common AMF-associated microflora to all treatments. For each inoculated and non-inoculated plantlet, another 90 mm membrane was placed on the nylon net, and then the whole system was transferred to sterile 150-mm Petri plates containing steam-sterilised quartz grit. Plates were sealed with Parafilm M, and the lower half was wrapped in aluminium foil. Then, all plantlets were transferred to a growth chamber at 24˚C. Half-strength Hoagland's solution (6 ml) was provided weekly, and progression of fungal colonisation was assessed by succinate dehydrogenase activity and trypan blue staining [64,65], monitored every 48 h on three plantlets. The gridline intersect method [66] was used to determine the percentage of AM fungal root colonisation.

RNA isolation extraction, sequencing and mapping procedures
Whole roots of mycorrhizal and control plantlets harvested after culture for 4 and 16 days, respectively, were ground in liquid nitrogen, and total RNA was isolated using the Logemann procedure [67].
DNaseI (Roche, Basel, Switzerland) was used to purify the RNA from genomic DNA, followed by phenol-chloroform extraction, and precipitation, using standard procedures. The overall RNA quality was evaluated by Bioanalyser 2100 (Agilent Technologies, Santa Clara, CA).
Messenger RNA (mRNA) was converted to cDNA and 12 cDNA libraries (three replicates of mycorrhizal and control root systems at 4 and 16 days of culture) were constructed by using a TruSeq RNA sample preparation kit (Illumina, Inc., San Diego, CA, USA), according to the manufacturer's protocol.
In order to exclude any ribosomal RNA (rRNA) traces from the libraries, reads were aligned against sunflower rRNA sequences downloaded from the NCBI repository, with the following criteria: length fraction = 0.5, similarity fraction = 0.8.

Production of a database of reverse transcriptase (RT)-encoding sequences of Helianthus annuus and Rhizoglomus irregulare
Reverse transcriptase (RT)-encoding sequences from LTR-retrotransposons of H. annuus were identified in a sequence set, representing a whole-genome set of assembled sequences [47]. Briefly, Illumina reads were obtained from genomic DNA of the HA412-HO line (the same used for establishment of symbiosis) and assembled using various procedures [47].This sequence set was analysed by using the RepeatExplorer [69] protein domain search tool. Searches were performed against the Repbase database of RT protein domains derived from mobile elements in plant and included in the RepeatExplorer tool, and RT sequences were retained using the following parameters: 60% minimum similarity, 40% minimum identity; proportion of the hit length from the length of the database sequence = 0.3; maximum allowed frameshifts = 3.
cDNA sequences obtained from inoculated and control roots of sunflower (see above) were mapped to the database of RT sequences using CLCBio Genomics Workbench version 9.5.3 (CLCBio, Aarhus, Denmark), with the following parameters: length fraction = 0.7, similarity fraction = 0.9, mismatch penalties = 1, gap penalties = 1.
Rhizoglomus irregulare genomic Illumina reads were downloaded from the Sequence Read Archive (SRA) public repository (accession number SRR2727640) and trimmed by Trimmomatic (SLIDINGWINDOW:4:20, MINLEN:125 [68]. Then, a randomised sample of 2,000,000 paired-end reads was assembled with RepeatExplorer [67] and compared with the Repbase database of fungi repeated elements. Reverse Transcriptase sequences with a 30% minimum identity, 60% minimum similarity, 150 bp minimum length and a maximum of three frameshifts were retained. Such parameters were less stringent than those used for H. annuus, as suggested by RepeatExplorer instructions for non-higher plant species. Sequences obtained from the assembly were then retrieved on the genomes of six strains of R. irregulare [62], using the BLASTN suite (max_target_seqs = 1; with all default parameters).

Abundance estimation of reverse transcriptase (RT) sequences in the Helianthus annuus genome
The abundance of each RT-encoding sequence in the genome of H. annuus line HA412-HO was estimated by mapping an Illumina set of 9,871,724 Illumina reads (cleaned from organellar sequences), trimmed to 90 nt in length (coverage 0.24×) to the isolated RT sequences (Bioproject archive accession number PRJNA64989). Mapping was conducted using CLCBio Genomics Workbench version 9.5.3. This tool distributes multireads randomly; however, the number of multireads is generally very low, and the number of mapped reads to a single sequence is an indication of its redundancy. Mapping was performed using mismatch cost = 1, deletion cost = 1, insertion cost = 1, similarity fraction = 0.9, and length fraction = 0.7.

Differential expression analysis
The expression level of RT-encoding sequences was calculated both as reads per kilobase per million reads mapped (RPKM) [70] and as mapped reads per million. Mapping was performed separately on H. annuus and R. irregulare RT sequence sets, using the following parameters: length fraction = 0.7, similarity fraction = 0.9, mismatch penalties = 1, gap penalties = 1. Besides unique reads, reads that occurred up to ten times were also included in the mapping analysis, because this strategy correctly estimates the expression of paralogous RT sequences [70]. Reverse transcriptase (RT) expression was filtered to ensure at least one mapped read per million, in at least one library.
Raw counts derived from the CLC aligner were analysed using Baggerley's statistical test, i.e. a proportion based comparison on t-test weighted by beta-distribution [71]. A pairwise comparison test was performed between mycorrhizal (M) and control libraries (C) after culture with AMF for 4 and 16 days, respectively (C4 vs. M4 and C16 vs. M16). The resulting Pvalues were corrected for the false discovery rate (FDR) [72], and significant RT sequences were identified (P<0.05, FDR-corrected).

Analysis of reverse transcriptase (RT) sequence expression in hormonal and stress treatments
In order to compare the expression of RT-encoding sequences induced by AMF with that induced by other treatments, 10 Illumina cDNA libraries from sunflower roots (inbred line HanXRQ), each treated with different hormones and chemicals, mimicking stimuli produced during biotic and abiotic stresses [45], were downloaded from the NCBI SRA (accession number SRP092742). Libraries were built from roots of plants treated with auxin, ethylene, gibberellic acid, salicylic acid, kinetin, abscisic acid, strigolactones, brassinosteroid, polyethylene glycol and NaCl, respectively. One library per treatment and seven corresponding control libraries were available [45].
The quality of the reads was checked using FastQC (v. 0.11.3), and overall quality was improved by trimming the reads using Trimmomatic [68], and removing Illumina adapter (Illuminaclip, Headcrop:10, Crop:56). Ribosomal reads were removed, and mapping was performed versus RT sequences using the CLC Genomics Workbench (v. 9.5.3, same parameters described above). Reverse transcriptase (RT)-encoding sequences showing more than one mapped read per million for each treatment were considered as expressed. Statistical analyses of differential expression between treatment and control libraries were done using Baggerley's statistical test [71].

Collection of reverse transcriptase (RT)-encoding sequences
A genome-wide analysis was conducted to identify RT-encoding sequences from the wholegenome assembly of the HA412-HO inbred line [47]. This assembly was composed of 283,800 scaffolds or contigs. These were scored for the occurrence of RT-encoding sequences of LTR-REs, using the domain search tool of RepeatExplorer [67]. This analysis was limited to LTR-retrotransposons because the frequency of other autonomous retrotransposons (namely, LINEs) in the sunflower genome is negligible [47].
Sequences showing more than 90% similarity were discarded, and a total of 1,807 unique RT sequences were retained, belonging to seven lineages of the Copia superfamily and three lineages of the Gypsy superfamily. The composition of this RT sequence set is reported in Fig 1. The abundance of each RT was measured by mapping the Illumina reads onto the RT sequence set. The average coverage of aligned reads (the sum of the bases of the aligned part of all the reads divided by the length of the reference sequence) of each sequence is reported in Fig 2. As expected, the most abundant sequences belonged to the Chromovirus and Ogre/Tat lineages of the Gypsy superfamily [46][47][48][49] (Fig 2).
For R. irregulare, Illumina reads were assembled with RepeatExplorer and 31 clusters of contigs containing sequences of Copia, Gypsy and LINE retrotransposons, each representing at least 1% of the fungal genome, were collected, totalling 814 sequences.
All the sequences, except 14, matched to the R. irregulare genomes [62]. After filtering for RT domains, 33 sequences were retrieved. Most RT sequences belonged to LINEs (66%) and Gypsy LTR-REs (34%). No RT sequences belonging to Copia elements were identified. Nonetheless, it is to be considered that the large intraspecific variations in sequence and structure of R. irregulare genome could have made difficult matching assembled sequences to the genomes.

Expression of reverse transcriptase (RT)-encoding sequences in sunflower mycorrhizal roots
Rhizoglomus irregulare-inoculated roots showed the first signs of colonisation after culture for 4 days while at 16 days, the percentage of colonised root length was about 50%. Control roots did not show any fungal structure throughout the experiment. Based on these data, RNA-Seq analyses were performed on mycorrhizal and control roots harvested at 4 (M4 and C4, respectively) and 16 (M16 and C16) days of culture [15]. The total number of high-quality reads for each of the 12 libraries ranged from 17.9 to 62.8 million. After aligning high-quality reads to the sunflower RT-encoding sequence set, the percentage of mapped reads of each sample to the sequence set ranged from 0.02% to 0.08%.
The expression level (in mapped reads × million) is reported in Table 1. The mean number of reads matching RT sequences was very low compared to the number of reads mapping to genes (S1 Table). It ranged from 11.80 to 21.68 in the four samples, and the maximum number of mapped reads per million ranged from 188.94 to 320.05. These maximum values belonged to one element of the Copia lineage AleII.
Interestingly, the mean number of reads matching RT sequences was similar in C4, C16 and M16 samples while it increased in M4; the number of reads matching RT sequences dropped from 21.68 in M4, to 11.80 in M16, i.e., it decreased during the colonisation process (Table 1). it can be observed that most elements were expressed in both mycorrhizal and control samples (Fig 3). Similarly, comparing early with late treatments or early with late controls, the majority of RT sequences were expressed throughout the experiment (Fig 3). Overall, 22 out of 46 expressed RT sequences were present in each of the four samples (C4, C16, M4 and M16), suggesting that the corresponding elements were constitutively expressed (even if at a low rate). Fourteen out of 22 sequences belonged to the AleII lineage of the Copia superfamily.
The differential expression of the 46 expressed RT-encoding sequences was analysed ( Fig  4). No RT sequence was differentially expressed after AMF colonisation for 4 days, compared with the respective controls. After AMF colonisation for 16 days, only four RT-encoding sequences were significantly over-expressed in mycorrhizal roots (Fig 4). No RT sequence was under-expressed during the experiment. The four differentially expressed RT sequences showed a low average number of mapped reads per million values (from 0.166 to 3.182, compared to the mapped reads per millions of housekeeping or AMF symbiosis-induced genes, S1 Table) and all belonged to the Copia superfamily, one each for the AleI/Retrofit, AleII, Angela and Ivana/Oryco lineages (Fig 4).
Since, in certain cases, the occurrence of RE sequences in cDNA libraries can be related to genomic DNA contamination, we analysed the relationship between the RPKM value of expressed RT sequences in the M16 sample and their abundance in the genome, as indicated by the average coverage (S1 Fig). The correlation coefficient was very low (r = 0.1), so contamination by genomic DNA in the cDNA libraries could be largely ruled out. Moreover, the most abundant RT sequences were slightly expressed, and, correspondingly, the most expressed RTs were poorly represented in the genome of Ha412-HO inbred line, except one RT sequence with an average coverage of over 400 and about five mapped reads per million. Considering the four differentially-transcribed RT sequences, genomic DNA contamination of cDNA libraries could be excluded (S1 Fig).
In order to analyse the expression of fungal retrotransposon during colonisation, reads from mycorrhizal libraries at 4 and 16 days of colonisation were mapped to the 33 RT-encoding sequences of LINE and Gypsy from the R. irregulare genome, whereas, no read aligned to the fungal RT sequences. This finding indicated that either fungal retrotransposons are not expressed during symbiosis or that the coverage of fungal reads was too low to estimate the RT expression rate if any.

Comparative analysis of reverse transcriptase (RT) sequence expression
Reverse transcriptase (RT) sequence expression was estimated by mapping onto the same RT sequence library as mentioned above, 10 Illumina cDNA libraries publicly available [45], obtained from roots treated with abscisic acid, ethylene, brassinosteroids, gibberellic acid, indole-acetic acid, kinetin, NaCl, polyethylene glycol, salicylic acid and strigolactones, Table 1

. Number of mapped reads onto 46 reverse transcriptase (RT)-encoding sequences expressed in Helianthus annuus, at 4 or 16 days after arbuscular mycorrhizal fungal inoculation (M4 and M16) and/or in the respective controls (C4 and C16).
For each sample, the maximum and the minimum number of reads mapping to single RT sequences are also reported. Values are the means of mapping reads per million, using triplicates per sample.  respectively. Although the ten cDNA libraries were obtained from the HanXRQ genotype rather than Ha412-HO, all RT sequences occurred in both genotypes (data not shown), indicating that the corresponding LTR-RE families are shared between the two genotypes. The results of this comparative analysis are reported in Fig 5 and Fig 6, in which the differential expression in controls and M4 and M16 roots is illustrated. A large number of RT sequences were expressed in many treatments, confirming the widespread transcription of LTR-REs in sunflower [55,56]. Thirty RT sequences expressed in mycorrhizal roots were also expressed in at least one of the other treatments. Sixteen RT sequences were expressed only in the specimens used in this study (mycorrhizal and/or control treatments, Fig 5).

Sample Mean nr. of reads per million Minimum nr. of reads per million Maximum nr. of reads per million
Concerning differentially expressed RT sequences (Fig 6), all treatments determined overexpression of RT sequences, except indole-acetic acid, which induced under-expression of some RT sequences. The four significantly over-expressed RT sequences induced by A sequence was considered as expressed when matching more than one read per million in mapping cDNA libraries to the set of sunflower RT sequences. Libraries from cDNA of mycorrhizal symbiosis were not activated by any other treatment, suggesting that the mechanism of RE induction by AMF is rather specific and not amenable to any of the chemical or hormonal treatments tested here. Instead, two of the analysed RT sequences were induced by all treatments, except mycorrhization.

Discussion
McClintock [73] addressed the activation of TEs as a possibility for the genome to cope with changes in the environment in which a species lives, for example, exposure to biotic and abiotic stresses. Indeed, active mobile elements can produce new genetic variability, increasing the possibility to overcome changing environments, with a significant impact on the evolution of the species.
It is known that pathogenic fungi can induce the activation of REs of the host plant [24][25][26][27][28][29][30][31]. To our knowledge, the effect of AMF root colonisation on REs of the genome of the host species has not been described so far, using the newly available sequencing technologies.
To infer the potential impact of AMF colonisation on TE activation, we analysed the expression of RT sequences during mycorrhization in sunflower. Finding an expressed RT sequence does not ensure that the RT belongs to a functional retrotransposon and that the complete retrotransposon is transcribed. However, given the generally low level of retrotransposon transcription and the length of these elements, an analysis of the expression of a complete element would need cDNA sequence coverage largely exceeding those obtained in current RNA-Seq experiments.
The observed RT sequence expression rate in mycorrhizal roots was on average very low compared with gene sequences, and the vast majority of the RT sequences included in the sunflower dataset was not expressed. The highest total number of cDNA reads matching RT sequences was found in M4, suggesting that at the onset of AMF colonisation, the host reacts, thereby activating REs. Then, with the exception of a few elements, RT transcription values was even lower than those found in controls.
In total, only 46 of 1,807 sequences were transcribed in mycorrhizal or control roots. Among the transcribed RT sequences, a large number were also expressed in hormonal or chemical treatments, suggesting that these RTs belong to LTR-REs, which are regularly active in sunflower roots, even if at low rates. Consistent with this finding, constitutive expression of LTR-REs was previously demonstrated in sunflower and related species [54][55][56].
In comparison, the induction of specific RTs by AMF colonisation is even rarer. Statistical analysis showed that no RT sequence was differentially expressed at day 4, and only four RT sequences were over-expressed at day 16 after mycorrhizal inoculation. Moreover, although not supported by statistical treatment of the data, it could be observed that five and eleven RT sequences were transcribed in M4 and M16, respectively, without being expressed in their respective controls. In all these cases, it is possible to hypothesise the occurrence, in the related LTR-REs, of promoter sequences specifically activated during mycorrhization.
In general, activation of REs can be associated with a reshaping of the gene regulation network in response to both abiotic and biotic stress. Probably these four REs are specifically regulated as are the genes induced by symbiosis. In the case that such expression activation was followed by retrotranscription and insertion in the genome, it should be assumed that symbiosis can induce genomic changes in the host. Obviously, such possibility is limited by the low number-only four-of REs potentially activated after mycorrhization. mycorrhizal roots and their controls were prepared during this study; the other libraries [45] were collected from the public database (see the materials and methods section). Green cells refer to the average expression of control libraries. PI = Post Inoculation. https://doi.org/10.1371/journal.pone.0212371.g005

Fig 6. Differential expression of reverse transcriptase (RT)-encoding sequences after different treatments of Helianthus annuus plants.
Analysis was performed only on expressed sequences. Libraries from cDNA of mycorrhizal roots and their controls were prepared during this study; the other libraries [45] were collected from the public database (see the materials and methods section).
Transposon activity is generally repressed, mostly based on RNA interference, a process that is mediated by small RNAs originating from many different precursors. In general, double-stranded RNAs (dsRNAs) are produced by the host and trigger retrotransposon silencing by chromatin remodelling, RNA-directed DNA methylation and post-transcriptional degradation of RE RNAs [44,74,75]. For example, in the yeast Schizosaccharomyces pombe, small RNAs maintaining heterochromatin structure through histone methylation are produced following a basal level of transcription of centromeric DNA repeats [76].
Although Gypsy LTR-REs are by far more abundant than Copia ones in the sunflower genome [77], most expressed RT sequences during mycorrhizal colonisation belonged to LTR-REs of the Copia superfamily, and especially to the Copia lineage AleII. The four RT sequences differentially expressed at M16 also belonged to four different lineages of the Copia superfamily. The higher expression of Copia elements compared with Gypsy ones can be related to the low redundancy of LTR-REs of this superfamily in the sunflower genome. Indeed, it is presumable that low-copy elements escape RNA interference relatively more easily because their past activity is too low to provide the host with enough transcripts to produce specific dsRNAs, thereby reducing the efficiency of silencing mechanisms. Accordingly, in plants, new insertion events were shown only for a few elements, all belonging to the Copia superfamily, Tnt1, Tto1 and Tos17, which are present in a relatively low copy number (<1,000) per haploid genome [78].
Concerning the retrotransposons of R. irregulare, we confirmed the occurrence of LINE and Gypsy RT-encoding sequences in the genome, as already reported [60]. No RT expression was found during mycorrhizal colonisation. However, it was not possible to affirm whether fungal retrotransposon expression did not occur during symbiosis or the coverage of fungal reads in our mycorrhizal libraries was too low to detect transcription of sequences that are generally barely expressed.
Sunflower mycorrhizal over-expressed RT sequences were not induced by any hormonal or chemical treatments mimicking abiotic or biotic stresses. Moreover, compared with other treatments analysed in this study, AMF colonisation was the treatment that induced the lowest number of RT sequences: four sequences were over-expressed after AMF colonisation while the other treatments induced from 5 to 15 RT-encoding sequences. These data indicate that mechanisms inducing transcription of RT sequences (and, presumably, of LTR-RE families) by AMF colonisation are highly specific. However, as noted above, the transcription of a retrotransposon does not necessarily imply that the retrotransposon is active and able to transpose. Transcripts should be reverse-transcribed to cDNA, which should be then integrated into the genome. In addition, transcripts might be inactivated at the post-transcriptional level, as shown in other species, including humans [44].
In conclusion, the current work showed the activation of RT-encoding sequences in H. annuus roots during mycorrhizal colonisation and compared the expression of these elements with other abiotic and biotic stresses, including pathogenic fungi. It could be deduced that mechanisms of retrotransposon activation in the mycorrhizal symbiosis are specific and a few specific retrotransposons are uniquely activated during the mycorrhizal interaction. Further studies will focus on the activation of REs during the various stages of mycorrhizal establishment in different genotypes, to confirm such specificity and disclose its molecular bases, thereby clarifying the role of retrotransposons during the symbiotic interaction.