Introduction

Bioethanol can be produced from starch and sucrose (first-generation ethanol – 1GE) but also from lignocellulosic biomass (second-generation ethanol – 2GE). For the production of 2GE the sugars used in the fermentation are from the depolymerization of the carbohydrates present in the cell wall, cellulose and hemicellulose1,2. Depending on the plant species, tissue wall material can represent between 40% and 80% of the plant biomass3,4. Grasses with C4 metabolism, especially those belonging to the subfamily Panicoideae, such as sugarcane (Saccharum spp.), sorghum (Sorghum bicolor), species of Miscanthus, and Panicum virgatum, represent plants with the greatest potential for 2GE production due to their large capacity for carbon fixation and biomass accumulation5.

Lignocellulosic biomass used in 2GE production is composed of cellulose, hemicellulose, and lignin, which are arranged in a chemically ordered manner in the wall. Cellulose is organized into crystalline microfibrils that are embedded in a matrix of hemicellulose, which is covalently linked to the complex structure of lignin. In 2GE production the chemical bonds between wall polymers must be broken to release sugars for downstream fermentation processes. Usually, a chemical pretreatment is needed to allow the access of enzymes to the wall polysaccharides6. One of the main difficulties in accessibility to the polysaccharides is the presence of lignin, which is highly resistant to degradation due to a diversity of low reactivity linkages, making this phenolic the main polymer responsible for the cell wall recalcitrance7,8,9. In addition, pretreatments can release lignin residues that can inhibit the fermentation process.

Lignin is a complex heteropolymer formed by oxidative combinatorial coupling of three alcohols that are synthesized in the cytoplasm of plant cells: p-coumaryl, coniferyl, and sinapyl alcohol. These alcohols differ in their degree of methoxylation10 and are transported from the cytoplasm to the apoplast, where they are oxidized by peroxidases and/or laccases into radicals, that are then incorporated by random radical reactions into the preformed polymer11. After the incorporation, the monolignol residues are called p-hydroxyphenyl (H), guaiacyl (G), and syringyl (S), respectively, and their proportion in the lignin structure varies significantly between the type of plant cells, tissues, and species12,13. Lignin present in gymnosperms consists of G units and small amounts of H units, whereas in angiosperms they are composed of G units, S units, and only trace amounts of H units. In monocotyledons, both S units and G units are presented at similar levels and the amount of H units is higher than in dicotyledons14.

The S/G ratio and the inter-monomeric linkages in the lignin polymer are important characteristics to predict the degree and nature of the condensation of the polymer and, consequently, about plant biomass recalcitrance15. In addition, the complexity of the lignin structure and its recalcitrance can be affected by other phenylpropanoids that can be incorporated into the polymer structure to different levels16. For example, a recent structural characterization of cell walls of several monocotyledons showed that the flavonoid tricin is part of native lignin17,18, and this monomer may act in the formation of a nucleation site for the beginning of lignin biosynthesis18,19,20.

Several species of plants have been genetically modified to change the content and composition of lignin, and the degree of modification depends on the responsible gene and on the position of the encoded enzyme in the biosynthetic pathway21,22. In general, changes in the expressions of C3H, HCT, or 4CL lead to quantitative changes in the levels of lignin, while the regulation of F5H and COMT leads to changes in the S/G ratio and, consequently, in the type of lignin7,23,24. The recent identification of another lignin biosynthesis enzyme, Caffeoyl Shikimate Esterase (CSE), adds another step in this metabolic pathway that can be manipulated25,26. Indeed, transgenic poplar plants silenced for CSE showed reduced lignin content, altered S and G composition, and increased saccharification yields27.

Many of the studies on the biosynthetic pathway of lignin monomers were conducted in some dicotyledons (e.g. Alfalfa and Populus) and model plants such as Arabidopsis thaliana and Nicotiana tabacum28, in which a high degree of conservation was observed. The information obtained with these plants has been applied in studies of monocotyledons used for 2GE production28,29,30, but studies with monocotyledons are still proportionally smaller in number. The study of lignin biosynthesis in sugarcane has been conducted recently in a systematic manner13,31,32,33,34,35,36 and transgenic plants of sugarcane silenced for COMT and CAD37,38,39 were produced.

The genus Saccharum comprises more than 10 species40 and the term sugarcane is generally used to define complex hybrids originated from the species S. officinarum and S. spontaneum, which appear to have contributed with 90% and 10%, respectively, to its genotype41. Sugarcane is a C4 grass, which is highly efficient in the production of photoassimilates and biomass accumulation42, in addition to storing up to 18% of sucrose (wet basis) in its culms43. The sucrose-rich syrup obtained by crushing the culms is used in the alcoholic fermentation and production of 1GE2,44. The residual biomass called “bagasse” – composed primarily of cellulose (39%), hemicellulose (25%), and lignin (23%) – has huge potential for 2GE production45,46,47. However, the use of sugarcane bagasse to produce 2GE has several technical hurdles, among them the recalcitrance of the lignocellulosic material mainly due to the presence of lignin, which drastically decreases the efficiency of saccharification yield for downstream fermentation33.

A new type of cane, called energy cane, with lower accumulation of sucrose in the stem and richer in fiber has been considered for 2GE production48. The term energy cane has been used generically for the species S. spontaneum as well as for its hybrids with commercial varieties of sugarcane. In addition to its application in biofuel production (first and second generation ethanol) energy cane can be burned to generate electricity42 because of its high lignin content and the greater heating value of this polymer49.

To date, a systematic study related to lignin biosynthesis and cell wall biochemistry has only been conducted on S. officinarum, but not on any other Saccharum species33. Some species of the genus have different sucrose and fiber contents, such as S. spontaneum, S. officinarum, S. robustum, and S. barberi. S. officinarum and S. spontaneum differ in sucrose and fiber content whereby the first accumulates more sucrose but has a lower fiber content. S. officinarum is the only species within the genus Saccharum whose chromosome number is not variable between individuals50 and it is believed that it originates from S. robustum. On the other hand, S. spontaneum is a complex, highly polymorphic species, and the most primitive of the species of the genus Saccharum. The high genetic variability of this species has been used in genetic breeding programs seeking to develop commercial varieties with potential for biomass production51. Abundant molecular evidence indicates that S. spontaneum is genetically very different compared to the other species of Saccharum52,53. Similarly to S. spontaneum plants of S. robustum have culms that are rich in fiber and poor in sucrose, and although the plants are vigorous, they are susceptible to abiotic and biotic stresses54. Although S. robustum has potential to be used in breeding programs because its vigor, its use has been restricted to Hawaii51. Apparently, the species S. barberi originated from the natural hybridization of S. officinarum with S. spontaneum55. This species has been cultivated and has moderate content of sucrose, displaying resistance to stresses and high content of fibers in relation to S. officinarum. Currently, there is little interest in using S. barberi in breeding programs, mainly due to the difficulty of flowering and flower sterility.

Because of differing fiber content and the potential for E2G production of these species, this study aims at investigating the cell wall components, the content and type of lignin, as well as to determine and evaluate the relative expression of the genes related to lignin biosynthesis in S. spontaneum, S. officinarum, S. robustum, and S. barberi. Such information may help not only in a better understanding of the accumulation of lignin within the genus Saccharum but also provide useful information for the adoption of these species for 2GE production.

Results

Cell wall polysaccharides

Irrespective of culm age, the cellulose content was higher in S. spontaneum and S. robustum than in S. officinarum and S. barberi (Fig. 1A). In the first two species, the highest content was observed in culms of internode 8. Hemicellulose content was always higher in internodes 2 + 3 (Fig. 1B) and the species with the lowest content was S. officinarum. The other species showed similar values for the culms of different ages. Pectin content (Fig. 1C) was higher in the younger internode of S. barberi and S. officinarum compared with the mature internode and similar between the internodes in the other two species. The highest pectin content was found in internodes 2 + 3 of S. officinarum and the highest content in internode 8 was found in S. spontaneum.

Figure 1
figure 1

Content of (A) cellulose (B) hemicellulose and (C) pectin in internodes of Saccharum species. Different capital letters denote significant differences (p < 0.05) between internodes of different stages of development within the same species. Different lowercase letters indicate differences (p < 0.05) between internodes of the same stage of development of the different species. The averages were compared by Tukey’s test. Vertical bars indicate the standard error of the means of five replicates.

Non-structural carbohydrates and starch

The highest contents of total sugars (Fig. 2A) and sucrose (Fig. 2B) were found in mature culms of S. officinarum and S. barberi. These two species were also those that accumulated more reducing sugars in young culms (Fig. 2C). While sucrose contents were similar in new and mature culms of S. robustum and S. spontaneum, reducing sugar contents in these species were higher in new culms. Starch content in S. spontaneum was more than eight fold higher than in the other species in mature culms (Fig. 2D). Comparatively, new culms of S. officinarum accumulated more starch than mature culms.

Figure 2
figure 2

Content of (A) total soluble sugars, (B) sucrose, (C) reducing sugars and (D) starch in internodes of Saccharum species. Different capital letters denote significant differences (p < 0.05) between internodes of different stages of development within the same species. Different lowercase letters indicate significant differences (p < 0.05) between internodes of the same stage of development of the different species. The means were compared by the Tukey test. Vertical bars indicate the standard error of the means of five replicates.

Total phenols

The highest phenol contents were found in newer internodes, and the highest values were those of the species S. robustum and S. spontaneum (Fig. 3). S. officinarum presented the lowest phenol content in new internodes. The four species did not differ as to the content in mature internodes.

Figure 3
figure 3

Total phenol content in internodes of Saccharum species. Different capital letters denote significant differences (p < 0.05) between internodes of different stages of development within the same species. Different lowercase letters indicate significant differences (p < 0.05) between internodes of the same stage of development of the different species. The means were compared by the Tukey test. Vertical bars indicate the standard error of the means of five replicates.

Soluble and insoluble lignin

In the four species the highest soluble lignin contents were found in new internodes (Fig. 4A), and S. officinarum had the highest content, which was equal among the others. S. officinarum also showed the highest content in mature culms. On the other hand, insoluble lignin content was higher in mature internodes in the four species (Fig. 4B), and in the two internode stages analyzed the highest contents were found in S. spontaneum and S. robustum.

Figure 4
figure 4

Content of (A) soluble lignin and (B) insoluble in internodes of Saccharum species. Different capital letters denote significant differences (p < 0.05) between internodes of different stages of development within the same species. Different lowercase letters indicate significant differences (p < 0.05) between internodes of the same stage of development of the different species. The means were compared by the Tukey test. Vertical bars indicate the standard error of the means of five replicates.

Saccharification yield

A similar pattern could be observed in saccharification yield of S. barberi and S. officinarum, and of S. robustum and S. spontaneum, constituting two distinct groups (Fig. 5A). While in the first two species the saccharification yields between the two stages of internodes were similar, they were quite different in the second group. Saccharification in mature internodes of S. spontaneum and S. robustum was nearly halfed that of young internodes. In general, the percentage of saccharification in young culms was close in the four species, around 65%.

Figure 5
figure 5

Saccharification yield (A) and S/G ratio (B) in internodes of Saccharum species. Different capital letters denote significant differences (p < 0.05) between internodes of different stages of development within the same species. Different lowercase letters indicate significant differences (p < 0.05) between internodes of the same stage of development of the different species. The means were compared by the Tukey test. Vertical bars indicate the standard error of the means of five replicates.

S/G ratio of lignin

S/G ratio was higher in new internodes than in mature internodes of S. robustum and S. barberi but did not differ in the other two species (Fig. 5B). When comparing only new internodes, the first two species also showed higher values than the other two. However, mature internodes of S. barberi, S. officinarum, and S. spontaneum showed higher values than that of S. robustum.

Profile of soluble lignin oligomers

Soluble lignin monomers and oligomers were identified through comparison with data from a library15, using retention times, m/z ratio, and MS/MS fragmentation pattern. We found linkage structures belonging to the groups β-aryl ether (8-O-4), phenylcoumarin (8-5), and resinol (8-8), and it was possible to identify the S aromatic unit involved in each of these linkage structures (Table 1). In the species investigated we identified 11 structures: one aldehyde sinapyl, one monolignol (S), four dimers, and five trimers. Two of the dimers and two of the trimers presented 8-5 linkages, with G monomers, which are more recalcitrant linkages. The other linkages present in the trimers (8-O-4) are characterized as of easier cleavage. Two of the dimers (m/z = 357 [G(8-5)G] and m/z = 387 [S(8-5)G; S(8-8)G]) presented stereoisomerism (retention time = 3.39, 4.08 and 3.42, 3.39 respectively). This characteristic was also presented for two of the five trimers identified in this study (m/z = 583 [G(8-O-4)S(8-5)G] and m/z = 643 [S(8-O-4)S(8-8)S]), retention times: 3.57, 3.66 and 3.82, 4.2 respectively).

Table 1 Oligomer precursors of lignins, retention time and their respective m/z obtained by UPLC-MS/MS in internodes of Saccharum species.

Figure 6 shows that, irrespective of internode age, S. spontaneum and S. robustum have the greatest diversity and frequency of oligomers compared with S. officinarum and S. barberi. Aldehyde sinapyl (m/z = 207) was found both in young internodes and in mature internodes of all species, whereas S monolignol (m/z = 209) was found only in young internodes of S. officinarum and S. barberi, more frequently in the latter species. In the species S. robustum and S. spontaneum, lignin dimers show a tendency to be more frequent in mature internodes, contrary to what was found for S. officinarum and S. barberi, where dimers were more frequent in young internodes. Trimers were found preferably in mature internodes of the four species, and with remarkable frequency in S. spontaneum and S. robustum. Comparing all the oligomers identified, the dimer m/z = 387 [S(8-5)G; S(8-8)G] and the trimer m/z = 583 [G(8-O-4)S(8-5)G)] were the most frequent structures, and contrarily, the dimer m/z = 405 [G(8-O-4)S)] was the least frequent. The dimer m/z = 357 [G(8-5)G] was found only in the species S. robustum and S. spontaneum. The dimers m/z = 357 [G(8-5)G] and m/z = 405 [G(8-O-4)S)] were identified neither in the young internode nor in the mature internode of the species S. officinarum and S. barberi.

Figure 6
figure 6

Distribution of lignin precursor oligomers and their respective m/z in internodes of different ages of Saccharum species. The frequency of each structure is represented in the diagram by different intensities of green colour, going from not found (0 - white) until found in all five samples analysed (x5 - intense dark green).

Composition of monosaccharides, of lignin, and acetyl groups substituent of cell wall xylan

Figure 7A,B show the expansion of the 2D-HSQC NMR spectrum (1H (x-axis)/13C (y-axis)) of the lignin aromatic region and anomeric region, respectively, of a stem wall sample, taking as example one of the Saccharum species. Prominent peaks corresponding to known polysaccharide linkages connections are tagged56,57. The compositions of p-hydroxycinnamates, O-acetyl substituent groups in xylan, and monosaccharides are shown in Fig. 7C–E. There was no significant difference as to p-coumarate and ferulate (Fig. 7C). In relation to the relative abundance of O-acetyl substituent groups of xylans (Fig. 7D), S. officinarum had a significantly higher percentage of 3-O-Ac substituent groups in xylan in relation to the other species under study. On the other hand, as for the 2,3-O-Ac group, there were significant differences between the species under study, and the highest percentage was found in S. officinarum and S. spontaneum. There were no significant differences between the species under study in relation to the total relative abundance of the acetylated groups and of the 2-O-Ac substituent group. S. spontaneum and S. robustum showed significantly highest glucose content, in relation to the species S. officinarum and S. barberi. In opposition to what was found for glucose, xylose percentage was significantly higher in the species S. officinarum and S. barberi. S. officinarum presents significantly greatest abundance of mannose when compared with the other species under study. As for the case of arabinose, S. barberi was the species that presented the highest percentage of this monosaccharide. S. spontaneum, S. robustum, and S. officinarum showed no significant differences with respect to the monosaccharide arabinose (Fig. 7E). β aryl ether and dibenzodioxocin were the main linkages detected in the four species while resinol and phenylcoumaran were found in lower amounts (Fig. 7F). S. officinarum showed the highest percentage for β aryl ether and the lowest for resinol and phenycoumaran.

Figure 7
figure 7

2D HSQC NMR (1H (x-axis)/13C (y-axis)) spectra of the anomeric region (A) and aromatic region of lignin (B) in shoots of Saccharum species. Percentage of p-hydroxycinnamates (C), O-acetyl substituents (D), monosaccharides (E) and lignin linkages (F) groups on stems of Saccharum species. The different lowercase letters indicate significant differences (p < 0.05) between the stems of Saccharum species for one type of p-hydroxycinnamate, O-acetyl substituent group or monosaccharide. The means were compared by the Tukey test. The vertical bars indicate the standard error of the means of three replicates.

Histochemical analyses

In all species analyzed the peripheral region adjacent to the epidermis presented a greater concentration of vascular bundles wrapped in several layers of fibers (Figs 8 and 9; left columns). In the central region of the culm, the species have vascular bundles scattered between fundamental parenchyma cells (Figs 8 and 9; right columns). In the second internode just below the stem apex, the tissues are in differentiation. In the xylem only the conducting cells of the protoxylem are differentiated and with lignified secondary cell wall (Figs 8 and 9; A-B, G-H, M-N, S-T). The epidermis and vascular bundles are differentiated in the fifth internode. The fibers around the bundles already show early deposition of secondary cell wall (Figs 8 and 9; C-D, I-J, O-P, U-V). In the seventh internode the tissues are differentiated and the parenchyma cells are completely expanded (Figs 8 and 9; E-F, K-L, Q-R, W-X). The Phloroglucinol-HCl reagent evidences the presence of lignin with red coloration (Fig. 9) and the Maüle reagent evidences syringyl (S) lignin with red coloration and guaiacyl (G) lignin with yellow coloration (Fig. 8). In the different stages of development analyzed, there was an increase in tissue lignification in the fifth and seventh internodes, with the second internode, still immature, showing little lignification. In the seventh internode of S. officinarum (Fig. 8E) and S. barberi (Fig. 8K) parenchyma cells and cells of the vascular bundles of the peripheral region of the culm show a predominance of red coloration with Maüle reagent, indicating S lignin. In the central region, parenchyma cells of S. officinarum are not lignified (Fig. 8F) and in S. barberi they are lignified and have yellowish coloration, indicating G lignin (Fig. 8L). In the seventh internode of S. spontaneum (Fig. 8Q) and S. robustum (Fig. 8W) there are fibers in the peripheral region of the vascular bundles, on which there is lignin deposition. The innermost fiber layers of the bundles show yellowish coloration, indicating G lignin, while the outermost layers have reddish coloration, of S lignin. In the two species, the fibers of the vascular bundles and parenchyma cells of the central region show yellowish coloration of G lignin (Fig. 8R,X).

Figure 8
figure 8

Cross sections of different regions and internodes of Saccharum species submitted to Maüle reaction for detection of lignin S and G. 2nd = immature internode, 5th = intermediate internode, 7th = mature internode. Right columns: peripheral region (Rind); Left Columns = central region (Pith). e = epidermis; f = fibers; fp = fundamental parenchyma; mx = metaxylem; ph = phloem; px = protoxylem; vb = vascular bundle. Scale bars = 50 μm.

Figure 9
figure 9

Cross sections of different regions and internodes in Saccharum species with Fluoroglucinol reagent for total lignin detection. 2nd = immature internode, 5th = intermediate internode, 7th = mature internode. Right columns: peripheral region (Rind); Left Columns = central region (Pith). e = epidermis; f = fibers; fp = fundamental parenchyma; mx = metaxylem; ph = phloem; px = protoxylem; vb = vascular bundle. Scale bars = 50 μm.

Starch grains, stained black, were observed in the chlorophyll parenchyma cells in the peripheral region of the culm of all species analyzed (Fig. 10). However, in the fundamental parenchyma cells the starch grains were only observed in abundance in S. spontaneum (Fig. 10C).

Figure 10
figure 10

Cross sections of the stem peripheral region at the 7th internode of Saccharum species treated with lugol (I2 + KI) for detection of starch grains. Cp = chlorophyll parenchyma; fp = fundamental parenchyma; vb = vascular bundle; arrow = starch grains. Scale bars = 50 μm.

The marked differences found between the species were the thickness of the cell wall of the fibers of the vascular bundles in the peripheral region and the lignification of the parenchyma cells in the central region. In S. officinarum (Fig. 9E,F) and S. barberi (Fig. 9K,L) the vascular bundles near the epidermis presented fibers with thinner cell wall compared with those present in S. spontaneum (Fig. 9Q,R) and S. robustum (Fig. 9W,X). In the peripheral region, parenchyma cells of all species are lignified on the seventh internode. However, in the central region of the S. officinarum culm the parenchyma cells remain non-lignified.

Identification and expression of monolignol biosynthesis genes

Bands taken from the gels and sequenced enabled the identification of 13 unigenes in the four Saccharum species: 1 C4H, 2 4CL, 1 HCT, 1 F5H, 1 C3H, 2 CCoAOMT, 1 CCR, 1 COMT, and 3 CAD. As two genes were isolated for CCoAOMT and 4CL, they were identified as A and B; and for CAD they were called A, B, and C. The SAS (Sugarcane Assembled Sequences) of the respective orthologs in sugarcane identified by Bottcher et al.33 and the abundances of reads observed for each one of the genes identified in this study are shown in Supplementary Table S3. The phylogenetic analyses of the sequences of the genes isolated from the Saccharum species of this study and other angiosperms are in Supplementary Figs S1S9 and the translated sequences for proteins are in Supplementary Figs S10S18.

Gene expression profile in S. spontaneum and S. officinarum

Expression of the identified genes were analyzed by qPCR (Fig. 11). In general, most genes were higher expressed in S. spontaneum, namely: C4H, 4CL A, C3H, CCoAOMT A and B, CCR, and F5H. S. officinarum had higher expression of HCT, COMT, and CAD B genes. The CAD A gene had varied expression between the tissues, but its highest expression was in young and mature leaf (Fig. 11J). Internodes 3 and 5 showed a difference in rind and pith. C4H was equally expressed in pith and rind of S. officinarum and decreased from rind to pith in S. spontaneum (Fig. 11A). 4CL A showed no difference between rind and pith in internode 3 in both species, but decreased from rind to pith in internode 5 for S. officinarum and increased for S. spontaneum (Fig. 11B). HCT had higher expression in rind of internodes 3 and 5, but compared with pith the expression in this tissue was lower (Fig. 11C). C3H had higher expression in all tissues of S. spontaneum compared with S. officinarum. The expression of C3H was higher in rind than in pith of internode 3 (Fig. 11D). CCoAOMT A had higher expression in rind and pith in internode 5 than in internode 3 (Fig. 11E). However, this gene was more expressed in pith (internodes 3 and 5) than in rind in S. officinarum and the opposite was observed in S. spontaneum. CCoAOMT B maintained the expression in rind of internodes 3 and 5 and increased slightly between pith 3 to 5 in S. officinarum, in S. spontaneum this gene was more expressed in tissues of internode 5 than 3 (Fig. 11F). CCR had relatively higher expression in rind and pith in internode 5 in S. spontaneum (Fig. 11G). In S. officinarum the expression was lower in all tissues, and there was higher expression in rind of internode 5 than in rind of internode 3. Among the genes analyzed CCR was one of the most expressed of the lignin biosynthetic pathway, followed by CCoAOMT B (Fig. 11F–G). S. officinarum showed no difference in expression between rind and pith for internodes 3 and 5 for F5H, but higher expression in S. spontaneum in rind and pith in internode 5 (Fig. 11H). In COMT a higher expression in pith of internode 5 for both species should be noted (Fig. 11I). CAD A presented a more specific pattern in young and mature leaves, low expression in roots for both species, and higher expression in pith for internodes 3 and 5 compared with rind, respectively of each internode in S. officinarum (Fig. 11J). No differences were observed in the expression of CAD A between rind and pith for S. spontaneum. CAD B in S. officinarum showed higher expression in rind of internode 5. In S. spontaneum there was higher expression in tissues of internode 5 compared with internode 3 (Fig. 11K). Interestingly, the genes display distinct pattern of expression in the two species, which shows a complex and distinct pattern in the control of the lignin biosynthetic pathway. The CCR and CCoAOMT B genes were expressed the highest, 4CL and F5H displayed higher expression in more developed tissues, i.e., internode 5; C3H and CCR in S. spontaneum; CAD B in S. officinarum.

Figure 11
figure 11

Expression profile of the genes of the biosynthetic pathway of the monolignols analysed by qRT-PCR in Saccharum species. YL = young Leaf, ML = mature Leaf, R3 = rind of the internode 3, R5 = rind of the internode 5, P3 = pith of the internode 3, P5 = pith of the internode 5 and R = root. Different letters indicate significant differences (p < 0.05) in relative gene expression among tissues of the same genotype. The means were compared by Tukey’s post-hoc test. The vertical bars indicate the standard deviation of the means of three biological replicates.

Discussion

Sugarcane has the capacity of storing soluble, readily fermentable sugars (mostly sucrose) up to 18% of the fresh mass in the stalk2,58. The large accumulation of sucrose occurs in the maturation of the culms. Energy cane accumulates half or less sucrose than sugarcane and much of the fixed carbon is shuttled to structural polysaccharides such as cellulose and hemicelluloses59. By comparing the mature internodes between the Saccharum species studied, the lowest values for cellulose, hemicellulose, and pectin were found in the species S. officinarum, and the highest values were found in S. spontaneum (Fig. 1). The opposite was observed for sucrose, the primary soluble sugar in mature culms (Fig. 2). With some variation, S. barberi had closer levels to those of S. officinarum, while S. robustum was closer to S. spontaneum. This inverse relationship appears to be reflected in the wall monosaccharide composition evaluated by 2D-HSQC NMR spectroscopy. S. officinarum and S. barberi biomass harbor a higher xylose content, while S. spontaneum and S. robustum a higher glucose content (Fig. 7D) reflecting the competing sinks for these carbohydrates, hemicellulose and cellulose, respectively60,61.

Interestingly, while the cellulose content remained the same in new and mature culms of S. barberi and S. officinarum, it increased in the other two species. This behavior is opposite to the sucrose levels, that is, the disaccharide increases with maturation in the culms of S. barberi and S. officinarum, but remains practically the same in S. robustum and S. spontaneum. On the other hand, the comparison of reducing sugar contents in new and mature culms shows a much greater variation for S. barberi and S. officinarum, suggesting that reducing sugars in these species are directed towards sucrose synthesis, whereas in the other two species towards structural polysaccharides, in particular cellulose62. Similar to Panicum virgatum63,64, Brachypodium distachyon60,65, and Zea mays66,67, during the development of the internodes in S. spontaneum and S. robustum there was higher accumulation of carbon as unsoluble polysaccharides (cellulose, hemicellulose and pectin) in the cell wall, than the soluble sucrose in the parenchymal cell.

While the starch content was reduced during the maturation of the culms in S. officinarum, S. robustum, and S. barberi, it increased notably in S. spontaneum as also visually observed in the histochemical analyses. Starch granules were detected in the fundamental parenchyma of mature internodes of S. spontaneum.The presence of starch in S. spontaneum had been reported previously59, where 215 clones related to the genera Saccharum, Erianthus, and Miscanthus were analyzed. While S. robustum was the species with only traces of starch, S. spontaneum harbors the highest content. It has been suggested that the accumulation of starch in mature internodes of this species could be due to its capacity for tillering and high metabolic activity and as a strategy to cope with biotic and abiotic stresses68.

Lignin is the second largest biopolymer present in the cell walls of grasses69. Although it is essential for plant growth and development, lignin is the main factor responsible for the recalcitrance to processing of plant biomass in 2GE, including sugarcane33. Lignin content in the Saccharum species was determined using the Klason method, which distinguishes the soluble and insoluble fractions together providing a total estimate of lignin70. Regarding internode age a negative correlation was observed between these two types of Klason lignin, indicating greater amount of soluble Klason lignin (monomers and oligomers precursors of insoluble lignin polymers) in young internodes, and insoluble lignin in mature internodes. This is not surprising as lignification of the wall is still underway in young internodes. However, most of the lignin biosynthetic genes analyzed had a lower expression in young culms suggesting that the larger amount of soluble lignin in these tissues would be correlated to the polymerization process and not with monolignol production.

In the culm, the rind contains a high percentage of densely packed vascular bundles and is a metabolically active region with high peroxidase activity, therefore polymerizing and thus accumulating lignin31,33. When comparing the insoluble lignin content in mature internodes of the four species, S. spontaneum (20%) and S. robustum (18%) contain higher values than S. barberi (16%) and S. officinarum (14.5%). This difference was also observed in the histochemical analyses with phloroglucinol-HCl. Compared with S. officinarum and S. barberi, the rinds of mature internodes of S. spontaneum and S. robustum have higher density of vascular bundles and the walls of cellular elements such as hypodermis, epidermis, sclerenchyma and vascular fibers seem thicker and more lignified, contributing significantly to the higher content of this polymer. A general analysis of the expression of lignin biosynthesis pathway genes in the tissues of the culms displays a higher expression in S. spontaneum compared to S. officinarum, and a higher expression in tissues (rind and pith) of internode 5 compared with internode 3, supporting the higher insoluble lignin content in S. spontaneum and in mature tissues of the stalk. These gene expression differences, however, varied slightly depending on the species and tissue, for example, C4H in S. spontaneum, C3H in pith of the two internodes, CAD A and CAD B in rind and pith of S. officinarum, CCoAOMT A in rind of S. officinarum, and HCT in pith of S. spontaneum.

The nature of inter-monomeric linkages between lignin oligomers and their modifications can be exploited for the production of more degradable lignins15,71,72 enabling greater efficiency in fermentation process using cell wall sugars for 2GE production. The linkages 8-O-4 (β aryl ether) are the most common and are characterized as those of easiest cleavage. Lignins rich in G units have more recalcitrant linkages, such as 8-5 (phenylcoumarins), 5-5 (resinols), and 5-O-4, while S lignins are less interlinked and less recalcitrant to hydrolysis15,73. Overall, the analyses of the profiles of oligomers obtained by UPLC/MS from the four species studied identified 11 structures, between aldehydes, monomers, dimers, and trimers (Table 1). The distribution of these structures allowed a clear distinction between the internodes of the Saccharum species, and there was higher frequency of lignin oligomers in mature internodes than in young internodes. On the other hand, the highest amount of soluble phenols in all species were found in young culms, with markedly higher quantities in S. robustum and S. spontaneum compared with the other two species. Large quantities of free phenols, such as hydroxynnamic acids and chlorogenic acids, are found in tissues in lignification10,16,25. Also mature internodes of S. robustum and S. spontaneum the highest frequency and diversity of lignin oligomers (dimers and trimers) were found. Morreel et al.74 commented that the various lignin oligomers in tissues that undergo extensive lignification are derived from the availability of monolignols that are coupled under oxidative conditions for cell wall lignification, justifying the correlation between lignin content and frequency of oligomers.

The 8-O-4 linkage was the most common type of lignin linkage (Fig. 7F). According to Santos et al.75, this type of linkage is dominant in lignins of grasses, corresponding to 60% of the total. Other works such as those presented by Bottcher et al.33 and Kiyota et al.15 also corroborate these results. It was also observed that G units were found more frequently than S units in oligomers of the four Saccharum species. We could not find H units, although the lignin of grasses is characterized by having more of these units than the lignin of dicotyledons76. The non-detection of H units in the Saccharum species could be explained by the fact that these units occur essentially as free terminal, inert phenolic groups, and their incorporation prevent the growth of the lignin polymer. Due to their high oxidative potential they are insoluble in ethyl acetate, which was the solvent used in the extraction of the oligomers76. Linkages containing only G units in this study, such as the dimers m/z = 357 [G(8-5)G] and m/z = 375 [G(8-O-4)G] and the trimer m/z = 553 [G(8-O-4)G(8-5)G], were identified notably in mature internodes of S. spontaneum and S. robustum. This result might explain why mature internodes of these species showed a lower S/G ratio than mature internodes of S. officinarum and S. barberi. The structures corresponding to the dimers m/z = 387 [S(8-5)G] and [S(8-8)G] and the trimer m/z = 583 [G(8-O-4)S(8-5)G] were identified in all internodes, which suggests that these structures are conserved and fulfil an important role in the growth and development. Sinapyl alcohol (S) (m/z = 209) was found more frequently in young internodes of the species S. officinarum and S. barberi. Young internodes of S. officinarum and S. barberi have a higher soluble Klason lignin content than the other Saccharum species, which could be correlated with an increased frequency of the structure m/z = 209 as soluble Klason lignins are primarily composed of S units77. In comparison to the studies of Bottcher et al.33 and Kiyota et al.15, only two structures m/z = 357 [G(8-5)G] and m/z = 583 [G(8-O-4)S(8-5)G] were always present indicating that they are conserved among the species of the genus Saccharum.

In sugarcane hybrids it has been observed that the S/G ratio increases with the development of the stalk33. The same was observed in other grasses such as Festuca arundinacea78, Zea mays66, and Panicum virgatum79. However, such a ratio increase was not observed here, with the S/G ratio being higher in young internodes of S. barberi and S. robustum and equal for the other two species. Local growth conditions may have affected the S/G ratio, but in the case of the S. barberi and S. robustum the low values might be due to the amount of one of the monomers being higher in the pith or rind. We did not separate rind and pith for S/G analysis but based on the histochemical analysis the G amount (stained yellow, see Fig. 8) was elevated in the pith compared to the rind.

Lignin composition (S/G ratio) affects the yield of saccharification7 since tissues rich in S are more susceptible to hydrolysis than those rich in G80. We found no significant difference in saccharification in young internodes of the four species studied, which is not unexpected, since the lignification process has not been completed based on the content of soluble and insoluble lignin, oligomers, and phenols. However, it is interesting to note that in young tissues there seems to be no relationship between saccharification yield and S/G ratio, since S. barberi and S. robustum have higher S/G ratio, but saccharification yield is equal. However, mature internodes of S. spontaneum and S. robustum with lower S/G ratio resulted in a lower yield of saccharification. Therefore, higher yield of saccharification is related to S/G ratio, but only in tissues whose maturity has been reached and, thus, where the secondary cell wall formation process has been completed.

F5H and COMT are thought to be the determinant enzymes in defining S unit content in plants38,81. In P. radiata, the joint action of the two activities led to an increase of S units, with the increase being smaller when only F5H was overexpressed81. In sugarcane, the reduction in the expression of COMT and F5H using RNAi led to different situations38. While plants with partial silenced F5H did not show a reduction in lignin content, one of the lines had a reduced S/G ratio with a concomitant increased saccharification yield. One of the mutants of COMT displayed a reduction in lignin content and improvement in saccharification yield. One of the mutants of COMT exhibited a reduction in the S/G ratio. Our data do not indicate a direct relationship between the expression of COMT and F5H and the S/G ratio. Using S. spontaneum as an example, this species had a similar S/G ratio between young and mature internodes; however, the expression of COMT and F5H was a little higher in pith of mature internodes but equal to the rind of young and mature internodes. On the other hand, the expression of F5H was much higher in mature tissues. A similar situation was also observed in S. officinarum, but with lower expression values. It cannot be ruled out that other hitherto unidentified isoforms of COMT and F5H are involved in lignin biosynthesis in these two species, but it is noteworthy that Bottcher et al.33 isolated only one COMT and one F5H in sugarcane, and its sequences have a high homology with the sequences isolated in the four species studied.

Another factor that has been recognized as negatively affecting plant biomass processing into 2GE is the degree of O-acetylation of cell wall polymers, since acetate, when released during pretreatment represents a powerful inhibitor of fermenting microorganisms82. O-acetylation of hemicelluloses also reduced enzymatic hydrolysis due to steric hinderance of the acetate83. Therefore, reducing the content of O-acetyl groups in biomasses with bioenergetic potential is desirable84. The main hemicelluloses in grasses are xylans3 and their degree of O-acetylation may vary according to plant species, type of tissue and organ, and state of development85. Xylan acetylation occurs more frequently in position O-3 (up to 30%) and less frequently in O-2 (up to 25%), but acetylation in both positions has been reported85. In the Saccharum species studied here, it was found that the total percentage of acetylation (36.9–39.9%) was similar to values found in other grass biomasses86. On the other hand, acetylation in position O-3 was predominant (21.8–24.7%) with respect to the substitution O-2 (11.8–13.0%) and to O-2/O-3 (1.47–3.47%). Analyses by 2D-HSQC NMR spectroscopy showed that S. robustum and S. spontaneum were the species that presented the lowest percentages of acetylation in position O-3 (21.8% and 22.9%, respectively) and total acetylation (36.9–37.4%, respectively). However, the hypothesis that biomass with a reduced percentage of acetylesters results in higher saccharification yields87 could not be supported here. S. officinarum and S. barberi, with a higher degree of acetylation than S. spontaneum and S. robustum, exhibited a higher yield of saccharification. Since it is known that in secondary walls xylans are closely associated with cellulose88, a lower percentage of acetyl groups in S. spontaneum and S. robustum could lead to an even tighter association of xylan with cellulose adding to recalcitrance in these species, and limiting the yield of saccharification83.

The strategy used in this study to identify genes involved in the lignin biosynthetic pathway in the four Saccharum species involved the amplification of fragments produced in RT-PCR reactions using primers designed from conserved regions of gene sequences of sugarcane and of several other close species. Therefore, such primers are likely to amplify sequences of closely related genes encoding similar enzymatic activities. There is a possibility that not all genes of a gene family are amplified. However, the isolated genes represent the highest expressed genes in the tissues is high. Taking into account that the four species studied presented distinct genetic characteristics, it was surprising to observe that the isolated sequences are highly similar among the species and very close in sequence to the ones identified by Bottcher et al.33 in sugarcane. Such similarities could be explained not only by the evolution of the lignin biosynthetic pathway in terrestrial plants but also by the origin of the genus Saccharum and of the commercial cultivars of sugarcane. The parental genomes of S. officinarum (80–90%) and S. spontaneum (10–20%) contributed to sugarcane hybrids including to some extent recombinant chromosomes89. Additionally, the lignin biosynthetic pathway is very conserved between plants and modifications in this pathway generate similar phenotypes between monocotyledons and dicotyledons. The approaches to manipulate lignin in alfalfa7 can be transferred to other species such as switchgrass and sugarcane29,37. Genes related to sugar accumulation in sugarcane culms arose through differential expression of other regulators suggesting a specific epigenetic control. PAL is highly conserved between plants and seems to precede the divergence of dicotyledons and monocotyledons90. Genes related to transcriptional activation are highly conserved in grasses91. An example is the gene SND1 which activates several transcription factors: SND3, MYB46, MYB83, MYB85, and MYB105; apparently very conserved during evolution91.

Conclusions

The set of data obtained here enabled the association of patterns to better understand the process of lignin deposition in four Saccharum species. The differences between the species studied became evident, whether in relation to structural and non-structural carbohydrates or in the quantity and type of lignin. The data enabled the coherent separation of the two species that have been identified as energy canes, S. spontaneum and S. robustum, which accumulate more fiber, from the other two, which accumulate more sucrose. Moreover, the first two species contain more insoluble lignin, the lowest S/G ratios, greater abundance of intermonomeric linkages (lignin oligomers), and lower percentages of saccharification. Gene expression analysis of the lignin biosynthesis pathway genes in S. officinarum and S. spontaneum showed that in general the later species has higher expression in culm tissues especially in mature culms. Surprisingly the sequences of the identified genes showed high conservation in the four Saccharum species including the commercial hybrids. This feature is desirable for the genetic manipulation of energy cane, since knowledge has already been gained with low lignin commercial varieties of sugarcane39,44,92,93. It has been show in other grasses that lignin biosynthesis has a complex regulation by transcription factors, which can activate or repress the expression of the several genes of the route93,94,95,96,97,98. However, to our knowledge, this is the first report describing that lignin genes are highly conserved among species of the same genus and, consequently, the differences they have regarding the polymer content and composition can be only fully understood after gaining knowledge on the sequencing of the regulatory regions of each gene or at least of a set of genes.

Methods

Plant material and growing conditions

Culms of the species S. spontaneum, S. officinarum, S. robustum, and S. barberi were obtained from the Center of Sugarcane of the Agronomy Institute of Campinas, at Ribeirão Preto, São Paulo State, Brazil. The culms were planted in plastic trays containing vermiculite and kept in a greenhouse and the resulting seedlings were transplanted to 50 L pots containing commercial organic substrate and kept in the greenhouse for approximately one year. For each species 5 replicates were planted (5 pots). After this period, the substrate of the pots was partially replaced, taking care not to damage the root system, and the pots were transferred out of the greenhouse, to the experimental area of our department, under natural sunlight. The pots remained in these conditions for a period of 4 months, with daily irrigation.

Only healthy stems, without any sign of physical injury or disease were collected. For biochemical analyses internodes 2 + 3 (young stage) and internode 8 (mature stage) were separated from the apex. Histochemical analyses were performed on internodes 2, 5, and 7. Internodes 4 to 10 were used for cell wall characterization by 2D-HSQC NMR spectroscopy. To identify the genes of the lignin biosynthetic pathway we made a composite sample, containing a 1/1 (w/w) mixture of young internodes (2 + 3) and mature internodes (8), from five plants. For the expression analyses (quantitative RT-PCR, qPCR) 7 types of tissues were used: young and mature leaves, rinds of internodes 3 and 5, piths of internodes 3 and 5, and roots. A steel blade was used to separate the rind from the pith31. In the samplings, the stems were washed in tap water, chopped into small pieces of 1 cm2, frozen in liquid nitrogen and grinded and stored in freezer at −80 °C. For biochemical analyses the ground tissues were dried in a freeze-dryer.

Histochemical analysis

Internodes 2, 5, and 7 of the stems of the four species were used in these analyses. Histochemical tests were made with hand cut sections of ~0.5 mm thickness using a steel blade. The following reagents were used to identify the cell wall components: lignin - fluoroglucinol-HCL99; syringyl and guaiacyl monomers - Maüle reagent100; and starch - Lugol’s iodine101. The staining results were obtained with an Olympus DP71 camera attached to an Olympus BX 51 microscope.

Cell wall polysaccharides

The protocol of Chen et al.78 was followed and pectin, hemicellulose fraction and cellulose were determined. Total sugar content in each fraction was determined with phenol-sulfuric reagent, using glucose as standard102.

Non-structural sugars and starch

Samples were extracted with 70% ethanol at 60 °C for three times and the supernatants were pooled after centrifugation. Total soluble sugars and sucrose were determined with the phenol- sulfuric assay102,103 and glucose and sucrose were used as standards, respectively. Reducing sugar content was determined according to Nelson104 using glucose as standard. Starch content was determined according to Amaral et al.105. The dried, 70% ethanol extracted samples were treated sequentially with α-amylase from Bacillus licheniformis (code E-ANAAM, MEGAZYME, Ireland) and amyloglucosidase from Aspergillus niger (code E-AMGPU, MEGAZYME, Ireland) and the resulting glucose was determined with the PAP Liquiform glucose kit (Labtest Diagnóstica S.A.), using an ELISA plate reader (model EL307C, Bio-Tek Instruments, Winooski, Vermont) at 490 nm. Glucose was used as standard.

Analysis of wall constituents by 2D spectroscopy HSQC NMR

Ball-milled de-starched, alcohol insoluble material (25 mg) was dissolved in 0.75 mL of DMSO-d6 and 10 μL of [Emim] OAc-d14 as previously described56. The dissolved lignocellulosics were subjected to a 2D HSQC NMR experiment acquired on a Bruker AVANCE 600 MHz NMR spectrometer equipped with a 5-mm TXI 1H/13C/15N cryo-probe using the pulse sequence ‘hsqcetgpsisp.2’. The experiments were carried out at 25 °C with the following parameters: spectral width 12 ppm in F2 (1H) dimension with 4096 data points (TD1) and 160 ppm in F1 (13C) dimension with 256 data points (TD2); scan number (SN) of 200; inter scan delay (D1) of 1 s. The chemical shifts were referenced to the DMSO solvent peak (δC 39.5 ppm, δH 2.5 ppm). The NMR data was quantified as described previously using Bruker’s Topspin 3.1 software56,57. The acetylation on xylan was quantified as described below. In brief, the signals in the aromatic region (H1-C1 signals of 2-O-Ac-Xyl, 3-O-Ac-Xyl, 2,3-O-Ac-Xyl, Xyl (xylan) and reducing ends of Xylan (α/β-Xyl-R)) were summed up to 100%, and the signal in the aliphatic region were integrated separately to calculate the relative content of each form of O-acetyl- xylan unit. The relative content of 2-O-Acetyl and 2,3-O-Acetyl-Xylan units were calculated from H2-C2 signal and 3-O-Acetyl-Xylan unit were calculated from H3-C3 signal. The monosaccharide composition [glucose (Glu), xylose (Xyl) and mannose (Man)] was quantified from their anomeric integrals as a fraction of 100%. The compositions of lignin; S (syringyl), G (guaiacyl), H (p-Hydroxyphenyl), FA (ferulate) and pCA (p-coumarate) lignin units were quantified from their aromatic lignin integrals as a fraction of 100%.

Total soluble phenols

The samples were extracted twice with 80% ethanol and the phenols extracted were determined with the Folin-Ciocalteu reagent106. Chlorogenic acid was used as standard.

Lignin content, S/G ratio, and oligomers

Soluble and insoluble lignin was determined according to the TAPPI UM-250 Protocol107. Insoluble lignin content was expressed as percentage of dry wall residue, obtained after sample extraction and hydrolysis. For the determination of soluble lignin, the absorbance of the filtrate of the hydrolysis product was determined at 205 nm and the content calculated using an extinction coefficient of 110 l. g−1.cm−1. To determine the S/G ratio, the samples were treated with NaOH in a heating block at 95 °C/24 h, neutralized with HCl and extracted with ethyl acetate. The residue was dried and then dissolved in H2O MilliQ and the hydrolysis products were analyzed by LC-MS using a UHPLC coupled to a triple quadrupole mass spectrometer with ESI ionization source (model ACQUITY, Waters Corp., Manchester, UK), as described by Mokochinski et al.108. For the analysis of soluble lignin oligomers the samples were twice extracted in 80% ethanol under sonication and the extracts were dried in a concentrator (Concentrator plus-Eppendorf). The dried residue was solubilized in acetonitrile/water (1:2, v/v) just before the analyses. The samples were analyzed in an Acquity UPLC coupled to a TQD triple quadrupole mass spectrometer (Micromass-Waters, Manchester, UK), according to Kiyota et al.15.

Saccharification

Saccharification was determined according as described by Brown and Torget109 using lyophilized biomass equivalent of 10 mg of cellulose. After addition of sodium citrate buffer (0.1 M, pH 4.8), Na3N, and H2O MilliQ, the mixture was heated to 50 °C and cellulase (Trichoderma reesei) and cellobiohydrolase (Aspergillus niger) was added at a 1:4 v/v ratio (Sigma-Aldrich). The samples were incubated in a 160 rpm shaker at 50 °C for 5 days, and then centrifuged at 12,000 rpm for 15 min. Glucose was quantified in the supernatant102.

In silico analysis of databases and synthesis of primers for identification of expressed genes

We studied the genes of the following lignin biosynthesis enzymes: 4-hydroxicinnamoyl CoA: ligase (4CL; EC 6.2.1.12), cinnamoyl CoA reductase (CCR; EC 1.2.1.44), ferulate 5-hydroxylase (F5H; EC 1.14.13.-), caffeate O-methyltransferase (COMT; EC 2.1.1.68) cinnamyl alcohol dehydrogenase (CAD; EC 1.1.1.195), caffeoyl CoA 3-O-methyltransferase (CCoAOMT; EC 2.1.1.104), p-coumaroylshikimate 3′-hydroxylase (C3′H; EC 1.14.13.36), cinnamate 4- hydroxylase (C4H; EC 1.14.13.11), hydroxycinnamoyl-CoA: shikimate/quinate p-hydroxycinnamoyl-transferase (HCT; EC 2.3.1.133) The sequences of the genes characterized by Bottcher et al.33 were used as bait for the search for homologues in the NCBI and Phytozome databases. We selected sequences of sorghum (Sorghum bicolor), rice (Oryza sativa), corn (Zea mays), wheat (Triticum aestivum), Lolium perenne, and Arabidopsis thaliana. We used only full-CDS sequences with a low e-value (<10−6). These sequences were aligned in the BioEDIT program110 and conserved regions were used for the design of primers (Supplementary Table S1) using the Primer 3 program, having as parameters Tm 57 °C–60 °C, a difference of only 2 °C in Tm values between the primers of a pair and the GC content between 55% and 60%111. In some cases, degenerate primers were synthesized. The primers were made in regions that enabled amplifying as many ORFs as possible.

Total RNA extraction, cDNA synthesis, amplification and sequencing

Total RNA extraction was performed in a 1:1 (w/w) mixture of tissues from young internodes (2 + 3) and mature internodes (8). Total RNA was extracted with Trizol (Tri-Phasis Reagent – BioAgency) and treated with Turbo DNAse-free (Ambion). First-strand cDNA synthesis was performed with SuperScript III (Invitrogen) following the manufacturers’ guidelines. RT-PCR reactions were carried out in a thermal cycler (Veriti 96-Well Thermal Cycler-AB Applied Biosystems) following the parameters of Llerena et al.112. The amplification products were separated by electrophoresis in a 1% agarose gel containing ethidium bromide and observed by a photo-documenter Gel Doc 2000 (Biorad). Bands with the expected number of bases were recovered from the gel with GeneJET Extraction (Thermo Scientific), inserted into the cloning vector pGEM-T easy (Promega), and cloned in thermocompetent Escherichia coli DH10β (Novagen). Some colonies were selected, and the presence of the insert (PureLink Quick Plasmid Miniprep Kit, Invitrogen) and its size were confirmed after digesting the plasmid with EcoRI. The inserts were sequenced using M13 primers. Sequencing reactions were performed using BigDye® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) and 3730xl DNA analyzer sequencer (Applied Biosystems). Several colonies were sequenced until 25 good quality sequences (forward and reverse orientation for each sequence) were obtained.

Phylogenetic analyses

The obtained nucleotide sequences were translated to amino acid sequences in silico, and homologous proteins obtained from databases NCBI (http://www.ncbi.nlm.nih.gov/), SUCEST (http://sucest-fun.org), and Phytozome (http://www.phytozome.net/) were selected for phylogenetic analysis. Multiple alignment of amino acid sequences was performed with the ClustalW program113. Phylogenetic analyses were performed with the MEGA program version 4.02 and evolutive relations were inferred using the Neighbor-joining algorithm with Bootstrap for 1,000 repetitions. Gap regions were excluded manually.

Gene expression analysis of the isolated genes

For the gene expression analysis primers specific for the isolated gene sequences were designed (Supplementary Table S2). The efficiency curve of the primers was determined with the Step One Plus Software v2.3 (Life Technologies). Total RNA extraction and first-strand cDNA production were carried out as described above. cDNAs of 7 tissues (new leaf, old leaf, rinds of internodes 3 and 5, piths of internodes 3 and 5, and root) of the species S. officinarum and S. spontaneum were used in the analysis. The reactions were prepared with iTaq™ universal SYBR® Green supermix (Bio-Rad) and analyzed in a StepOnePlus™ Real-Time PCR System, following the program of 95 °C for 3 min and 40 cycles of 95 °C for 10 s and 60 °C for 30 s. The specificity of the amplified products was evaluated by dissociation curve analysis generated by the equipment. GAPDH (glyceraldehyde 3-phosphate dehydrogenase) was used as housekeeping gene33. The relative expression was calculated by 2−ΔCt according to Livak and Schmittgen114.

Statistical analyses

For biochemical analyses we conducted factorial Analysis of Variance (ANOVA), where the first level are the species of Saccharum and the second level are the sugarcane maturation stages, i.e., young internodes (2 + 3) and mature internodes (8). Comparison between means was performed through the Tuckey test (α = 0.05). For gene expression analysis we used ANOVA and for comparison of means the Tuckey test (α = 0.05). For the biochemical analyses (soluble sugars, starch, cell wall polysaccharides, total phenols, Klason lignin, and saccharification) we analyzed 5 biological replicates with three technical replicates each. For the analysis of soluble lignin oligomers and S/G ratio, we analyzed 5 replicates and 1 technical replicate each. For the analyses of hydroxycinnamic acids, monosaccharides, and acetylated xylans we analyzed three biological replicates and one technical replicate each. Results of the biochemical analysis were expressed as mean ± standard error. For gene expression, the analyses were expressed as the mean for three biological replicates and three technical replicates each. For the control of error transfer in the calculation of gene expression we used a linear model of error accumulation \(({\sigma }_{\Delta Ct}^{2}={\sigma }_{{\rm{C}}t,ref}^{2}+{\sigma }_{{\rm{C}}t}^{2})\) in the calculation of the ∆Ct value and a nonlinear model \(({\sigma }_{{2}^{-\Delta Ct}}^{2}={(\frac{d[{2}^{-\Delta Ct}]}{d[\Delta Ct]})}^{2}{\sigma }_{\Delta Ct}^{2})\) in the calculation of the 2−∆Ct value115.