Exploring tomato Solanum pennellii introgression lines for residual biomass and enzymatic digestibility traits

Residual biomass production for fuel conversion represents a unique opportunity to avoid concerns about compromising food supply by using dedicated feedstock crops. Developing tomato varieties suitable for both food consumption and fuel conversion requires the establishment of new selection methods. A tomato Solanum pennellii introgression population was assessed for fruit yield, biomass phenotypic diversity, and for saccharification potential. Introgression lines 2–5, 2–6, 6–3, 7–2, 10–2 and 12–4 showed the best combination of fruit and residual biomass production. Lignin, cellulose, hemicellulose content and saccharification rate showed a wide variation in the tested lines. Within hemicellulose, xylose value was high in IL 6–3, IL 7–2 and IL 6–2, whereas arabinose showed a low content in IL 10–2, IL 6–3 and IL 2–6. The latter line showed also the highest ethanol potential production. Alkali pre-treatment resulted in the highest values of saccharification in most of lines tested, suggesting that chemical pretreatment is an important factor for improving biomass processability. Interestingly, extreme genotypes for more than one single trait were found, allowing the identification of better genotypes. Cell wall related genes mapping in genomic regions involved into tomato biomass production and digestibility variation highlighted potential candidate genes. Molecular expression profile of few of them provided useful information about challenged pathways. The screening of S. pennellii introgression population resulted very useful for delving into complex traits such as biomass production and digestibility. The extreme genotypes identified could be fruitfully employed for both genetic studies and breeding.


Background
Over the last decades, rising concerns upon depleting fossil fuels has resulted in an increased interest in fuels derived from bio-renewable sources including sugars, starch and lignocellulosic materials [1]. Lignocellulosic biomass materials constitute the most abundant renewable substrate for ethanol [2]. Currently, cellulosic feedstocks derived from dedicated biomass crops in the U.S., South America, Asia and Europe are from corn stover, sugarcane bagasse, or perennial low input-high yield crops such as miscanthus or switchgrass [3]. More interestingly, lignocellulosic biomass can be obtained in large-scale from agricultural residues, making their conversion into fuel more advantageous from the economic, environmental and strategic points of view [4]. In particular, the production of ethanol from biomass residuals represents a unique opportunity to avoid concerns about compromising food supply by using starch or sucrose based feedstocks [5][6][7]. So far, a few attempts have been conducted to investigate the potential of producing fuel from residual biomass of tomato (Solanum lycopersicum L.), a major vegetable crop worldwide.
Biomass production depends on several traits related to morphological and physiological processes controlling the plant vegetative growth. Developing new varieties for both food and fuel production, will require the establishment of new selection methods. Moreover, in order to elucidate genetic interactions between traits, it will be important to understand the correlations between traits and the extent to which they can be uncoupled, since positive and negative correlations can have profound effects on each other or comprising other aspects of crop production [8]. Genomic resources like wild introgression populations can facilitate the identification of tomato genotypes characterized by both high fruit and residual biomass production.
The tomato Solanum pennellii introgression population is a permanent mapping source for Quantitative Trait Loci (QTLs) analysis composed of a series of introgression lines, in which defined genomic segments of the S. pennellii genome replaced homologous regions in S. lycopersicum (cultivar M82) background. Such population can be very effective for identifying QTL, because any phenotypic difference between an introgression line and the recurrent parental line is attributed solely to donor parent genes within the introgressed chromosomal segment [9]. The assessment of S. pennellii introgression lines phenotypic and chemical traits can provide useful genetic information about tomato biomass production and potential fuel conversion.
Numerous structural and compositional features can have effects on lignocellulosic biomass processability. Cellulose is a polymer of 1-4 β linked glucose and forms crystalline fibrils within the cell walls. Hemicelluloses are complex-polymers of hexoses (mannose, glucose, galactose) and pentoses (xylose and arabinose), arranged in a non-crystalline manner, and interact with cellulose fibers [10]. The cellulose cristallinity and the heterogeneity of the hemicellulosic fraction confer recalcitrance to the biomass and represent a barrier for the utilization of the sugars locked in the polymers [11]. Pretreatments can be useful for modifying the architecture of the cell walls that compose the biomass, making it more accessible to hydrolytic enzymes. This involves the modification of lignin, removal of matrix polysaccharides, and reduction of cellulose crystallinity [12].
A way to improve feedstock amenability to fuel conversion is to delve into genetic variability within one species for breeding towards the best traits. In this work, we approach the study of a complex trait such biomass production and fuel conversion making use of S. pennellii introgression lines. Firstly, we assessed the tomato introgression lines biomass phenotypic diversity for identifying both traits contributing to high biomass yield and genotypes to employ in selection program for increasing biomass production. Then we characterised the residual biomass for lignin components, correlating them with the saccharification potential. In addition, we performed, on selected genotypes, a comparison among different cell wall polysaccharide components in order to identify chemical traits associated with tomato residual biomass digestibility. Finally, as proof of concept we assessed genes involved in cell wall biosynthesis in extreme genotypes by in silico analysis and by molecular assay, in order to investigate their role in phenotypic diversity.

Characterization of morphological traits in S. pennellii introgression lines
Thirty-seven tomato introgression lines, able to cover the full S. pennellii genome, have been assessed for fruit yield and biomass production. The frequency distribution of fruit yield and biomass parameters ( Fig. 1) showed that the fruit yield of 50 % of the introgression lines felt in the range 369-754 g per plant, with a harvest index range of 45-61 and a residual biomass of 295-701 g. Overall, IL 12-4, 6-3 and 2-5 provided the highest fruit yields (1043, 978 and 958 g per plant respectively), whereas 4-4 and 6-2 gave the lowest (117 and 25 g per plant respectively), whilst the recurrent parent M82 showed a fruit yield of 665 g per plant and a harvest index of 61.5 (Additional file 1: Tables S1). Interestingly, IL 6-2 and IL 6-3 have both an indeterminate growth habit, but showed extremely different yield performance, in agreement with data reported by Eshed and Zamir [9].
For crop residual biomass and leaf expansion, IL 2-6 showed the highest values of 1136 g and 6.6 m 2 per plant, respectively. It was followed by IL 4-4 (950 g) in the biomass ranking and by IL 12-4 (4.55 m 2 ) with regard to leaf area. The lowest values of residual biomass and leaf area were showed by IL 3-2 and IL 1-1 (128 g and 0.47 m 2 per plant respectively). M82 showed a biomass yield of 420 g per plant and a leaf area of 1.62 m 2 per plant (Additional file 1: Tables S2 and Additional file  2: Table S3). Extreme genotypes were identified for each measured trait. The most skewed traits were fruit weight per plant, residual biomass per plant and leaf area per plant. The introgression lines 2-5, 2-6, 6-3, 7-2, 10-2 and 12-4 showed the highest combination of fruit and residual biomass production and the introgression lines 2-1, 3-2, 4-3, 5-1 and 7-1 the lowest.

Cell wall composition and theoretical ethanol yield
The monosaccharide composition in the hemicellulosic fraction showed a large degree of variability between genotypes tested (Table 3). Glucose was the highest represented sugar, ranging from 26.9 g 100 g −1 d.w (IL 4-1) to 18.4 g 100 g −1 d.w (IL 4-3). Xylose followed glucose in the rank, showing a variation comprised between 13.4 g 100 g −1 d.w. in IL 3-1 and 26.5 g 100 g −1 d.w in IL 6-3. Fucose was the least represented monosaccharide, with values ranging from 0.27 (IL 2-6) and 0.92 g 100 g −1 d.w. (IL 10-3). Based on cell wall content of cellulose and hemicellulose monosaccharides (on dry weight) a theoretical ethanol yield was calculated (Fig. 3). IL 2-6 showed the highest ethanol potential production (2681 L ha −1 ) as it provided with the highest biomass yield (6 t ha −1 ) and a good conversion of biomass into ethanol (451 L t −1 d.w.). IL 4-3 resulted in a very low theoretical ethanol yield (539 L ha −1 ), in spite of the best biomass quality for ethanol production (486 L t −1 d.w.). Finally, IL 4-1 showed the lowest potential conversion of biomass into ethanol (340 L t −1 d.w).

Comparison among different saccharification pretreatments
Three biomass pretreatments were assessed in order to understand if saccharification rate is affected by treatment in genotype specific manner. Alkali pre-treatment caused the highest values of saccharification in most lines, whereas water pre-treatment was generally the least effective (Fig. 4). IL 4-3, IL 6-3, and IL 2-6 resulted in the highest values of saccharification rate (41, 38 and 35 mg glucose g −1 biomass h −1 respectively) using the alkali pretreatment. Interestingly, the saccharification rate was positively correlated with cellulose content, upon alkali pre-treatment, and with the crystalline cellulose content and hemicellulose content upon all the three pretreatments tested (Additional file 3). Finally, IL biomass saccharification rate was not significantly correlated with the hemicellulose arabinose/xylose ratio, regardless of the pre-treatment type. Our results display that biomass enzymatic digestibility is affected both by genotype cell wall structures and pretreatment type employed.

In silico identification of candidate QTL involved in biomass production
Based on information deposited in cell-wall literature studies and in database resources [13], we selected 23 protein families implicated in the construction and modification processes of plant cell walls. Using this dataset, the tomato genome was explored to identify the enzymes involved in the biosynthesis of key polysaccharides such as pectin, lignin, cellulose, hemicellulose and starch. In order to identify candidate QTL for biomass production by in silico analysis, we looked at selected ILs for cell wall related genes (see Additional file 2). Several genes located in the introgression regions of contrasting genotypes 2-5, 2-6, 3-1, 4-1 4-3, 6-2, 6-3, 7-2, 9-3, 10-2, 10-3, 11-3, and 12-4 were identified. Interestingly, introgression region 2-6 included a high number of Peroxidases, Laccases, Glycosyltransferases and MYB transcription factors in a region of 3.7 Mb. Moreover, two Cellulose synthase, two UDP-D-glucose dehydrogenase, a specific GDP-mannose dehydratase, a Sucrose synthase, a Galactosyltransferase, a GDP-mannose dehydratase and a GHMP kinase have been identified in this chromosome area (Table 4). Conversely, IL 4-3 in a much larger region includes only 30 genes coding for enzymes involved in cell wall biosynthesis.

Molecular expression profile
RT-qPCR analysis was used to analyze the expression pattern of key genes in two contrasting ILs    Fig. 5b). The expression of the cellulose synthesis like genes was up-regulated in IL 12-4 and downregulated in IL 4-3, except for the Cslc1.1 gene (Fig. 5c). Transcription levels of the BRU gene, involved in xyloglucan synthesis, was high in both IL, while transcription of FUCA, involved in hydrolysis L-fucose, was lower than in M82 in both ILs (Fig. 5d). Finally, the expression of ALDH was slightly up-regulated in IL 12-4 and down-regulated in IL 4-3 (Fig. 5e).

Discussion
Large variations in fruit yield and biomass production among lines was evidenced in a two years field experiment conducted with S. pennellii IL population. Our results support previous findings [14] reporting that most of tomato introgression lines showed lower yield and fruit mean weight but higher vegetative biomass than the parent M82. Interestingly, in our study a number of genotypes with extreme phenotypes were identified. Analysis of extreme genotypes that exceed +/− 2.5 mean value of quantitative trait loci (QTL), provide nearly equivalent power to complete genotyping at a reduced cost [15,16]. The introgression lines tested exhibited dramatic variations also in cell wall lignin content, with the highest value being more than two times greater than the lowest. A similar survey on a set of Arabidopsis thaliana mutants for lignin biosynthetic genes revealed that lignin content ranged from 6.0 g 100 g −1 dry weight to 15.9 [17]. Lignin composition could affect the saccharification efficiency and usually plants with increased biomass and reduced lignin have an improved fuel production [17]. However in our work a subset of ILs with high dry biomass and lignin content, also showing a high saccharification rate was identified. Different Mischanthus genotypes also display different patterns of correlation between lignin content and saccharification efficiency [18]. Transgenic plants with modified lignin composition enhanced biomass saccharification [19]. In tobacco, up-regulation of sucrose metabolism genes appears to directly impact primary growth and therefore biomass production, also with slight decrease of lignin content [20]. Determining sample composition, especially structural carbohydrate, and using this information to predict relative performance for fuel conversion could be very useful [21]. In our study a number of lines with high potential ethanol production were identified, and as outlined, ethanol production can depend on different cell wall structure variations. Indeed, the cell  wall architecture can be assembled from many different types of polysaccharides, phenylpropanoids and structural proteins. In our research, crystalline cellulose was highly correlated with hemicelluloses; with this respect, the amount and composition of branches attached to the hemicellulose backbone can affect the cell wall plasticity and crystal structure [22]. Within the hemicelluloses, the amount of xylan was high in IL 2-6, IL 4-3, IL 6-2, IL 6-3, IL 7-2, IL 10-3 and IL 11-3 and low in IL 3-1 and IL 12-4. Cell wall recalcitrance varies among plant species and even within different genotypes of the same species [23]. The close relationship between recalcitrance and the chemical composition of the non cellulosic matrix suggests that cell wall strength could be tuned by carefully controlling the matrix composition [17,24]. Knowledge on cell wall composition can be useful for better direct genetic approaches and pretreatment design to render biomass more amenable to processing. In this respect, the effect of lignin as well as of cellulose on biomass digestibility has been described in previous research [25]. As for the effect of pretreatment on biomass saccharification rate, in our research alkali pretreatment showed the best performances. Consistently with our results, Lima et al. [26] reported that one-step alkali pretreatment improves the enzymatic digestibility of Eucalyptus bark compared to two-step pretreatment with HCl 1 % followed by 4 % NaOH. Acid and alkali pretreatments have distinct mechanisms for biomass modification [27]. Acid pretreatment involves the hydrolysis of hemicelluloses by breaking the glycosidic linkages of polysaccharides [28]. Alkali pretreatment, in turn, breaks down the intermolecular ester bonds that cross-link lignin with hemicelluloses, thereby solubilizing lignin and hemicelluloses [29]. The removal of hemicelluloses increases the mean pore size of the substrate, which facilitates the hydrolysis of cellulose [30]. Correlation among parameters can be affected by composition. Significant positive correlation between hexose sugar  production and hemicellulose arabinose content or arabinose/xylose ratio resulted in enhanced biomass digestibility both with acid and alkaline pre-treatments [31]. The composition study conducted here allowed us to detect genotypes with wide differences in biomass production, as well as in cell wall composition. Such large differences in the composition of biomass have been previously observed by van Acker et al. [17] in both, cellulose and hemicellulose glucose content among Arabidopsis thaliana mutants. A similarly large variation was also observed by Marriott et al. (2014) in Brachypodium mutants. The variation in biomass composition between lines observed in the present work, although surprisingly wide, can be explained by the fact that these ILs originated from a genetically divergent biparental population. The identification of genes that affect such traits is still challenging. To this purpose a core panel of genes involved in metabolic pathways of cell wall biosynthesis and degradation was mapped in selected ILs with extreme phenotypes. These candidate genes encode proteins predicted to play a role in the synthesis, modification, assembly and disassembly of lignin, hemicellulose, cellulose and pectin [32].
A previous comparative approach showed that the differences in wall architecture between Arabidopsis and rice actually mirror the diversity of the individual gene families involved in the cell wall dynamics of the respective plant species [33]. Changes in wall composition or architecture could be due to mutations either in genes directly related to cell wall metabolism or in genes involved in regulation [34]. Morgan et al. [35] performed a biochemical analysis of a tomato introgression line with increased levels of fruit citrate identifying a target gene that was successfully tested in transgenic plants. Mutations with major effect more frequently occur in domesticated or artificially disturbed populations [36]. This supports the use of inbreds obtained between wild and domesticated species for identifying traits with strong and repeatable phenotypes [37]. In our work several specific candidate genes colocalizations could be underlined. For example, the genomic region delimitated by IL 2-6 contains 20 Peroxidase, 19 MYB factors, 6 Glycosyltransferase, 7 Laccase, 2 UDP-D-glucose dehydrogenase and 2 Cellulose synthase as well as other important polysaccharides biosynthesis enzymes. Different isoforms of UDP-glucose-consuming pathway have a regulatory role in carbon partitioning between cell wall formation and sucrose synthesis [38]. A Glycosyltransferase mutant produced plants deficient in ferulic and coumaric acid, aromatic compounds known to be attached to arabinosyl residues in xylan substituted with xylosyl residues. The mutant plants exhibit an increased extractability of xylan and increased saccharification, probably reflecting a lower degree of diferulic cross-links [39]. Genes and regulatory elements present in S. pennelli and S. lycopersicum could be involved in alterations of cell wall composition and biomass production. The molecular expression profile suggests that the partioning between source and sink organ could be challenged in different lines. Indeed, IL 12-4 (a high biomass producer line) showed a major cleavage of sucrose compared to IL 4-3, due to the up-regulation of a cell wall invertase (INV2) and a lower activity of a GDP-mannose transporter and a of glucose transporter 8. Cleavage of sucrose by invertase is generally correlated with growth and cell expansion, associated with sucrose partioning [40]. Excess of sucrose is broken down into fructose and UDPglucose, which is employed in the synthesis of cell wall polymers [20]. Cellulose synthase-like (CslC2 and CslC6) proteins, involved in the synthesis of various β-glycan polymers [41] by using GDP-mannose as substrate, result both up-regulated in IL 12-4 and down-regulated in IL 4-3 (low biomass producer line) whilst CslC1 is upregulated in both. A decrease of mannose substrate to synthesize 1,4-β-glucan backbone of mannose could improve IL12-4 digestibility. The expression of ALDH, an enzyme involved in the synthesis of lignin components such us ferulate and sinapate [42] is also up-regulated in IL 12-4. Both ILs have a similar amount of lignin but different saccharification rate; this difference may be related either to their lignin composition or to the fact that saccharification is a multigenic trait which is affected by a large number of factors in the cell wall [43]. Genes involved in carbon metabolism mapped in IL extreme genotypes are part of a "network QTL", where several elements of a metabolic network are affected by expression QTLs, enzyme activity QTLs, or metabolite QTLs [44,45] fine tuned in any genotype.

Conclusions
The present work shows the potential for exploitation of tomato ILs residual biomass for fuel conversion. The tomato introgression lines showed high variability in biomass production, cell wall composition and saccharification rate and, consequently, potential ethanol yield also resulted in a wide range of values among the genotypes. The trait enhancement found in extreme genotypes could be compared with the behavior of recurrent parent through a combined genomic and chemical profiling. It is evident, even from literature, that an intricate network of relations among components govern biomass production and digestibility. For this reason it is difficult to identify genotypes that combine both characteristics, on the other hand expression pattern of genes related to biomass production provided interesting clues which should be further investigated. Precise gene mapping is needed in order to predict biomass quality based on genomic information, while the interrogation of contrasting lines could permit the identification of the most eligible alleles linked to saccharification efficiency. Indeed, selected lines will be further explored to identify the candidate genes involved.

Phenotypic analysis
Research was carried out for two years in Naples, southern Italy (40°50' N, 14°15' E, 17 m a.s.l.), on the commercial tomato variety M82 (Solanum lycopersicum L.), Solanum pennellii LA716 and on 37 Solanum pennellii tomato introgression lines, kindly provided by Dr Dani Zamir (Hebrew University of Jerusalem) and reported in Table 1. Information on such lines can be found at website http://zamir.sgn.cornell.edu. A randomized complete block design with three replicates was arranged and each plot had a 4.73 m 2 (1.75 x 2.70 m) surface area. Transplanted plants were arranged in single rows spaced by 0.90 m from each other and the spacing was of 30 cm along the rows (3.2 pt · m −2 ). Experimental research on plants were conducted in accordance with local legislation. Plants were grown under standard tomato field procedures used for the area and fruits were harvested at full ripening. S. pennellii plants failed to grow properly in our climatic conditions and the few samples obtained were not included in the following analysis.

General analytical methods
Plant samples were randomly selected to assess the maximum leaf surface extension using a bench top LI-COR leaf area meter. At harvest, the following determinations were made: a) weight and number of ripe undamaged fruits, classified as marketable; fruit mean weight on 50 unit samples; b) residual biomass, including leaves, shoots, stems; and c) immature or damaged fruits. Harvest index was calculated as a ratio between marketable fruits and total plant weight. After harvest, residual biomass showed no fungal symptoms; therefore samples were randomly collected in each plot and immediately transferred to the laboratory, where they were dried in an oven at 70°C under vacuum until they reached constant weight. After assessing the dry residue, samples were carefully milled, avoiding mixing of materials belonging to different plant organs. The final material, composed of particles ≤ 1 mm diameter, was stored in air-tight bags at −20°C and further dried just before being processed.

Chemical analyses
After harvesting and weighing out, the residual plant biomass collected was oven-dried at 60°C. Lignin determination and saccharification assay using water pre-treatment were performed in all the 38 genotypes tested for yield and biomass production. Other analyses (lignin, cellulose, hemicellulose, pectin, hemicellulose monosaccharides, crystalline cellulose and saccharification assay with acid or alkaline pretreatment) were performed on 13 selected genotypes (5 genotypes with biomass production < 300 g pt −1 of fresh weight; 3 genotypes with biomass production from 301 to 800 g pt −1 of fresh weight; 5 genotypes with biomass production > 800 g pt −1 of fresh weight).

Lignin determination: acetyl bromide method
Biomass powder was weighed out (4 mg) into 2 mL tubes. The biomass was heated at 50°C for 3 h after adding 250 μL of acetyl bromide solution (250 μL of acetyl bromide and 750 μL of glacial acetic acid in volume) and vortexing every 15 min. After the samples were cooled to room temperature, the content was transferred into 5 mL volumetric flasks. A further 1 mL of NaOH (2 mol L −1 ) was used to rinse the tubes pouring the NaOH into the 5 mL flasks. 175 μL of hydroxylamine HCl (0.5 mol L −1 ) was added to the volumetric flasks and, after vortexing, the latter were filled up to 5 mL with glacial acetic acid and mixed several times. Finally, in order to measure the 280 nm UV adsorption by spectrophotometer, 100 μL of each sample was diluted in 900 μL of glacial acetic acid. The amount of lignin was calculated using the following formula: total volume 100% ð Þ = ½ biomass weight; where coefficient ¼ 15:69; pathlength ¼ 1; total volume ¼ 5; biomass weight ¼ 4:

Cellulose, hemicellulose and pectin determination Holocellulose
A mixture of 240 mL of water, 0.75 mL of glacial acetic acid and 2.25 g of sodium chlorite were added to 7.5 g of extracted and dried sample and kept at 75°C for 3 h.
At hourly intervals, a volume equivalent to the initial amounts of glacial acetic acid and sodium chlorite was added to the biomass. The sample obtained was filtered and washed up first with cold water, then with warm water and finally with acetone. The residue was ovendried at 105°C for 24 h and then weighed to calculate the content of holocellulose. Pectin 1.3 g amount of the resulting holocellulose was treated with 26 mL of potassium acetate (0.6 mol L −1 ) and incubated at 75°C for 3 h before adding 26 mL of ammonium oxalate (0.04 mol L −1 ). The suspension was kept at 75°C for 3 h. Then, the samples were filtered and washed up with excess of water before the residue was oven-dried at 105°C for 24 h. The pectin content was calculated as the difference between the holocellulose fraction and the above residue.

Cellulose and hemicellulose
A sample of holocellulose (3.8 g) was treated with 100 mL of sodium hydroxide (4.4 mol L −1 ) at room temperature for 30 min and filtered. Then, it was washed up sequentially with warm water (200 mL), 5 mL of acetic acid (2 mol L −1 ) and 500 mL of water. Next, the residue was oven-dried at 105°C for 24 h and weighed, providing the cellulose fraction. The hemicellulose content was calculated by subtracting the cellulose and pectin amount from that of holocellulose. The filtration process as well as the subsequent residue drying reported for pectin, cellulose and hemicellulose were accurately performed and, in fact, neither material loss nor different water content in the dried cell walls compared to the initial samples were assessed.

Non cellulosic monosaccharide determination
Biomass dry powder (4 mg) was partially hydrolyzed by adding 0.5 mL of trifluoroacetic acid (2 mol L −1 ). Then, the vials were flushed with dry argon, mixed and heated at 100°C for 4 h, mixing periodically. The vials were then cooled to room temperature and dried in centrifugal evaporator with fume extraction overnight. The pellets were washed twice with 500 μL of 2-propanol and vacuum dried. Finally, the samples were resuspended in 200 μL of deionised water, filtered with 0.45 μm PTFE filters, and analyzed by HPAEC. Monosaccharides were separated and quantified by HPAEC using a Dionex ICS-3000 with integrated amperometry detection. Chromatographic separation was performed on a CarboPac PA20 (3 x 150  The separated monosaccharides were quantified by using external calibration with a mixture of nine monosaccharide standards at 100 μM (arabinose, fucose, galactose, galacturonic acid, glucose, glucuronic acid, mannose, rhamnose, and xylose) that were subjected to acid hydrolysis in parallel with the samples.

Crystalline cellulose
Biomass dry pellets after TFA hydrolysis were washed once with 1.5 ml of water, and twice using 1.5 ml of acetone. The dried pellets were left to air dry overnight before complete hydrolysis by adding 90 μl of 72 % (p/v) sulfuric acid, incubating at room temperature for 4 h. 1.89 ml of water was subsequently added and the sample was heated for 4 h at 120°C. The glucose content of the supernatant was assessed using the colorimetric Anthrone assay, using a glucose standard curve.

Theoretical ethanol yield calculation
The theoretical ethanol yield was calculated considering the total cellulose conversion in the sample, according to the National Renewable Energy Laboratory standards [46,47]. The theoretical ethanol yield was expressed also taking into account agronomical traits such as the biomass yield per surface area unit.

Formatting of plant materials
Loading of plant powder into 96-well plates, using a custom-made robotic platform (Labman Automation, Stokesley, North Yorkshire, UK), and saccharification assays were performed according to Gomez et al. (2010) [48] after water, acid or alkali pretreatment. Enzymatic hydrolysis was carried out using an enzyme cocktail with a 4:1 ratio of Celluclast and Novozyme 188.

Gene expression analysis
Total RNA of IL 12-4, IL 4-3 and M82 genotypes was extracted from leaf tissues, using a Kit Spectrum plant total RNA (Sigma), and treated with DNase I Digestion (Sigma). Total RNA was quality checked and cDNA synthesis was performed with oligo (dT) and SuperScript III Reverse Transcriptase (Invitrogen). Specific primers for candidate genes (Additional file 2: Table S5) were designed using Primer3 software. RT-qPCR was performed in a 12.5 μL reaction volume using the Sensi-FAST SYBR Hi-ROX Kit (Bioline) with 4.5 μL cDNA as a template. Each reaction was carried out in triplicate and run on the 7900HT Fast Real-Time PCR System (Applied Biosystems). Fold change of each transcript, normalized to EF (elongation factor), was calculated relative to expression in the M82 sample, using the 2 -ΔΔCt method.

Statistical analysis
Data were processed by analysis of variance and mean separations were performed through the Duncan multiple range test, with reference to 0.05 and 0.01 probability levels, using SPSS software version 17. Data expressed as percentage were subjected to angular transformation before processing. Correlations were performed with all pairs of chemical parameters using the software mentioned above.

Availability of data and materials
Supporting data are included as additional files.

Additional files
Additional file 1: Table S1. Fruit yield and harvest index of 37 tomato introgression lines. Table S2. Residual biomass and leaf area of tomato introgression lines. (DOCX 113 kb) Additional file 2: Table S3. ANOVA analysis on phenotypic and chemical traits. Table S4. Protein families involved in the construction and modification processes of cell wall polysaccharides. Table S5. Tomato cell wall related proteins retrieved in S. pennelli genome region of selected introgression lines. Table S6. List of primers used in qPCR assay. (XLSX 55 kb) Additional file 3: Figure S1. Correlation between saccharification rate and crystalline cellulose and between saccharification rate and hemicellulose in 13 tomato introgression lines plus M82. (PDF 13 kb)