Main

The 65-megabase genome of Laccaria bicolor (Maire) P. D. Orton is the largest sequenced fungal genome published so far3,4,5,6,7 (Table 1). Although no evidence for large-scale duplications was observed within the L. bicolor genome, tandem duplication occurred within multigene families (Supplementary Fig. 4). Transposable elements comprised a higher proportion (21%) than that identified in the other sequenced fungal genomes and may therefore account for the relatively large genome of L. bicolor (Supplementary Table 3). Approximately 20,000 protein-coding genes were identified by combined gene predictions (Supplementary Information Section 2). Expression of nearly 80% (16,000) of the predicted genes was detected in free-living mycelium, ectomycorrhizal root tips or fruiting bodies (Supplementary Table 4) using NimbleGen custom-oligoarrays (Supplementary Information Section 9). Most genes are activated in almost all tissues, whereas other more specialized genes are only activated in some specific developmental stages, such as the free-living mycelium, ectomycorrhizae or the fruiting body (Supplementary Table 5).

Table 1 Genome characteristics of L. bicolor and other basidiomycetes

Only 14,464 L. bicolor proteins (70%) showed sequence similarity (BLASTX, cut-off e-value >0.001) to documented proteins. Most homologues were found in the sequenced basidiomycetes Phanerochaete chrysosporium4, Cryptococcus neoformans5, Ustilago maydis6 and Coprinopsis cinerea7 (Supplementary Table 6). The percentage of proteins found in multigene families was related to genome size and was the largest in L. bicolor (Fig. 2). This was mainly owing to the expansion of protein family size, but was also because of the larger number of protein families in L. bicolor compared with the other basidiomycetes (Supplementary Table 7). Expansion of protein family sizes in L. bicolor was prominent in the lineage-specific multigene families. Marked gene family expansions occurred in those genes predicted to have roles in protein–protein interactions (for example, WD40-domain-containing proteins) and in signal transduction mechanisms (Supplementary Table 7). Two new classes of GTPase α genes were found and may be candidates for the complex communication that must occur between the mycobiont and its host plant during mycorrhizae establishment (Supplementary Table 8). Several transcripts coding for expanded and lineage-specific gene families were upregulated in symbiotic and fruiting body tissues, suggesting a role in tissue differentiation (Supplementary Tables 5 and 9).

Figure 2: Expansion of protein families in L. bicolor.
figure 2

a, Relationship between genome size and the number of protein families. b, Relationship between genome size and protein family sizes in five sequenced basidiomycetes. Protein sequences predicted from the genome sequences of L. bicolor, C. cinerea, P. chrysosporium, C. neoformans and U. maydis were clustered into families using the TRIBE-MCL algorithm (see Supplementary Information Section 5 for details).

In our analysis of annotated genes, in particular that of paralogous gene families, we highlighted processes that may be related to the biotrophic and saprotrophic lifestyles of L. bicolor. Twelve predicted proteins showed a similarity to known haustoria-expressed secreted proteins of the basidiomycetous rusts Uromyces fabae8 and Melampsora lini9, which are involved in pathogenesis (Supplementary Table 10). Out of the 2,931 proteins predicted to be secreted by L. bicolor, most (67%) cannot be ascribed a function, and 82% of these predicted proteins are specific to L. bicolor. Within this set, we found a large number of genes that encode cysteine-rich products that have a predicted size of <300 amino acids. Of these 278 SSPs, 69% belong to multigene families, but only nine groups comprising a total of 33 SSPs co-localized in the genome (Supplementary Fig. 5). The structure of two of these clusters is shown in Supplementary Fig. 6. Other SSPs are scattered all over the genome, and we found no correlation between SSP and transposable element genome localization (Supplementary Fig. 5). Transcript profiling revealed that the expression of several SSP genes is specifically induced in the symbiotic interaction (Table 2 and Supplementary Fig. 10). Five of the 20 most highly upregulated fungal transcripts in ectomycorrhizal root tips code for SSPs (Supplementary Table 5). These mycorrhiza-induced cysteine-rich SSPs (MISSPs) belong to L. bicolor-specific orphan gene families. Within the MISSPs, we found a family of secreted proteins with a CFEM domain (INTERPRO IPR014005) (Supplementary Figs 7 and 8), as previously identified in the plant pathogenic fungi M. lini9 and Magnaporthea grisea10 (Supplementary Table 10), and proteins with a gonadotropin (IPR0001545) or snake-toxin-like (SSF57302) domains related to the cysteine-knot domain. Expression of several SSPs was downregulated in ectomycorrhizal root tips (cluster E in Supplementary Fig. 10), suggesting a complex interplay between these secreted proteins in the symbiosis interaction.

Table 2 Changes in expression of transcripts coding for MISSPs

The rich assortment of MISSPs may therefore act as effector proteins to manipulate host cell signalling or to suppress defence pathways during infection, as suggested for pathogenic rusts8,9, smuts6 (U. maydis) and Phytophthora11 species. To have a role in symbiosis development, MISSPs should be expressed in L. bicolor hyphae colonizing the root tips. To test this assertion, we determined the tissue distribution of the mycorrhiza-induced cysteine-rich SSP of 7 kDa (MISSP7) (JGI identification number 298595) showing the highest induction in ectomycorrhizal tips (Table 2 and Supplementary Table 5). Two peptides, one of which is located in the amino-terminal and the other in the carboxy-terminal part of the mature protein, were selected as antigens for the production of anti-MISSP7 antibodies. The selected peptides were not found in the deduced protein sequences of other L. bicolor gene models, nor in the Populus trichocarpa genome12. MISSP7 localization in L. bicolorP. trichocarpa ectomycorrhizal root tips by indirect immunofluorescence is illustrated in Fig. 1 and Supplementary Fig. 11. Control images in which the ectomycorrhizae sections were obtained by replacing primary anti-MISSP7 antibodies with pre-immune immunoglobulin (Ig)G are shown in Supplementary Fig. 12. For cases in which ectomycorrhizae were treated with anti-MISSP7 antibody followed by fluorescent-labelled secondary antibody, fluorescence was localized in the hyphae colonizing short roots (Fig. 1 and Supplementary Fig. 11) and was not detected in the free-living mycelium (Supplementary Fig. 12). Although MISSP7 was detected in the hyphal mantle layers ensheating the root tips, the protein mainly accumulated in the finger-like, labyrinthine branch hyphal system (Hartig net), which provides a very large area of contact between cells of the two symbionts. It accumulated in the cytosol and cell wall of the fungal cells. The MISSP7 protein could therefore interact with the plant components after secretion. MISSP7 shares no sequence similarity or protein motif with other SSPs.

Comparison of the MISSP sequences did not reveal a specific conserved motif that could potentially contribute to their function or to targeting to the host cell, such as the RXLR motif11 of phytopathogenic Phytophthora or the malaria parasite. SSPs with upregulated expression in fruiting bodies (Supplementary Table 5 and Supplementary Fig. 10) may have a role in the differentiation of the sexual tissues and/or the aggregation of sporophore tissues. Interestingly, there is a large set of SSP genes showing significant changes in gene expression in both ectomycorrhizal root tips and fruiting bodies (cluster A in Supplementary Fig. 10), suggesting that both developmental processes recruit similar gene networks (for example, those involved in hyphal aggregation).

Host trees are able to harness the formidable web of mycorrhizal hyphae (which permeates the soil and decaying leaf litter) for their nutritional benefit. A process that is pivotal to the success of ectomycorrhizal interactions is therefore the equitable exchange of nutrients between the symbiont and its host plant1,2,13. A comparison with other basidiomycetes (Supplementary Table 12) revealed that the total number of predicted transporters is larger in L. bicolor compared with C. cinerea and P. chrysosporium. Interestingly, L. bicolor has multiple ammonia transporters, although it encodes a single nitrate permease. Ammonia is arguably the most important inorganic nitrogen source for ectomycorrhizal fungi14. One of the ammonia transporters (AMT2.2), for instance, is greatly upregulated in ectomycorrhizae (Supplementary Table 5). Therefore, L. bicolor shows an increased genetic potential in terms of nitrogen uptake compared with other basidiomycetes. These capabilities are consistent with L. bicolor being exposed to a range of nitrogen sources from the decay of organic matter15.

Although the L. bicolor genome contains numerous genes coding for key hydrolytic enzymes, such as proteases and lipases, we observed an extreme reduction in the number of enzymes involved in the degradation of plant cell wall (PCW) oligosaccharides and polysaccharides. Glycoside hydrolases, glycosyltransferases, polysaccharide lyases, carbohydrate esterases and their ancillary carbohydrate-binding modules were identified using the carbohydrate-active enzyme (CAZyme) classification (http://www.cazy.org/). A comparison of the L. bicolor candidate CAZymes with fungal phytopathogens confirms the adaptation of its enzyme repertoire to symbiosis and reveals the strategy used for the interaction with the host (Supplementary Tables 13 and 14). The reduction in PCW CAZymes affects almost all glycoside hydrolase families, culminating in the complete absence of several key families. For instance, there is only one candidate cellulase (glycoside hydrolase 5, GH5) appended to the sole fungal cellulose-binding module (CBM1) found in the genome, and no cellulases from families GH6 and GH7 (Supplementary Table 14). Similar reductions or loss of hemicellulose- and pectin-degrading enzymes were also noted. These observations suggest that the inventory of L. bicolor PCW-degrading enzymes underwent massive gene loss as a result of its adaptation to a symbiotic lifestyle, and that this species is now unable to use many PCW polysaccharides as a carbon source, including those found in soil and leaf litter. The remaining small set of secreted CAZymes with potential action on plant polysaccharides (for example, GH28 polygalacturonases) is probably required for cell wall remodelling during fungal tissue differentiation because their expression was upregulated in both fruiting bodies and ectomycorrhizae (Supplementary Table 15 and Supplementary Fig. 13). By contrast, transcripts coding for proteins with an expansin domain were only induced in ectomycorrhizae, suggesting they may be used by L. bicolor for penetrating into the root apoplastic space.

To survive before its mycorrhizal association with its host, L. bicolor seems to have developed a capacity to degrade non-plant (for example, animal and bacterial) oligosaccharides and polysaccharides; this is suggested by retention of CAZymes from families GH79, polysaccharide lyase 8 (PL8), PL14 and GH88 (Supplementary Table 14). Interestingly, there is no invertase gene in the L. bicolor genome, implying that this fungus is unable to use sucrose directly from the plant. This is consistent with earlier observations16 that L. bicolor depends on its host plant to provide glucose in exchange for nitrogen. We also noticed an expansion of CAZymes involved in the fungal cell wall biosynthesis and rearrangement, almost entirely owing to an increased number of putative chitin synthases and enzymes acting on β-glucans (Supplementary Table 14). Several of the corresponding genes are up- or down-regulated in developmental processes requiring cell wall alterations such as formation of fruiting bodies or mycorrhizae (Supplementary Table 15 and Supplementary Fig. 13).

Ectomycorrhizal fungi have an important role in mobilizing nitrogen from well-decomposed organic matter2,15. The hyphal network permeating the soil might therefore be expected to express a wide diversity of proteolytic enzymes. The total number of secreted proteases (116 members) identified (Supplementary Fig. 14) is relatively large compared with that in other sequenced saprotrophic basidiomycetes, such as C. cinerea and P. chrysosporium. Secreted aspartyl-, metallo- and serine-proteases may have a role in degradation of decomposing litter15, confirming that L. bicolor has also the ability to use nitrogen of animal origin, as suggested previously17. They may also have a role in developmental processes because the expression of several secreted proteases is up- or down-regulated in fruiting bodies and ectomycorrhizal root tips (Supplementary Table 16). Mycelial mats formed by L. bicolor hyphae colonizing organic matter therefore possess the ability to degrade proteins from decomposing leaf litter.

Our analysis of the gene space reveals a multi-faceted mutualistic biotroph equipped to take advantage of transient occurrences of high-nutrient niches (living host roots and decaying soil organic matter) within a heterogeneous, low-nutrient environment. The availability of genomes from mutualistic, saprotrophic4 and pathogenic6 fungi, as well as from the mycorrhizal tree P. trichocarpa12, now provides an unparalleled opportunity to develop a deeper understanding of the processes by which fungi colonize wood and soil litter, and also interact with living plants within their ecosystem, to perform vital functions in the carbon and nitrogen cycles2 that are fundamental to sustainable plant productivity.

Methods Summary

Genomic sequence

Scaffolds and assemblies for all genomic sequences generated by this project are also available from the Joint Genome Institute (JGI) portal (http://genome.jgi-psf.org/Lacbi1/Lacbi1.download.ftp.html). A genome browser is available from JGI (http://www.jgi.doe.gov/laccaria). BLAST search of the genome is available at JGI (http://www.jgi.doe.gov/laccaria) and INRA LaccariaDB (http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/).

Predicted gene models

Consensus gene predictions, produced by combining several different gene predictors, are available from JGI (http://www.jgi.doe.gov/laccaria) as General Feature Format (GFF) files. These gene models can also be accessed from the Genome Browser in the JGI L. bicolor portal (http://www.jgi.doe.gov/laccaria).

Gene annotations

Tables compiling KEGG, PFAM, KOG and best BLAST hits for predicted gene models, transposable element and CAZyme data, as well as Tribe-MCL gene families, are available from INRA LaccariaDB (http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/).

Online Methods

Genome sequencing

The haploid genome of the strain S238N-H82 from L. bicolor (Maire) P. D. Orton was sequenced with the use of a whole-genome shotgun strategy. All data were generated by paired-end sequencing of cloned inserts using Sanger technology on ABI3730xl sequencers. Supplementary Table 1 gives the number of reads obtained per library.

Genome assembly

The data were assembled using release 1.0.1b of JAZZ, a JGI whole-genome shotgun assembler. On the basis of the number of alignments per read, the main genome scaffolds were at a depth of 9.88. The amount of sequence in the unplaced reads was 6.5 Mb, which is sufficient to cover the main-genome gaps to a mean depth of 9.9. A total of 64.9 Mb are captured in the scaffold assembly (Supplementary Table 2).

Genome annotation

Gene models were predicted using FgenesH18, homology-based FgenesH+ (ref. 18) and Genewise19, as well as EuGène20 and TwinScan21, and alignments of several complementary DNA resources (Supplementary Information Section 3). The JGI pipeline selected a best representative gene model for each locus on the basis of expressed sequence tag support and similarity to known proteins from other organisms, and predicted 20,614 protein-coding gene models. All predicted genes were annotated using Gene Ontology22, eukaryotic clusters of orthologous groups23 and KEGG pathways24. Protein domains were predicted using InterProScan25. Signal peptides were predicted in 2,931 L. bicolor proteins by both the hidden Markov and the neural network algorithms of SignalP26. After eliminating predicted transmembrane proteins and removal of transposable element fragments, we selected 278 cysteine-rich secreted proteins with a size of <300 amino acids. Gene families were built from proteins in L. bicolor, C. cinerea, P. chrysosporium, C. neoformans and U. maydis using Tribe-MCL tools27 with default settings.

Indirect immunofluorescent localization of MISSP7

The peptides LRALGQASQGGDLHR and GPIPNAVFRRVPEPNF located in the N-terminal and C-terminal parts of the MISSP7 sequence (without the signal peptide) were synthesized and used as antigens for the generation of antibodies in rabbits according to the manufacturer’s procedures (Eurogentec). The anti-MISSP7IgG fraction was purified using the MAbTrap kit (GE Healthcare) according to the manufacturer’s recommendations. Subsequently, the IgG-containing fraction was desalted using a HiTrap desalting column (GE Healthcare). The concentration of purified IgG from pre-immune serum was determined by Bradford assay using a Bio-Rad protein assay. The final concentration of anti-MISSP7 IgG was 0.16 mg ml-1. Immunolocalization was performed essentially as described in refs 28 and 29, with slight modifications (Supplementary Section 10).

Gene expression

Average expression levels of genes in different tissues and conditions were analysed using CyberT statistical framework (http://www.igb.uci.edu/servers/cybert/) and hierarchical clustering with EPCLUST (http://ep.ebi.ac.uk/EP/EPCLUST/) (Supplementary Information Section 8).