Genome-Wide Exon-Capture Approach Identifies Genetic Variants of Norway Spruce Genes Associated With Susceptibility to Heterobasidion parviporum Infection

Root and butt rot caused by members of the Heterobasidion annosum species complex is the most economically important disease of conifer trees in boreal forests. Wood decay in the infected trees dramatically decreases their value and causes considerable losses to forest owners. Trees vary in their susceptibility to Heterobasidion infection, but the genetic determinants underlying the variation in the susceptibility are not well-understood. We performed the identification of Norway spruce genes associated with the resistance to Heterobasidion parviporum infection using genome-wide exon-capture approach. Sixty-four clonal Norway spruce lines were phenotyped, and their responses to H. parviporum inoculation were determined by lesion length measurements. Afterwards, the spruce lines were genotyped by targeted resequencing and identification of genetic variants (SNPs). Genome-wide association analysis identified 10 SNPs located within 8 genes as significantly associated with the larger necrotic lesions in response to H. parviporum inoculation. The genetic variants identified in our analysis are potential marker candidates for future screening programs aiming at the differentiation of disease-susceptible and resistant trees.


INTRODUCTION
Norway spruce [Picea abies (L.) H. Karst.] is one of the dominant species in European boreal forests. As a major source of timber, it is one of the most commercially important species in Nordic countries. However, a combination of biotic and abiotic factors pose a threat to sustainable timber supply from Norway spruce plantations (Lindroth and St. Clair, 2013;Oliva et al., 2013;Robert et al., 2013;Lind et al., 2014;Wang et al., 2015). Among microbial pathogens of Norway spruce, necrotrophic fungi of the Heterobasidion annosum species complex occupy an outstanding position due to their devastating effect on commercial plantations (Keriö et al., 2014). Members of this species complex are causative agents of root and butt rot of conifer trees throughout the boreal and temperate zones of Northern hemisphere (Abu et al., 2004;Asiegbu et al., 2005a;Garbelotto and Gonthier, 2013;Gunulf et al., 2013;Skrøppa et al., 2015). They affect growth rate, mortality, and timber quality of conifer trees as well as steadily increase the risk of windfall (Gori et al., 2013;Keriö et al., 2014). The annual losses due to Heterobasidion infections are estimated at e50 million for Finnish forest industry (Gori et al., 2013;Piri and Hamberg, 2015), and at e790 million for European forest owners (Asiegbu et al., 2005a). Based on morphology, ecology, and molecular characters, H. annosum species complex is classified into five species, three of which occur in Europe: H. annosum sensu stricto, H. parviporum, and H. abietinum (Asiegbu et al., 2005a;Garbelotto and Gonthier, 2013;Hansson et al., 2014). Their preferred hosts are Scots pine, Norway spruce, and Silver fir, respectively (Asiegbu et al., 2005a). H. parviporum occurs primarily in forest areas where its major host, Norway spruce, grows naturally (Asiegbu et al., 2005a,b;Lind et al., 2014). The tree infection might occur either through wounds and damaged area on stem or, alternatively, pathogen might move from infected to healthy trees via root contacts (Asiegbu et al., 2005a).
Norway spruce trees possess a repertoire of defenses against microbial pathogens and pests. These defense reactions are commonly classified into physical and chemical defenses. An example of physical defense mechanisms is provided by thick outer bark, which acts as an efficient barrier that most pathogens cannot penetrate. It protects both dead tissues (heartwood) and living tissues (phloem, cambium and sapwood) (Kovalchuk et al., 2013). Chemical defense of spruce trees is achieved via production of diverse secondary metabolites such as flavonoids, lignans, phenolics, stilbenes, and terpenes. Many of them possess antimicrobial activities against potential pathogens, and they could delay or prevent the establishment of pathogens within the tree (Danielsson et al., 2011). Upon recognition of potential pathogens or tissue damage caused by microorganisms or insects, trees activate induced defenses that include cell wall reinforcement, production of phenolics, terpenes and pathogenesis-related (PR) proteins, and formation of traumatic resin ducts (TD), and polyphenolic parenchyma (PP) cells (Arnerup et al., 2013).
Growth rate and wood properties (Cunningham et al., 2006;Zubizarreta-Gerendiain et al., 2009;Luoranen and Viiri, 2016;Levkoev et al., 2017) and impact of climate change (Kolár et al., 2017) have been the major economic traits selected for in Norway spruce breeding programs. Recent studies have also aimed at identifying genes implicated in Norway spruce resistance against H. parviporum infection (Lind et al., 2014). Their results led to the identification of the gene PaLAR3, encoding leucoanthocyanidin reductase, an enzyme of the flavonoid biosynthetic pathway. A low frequency allele of this gene is associated with higher resistance to H. parviporum and increased content of (+)-catechin, a compound showing a fungistatic effect on H. parviporum (Nemesio Gorriz et al., 2016). However, Norway spruce resistance to the infection is a quantitative character, and it is likely to be controlled by multiple loci (Lind et al., 2014). Thus, further studies are required to identify additional genetic determinants of resistance against H. parviporum infection in Norway spruce.
The genome-based discoveries open the way to better understanding the effects of host genes associated with resistance to microbial pathogens. Availability of the draft sequence of Norway spruce genome (Nystedt et al., 2013) provides opportunities for high-throughput identification of genetic variants associated with the resistance to H. parviporum infection. Sequence capture or exon capture is a cost-efficient approach to determine genetic variation in species with a large genome size or without a reference genome having a wide range of scale of the capture from several targeted loci to a million target regions (Grover et al., 2012;Müller et al., 2015). This approach eliminates the need to set up and optimize thousands of PCR reactions for candidate genes, allowing instead for the parallel enrichment and sequencing of thousands of targeted regions. Exon capture has been successfully applied for such species as poplar, eucalyptus, polyploid wheat, polyploid switchgrass, and loblolly pine (Pavy et al., 2016). In the present study, we used the exon capture approach for the identification of Norway spruce genetic loci associated with the resistance or susceptibility to H. parviporum infection. The objectives of this study were: 1) to assess the genetic polymorphism among selected clonal lines of Norway spruce and 2) to identify genetic variants associated with the control of lesion lengths caused by H. parviporum inoculation.

Plant and Fungal Material
The study material comprised 533 Norway spruce saplings. These represented 64 clonal lines which are a sample of the spruce breeding material for Central Finland (7 to 10 ramets per clonal line) ( Table 1). The ortet trees were selected in two 10-yearold Norway spruce progeny trials in Pieksämäki (N 62 • 23 ′ , E 27 • 17 ′ ) for their favorable phenotype (mainly for growth and late flushing). The ortets were propagated by rooted cuttings, which were then grown three seasons at the breeding nursery of the Natural Research Institute Finland. All the clones were untested and previously unexploited in research. Prior to the inoculation experiment, they were cultivated in a greenhouse (

Phenotyping for Susceptibility to H. parviporum Infection
After the 2-month acclimatization, 4 to 7 saplings of each clone were inoculated with H. parviporum, and three saplings of each clone were mock-inoculated to perform as a control. The inoculation method used (Figure 1) was similar to the previously  described one Stenlid, 1995, 1996;Sun et al., 2011;Keriö et al., 2014). The stem surface was sterilized with 70% ethanol before a hole was made through the bark with a 70% ethanol-sterilized puncher (3 mm diameter) to reach the xylem surface. The distance of the holes from the stem base was ∼5 cm. Plugs of H. parviporum-colonized sawdust-containing malt extract agar of the same diameter as the holes were placed in the holes and then covered with Parafilm (Pechiney Plastic Packaging, Chicago, IL, USA). For control saplings, holes were inoculated with uncolonized sawdust-containing malt extract agar plugs. Saplings were harvested 2 months after inoculation. Harvested material was stored at −20 • C until used. The necrotic lesions in phloem and xylem tissues were measured after removing the periderm tissues using a sterilized knife. The saplings dimensions (stem diameter, height, and volume) were also documented.

Statistical Analysis of Phenotyping Data
Data were analyzed using the SPSS version 25.0 (IBM Corporation, New York, USA) and Microsoft Excel 2016. A oneway analysis of variance (ANOVA) was performed to detect differences among the necrosis sizes of each clone and among the inoculation and mock treatments. The normality of the error variances was checked with Levene's test and by skewness and kurtosis values for the data distribution, the values for lesion size were converted to the logarithmic scale (Supplementary Files S1, S2). The effect of the clone on the lesion size was estimated with the following general linear model: where Y ij is the value of ijth observation, µ is the overall mean, c i is the fixed effect of the clone i, and e ij is the random residual term associated with the ith clone and jth ramet. Differences between clones were tested with the least significant difference (LSD) test between the means. The growth rates of H. parviporum were assessed by t-tests for two-sample assuming unequal variances. Pearson's correlation analysis was used to determine correlations of diameter, height, and volume of the tested clonal lines with the sizes of necrotic lesions caused by H. parviporum. Differences were considered as statistically significant if p-value was below the threshold of 0.05.

Estimation of Heritability
Broad-sense heritability was estimated both for the growth traits (height, diameter, and volume of the saplings) and the lengths of various lesions using the following linear model: where y ij is the ijth phenotypic observation of the trait y, µ is the general mean, c i is the random effect ith clone and w ij is the residual effect due to the jth sapling of the ith clone. The variance components of the model effects σ 2 c and σ 2 w were computed using procedure Mixed of the SAS package. Heritability was first calculated separately for both treatments (fungal vs. mock inoculation) as a ratio of the clonal variance to the phenotypic variance: Frontiers in Plant Science | www.frontiersin.org and its standard error as sqrt(Var (σ 2 c ))/(σ 2 c + σ 2 w ) −1 . Secondly, to analyze the magnitude of clone by treatment interaction in the lesion traits we used the mixed effect model: where T j is the fixed effect of the jth treatment (fungal inoculation or mock inoculation), and c i T j is the random interaction effect between ith clone and jth treatment. This enabled us to separate the clone and clone by environment (treatment) interaction components of variance in the estimation of broadsense heritability: The relative magnitude of the interaction variance vs. the total genetic variance (R cT ) was calculated as a simple ratio:

Library Preparation and Target Enrichment Sequencing
Total genomic DNA was extracted from the needles using a standard cetyl-trimethyl ammonium bromide (CTAB) method with modifications described in Terhonen et al. (2011). DNA concentrations were estimated with PicoGreen dsDNA quantification assay (ThermoFisher Scientific, USA) and DNA integrity was analyzed by visualizing the DNA on a 0.8% w/v agarose electrophoresis gel. A total of 500 ng was utilized for construction of libraries compatible with Illumina sequencing machines. In summary, the DNA was mechanically sheared to a mean fragment size of 300 bp, followed by repair of the ends of the molecules, phosphorylation and adenylation. Common adapters suited for Illumina sequencing were ligated on each side of the molecules containing 8 bp indexes (i5 and i7). The ligated libraries were amplified with 10 cycles of PCR with common primers to enrich the libraries for properly ligated molecules and the resulting libraries were quantified with PicoGreen. A set of 80,000 probes was utilized to capture genic regions of the Norway spruce genome (Vidalis et al., in review). In summary, these probes target 34761 genes with a median of two probes per gene, and were selected based on their uniqueness on the genome. Probe sequences were designed based on the P. abies v1.0 genome (Nystedt et al., 2013) and RNA-based, 120 nucleotide long probes were synthesized by Agilent Technologies. Sixteen libraries were equimolarly combined to a total of 750 ng for target enrichment, which was performed following SureSelect Target Enrichment System (Agilent Technologies). Enriched libraries were sequenced using Illumina HiSeq3000 instrument on a 2 × 100 bp sequencing mode. The library construction and enrichment sequencing were performed by RAPiD Genomics (USA). The generated sequence reads were submitted to the GenBank Short Reads Archive (SRA) and are available under the BioProject accession number PRJNA450911.

Variant Identification and Genome-Wide Association Analysis
The generated sequencing data was processed to identify the SNPs segregating in the population. Short reads were filtered for quality and aligned to the scaffolds that contain probes of the P. abies v1.0 genome using Mosaik version 2.1 as described before (Neves et al., 2013). The resulting alignment BAM files were used to identify SNPs with Freebayes v1.0.2 (Garrison and Marth, 2012) on a populational level (parameters: -maxcomplex-gap 1 -theta 0.01 -no-complex -no-mnps -no-indels). The resulting SNPs were filtered using VCF.filter v0.1.13 (Müller et al., 2017) to keep markers with a minimum quality of 10, an average sequencing depth between 15 and 150, a minimum allele frequency of 0.01, a missing data lower than 0.4 and assigning genotypes with sequencing depth lower than 3 as missing.
The VCF file containing the information on the identified SNPs was deposited to the Dryad repository (Mukrimin et al., 2018). The filtered markers for the 64 clonal lines were subsequently used for genome-wide association analysis (GWAS). The average xylem lesion size for each clone was used as a quantitative phenotype for the GWAS, which was carried out using EMMAX (Kang et al., 2010) including the BN kinship matrix in the model to control for relatedness in the population. To control for multiple testing, the Bonferroni and the Benjamini and Hochberg (1995) procedures were used (α = 0.05). The Manhattan and Quantile-Quantile plot (Q-Q plot) were calculated using the qqman package on R (Turner SD. Qqman: an R Package For Visualizing Gwas Results Using Q-Q And Manhattan Plots. 2014; https://doi.org/10.1101/005165). Since the spruce genome is not ordered and fragmented, the pvalues plotted on the Manhattan plot do not represent any physical or genetic distance, but rather each SNP was simply considered a coordinate on the x-axis for visualization purposes (Supplementary File S3).

Variation in Lesion Length in Response to Fungal Inoculation
In our experiment, the lesions formed in response to inoculation with H. parviporum were significantly larger than the lesions developing after mock treatment. The differences between mock and fungal inoculation treatments were observed both in xylem and in phloem, and they were statistically significant (p < 0.001) in both cases ( Table 2).
The lesions developed faster in vertical than in horizontal direction, as indicated by significant differences between Horizontal lesion length in phloem was significantly higher than in xylem (p < 0.001), but there was no significant differences between the tissues in the vertical lesion length. The analyzed spruce clones displayed significant differences in size of necrotic lesions both in phloem and in xylem [F (63, 277) = 2.84 and F (63, 277) = 2.77, respectively at p < 0.001]. The distribution of lesion length among the analyzed clones was similar to normal (Figure 2). The data for the individual clones are summarized in Figure 3. Clonal lines V31094 and V31406 had the largest lesion sizes and were classified as the most susceptible to the infection.
Based on Pearson correlation coefficients (r), there were significant negative correlations between the growth traits (stem diameter, seedling height, and volume) and the observed lesion sizes. The strongest negative correlation was found in inoculated xylem that differed significantly at p < 0.01 ( Table 3).
The heritability estimates, indicating the proportion of clonal variation of the total phenotypic variation in each treatment, were moderate to large, ranging from 0.18 to 0.80. The levels of heritability were consistently larger for the clones subjected to mock treatment than the respective estimates for inoculated trees. The genetic coefficients of variation displayed the similar tendency ( Table 4). The clonal variance used in calculating these heritability values was in fact composed of two components of variation: the true genetic (clonal) and genotype by treatment interaction variances. When these two sources of variation were distinguished by Equation (4), much smaller heritability estimates were obtained both for the lesion traits (0.15-0.23) and the growth traits (0.07-0.13) ( Table 5). The relative magnitude of the genotype by treatment interaction variance (Equation 5) is generally low, but markedly high for phloem horizontal necrosis.

Sequence Capture and Sequencing of Gene Space in Norway Spruce
We genotyped 64 clonal lines of Norway spruce by resequencing of predicted exons corresponding to 34 761 gene models. On average, 6.3 million reads were generated per clonal line. Reads were aligned to Norway spruce reference genome, with the median experiment-wide sequencing depth of 40.35×. The efficiency of the sequence enrichment strategy was high, capturing 79 170 out of 80 000 probes. Calling of genetic variants identified 373 384 high-quality biallelic SNPs.

Identification of Sequence Variants Associated With the Necrotic Lesion Size
Genome-wide association analysis identified 34 genes (36 SNPs) that were significantly associated (at 5% FDR significance level) with the size of necrotic lesions formed in response to fungal inoculation (Supplementary File S4). As we had a limited number of clonal lines available for GWAS analysis, we additionally selected only those variants, which occurred in at least 4 clonal lines. The resulting set included 10 SNPs located within 8 gene models ( Table 6). All of the identified variants were associated with larger necrotic lesions and belonged to low frequency alleles (frequency among studied clonal lines <5%). Three of the identified genes were of particular interest because of their potential involvement in the regulation of plant defense responses: the gene MA_940838g0010 with a similarity to Arabidospis thaliana ILITYHIA, which is required for the establishment of systemic acquired resistance (Monaghan and Li, 2010), the gene MA_4047g0010 encoding a predicted subtilase (subtilisin-like serine peptidase), and the gene MA_3905g0010 with a similarity to the plant-specific class of HD2-type histone deacetylases.

DISCUSSION
In this study, we characterized a set of Norway spruce clonal lines to explore the genetic control of susceptibility to H. parviporum infection. As an indicator of susceptibility, we used size of necrotic lesions developing in response to fungal inoculation. The artificial fungal inoculation of seedlings and saplings has been widely practiced under controlled greenhouse conditions to investigate tree susceptibility to Heterobasidion infection (Swedjemark and Stenlid, 1996;Swedjemark et al., 2001;Sun et al., 2011;Krokene et al., 2012). This approach is much faster than the field experiments (Swedjemark and Stenlid, 1996), but yet much more complex than a potential selection via genetic markers. Inoculation with Heterobasidion results in the pathogen spread in host tissues and in the development of necrotic lesions around the inoculation point.
In this study, the necrotic lesions significantly differed among the tested clones, which concurs with the results of previous studies (Swedjemark and Stenlid, 1996;Swedjemark et al., 2001;Keriö et al., 2014;Skrøppa et al., 2015). Our findings show that saplings with lower values of growth parameters (stem diameter, height, and volume) appeared considerably more sensitive to fungal inoculation and displayed larger necrotic lesions in response to H. parviporum infection. This is in accordance with previous results that reported a strong negative correlation between growth traits and H. parviporum resistance (Karlsson et al., 2008). The investigated clones represented a practically random sample of genotypes from a single breeding population aimed at a specific region (Central Finland). We could not detect any common variables characterizing these clones that would explain the differences observed in their susceptibility.
Heritability is often used in quantitative genetics, behavior genetics, selective breeding, among other fields to quantify the ratio of genetic to phenotypic variation in a specific population and a specific environment. In this study, all the traits showed moderate to high heritability, suggesting a relatively small role of environmental differences in the trait expression when the analyses were performed separately for the two treatments. Interestingly, heritability estimates for traits measured after the mock treatment were consistently higher than the respective estimates on clones subjected to inoculation, suggesting these traits could be indeed considered to be genetically different. In the second analysis for the combined data (including a fixed treatment effect and a random clone by treatment variance) Frontiers in Plant Science | www.frontiersin.org all the traits studied showed moderate heritability. The relative magnitude of the interaction was high for a single trait, phloem horizontal necrosis, but small to moderate for the rest of the Diameter measured 5 cm from the stem's base Significant correlation at *p < 0.05 and **p < 0.01. lesion traits and very small for all of the three growth traits. This kind of discrepancy is interesting, yet difficult to explain. The finding needs to be verified in further studies to make sure that it was not just a random occurrence. GWAS became an important tool in the identification of genetic loci controlling economically important traits in various plant species (Kump et al., 2011;Tian et al., 2011;Li et al., 2013;Wei et al., 2015). Our analysis identified 10 SNPs associated with the control of the size of necrotic lesions developed in response to H. parviporum inoculation. These SNPs correspond to 8 genes. None of the identified genes was previously reported as being implicated in control of resistance of Norway spruce to Heterobasidion infection. At the same time, none of the SNP markers associated with the Norway spruce resistance to H. parviporum in a previous screening program (Lind et al., 2014) was recovered in our analysis. Possible explanations for the lack of overlap of our results with the previous ones could be differences in the employed marker identification strategies; the populations used, whereas in the case of our work a multipedigree breeding population was used vs. a single bi-parental QTL cross used in the referenced study; and the relatively small number of samples employed in both studies.
Our analysis identified several genes, which might be involved in the regulation of plant defense responses in Norway spruce.  The identification of a gene showing similarity to A. thaliana ILITYHIA (ILA) in our study is particularly noteworthy, as the corresponding protein was demonstrated to play a key role in the induced defense reactions in Arabidopsis (Monaghan and Li, 2010). Furthermore, by controlling chloroplast development, it might be implicated in the regulation of the ROS accumulation and programmed cell death in response to a pathogen attack (Faus et al., 2018). However, nothing is known about the function of this protein in other plant species, including conifer trees. The role of this gene in defense reactions of Norway spruce against H. parviporum infection deserves further studies. The gene MA_4047g0010 encodes a predicted subtilase (subtilisin-like serine peptidase). This protein family is largely expanded in plant genomes, and their physiological functions include the control of growth and development, regulation of programmed cell death and senescence and plant responses to biotic and abiotic stressors (Schaller et al., 2018). Several plant subtilases are important modulators of plant immune responses (Tornero et al., 1996;Ramírez et al., 2013;Serrano et al., 2016). A specific group of subtilases, called phytaspases, are involved in the regulation of programmed cell death (hypersensitive response) triggered by the pathogen attack (Chichkova et al., 2004;Schaller et al., 2018). Our transcriptomic profiling of asymptomatic and symptomatic Norway spruce trees naturally infected by Heterobasidion sp. showed that the gene MA_4047g0010 had higher expression level in symptomatic trees, with the p value (p = 0.06) close to the significance threshold (Supplementary File S5). However, taking into account a large number of the predicted subtilase genes in spruce genome, it is difficult to predict the specific role of the identified gene, and its characterization would require further experimental work.
The MA_3905g0010, identified in our analysis, shows similarity to the plant-specific class of HD2 histone deacetylases. Members of this class play a role, among others, in ABA and abiotic stress responses (Sridha and Wu, 2006;Luo et al., 2012;Han et al., 2016). They might potentially contribute to the ABAmediated regulation of programmed cell death, which, in turn, might be one of the factors determining sensitivity or resistance to necrotrophic pathogens (Coll et al., 2011).
The connection of the remaining genes with the responses of spruce trees to fungal infection are less clear. No obvious link between their deduced function and the response of spruce trees against fungal infection could be established. However, it is possible that identified markers are physically linked with neighboring genes, located in the adjacent chromosomal loci. Unfortunately, the draft genome assembly of Norway spruce is unordered and highly fragmented (N50 = 4,869 bp; Nystedt et al., 2013), which makes it not always possible to find genes located adjacent to the identified SNP markers.
The identified genetic variants represent low frequency alleles (occurrence frequency among analyzed clonal lines <5%). This observation indicates that more extensive genotyping programs involving several hundreds of individuals might be required to identify the most relevant alleles controlling responses of Norway spruce to H. parviporum infection. In parallel, genomic selection approaches can be used to advance the selection of tolerant trees in the breeding program, as these methods do not rely on the dissection of quantitative traits to rank the most favorable trees.
This study is the first attempt to identify genes involved in the control of Norway spruce resistance to H. annosum infection using sequence capture enrichment strategy. Obtained results illustrate the suitability of this approach for genotyping of Norway spruce, a species with a very large genome and a high content of repetitive elements. Some of the identified genes are promising candidates, and further analysis should investigate their role in spruce defense reactions. However, our study also illustrates that even larger efforts might be required to identify genetic variants controlling economically important traits and occurring in low frequency in natural populations.

AUTHOR CONTRIBUTIONS
Phenotyping of Norway spruce saplings was performed by MM and EJ; total genomic DNA extraction was done by MM and EJ; genetic and genome-wide association analyses were done by LN; heritability analysis was carried out by MH; MM, AK, and LN analyzed the data; MM and AK prepared the manuscript draft; MK and FA conceived the study and contributed to the experiment design. All authors read and approved the final version of the manuscript.

FUNDING
This study was supported by funding from the Academy of Finland (Academy of Finland grant nr. 276862).