AvrPm2 encodes an RNase‐like avirulence effector which is conserved in the two different specialized forms of wheat and rye powdery mildew fungus

Summary There is a large diversity of genetically defined resistance genes in bread wheat against the powdery mildew pathogen Blumeria graminis (B. g.) f. sp. tritici. Many confer race‐specific resistance to this pathogen, but until now only the mildew avirulence gene AvrPm3 a2/f2 that is recognized by Pm3a/f was known molecularly. We performed map‐based cloning and genome‐wide association studies to isolate a candidate for the mildew avirulence gene AvrPm2. We then used transient expression assays in Nicotiana benthamiana to demonstrate specific and strong recognition of AvrPm2 by Pm2. The virulent AvrPm2 allele arose from a conserved 12 kb deletion, while there is no protein sequence diversity in the gene pool of avirulent B. g. tritici isolates. We found one polymorphic AvrPm2 allele in B. g. triticale and one orthologue in B. g. secalis and both are recognized by Pm2. AvrPm2 belongs to a small gene family encoding structurally conserved RNase‐like effectors, including Avr a13 from B. g. hordei, the cognate Avr of the barley resistance gene Mla13. These results demonstrate the conservation of functional avirulence genes in two cereal powdery mildews specialized on different hosts, thus providing a possible explanation for successful introgression of resistance genes from rye or other grass relatives to wheat.


Fig. S1
Primer efficiency curves for qRT-PCR primers for AvrPm2.               Table S9 BLAST hits of PM2 against wheat and barley databases.

Table S10
Results of the ANOVA for HR intensity in HR quantification assays.

Table S11
Results of TukeyHSD post-hoc test.

Table S12
Signal peptide and disulfide bond predictions in the AVRPM2 family.

Table S13
Structure prediction of the AVRPM2 in B.g. tritici and B.g. hordei.
Methods S1 BAC selection and sequencing.

Methods S3 RNA extraction and qRT-PCR assays
Methods S4 Genome-wide effector annotation and AvrPm2 family identification in different formae speciales.

Methods S5 In silico analysis of proteins encoded by Pm2 and AvrPm2
Notes S1 Sequences of the Blumeria graminis genes used for transient expression in   (Wicker et al., 2013).  The deleted sequence in JIW2 as identified by coverage analysis is indicated with a black frame (Fig 2). (c) PCR validation of the AvrPm2 deletion in three isolates. Primers were designed every 500 bp and spanning 50 kb were tested in the two parental isolates 96224 (avirulent), JIW2 (virulent) and in a third isolate (94202, avirulent). Black lines show the region where amplifications were obtained in the respective isolates. In the virulent isolate JIW2, a region of about 12 kb (508-520 kb) is deleted. A few unspecific amplifications were obtained inside the deleted segment from sequences corresponding to transposable elements.  construct harboring two HA tags (C-and N-terminal) was co-expressed with AvrPm2 but did not give HR. In the same assays, AvrPm3 a2/f2 was co-expressed with Pm3a as a positive control, and the hypersensitive cell death was weaker than the one resulting from co-infiltration of AvrPm2 and Pm2. AvrPm2 was also co-expressed with Pm3a which and did not result in HR. (b) The AvrPm2 construct encoding for the full native protein with the signal peptide was coexpressed with Pm2 and the observed hypersensitive response was weaker than in the infiltrations where the version without signal peptide was used. The AvrPm2 family members BgtE-5843 and BgtE-5846 were also co-expressed with Pm2 in N. benthamiana which did not result in HR. (c, d) Representative leaves from HR quantification assays (Fig. 3d) where the cell death response resulting from co-infiltrations of AvrPm2 and Pm2 constructs was compared to that resulting from co-infiltrations of AvrPm3 a2/f2 and Pm3a in 4:1 Avr:R ratio (c) and in 2:1 Avr:R ratio (d). that differs from AvrPm2 by only two amino acid polymorphisms, was co-expressed with Pm2 and a strong HR was observed. The two variants containing either one of the two SNPs present in BgsE-5845 (BgtE-5845_T62I and BgtE-5845_R118C) were also co-infiltrated with Pm2 and HR was observed. Results in (a-f) were consistent across independent replicates from at least three experiments where four to eight leaves were assayed.                 Indicates the alignment score falling between 0 and the sequence length. 3 Indicates the likelihood of a predicted model being worse than the best of a set of randomly-generated models for this protein. For mainly alpha helixes, P-value less than 10^-3 is a good indicator. For manly beta sheets, P-value less than 10^-4 is a good indicator 4 Indicates uGDT and GDT where uGDT is the unnormalized GDT (Global Distance Test) score defined as 1*N(1)+0.75*N(2)+0.5*N(4)+0.25*N(8), where N(x) is the number of residues with estimated modeling error (in Å) smaller than x. GDT is calculated as uGDT divided by the protein length and multiplied by a 100. For a protein with >100 residues, uGDT>50 is a good indicator. For a protein with <100 residues, GDT>50 is a good indicator. 5 Is the length of the protein in amino acids.

Methods S1 BAC selection and sequencing
The B.g. tritici BAC library was previously produced from isolate 96224 as described by Oberhaensli and colleagues (2011). BACs were selected based on their positions in the BAC fingerprint contig (FPC) and in the linear topological contig (LTC) assemblies (Frenkel et al., 2010). 3D-DNA pools of the library were screened by PCR using genetic and physical markers.
Candidate clones were confirmed by PCR and plasmids were extracted using the Qiagen Large- assembly 2: distance between paired reads = 100 to 350bp, word size = 19bp, minimum contig length = 1000 bp. Assembly 1 resulted in large contigs that were used to order the smaller contigs obtained from assembly 2. The AvrPm2 assembled locus was integrated into contig 51 of the reference genome (Wicker et al., 2013). The physical markers used to confirm the BAC assembly are listed in Table S7.

Methods S2 Illumina whole-genome resequencing
Genomic DNA of 22 Chinese isolates selected for resequencing was extracted from fresh conidiospores using a CTAB method (Ristaino et al., 1998) with minor modifications. Briefly, 240 mg conidia in 700 µl extraction buffer was vortexed with glass beads for 3 min. 40 µl 20% SDS, 65 µl 500 µg ml -1 proteinase K and 2 µl 500 µg ml -1 RNase was added and incubated at 65C for 1 h to digest proteins and RNA. Then 700 µl phenol-chloroform was added and mixed followed by centrifugation at 13,201 g for 10 min. DNA in the supernatant was precipitated with isopropanol overnight at -20C and then collected with centrifugation at 12,000 rpm for 10 min. Paired-end 90 (PE90) and paired-end 100 (PE100) sequencing strategies were employed by BGI and Novogene, respectively (Table S9).

Methods S3 RNA extraction and qRT-PCR assays
Samples of susceptible wheat leaf segments (cv 'Chinese Spring') infected with the isolates 96224 were collected every 24 h and frozen in liquid nitrogen. We performed three individual infections corresponding to the three biological replicates. The samples were grinded and total RNA was extracted using the Qiagen miRNeasy Mini Kit (50) according to manufacturer's instructions. RNA was quantified using a spectrophotometer (NanoDrop, Thermo Fisher Scientific). RNA quality was assessed by gel electrophoresis and based on the 260/280 ratios obtained with the spectrophotometer (Supporting Information Table S1). Samples with a ratio 2.0 -2.2 were used for cDNA synthesis. All RNA samples were stored at -80°C until cDNA synthesis, usually the day after extraction. RT-qPCR ready cDNA was synthesized from 1.5 µg total RNA using the iScript cDNA synthesis kit (Bio-Rad) according to the manufacturer's instructions.
We have used Glyceraldehyde 3-phosphate dehydrogenase (Gapdh) and Actin (Act1), two commonly known housekeeping genes in fungi. Both genes were previously validated as stable internal controls for gene expression studies in Blumeria graminis by Bourras et al. (2015)and Pennington et al. (2016). All our gene expression data is normalized to Gapdh, in consistence with previous work in wheat powdery mildew (Bourras et al., 2015). For AvrPm3 a2/f2 , we used the primers published in Bourras et al. (2015). Primers for AvrPm2 were designed after in silico testing by blast searches against the Blumeria graminis f.sp tritici. The length of the amplicon is 132 bp (see Supporting Information Table S2). Primer specificity was additionally tested by gel electrophoresis analysis of PCR amplicons from genomic DNA and cDNA. Primer efficiencies were tested with series of 4 dilutions (1x, 4x, 16x and 64x) using the Bio-Rad Real-Time PCR CFX Manager Software version 3.5 and the curve for AvrPm2 primer efficiency is available in Supporting information Fig. S1.
We used the SsoFast EvaGreen Supermix chemistry (BioRad) according the manufacturer's instructions. Each reaction was done in a 10 µl volume, with 4 µl of qRT-PCR ready cDNA, and a final primer concentration of 500 nM. We used 10 µl of final reaction volume. All assays were done in a BioRad C1000 Touch thermal cycler, equipped with the CFX96 Real-Time PCR System, with the following program:

Methods S4 Genome-wide effector annotation and AvrPm2 family identification in different formae speciales
To identify the AvrPm2 effector family we performed a de novo annotation of the B.g. tritici genome and redefined the predictions of effector families. We also assembled de novo the B.g.
Transcriptome assembly resulted in 16,440 contigs which were used subsequently. We used the B.g. tritici gene database (Wicker et al., 2013) as query in blast searches against the B.g. tritici genome and kept all hits with at least 50 amino acids in length and 20% identity at the protein level. If no genes were previously annotated in the aligned region we considered an additional window of ± 500 bp up-and downstream. In a second step we blasted every window against the transcriptome and kept hits with more that 95% identity at the nucleotide level. If the alignment had an open reading frame (ORF) we extended the sequence, using the transcriptome as template, until the closest start and stop codons. If in the 5' direction, we found a stop codon before finding a start codon we retained the original aligned sequence without extending it in the 5' direction. Results were further parsed to avoid redundant annotations. To exclude repetitive elements we used every new gene as query in blast search against the Blumeria and Triticeae transposable elements (TE) databases (Bg +PTREP12) (Wicker et al., 2013), http://wheat.pw.usda.gov/ITMI/Repeats/, accessed 01/03/2014), and hits with blastp e-value ≤ 10 -5 were excluded. This annotation produced a large number of gene fragments. We retained only ORFs longer than 50 amino acids and we then performed further manual curation excluding all the genes that could be aligned more than 10 times in the genome by gmap (version 2013-07-20) (Wu & Nacu, 2010). This process was iterated 9 times with the newly identified genes. To cluster proteins in families we performed "all against all" blast searches using proteins sequences form B.g. hordei, B.g. tritici, N. crassa and P. anserina. Hits with e-value greater than 0.001 and alignment length shorter than 70% of the query length were excluded. Protein families were generated with the markov cluster algorithm implemented in the MCL software (with option -I 6) (Enright et al., 2002). We then defined secreted genes as all genes which encode a protein with a predicted signal peptide but do not have a trans-membrane domain (predicted with signalp 4.1) (Petersen et al., 2011). Finally, we defined as candidate secreted effector proteins In this annotation we have also included genes lacking a predictable signal peptide but showing homology to actual CSEPs. We than re-clustered these 1,189 CSEPs with the same pipeline (with option -I 1.4) and found that 1,123 belonged to a family of at least 2 proteins. We then selected the family containing BgtE-5845. To identify haplotypes of AvrPm2 in the different B.g. tritici and B.g.triticale isolates, we ran a SNP call on the genomic mapping using vcftools (Danecek et al., 2011). We considered as high-confidence SNPs all positions with a minimum mapping score of 20, a minimum coverage of 8×, a minimum frequency of the alternative call of 0.9 and < 5 missing genotypes. To identify the family members in B.g. secalis we mapped the B.g. tritici family members against the B.g.

Methods S5 In silico analysis of proteins encoded by Pm2 and AvrPm2
In silico identification and annotation of conserved protein domains in the PM2 sequence was done with i) COILS (http://www.ch.embnet.org/software/COILS_form.html) (Lupas et al., 1991) for For the AVRPM2 effector family, secretion peptides were predicted using SignalP V4.1 (Dcutoff values 0.5, (Petersen et al., 2011) and disulfide bonds by DISULFIND (Ceroni et al., 2006) after manual reannotation of the genes. Conserved motifs were identified using motif scanning software from the MEME Suite 4.11.2 (Bailey et al., 2009). Sequences corresponding to the predicted mature protein after removal of the predicted signal peptide, and encoded by the AvrPm2 gene family in wheat and barley powdery mildews were submitted to the RaptorX server for tertiary structure prediction (http://raptorx.uchicago.edu/) (Källberg et al., 2012).
Three dimensional protein models were assessed for significant homology to ribonucleases based on the criteria described in Källberg et al. (2012).

Nicotiana benthamiana.
(See separate file.) Notes S2 CDS sequences of the AvrPm2 family members in B.g. tritici and B.g. hordei.
(See separate file.) Notes S3 Protein sequences of the AvrPm2 family members in B.g. tritici and B.g. hordei.

Notes S4 BAC selection, sequencing and assembly
A BAC (bacterial artificial chromosome) library of the reference isolate 96224 was previously produced  and assembled using the fingerprint contig (FPC) algorithm covering each contig with a predicted arrangement of overlapping BACs. The isolate 96224 was then sequenced using the 454 shotgun sequencing technology and the 454 scaffolds were integrated to the BAC assembly using BAC end sequences as linker sequences between 454 scaffolds and FP contigs (Wicker et al., 2013) S2). The LTC assembly showed different overlap compared to the FPC assembly: for example, BAC 24i20 (red in Fig S2a) was located before BACs 11j01 and 11o01 in the FPC assembly and after these same BACs in the LTC. The amplification obtained with the PCR markers confirmed that the LTC assembly is better than the FPC: marker 3b20f_F1 showed amplification on four BACs (3k11, 8e01 11o01 and 24i20) but not on 11j01 located between 11o01 and 24i20 (Fig. S2). In the LTC assembly the overlap between these three BACs was reorganized and therefore marker 3b20f_F1 amplified from four neighboring BACs.
The five BACs covering the region between the markers F2 and F12 were completely sequenced and assembled using two different sets of parameters. A permissive assembly was done with all reads obtained by sequencing (on average 2 million per BAC) giving an average of 5.2 contigs per BAC. A stringent assembly was done using 120,000 to 200,000 reads per BAC cleaned from bacterial vector backbone sequences. With the second assembly more and smaller contigs were obtained with an average of 12.2 contigs per BAC. The permissive assembly containing the bacterial backbone was useful to localize the beginning and the end of each BAC clone sequence. Thus, contigs obtained with the stringent assembly could be ordered by comparing them to the bigger contigs resulting from the permissive assembly (Fig. S3b). Finally, all the assemblies generated for all the BACs were compared and aligned to obtain a contiguous sequence. The model assembly was created using six contigs from four different BACs (Fig.   S3c). This assembly was validated by locating the genetic markers on the sequence. The size of the model assembly was 217 kb which is smaller than the originally predicted size (i.e. 250 kb between F2 and F12).

Notes S5 Genetic and physical mapping of AvrPm2 based on the B.g. tritici/B.g. hordei genome collinearity
It was previously observed that orthologous genes in B.g. tritici and B.g. hordei show a strong collinearity as their order and orientation are highly conserved Wicker et al., 2013). We exploited this collinearity to search for homologs of the genes located in the AvrPm2 region that are present in the B.g. hordei genome. In the candidate region of contig 51, nine genes were predicted. Those genes were used in BLAST searches against the B.g. hordei genome and homologs of these genes were found on contig 47 of B.g. hordei. The genes from B.g. hordei were then used in BLAST searches against the B.g. tritici genome. We found that three of the nine identified genes from B.g. hordei had homologs on contig 48 of B.g. tritici which, according to the genome-wide map, is located on the same linkage group as contig 51 (S1 Data). Five B.g. hordei genes and their homologs in B.g. tritici showed a strong collinearity: both gene order and transcriptional orientation are conserved. Bgt-2686 was predicted on a sequence island on the right of Bgt-1444 but its homolog on B.g. hordei was situated on the left of Bgh-3405 (Fig. S2). Markers were designed on this gene and tested by sequencing. It co-segregated with marker F2 at 9 recombinants to AvrPm2 revealing that the segment of sequence containing Bgt-2686 was misassembled and could be located in the sequence gap on the F2 side of AvrPm2 ( Fig. 12c). Three of the nine genes in contig 51 were CSEPs from the same gene family . BLAST searches using these effectors resulted in the identification of two homologous CSEPs in B.g. hordei (Fig. S2). When used again for BLAST searches against the B.g. tritici genome, a new homolog was found in B.g. tritici,  which was predicted on an unanchored scaffold (Fig. S2). Markers were designed on this gene and tested by sequencing. BgtE-5842 mapped genetically near the AvrPm2 locus at 2.0 cM from AvrPm2, anchoring the scaffold inside the locus and reducing the genetic interval by 1.9 cM (Fig.   2c).