Mining Grapevine Downy Mildew Susceptibility Genes: A Resource for Genomics-Based Breeding and Tailored Gene Editing

Several pathogens continuously threaten viticulture worldwide. Until now, the investigation on resistance loci has been the main trend to understand the interaction between grapevine and the mildew causal agents. Dominantly inherited gene-based resistance has shown to be race-specific in some cases, to confer partial immunity, and to be potentially overcome within a few years since its introgression. Recently, on the footprint of research conducted in Arabidopsis, putative genes associated with downy mildew susceptibility have been discovered also in the grapevine genome. In this work, we deep-sequenced four putative susceptibility genes—namely VvDMR6.1, VvDMR6.2, VvDLO1, VvDLO2—in 190 genetically diverse grapevine genotypes to discover new sources of broad-spectrum and recessively inherited resistance. Identified Single Nucleotide Polymorphisms were screened in a bottleneck analysis from the genetic sequence to their impact on protein structure. Fifty-five genotypes showed at least one impacting mutation in one or more of the scouted genes. Haplotypes were inferred for each gene and two of them at the VvDMR6.2 gene were found significantly more represented in downy mildew resistant genotypes. The current results provide a resource for grapevine and plant genetics and could corroborate genomic-assisted breeding programs as well as tailored gene editing approaches for resistance to biotic stresses.


Introduction
The development of disease-resistant varieties is a convenient alternative to chemical control methods to protect crops from diseases. When it recognizes and invades plant tissues and a plant-pathogen interaction is established, the pathogen is faced with the host response, which involves the activation of signals that translate into a rapid defense response. This immune response helps the host plant to avoid further infection of the pathogen [1]. To suppress this immunity, pathogens produce effector molecules to alter host responses and support compatibility. In turn, plants evolved the ability to recognize these effectors by resistance (R) genes. The majority of R genes encode nucleotide-binding leucinerich-repeat (NBS-LRR) proteins. Since R genes are specifically directed towards highly polymorphic effector molecules or their derivatives, this kind of immunity is dominantly inherited, mostly race-specific, and rapidly overcome by the capacity of the pathogen to mutate [2]. Analyses of whole-genome sequences have provided and will continue to provide new insights into the dynamics of R gene evolution [3].
apple [44], walnut [45], sweet cherry [46], pear [47], coffee [48], and grapevine [49,50]. This phenomenon is due to the boost in the sequencing of cultivated plant genomes to provide high-density molecular markers for breeding programs aimed to crop improvement as well as to elucidate evolutionary mechanisms through comparative genomics [51,52]. In grapevine a great deal of progress has been made from the first SNP identification in the pre-genomic-era [53] to the sequencing of the whole genome of several Vitis vinifera cultivars [54][55][56][57][58][59], to the very recent report of the genome sequence of Vitis riparia [60] and the diploid chromosome-scale assembly of Muscadinia rotundifolia [61]. The last two studies represent a turning point on the scavenging of genomes that are donors of disease resistance traits. This issue in Vitis spp. is tackled by identifying R loci, underlying R genes, through quantitative trait loci (QTL) analysis in different genetic backgrounds. Nowadays, 13 R loci against powdery mildew and 31 to DM have been identified with different origins, mainly from American and Asian wild species [62,63].
A promising approach to cope with disease resistance is represented by the study of S loci. Based on a high-resolution map, Barba et al. (2014) [64] identified on chromosome 9 a locus (Sen1) for powdery mildew susceptibility from 'Chardonnay', finding evidence for quantitative variation. Moreover, on the footprint of research conducted on model plants, genes associated with mildew susceptibility have been discovered and dissected also in the grapevine genome. 17 VvMLO genes, orthologues of the Arabidopsis MLOs, were identified and a few members showed transcriptional induction upon fungal inoculation [65,66]. Lately, more insights in terms of powdery mildew resistance has been achieved by silencing of four VvMLO genes through RNAi in grapevine [67].
In this research, we aim to investigate the diversity of the DMR6 and DLO genes in a wide set of Vitis spp. to broaden our knowledge about the genetic variation present and about the impact on the protein structure and function. This information will represent a resource to enhance our knowledge of possible alternative or integrative solutions, as compared to the use of R loci to be applied in plant molecular breeding strategies.

Genetic Material and Target Genes
In the current study, the four VvDMR6.1, VvDMR6.2, VvDLO1, and VvDLO2 genes were scouted in 190 grapevine genotypes ( Table 1, Table S1). Out of these, 139 (73%) are Vitis hybrids, 28 (15%) are V. vinifera varieties, 12 (6%) belong to wild Vitis species and additional 11 (6%) are ascribed as hybrids/wild species. Phenotypic data about DM resistance degrees were retrieved from literature, public databases, and unpublished information. Pairwise alignment [68] was performed in order to define nucleotide identity between investigated genes.

Amplicon Sequencing and Read Processing
Genomic DNA was extracted from young grapevine leaves using DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany), according to the manufacturer's protocol, and then used to produce amplicons for deep sequencing. PCR on the templates was performed using Phusion High-Fidelity Polymerase (NEB, Ipswich, MA, USA) according to the manufacturer's protocol. Primers were specifically designed to amplify 250 bp of the coding regions of target genes and barcoded followed by in-house sequencing using the Illumina MiSeq platform (Table 1). A total of 19 amplicons was sequenced including six amplicons for VvDMR6.1, seven amplicons for VvDMR6.2, four amplicons for VvDLO1, and two amplicons for VvDLO2. Obtained amplicons were then mapped on the PN40024 12X reference genome [54] considering the latest V2 gene prediction [69,70] through Burrows-Wheeler alignment (BWA) [71] with no filter on mapping quality.

Sanger Sequencing
Thirteen impacting mutations (six in VvDMR6.1, two in VvDMR6.2, two in VvDLO1, three in VvDLO2) in 17 genotypes (12 hybrids, one V. vinifera, two wild species, two hybrids/wild species) in 25 combinations (Table S2) were chosen according to their representativeness of the overall results and to the availability of plants in situ. Previously extracted DNA was used to produce 12 targeted Sanger amplicons (six in VvDMR6.1, two in VvDMR6.2, two in VvDLO1, two in VvDLO2) by PCR using Phusion High-Fidelity DNA Polymerase (Thermo scientific) according to the manufacturer's protocol. Purification was made enzymatically with ExoSAP-IT PCR Product Cleanup Reagent (Applied Biosystems Inc., Foster City, CA, USA) according to the manufacturer's instructions. 3.2 µM of forward or reverse primer were then added to the sample and sequencing was performed using the BigDye Terminator Cycle Sequencing Ready Reaction Kit v3.1 (Applied Biosystems Inc.) in ten µL final volume. Sequencing reactions were performed using a 2 min initial denaturation step, followed by 25 cycles at 96 • C for 10 s, 50 • C for 5 s, and 60 • C for 4 min and then purified from unincorporated primer and BigDye excess through Multiscreen384SEQ Sequencing reaction Cleanup Plate (Millipore, Carrigtwohill, Co. Cork, Ireland). Capillary electrophoresis of the purified products was performed on a 3730 × l DNA Analyzer (Applied Biosystems Inc.). Pregap4/Gap4 from Staden Package software package [72] were used to align DNA sequence electropherograms and scan all polymorphic sites.

Data Mining and Protein Model
Variant calling was performed by BCFtools [73] using the following settings: minimum mapping quality 20; minimum genotype quality 20; minimum base quality 20; maximum per sample depth of coverage 1000; minimum depth of coverage per site 10; keep read pairs with unexpected insert sizes (for amplicon sequencing). Filtering of results was done with VCFtools [74] to exclude all genotypes with quality below 20 and include only genotypes with read depth ≥10.
SnpEff toolbox was used to further discriminate variants according to their impact (MODIFIER, LOW, MODERATE or HIGH accordingly to the user's manual) on gene sequence [75]. Elected-impacting variants were then subject to SIFT (sorting intolerant from tolerant) [76] analysis to assess the tolerance of amino acid variants on the protein primary structure, based on the alignment with sequences in SWISS-PROT/TrEMBL database. Only not tolerated mutations were considered for a last impact evaluation based on variants chemical-physical properties according to Betts and Russel (2003) [77]. Both SnpEff and SIFT algorithms were used with default parameters settings. Data obtained from mapping and variant calling were dissected to extrapolate overall genetic information on the studied genotypes. Amplicons were classified according to their level of polymorphism. All the other parameters were calculated considering all genotypes and the various taxon. For each gene, frequencies of occurring mutation arrangement were calculated along with mutation frequency, triallelic variants occurrence, and MAF. PHASE v2.1 software [78] was used for haplotype reconstruction and frequency calculation using PN40024 as the reference genome [54]. The genotypes belonging to specific classes (carried haplotypes) were linked in contingency tables to the phenotypic trait according to OIV 452(-1) [79]. Pearson's Chi-squared Tests for Count Data were performed on each locus separately.
Sequences of bonafide (*) and putative DMR6 and DLO orthologues were collected from literature [11,13,14,80] and available databases (Plaza 3.0) [81] and aligned using ClustalW (https://www.ebi.ac.uk/Tools/msa/clustalo/). Genes carrying mutations confirmed by Sanger sequencing were subjected to a homology detection and three-dimensional structure prediction using the HHpred tool of MPI Bioinformatic Tools [82] available at https://toolkit.tuebingen.mpg.de/#/tools/hhpred. The algorithm found a Thebaine 6-O-demethylase [83] as the protein sequence with three-dimensional structure available (PDB coordinates: 509W) and highest homology to VvDMR6 and VvDLO and it produced a three-dimensional model carrying the mutations using the MODELLER software [82]. The three-dimensional structure was visualized to better understand the impact of the mutations on the wild type protein structure.
Considering the 696 biallelic mutations in all genotypes, 75% were transitions (A↔G, C↔T) and 25% were transversions (A↔C, A↔T, C↔G, G↔T) with a transition/transversion ratio of three. Both vinifera varieties and hybrids show the same assortment with 77% transitions and 23% transversions. In wild species the percentages were 73% and 27% respectively, while 71% and 29% were the values observed in hybrid/wild species. SNP frequency was calculated both as average across all genes as well as per gene for every taxon. Vinifera varieties showed the lowest average frequency (~15 SNPs per Kb) with high differences between the target genes:~33 SNPs per Kb in VvDMR6. 1 In the current work, minor allele frequency (MAF) was calculated for each biallelic mutation. MAF values 0.01 ≤ x ≤ 0.05 were represented by the 29% of mutations detected in all genotypes, in particular the 23%, 0%, 2%, and 3% in hybrids, wild species, vinifera varieties and hybrids/wild species, respectively. MAF values 0.05 < x ≤ 0.1 were represented by 3% of the mutations in all genotypes as well as in wild species and by 2% in hybrids, vinifera varieties and hybrid/wild species. 0.1 < x ≤ 0.3 MAF values were represented by the 5% of mutations in all genotypes as in hybrids; wild species and vinifera varieties represented them by the 4% of their mutations and hybrid/wild species by the 2%. A very low percentage of mutations showed MAF 0.3 < x ≤ 0.5: 3% for all genotypes, hybrids and vinifera; 2% for wild species and hybrid/wild species.

Mutation Impact Evaluation
In the current study, upon the variant discrimination performed according to their impact on codon sequence, 27% of total mutations (in particular, 27% in VvDMR6.1, 25% in VvDMR6.2, 30% in VvDLO1 and 25% in VvDLO2) were classified as "MODIFIER": falling into intronic regions or upstream/downstream the gene. "LOW" impact variants, responsible for synonymous mutations or falling into splice regions, represented the 32% of the total mutations: 36% in VvDMR6.1, 32% in VvDMR6.2, 32% in VvDLO1, and 28% in VvDLO2. Of the total mutations, 38% (in particular, 35% in VvDMR6.1, 40% in VvDMR6.2, 35% in VvDLO1 and 43% in VvDLO2) were non-synonymous variants and therefore classified with "MODERATE" impact. These percentages are partially confirmed in vinifera by Amrine et al. (2015) [84], with~90% of MODIFIER and LOW mutations and~8% non-synonymous variants in gene sequence. The lowest number of variants (in average 3%: 2% in VvDMR6.1, 2% in VvDMR6.2, 3% in VvDLO1 and 4% in VvDLO2) was classified with "HIGH" impact as being responsible for sequence frameshifts or premature stop codons. Following the filtering of mutations classified as "MODERATE" and "HIGH" (41%) in order to discriminate amino acid variants according to their conservation, these variants were further checked and mutants carrying different chemical/physical properties from the reference were chosen. Finally, results from both analyses on amino acid sequence were cross-referenced and 20 mutations were elected as potentially affecting the protein structure: 6 in VvDMR6.1, 4 in VvDMR6.2, 4 in VvDLO1, and 6 in VvDLO2 (Table S3, Figure 1).
Twenty-five genotype-SNP combinations were selected for confirmation via Sanger sequencing. 44% of the mutations were confirmed by Sanger sequencing, while 56% were not, indicating a certain discrepancy from Illumina sequencing results. In VvDMR6.1, two mutations out of six polymorphisms were validated in one genotype each. The same variant in VvDMR6.2 was confirmed in three individuals. In VvDLO1 the confirmed variants were two, both in two different genotypes. Two individuals shared only one mutation in VvDLO2. Validated variants spanned among all the scouted genes, and the distribution of genotypes carrying confirmed mutations fairly represented the starting taxon assortment (six hybrids, one wild species, two hybrid/wild species individuals). For each gene, there were mutations that were both confirmed and unconfirmed depending on the genotype, and some individuals carried both confirmed and unconfirmed variants in the same gene. We classified Sanger-investigated variants according to their read coverage (DP) and to their genotype quality (GQ). Out of the total 25 variants taken into account, 15 showed DP < 100 and 10 mutations with DP > 100 of which only one with DP close to 1000. While within 15 mutations with 10 < DP < 73 only four NGS results (27%) were confirmed, 7 out of the 10 variants (70%) with DP > 100 could be confirmed via Sanger sequencing. Furthermore, seven variants out of 25 (28%) showed a GQ lower than 99, of these only two were confirmed by Sanger sequencing. The remaining 18 mutations (72%) had GQ = 99 and half (nine) of them were confirmed. Considering both DP and GQ values together, six out of the seven variants with GQ < 99 showed DP < 100 but still two of them were Sanger sequencing confirmed. While five out of the nine remaining confirmed mutations showed GQ > 99 and DP > 100, two variants were with 50 < DP < 100. Of all the 20 impacting mutations considered (Table S3), only five were located at less than 60 nucleotides from amplicon or contig edge, and only one at less than 10 nucleotides. All the variants located on boundaries showed DP < 100; 50% of these edge mutations showed GQ < 99 and the other half GQ > 99. All the Sanger-confirmed variants were located far from amplicon ends, while only one was located on a reverse primer.
In order to provide robust results, only the validated mutations, corresponding to 11 genotype-SNP combinations, were selected for haplotype reconstruction and following analyses ( Figure 1).

Mutated DMR and DLO Gene Combinations
Of the 190 studied genotypes, 55 showed at least one of the elected mutations: 37 hybrids, three vinifera varieties, six wild species and nine hybrid/wild species. 73% of individuals showed mutations only in one gene: 13% in VvDMR6.1, 29% in VvDMR6.2, 7% in VvDLO1 and 24% in VvDLO2, while 26% were double mutants within six gene combinations and one genotype was mutant in three genes (Table S4). Haplotypes and their frequencies were determined for VvDMR6.1, VvDMR6.2, VvDLO1, and VvDLO2 genes. Individuals carrying one impacting mutation per each gene were selected and the gene haplotypes were inferred taking into account all the flanking mutations showing at least MODERATE impact on the gene sequence ( Table 2, Table S5). For VvDMR6.1, based on 14 SNPs, 17 haplotypes were calculated in 11 genotypes. The reference haplotype was the prominent (18.2% of frequency), all the others were unique, except for two haplotypes respectively shared by two individuals. No particular association between taxon and haplotype occurrence was observed. Regarding VvDMR6.2, 14 haplotypes were reconstructed based on 14 SNPs in 27 genotypes. The most shared haplotype (40.7%), showing two impacting mutations, was present in 12 individuals belonging to hybrids and, mainly in homozygous state, to Vitis spp./hybrid individuals. The reference haplotype was the second one mostly represented, and then the third one showed 13% of frequency being shared by six hybrid genotypes. VvDLO1 showed nine haplotypes based on 11 SNPs in 10 individuals. Besides the most recurrent reference haplotype (30%), the one with 20% of frequency encompassed two impacting mutations in one hybrid and two wild species. Sixteen SNPs in 25 genotypes were taken into account for VvDLO2, resulting in 19 haplotypes. Most haplotypes were unique or slightly shared, except for the reference one (34% of frequency) and two other main haplotypes (12% each) respectively shared by only and both hybrids and wild species (Table S5).
Integrating genotypic (haplotypic) data and available phenotypic OIV 452(-1) scores (Table 2), a chi-squared test was performed in order to check that genotypes belonging to specific classes (carried haplotypes) significantly led to the DM resistance trait. Interestingly, in VvDMR6.2, significance levels p = 0.0025 and p = 0.018 were respectively observed for haplotype number 10 and 8.

Mutation Mapping on Amino Acid Sequences and Protein Structural Model
The amino acid variants corresponding to the mutations confirmed by Sanger sequencing were further investigated: (i) to estimate their conservation at the primary sequence level both within Vitis as well as in a larger group comprising other plant species (Figure 2A,B, Figure S1), and (ii) to evaluate their impact on the protein tertiary structure model (Figure 3).   Due the high sequence identity among them, the same protein three-dimensional model was used for mapping the mutations of all four proteins. Of the six amino acid substitutions two were found in VvDMR6.1 and VvDLO1 respectively, and one in VvDMR6.2 and VvDLO2 (Figure 3). All these mutations were non-conservative and therefore could potentially determine deep structural changes affecting also on the protein function. As depicted in Figure 3, four mutations appeared to be more exposed to the solvent, while the other two were buried inside the hydrophobic core of the proteins. Changes in the exposed amino acids are often less detrimental on the protein structure/function and this is the case of the V2D and H52L mutations. Although these mutations replaced a hydrophobic residue with a negatively charged one (V > D) and vice versa (H > L), being solvent exposed they do not seem of high impact on the protein structure. G302E and E53G mutations affect both steric hindrance and charge of the amino acid: glycine bearing the smallest side chain and glutamic acid bearing a bulky and negatively charged side chain. Also, for these two mutations, the location at the protein surface suggests that they may be tolerated and likely do not affect heavily protein function. The remaining mutations Y89H and I253K might instead have a much greater impact on the structure and function of VvDMR6.1, the sequence where they have been found. In this case, amino acids with hydrophobic character (Y and I) and positioned within the hydrophobic core of the globular protein are changed into positively charged amino acids (H and K). In blue are residues located inside the protein while in red are those more exposed on the surface.

Wealth of Genetic Variability
The current survey revealed a high representation of triallelic mutations within our genotype panel, due to the great genetic variability considered. Analogously, the occurrence of triallelism is consistent with previous work in grapevine [86][87][88]. However, as reported by Bianco et al. (2016) [44] and Marrano et al. (2019) [45], triallelic variants are usually discarded in large scale SNP-based analyses for cost reasons (i.e., they require multiple probes in SNP arrays) and not necessarily because they are less accurate. The obtained results in terms of transitions/transversions slightly diverge from the usual ratio found in grapevine (~1.  [86][87][88][89][90] as well as in beetroot [91], potato [92] and cotton [93], while they are much higher than in soybean [94] and almond [95]. Regarding the detected average of~15 SNPs per Kb in vinifera genotypes, a comparable polymorphism rate (~14.5 SNPs per Kb in coding regions) was found in both cultivated (spp. sativa) and non-cultivated (spp. sylvestris) vinifera species by Lijavetzky et al. (2007) [86]. In contrast,   [87], estimated~8.5 SNPs per Kb in cultivated vinifera and~6 per Kb in wild vinifera individuals coding sequence. Moreover, studying different Vitis spp. genotypes, Salmaso et al. (2004) [89] observed an average of~12 SNPs per Kb in the coding sequence of a set of genes encoding proteins related to sugar metabolism, cell signaling, anthocyanin metabolism, and defense. Based on the first Pinot noir consensus genome sequence, the average SNP frequency was estimated at four SNPs every Kb [55], compatible with the use of such molecular markers for the construction of genetic maps in grapevine [96]. Different polymorphism rates were found in other highly heterozygous tree species as peach (less than two SNPs per Kb) [97], black cottonwood (~3 per Kb) [98], almond (~9 per Kb) [95], and Tasmanian blue gum tree (~22 per Kb) [99], but all these results have to be carefully taken into account since different SNP calling methods can distort the comparison.
SNP informativeness depends on their reliability among individuals and species and their high transferability rates probably are not consistent with a direct impact on the genetic sequence (when in coding regions). Considering previous studies in grapevine, a larger representativeness of MAF values <0.1 was found in non-vinifera genotypes and rootstocks, non-cultivated vinifera showed a MAF 0.05 < x < 0.3 while MAF > 0.1 were severely represented by vinifera sativa [86,87,90,100]. As explained by Jones et al. (2007) [101] and Grattapaglia et al. (2011) [102], genotyping studies take advantage of different molecular markers, mostly relying on their informativeness. In this framework, SNPs are informative markers, and this peculiarity is calculated as MAF. SNPs are considered interesting for many goals when MAF values are >0.05 [103,104], but their main usefulness is due to the transferability across genotypes (>0.1) [86]. In the current study, the aim to focus on impacting mutations was achieved, since MAF ≤ 0.05 is a distinguishing mark for rare SNPs which affect the gene sequence and most likely the protein activity.

Relevance of Mutation Impact
In crops like tomato [105] and Cucurbita spp. [106], coding regions and whole genome sequence were scouted to find impacting mutations. A non-synonymous/synonymous mutation ratio of~1.5 was found in tomato cultivars. In Cucurbita spp., the ratio was~0.8 but only 9% of genetic variants showed HIGH or MODERATE impact in full genomic sequence, suggesting a great presence of intergenic mutations. In the walnut tree genomic sequence, Marrano et al. (2019) [45] identified 2.8% potentially impacting variants, while in the pear genome 55% of mutations were classified as missense and 1% with HIGH impact [107]. In grapevine, a significantly lower presence (0.7%) of HIGH impacting variants was observed in Thompson Seedless cultivar [108] compared to average percentages we observed in all taxa. The present aim to detect potentially disrupting mutations finds support in the great frequency of HIGH-and MODERATE-impact variants compared to the aforementioned research works on grapevine. Particular interest in the current results is given by the occurrence of impacting elected mutations in each one of the four scouted genes. Given the predicted compensative functional role of AtDMR6 and AtDLO in SA catabolism [10,12], obtained data may allow the use of VvDMR6 and VvDLO genes in different combinations to enhance the impact of such homozygous mutations and likely avoid complementary effects.
Regarding the confirmation via Sanger sequencing, a borrowed attempt from clinical studies was tried herein on the overall grapevine Illumina sequencing results. In clinical research, reliability of variant calls is a fundamental precondition that requires the use of Sanger sequencing as gold standard to confirm NGS results and avoid false positives [109][110][111]. Incidentally, in order to avoid expensive and time-consuming extra analysis, some studies tried to set conditions according to which NGS-based variant calls can be considered definitive [112,113]. Although given the low number of tested samples we cannot draw a definitive conclusion that there is a direct correlation between these conditions and the reliability of Illumina sequencing-based calls, we observed that the most Sanger-confirmed variants (64%) showed DP > 100 and GQ = 99, while all ones were located away from the edges of the amplicons. The latter is in accord to Satya & DiCarlo (2014) [114], who report that variant calling accuracy decreases when SNPs are next to amplicon boundaries.
At this point, it is important to highlight the genetic complexity (high heterozygosity) of the studied genotype panel, which can unpredictably affect the Illumina probe as well as the Sanger sequencing primer annealing. Therefore, in order to provide reliable results, only validated mutations were selected for haplotype reconstruction and subsequent analyzes.

The Value of Haplotype Consideration
The reported broad genetic survey went back to the haplotype level. In three scouted genes out of four, the prominent haplotype belongs to the reference genotype (PN40024) which is a near-homozygous line [54] derived from the founder vinifera variety Pinot (noir) [115]. It is believed that the ancestral haplotype of a gene is the one showing the highest frequency while the rarest ones are the ones showing the most recent mutations occurring on the most shared haplotype [116], this hypothesis is supported by the fact that haplotype frequency is directly related to its age [117,118]. As advocated by Riahi et al. (2013) [119], domestication, hybridization with wild relatives and somatic mutations induced by vegetative propagation are the main reasons for the onset of genetic diversity between and among grapevine taxons.
Considering haplotypic data and available phenotypic OIV 452(-1) scores, two VvDMR6.2 mutant haplotypes (number 10 and 8) were found more represented in DM resistant genotypes. It is relevant to highlight that none of the scouted target genes are underlying known resistance QTLs and no R loci discovered in grapevine so far were detected in the eight genotypes carrying these two haplotypes, except for the partial resistant Rpv3-3 in three genotypes (Vezzulli S., personal communication). These observations suggest a potential effect of the mutant haplotypes in the defense response to DM. In grapevine, in addition to pursue association studies in large sample panels [120,121], some research works have lately been focusing on the haplotype investigation to dissect the relation between genetic diversity and cis-regulated gene expression in disease-related genes [122,123].

Scouting of Amino Acid Changes
DMR6 was identified as a putative 2-oxoglutarate (2OG)-Fe(II) oxygenase [9] and it revealed to share the WRD(F/Y)LR motif with DLO in flowering plant species [80]. Interestingly, Zeilmaker et al. (2015) [11] observed that non-conservative mutations in the catalytic sites (H212, H269, D214) of this protein were not able to restore susceptibility in an Atdmr6.1 mutant background, in a complementation experiment. Unfortunately, no impacting mutation has been observed in any of these positions, but others have been identified that could potentially alter the structure of the protein. In particular, six mutations classified as impacting ones and confirmed by Sanger sequencing were further investigated by mapping on a three-dimensional model of the proteins and by analyzing the amino acid degree of conservation in a sequence alignment.
Drawing conclusions on the actual disrupting impact of the detected mutations will only be possible upon enzymatic assays of wild type and mutant proteins or by indirect functional assays such as the confirmation of the response to DM of the genotypes carrying the different variants. Nevertheless, the in silico analysis on the three-dimensional model of DMR6 and DLO proteins can already provide some insights and guide further investigations. Of the six mutations, two (Y89H and I253K) appeared to have a larger impact than the other four on the protein structure and consequently on the enzymatic activity. These changes occurred in amino acids positioned in the hydrophobic core of the protein. They imply the switch from a hydrophobic character to a hydrophilic character of the side chains, which carry a positive charge in the mutated amino acids. The use of a three-dimensional model to map the impacting mutations helped in inferring with a good approximation the position of the amino acids within the structure, in particular whether they are on the protein surface or buried inside the core of the proteins, and whether they are part of beta-structures or alfa-helices. An additional hint of the importance of the Y89 and I253 residues came from the analysis of DMR6 and DLO sequence alignments both within the Vitis species, results from this study, as well on a larger set of species. Y89 corresponds to an extremely conserved phenylalanine in other DMR6 and DLO sequences and this is an indication of the importance of an aromatic residue in that position. Interestingly, the amino acid following phenylalanine in several DLO sequences is a histidine. I253 is even more conserved in the sequence alignments and it is only in a few cases substituted by a leucine or a valine, which bear the same chemical properties. This suggests a structural and functional role of this amino acid in that specific position, which would be likely disturbed by the mutation into a lysine, as it was observed in one of the studied genotypes.

Ultimate Application of S Genes
The genetic and protein data observed together with the phenotypic data (Table 2, Figure 2A,B, Figure 3) provide a well-rounded view of the role of the genes scouted here. The VvDMR6.2 gene arouses a particular interest. The broader genetic analysis allowed us to observe that this gene shows two haplotypes (number 10 and 8) which are more frequently represented in DM resistant genotypes. Through the more focused analysis on the impact of Sanger-confirmed mutations, both haplotypes were found to share the genetic mutation responsible for the amino acid variant E53G. This finding suggests a decisive role of VvDMR6.2 as S gene to grapevine DM and confirms the reliability of the bottleneck analysis here carried out (Figure 1).
Induction of plant defense signaling involves the recognition of specific pathogen effectors by the products of specialized host R genes. Numerous plant R genes have already been identified and characterized and they are being efficiently used in crop improvement research programs [1]. However, especially in tree species, selection of desirable resistant mutants comes with a cost of lengthy and laborious breeding programs. The effort required to produce resistant plants is often baffled within a few years from the selection because the pathogen evolves mechanisms to circumvent the R gene mediated immunity [124,125]. Exploitation of inactive alleles of susceptibility genes seems to be a promising path to introduce effective and durable disease resistance. Since S genes' first discovery [6], converting susceptibility genes in resistance factors has become an increasingly complementary strategy to that of breeding for R loci [4], and the advent of new reliable genome editing tools has enhanced this trend. The use of genome editing technologies such as CRISPR-Cas9 allow to specifically and rapidly target susceptibility genes to indirectly obtain resistance in a chosen genetic background, which is highly desired in crops like grapevine where the genetic identity is economically important. Recently, the S gene MdDIPM4 was targeted in apple for a genome editing-driven knock out, resulting in edited plants showing reduced susceptibility to the bacterial pathogen Erwinia amylovora [126]. A similar approach was carried out by Low et al. (2020) [127] on Hv2OGO gene in barley conferring resistance to Fusarium graminearum. However, generation of edited plants and testing of their phenotype still requires years [128,129]. S genes may play different functions in the plant, thus pleiotropic effects associated with their knockout may entail a certain fitness cost for the plant. Recently, quantitative regulation of gene expression has been achieved with genome editing on cis-regulatory elements [125,130,131] and this might be a strategy to limit negative drawbacks associated with a reduced S gene function.

Conclusions
In this framework, the broad investigation of genetic diversity (until the haplotype level) related to a disease resistance trait presented here has the potential to become a resource in different contexts of plant science, both through the future integration of transcriptomics, proteomics and metabolomics data and as such. The identification of specific homozygous variants in the natural pool can in fact guide genome editing projects in targeting mutations that occur 'naturally'. This "tailored gene editing" that mimics natural polymorphisms has recently been demonstrated by Bastet et al. (2017Bastet et al. ( , 2019 [132,133]. Finally, breeding programs could benefit from information on selected homozygous and heterozygous S gene mutations by implementing a next-generation marker-assisted strategy. Supplementary Materials: The following are available online at https://www.mdpi.com/2218-2 73X/11/2/181/s1: Figure S1. CLUSTALW alignment of bonafide and putative DMR6 and DLO proteins from different species, Table S1. List of studied grapevine genotypes, Table S2. Selected genotypes for Sanger sequencing of each gene, investigated variants with their physical position, and sequencing primers, Table S3. List of impacting mutations with positions and data in VCF (Variant Call Format), Table S4. List of genotypes showing impacting mutations-heterozygous (He) or homozygous (Ho) status-in at least one gene, Table S5. Haplotype identification and frequencies determined for the VvDMR6.1, VvDMR6.2, VvDLO1, VvDLO2 genes.