Using Genotyping by Sequencing to Map Two Novel Anthracnose Resistance Loci in Sorghum bicolor

Colletotrichum sublineola is an aggressive fungal pathogen that causes anthracnose in sorghum [Sorghum bicolor (L.) Moench]. The obvious symptoms of anthracnose are leaf blight and stem rot. Sorghum, the fifth most widely grown cereal crop in the world, can be highly susceptible to the disease, most notably in hot and humid environments. In the southeastern United States the acreage of sorghum has been increasing steadily in recent years, spurred by growing interest in producing biofuels, bio-based products, and animal feed. Resistance to anthracnose is, therefore, of paramount importance for successful sorghum production in this region. To identify anthracnose resistance loci present in the highly resistant cultivar ‘Bk7’, a biparental mapping population of F3:4 and F4:5 sorghum lines was generated by crossing ‘Bk7’ with the susceptible inbred ‘Early Hegari-Sart’. Lines were phenotyped in three environments and in two different years following natural infection. The population was genotyped by sequencing. Following a stringent custom filtering protocol, totals of 5186 and 2759 informative SNP markers were identified in the two populations. Segregation data and association analysis identified resistance loci on chromosomes 7 and 9, with the resistance alleles derived from ‘Bk7’. Both loci contain multiple classes of defense-related genes based on sequence similarity and gene ontologies. Genetic analysis following an independent selection experiment of lines derived from a cross between ‘Bk7’ and sweet sorghum ‘Mer81-4’ narrowed the resistance locus on chromosome 9 substantially, validating this QTL. As observed in other species, sorghum appears to have regions of clustered resistance genes. Further characterization of these regions will facilitate the development of novel germplasm with resistance to anthracnose and other diseases.

across replicates were given a score 0 (resistant), and all other lines were scored as 1 (susceptible). For the 'resistant scale', lines with an average original score of ≤2.5 were scored as 0 (resistant), while the remaining lines were scored as 1 (susceptible). For the 'susceptible scale', lines with an average original score of ≥4 were scored as 1 (susceptible), and the remaining lines as 0 (resistant). For the 'polarized scale', lines with original scores of only 1 or 2 across all replicates by location were scored as 0 (clearly resistant), lines with original scores of 4 or 5 across replicates as 2 (clearly susceptible), and lines with scores that varied between clearly resistant and clearly susceptible across replicates were scored as 1 (ambiguous). Missing data are due to failed germination or lack of growth.  (Table S2), with heterozygous markers treated as missing. This dataset identifies the SNP marker number (Marker), chromosome on which the SNP marker resides (Chrom), and the position (in basepairs) of the SNP on the chromosome (Genomic Position).
The -log10(p-value) for the marker at each location (Live Oak 2013 significance, Citra significance, Live Oak 2015 significance) is denoted with an asterisk to indicate a score above the FDR threshold of 5%. The identity of each GBS-derived marker allele is also provided (Early Hegari-Sart, Bk7). based on a significant association between SNP markers and phenotype (Locus Area), the number of markers within the locus (Number of Markers), the number of markers per Mb (Markers/area), the marker with the lowest p-value given as the -log10(p) (Highest Score), the identity of the marker with the lowest p-value (Marker Name) and its position (Marker position).

Table S5
Allele-specific PCR to identify 'Bk7'-derived alleles on chromosome 9 in four anthracnose-resistant cultivars. The name (Primer Name) and sequence (Primer Sequence), (listed 5' to 3') for each primer are provided. The GBS-derived SNP markers that formed the basis for the primers are identified by their name (SNP Marker), position on chromosome 9 (Position, in basepair), and the identity of the 'Bk7' allele of the SNP marker (SNP Nucleotide).
The presence of a 'Bk7' SNP allele in the cultivars (Cultivars) at the various SNP markers is indicated by a gray "X". The optimal annealing temperature (Tm , ∘ C ) for each primer pair and the PCR program are also provided.         Association between the markers and disease phenotype using the four different resistance scales (see Table S2   mapping populations are provided with the maker name (marker), the sorghum chromosome on which the marker resides (chrom), the marker position on the chromosome (position), and the parental nucleotide for 'Early Hegari-Sart' (EH-S) and 'Bk7' (Bk7). Genotypic data of the lines are reported with a biparental classification: "a" and "b" indicate alleles derived from 'Early Hegari-Sart', and 'Bk7' respectively, "h" refers to a heterozygous marker, and "-" indicates markers removed due to their origin from the bmr12 donor line used to generate EH-S bmr12.

File S3
Final GBS markers from 2015 after all filtering steps are applied. Markers for the F4 mapping population are provided with the maker name (marker), the sorghum chromosome on which the marker resides (chrom), the marker position on the chromosome (position), and the parental nucleic acids for 'Early Hegari-Sart' (EH-S) and 'Bk7' (Bk7). Genotypic data for the individual lines are reported with a biparental classification: "a" means alleles derived from 'Early Hegari-Sart', "b" means 'Bk7' alleles, and "h" means heterozygous calls.

File S6
Original GBS data from 2013 in hapmat format. The hapmat format, written as .hmp in the file extension, provides information on the marker name (rs#), the variant nucleotides in the SNP marker (alleles), the chromosome or super contig to which the marker aligned (chrom), the position on the chromosome (pos), whether the sequence aligned to the forward or reverse strand (strand), and the sequencing results for the different lines, including the parents. The column headings for the sequencing data are in a coded format that is changed to the correct line number when imported into the SAS filtering code (Supplemental File S1).

File S7
Original GBS data from 2015 in hapmat format. The hapmat format, written as .hmp in the file extension, provides information on the marker name (rs#), the variant nucleotides in the SNP marker (alleles), the chromosome or super contig to which the marker aligned (chrom), the position on the chromosome (pos), whether the sequence aligned to the forward or reverse strand (strand), and the sequencing results for the different lines, including the parents.