Data on genetic polymorphism of flax (Linum usitatissimum L.) pathogenic fungi of Fusarium, Colletotrichum, Aureobasidium, Septoria, and Melampsora genera

Being a valuable agricultural plant, flax (Linum usitatissimum L.) is used for oil and fiber production. However, the cultivation of this agriculture faces an urgent problem of flax susceptibility to fungal diseases. The most destructive ones are caused by the representatives of Fusarium, Colletotrichum, Aureobasidium, Septoria, and Melampsora genera, reducing flax yields significantly. To combat such pathogens effectively, it is of high importance to assess their genetic diversity that can be used to develop molecular markers to distinguish fungal genera and species. Morphological analysis traditionally carried out for fungal identification requires a given amount of time and tends to be difficult. In the present work, we determined the DNA sequences that are frequently used for phylogenetic studies in fungi – internal transcribed spacer (ITS) and beta-tubulin (tub2), translation elongation factor 1-alpha (tef1), RNA polymerase II largest subunit (RPB1), RNA polymerase II second largest subunit (RPB2), and minichromosome maintenance protein (MCM7) genes – for 203 flax fungal pathogens of Fusarium oxysporum, F. avenaceum, F. solani, F. sporotrichiella, F. moniliforme, F. culmorum, F. semitectum, F. gibbosum, Colletotrichum lini, Aureobasidium pullulans, Septoria linicola, and Melampsora lini species. The sequencing was performed using the Illumina MiSeq platform with a 300+300 bp kit, and on average, about 2350 reads per sample were obtained that allows accurate identification of the genetic polymorphism. Raw data are stored at the Sequence Read Archive under the accession number PRJNA596387. The obtained data can be used for fungal phylogenetic studies and the development of a PCR-based test system for flax pathogen identification.

Being a valuable agricultural plant, flax ( Linum usitatissimum L.) is used for oil and fiber production. However, the cultivation of this agriculture faces an urgent problem of flax susceptibility to fungal diseases. The most destructive ones are caused by the representatives of Fusarium, Colletotrichum, Aureobasidium, Septoria , and Melampsora genera, reducing flax yields significantly. To combat such pathogens effectively, it is of high importance to assess their genetic diversity that can be used to develop molecular markers to distinguish fungal genera and species. Morphological analysis traditionally carried out for fungal identification requires a given amount of time and tends to be difficult. In the present work, we determined the DNA sequences that are frequently used for phylogenetic studies in fungi -internal transcribed spacer (ITS) and beta-tubulin ( tub2 ), translation elongation factor 1alpha ( tef1 ), RNA polymerase II largest subunit ( RPB1 ), RNA polymerase II second largest subunit ( RPB2 ), and minichromosome maintenance protein (

Value of the data
• The dataset could be actively used for the assessment of genetic diversity and phylogenetic investigations of fungi belonging to Fusarium, Colletotrichum, Aureobasidium, Septoria , and Melampsora genera.
• The data can support those working in the field of molecular genetics of fungi and those who work with plant fungal pathogens. • The kinship and evolution of fungi of Fusarium, Colletotrichum, Aureobasidium, Septoria , and Melampsora genera could be estimated using the provided dataset. • The generated data open up an opportunity to create a test system for the identification of flax pathogens basing on the information on genetic polymorphism.

Data Description
Being a source of oil and fiber [1] , flax is affected by numerous fungal pathogens. The most serious diseases of this plant are caused by the representatives of such species as Fusarium oxysporum f. sp. lini, F. avenaceum, F. culmorum, Melampsora lini, Colletotrichum lini, Septoria linicola , and Aureobasidium pullulans [2 , 3] . Information on the genetic diversity of the listed species is lacking but is of high importance for the determination of their kinship and evolution, as well as for the development of effective ways to combat these flax pathogens.

DNA extraction and quality control
DNA was extracted from fungal mycelium according to the standard CTAB protocol. Agarose gel electrophoresis (2% agarose) and the Qubit 2.0 fluorometer (Thermo Fisher Scientific, USA) were used to control DNA quality and evaluate DNA quantity.

DNA library preparation and sequencing
We amplified DNA fragments of 203 samples of fungi comprising ITS and tub2, tef1, RPB1, RPB2 , and MCM7 genes. Amplicon libraries were prepared according to the protocol [9] with two-stage PCR as we described earlier [10] . Amplification of target sequences using primers that comprised target-specific sequences [8 , 11] and overhang adapters was performed in the first step (Supplementary Table S1). For each sample, amplicons were then equimolarly pooled and the second PCR was performed with primers consisted of dual-index barcodes and sequencing adapters. Next, all PCR-products were equimolarly pooled and the quality of the library was evaluated by the 2100 Bioanalyzer (Agilent Technologies, USA), while the quantity -by the Qubit 2.0 fluorometer (Thermo Fisher Scientific). The library was sequenced using the MiSeq platform (Illumina, USA) and the Illumina MiSeq Reagent Kit v3 (2 × 300 bp reads). . We thank the Center for Precision Genome Editing and Genetic Technologies for Biomedicine, EIMB RAS for providing the techniques for targeted deep sequencing. The sequencing was performed using the equipment of EIMB RAS "Genome" center ( http://www.eimb.ru/ ru1/ckp/ccu _ genome _ ce.php ).