DNA-Based Molecular Markers and Antioxidant Properties to Study Genetic Diversity and Relationship Assessment in Blueberries

: Blueberries ( Vaccinium L. spp.) are an economically and medicinally important plant. Their antioxidant properties are well-known for their medicinal value in negating the harmful effects of free radicals. It is very important to develop genotypes that are high in health-promoting factors and economic value to meet present world needs. Estimation of genetic diversity using molecular markers, antioxidant properties, and their association can reveal genotypes with important characteristics and help in berry improvement programs. Wild blueberries are a better source of antioxidant metabolites compared to cultivated ones. Extensive variations are present in molecular and biochemical contents among wild clones and cultivars. The current review provides detailed and updated information on the economic and medicinal importance of blueberries, the application of molecular markers, and biochemical estimation in berry improvement and conservation, ﬁlling the gap in the literature.


Introduction
Blueberries (Vaccinium L. spp.; family: Ericaceae; tribe: Vaccinieae; subfamily: Vaccinioideae) that include diploids (V. corymbosum L., V. tenellum Ait., V. myrtilloides Michx., V. darrowi Camp, V. pallidum Ait., V. elliottii Chapm., V. boreale Hall & Aald.), tetraploids (V. corymbosum, V. angustifolium Ait, V. hirsutum Buckley, V. myrsinites Lam., and V. simulatum Small), and hexaploids (V. constablaei Gray and V. ashei Reade) are native to North America and are known as "super fruits" due to their abundant polyphenolic and anthocyanin compounds possessing high antioxidant capacity [1]. Aneuploid and pentaploid blueberries from interploidal hybridisation are also available [2]. They are woody, perennial shrubs, grow in sandy, acidic, peaty, or organic soils, and bear fruits in clusters. The cultivars in the Vaccinium section Cyanococcus: V. angustifolium (lowbush (LB) blueberry), V. corymbosum (highbush blueberry), and V. ashei (rabbiteye (RE) blueberry) have the most commercial value [3] and constitute the primary blueberry gene pool. The noncultivated species in Cyanococcus represent the secondary gene pool of blueberries [4]. While highbush blueberries can be Northern (NHB) or Southern (SHB), half-high (HH) blueberries are the hybrids between V. angustifolium and V. corymbosum. The USA is the largest blueberry producer in the world, followed by Canada. Although all commercial types of blueberries are produced in the USA, only LB and NHB blueberries are mostly grown in Canada. LB blueberries are found in the wild in Maine of the USA and in the Atlantic Provinces (New Brunswick, Nova Scotia, Prince Edward Island, and Newfoundland and Labrador) and Quebec in

Lowbush (LB) Blueberry
LB blueberries mostly constitute V. angustifolium but also include V. myrtilloides and are known as wild or sweet blueberries [3,7]. They are woody, deciduous shrubs, 0.3-0.6 m in height, and generally grow in the wild. Leaves are elliptical, pale to dark green, and 5-20 mm × 16-40 mm in size, with uniformly serrated margins and predominantly glabrous/smooth or hairy surfaces. LB blueberry shoots are erect and form wide, dense colonies with underground rhizomes around 4.5 mm in diameter and can grow about 6 cm in depth. The colour of smooth LB blueberry stems varies from tan to red. The bell-shaped self-incompatible flowers are generally white or pinkish-white and are borne in small, few-flowered terminals or axillary racemes. Fruits are orbicular, oval, intermediate in size, and blue to dark blue with or without a waxy coating [3]. The pedicel scar is medium, and the calyx end is closed. LB blueberry plants are cold-hardy bushes and are intolerant to harsh summer heat, with an average annual minimum temperate range from −17.8 to −12.3 • C [3].

Northern Highbush (NHB) Blueberry
Crown-forming highbush plants are usually 1.8-2.4 m tall. They are naturally found in Nova Scotia, Wisconsin, Georgia, and Alabama [3]. While highbush blueberry plants have angular to terete and glabrous to densely pubescent stems, they have oval to narrow elliptical leaves, 20-30 mm wide and 40-80 mm long. The pubescent or glabrous leaf blades have entire or sharply serrate margins. The flowers are pink, white-pink, or white and cylindrical with green or glaucous calyx [8]. Berries are black, blue, or dull black.

Southern Highbush (SHB) Blueberry
SHB blueberries were developed from crossing between V. corymbosum and V. darrowi Camp, with V. ashei. They were explicitly hybridized for improved fruit, adaptability to the soil, tolerance to heat, and chilling to low winter conditions [8]. They have bell-shaped white flowers and powder-blue to medium-blue berries with good flavour. Although some cultivars are self-fertile, better production is achieved when two or more varieties are planted together.

Medicinal Value of Blueberry
Fresh blueberries are a rich source of essential nutritional components and consist of 83% water, 0.7% protein, 0.5% fat, 1.5% fibre, 15.3% carbohydrates, and 3.5% cellulose [13]. Apart from these primary components, blueberries also contain high anthocyanin, flavonols, catechins, or proanthocyanidins. Proanthocyanidins or condensed tannins show the anti-adhesion activity of the uropathogenic bacteria Escherichia coli [14]. The supplement of blueberries to the diet significantly increases serum anti-oxidation activities, lowering blood pressure, and blood cholesterol, which may have links to the inhibition of cardiovascular diseases and atherosclerosis [15][16][17][18]. Blueberry phenolic and anthocyanin contents have anti-cancerous properties that can influence many factors in the carcinogenesis process, including the induction of apoptosis, the inhibition of oxidation, anti-proliferation, and the decline of metastases and invasion [19,20]. Others also reported similar observations [21][22][23][24]. The contents of blueberries were shown to assist the brain in numerous ways, including protection from inflammatory reactions and oxidative stresses [25], direct alteration of cell signaling for neuronal communication and effects on memory [26], the ability of calcium buffering [27], and bone protection [28]. Blueberry extract reportedly reduced cataracts and significantly reduced lipid oxidation in blood in animal models [29]. The phenolic-rich extract and an anthocyanin-enriched fraction of blueberries have antidiabetic properties [30], and they scavenge free radicals to improve oxygen delivery to the eyes in humans [31]. Debnath-Canning et al. [32] reported that cell treatment with the extracts of fruits and leaves of wild LB blueberries repressed cell death and reduced neuroinflammation, suggesting that a diet rich in blueberry leaves and/or fruits can be useful for brain health to protect neurodegenerative disorders.

Antioxidant Activity of Blueberry Phenolics
Years of scientific research in the quest for a "superfood" [33] suggest that it is now no longer a secret that fruits and vegetables contain metabolites that possess antioxidant activity, which is positively linked to their beneficial effects on health. Blueberries are most famous for their high antioxidant phytochemical contents, mainly the phenolic metabolites that furnish an important role in plant defense and human health aids other than the primary nutrients, including proteins, carbohydrates, minerals, fats, and vitamins [7]. Phenolic compounds are the main group of phytochemicals that are widely distributed in plants. Plant phenolics are secondary metabolites that are naturally produced as well as produced in response to various stresses, including metal toxicity, drought, chilling, wounding, and nutrient deficiency [34], via the shikimate acid pathway from aromatic amino acids, l-phenylalanine, and/or l-tyrosine [35][36][37]. They perform various activities, from acting as a signal in plant-microbe interactions, pigmentation, protection against UV light, pollination, and antioxidants to protecting against pathogenic attacks [38]. They mainly include flavonoids, phenolic acids, tannins, and other phenolic compounds [39]. Most phytochemicals are divided into flavonoids or non-flavonoids [40]. Details on blueberry phytochemicals were reviewed elsewhere [7].
Blueberry is well-known for its ability to negate the harmful effects of free radicals and reactive oxygen species in the body. Blueberry possesses high levels of hydroxycinnamic acids (coumaric acid, caffeic acid derivatives, chlorogenic acid, and benzoic acids) [39,41] and flavonoids (flavonols-quercetin derivatives, proanthocyanidins, anthocyanidins, catechins, and glycosides of catechins) [42,43]. Anthocyanidins such as cyanidin, delphinidin, malvidin derivatives, peonidin, and petunidin are generally found in blueberries [44,45]. Anthocyanins are the bioactive flavonoids responsible for the vibrant colours of the fruits, leaves, and other parts of blueberry plants [33]. When ripe, blueberry fruits may contain up to 60% of the anthocyanin flavonoids out of the total polyphenolics, suggesting that anthocyanins may play the greatest role in health benefits due to blueberry consumption [1]. In blueberry fruits, proanthocyanidins are extensively dispersed, where they have strong bondage with proteins and carbohydrates and work as potent free radical scavengers [46,47].
The antioxidant activities of blueberries depend on their phytochemical components, structures, and redox potential [48]. Phenolic compounds donate an electron or a hydrogen atom to a free radical with one or more unpaired electrons; convert it into a neutralized, non-harmful molecule; and act as antioxidant molecules in vitro and in vivo [49]. O 2 -, H 2 O 2 , and ·OH are called reactive oxygen species (ROS) and are naturally produced in organisms as a part of the regulatory process. However, ROS imbalance leads to oxidative stress, causing structural and functional changes to biomolecules [50]. Free radicals are the main reason leading to DNA (deoxyribonucleic acid), protein, and lipid damage, and numerous other diseases, including cancer, cardiovascular diseases, neurodegenerative diseases, diabetes, arthritis, and several other diseases [50]. There are a number of reasons leading to an imbalance in ROS generation, which are involved directly or indirectly. These include abiotic factors such as chill [51] and heat stress; xenobiotic compounds such as pollutants [52,53], ozone [54], and phosgene [55]; organic compounds [56]; and certain heavy metals [57][58][59]. In blueberries, antioxidant activities are positively correlated with the total phenolic, flavonoid, anthocyanin, and proanthocyanidin contents [60][61][62]. The overall antioxidant activity may be a function of various phytochemicals, working together or synergistically, and depends on the antagonistic interaction of various compounds and environmental factors [63].
The phenolic compounds and their antioxidant activities in blueberries depend on the species and genotypes; maturity time; plant tissue types; growing conditions, including seasons and locations; harvest time; and storage conditions after harvest [7]. While RE blueberries have more polyphenolic contents than northern and SHB blueberries [64,65], the total phenolic and anthocyanin contents of LB blueberries are higher than those of highbush and RE blueberries [60,66]. Similar observations were also reported by Giovanelli and Buratti, [44], where wild blueberries had far more total phenolic content (2.99-6.00 mg/g) than highbush blueberries (1.81-3.90 mg/g). Tsuda et al. [67] reported more anthocyanin content in RE blueberries than in northern and SHB blueberries. The total phenolic content varied at a high level (2-10 times) among the highbush blueberry cultivars, while the variation was moderate in HH (2-5 times) and low in RE and LB blueberries [7]. The total anthocyanin content among LB and highbush blueberry cultivars and their hybrids ranged from 0.18-3.4 mg/g. The anthocyanidin levels are higher in the cultivars of HH blueberries than in those of highbush blueberries [68]. The anthocyanin content varied at a high level in the RE blueberry genotypes and at a low level in the HH blueberry cultivars [7].
A wide variation in the antioxidant properties exists between blueberry species/types and among different blueberry genotypes within the same species/type. The antioxidant activity in LB blueberries was higher than in highbush and RE blueberries [60,61]. The LB blueberry wild clone (individual genotype) had higher antioxidant activity than the LB cultivar [61]. Bhatt and Debnath [63] reported a wide diversity in the leaves of LB wild clones and HH and NHB blueberry cultivars for antioxidant properties, where the variation for DPPH radical scavenging activity was the highest (20-fold), followed by total flavonoid (16-fold) and phenolic contents (3.8-fold). They also reported the highest diversity in a hybrid group of 11 genotypes for antioxidant activity (15-fold) and, in wild clones collected from Quebec, Canada, for total flavonoid (6.9-fold) and phenolic contents (2.8-fold) [63]. Working with leaves, Wang et al. [64] observed higher antioxidant activity in most RE blueberry cultivars than in NHB and SHB blueberries. The leaves of blueberry wild clones and cultivars have higher antioxidant activity [69], polyphenolics, and proanthocyanidins than the fruits [61,62,70,71]. The results indicated the possibility of using blueberry leaves for tea production and food additives for health promotion [64].
In blueberries, the bioactive compounds within and between the ploidy levels and species and their associations with fruit quality traits were compared by Mengist et al. [72]. They evaluated 33 health-related phytochemicals belonging to four major groups of flavonoids and phenolic acids across 128 blueberry accessions over two years, together with fruit quality traits, including fruit weight, titratable acidity, total soluble acids, and pH. Highly significant variations between accessions, years, and accession-year interactions were identified for most of the traits. Uleberg et al. [73] reported that northern bilberry (V. myrtillus L.) clones showed significantly higher anthocyanins and phenolics than southern clones. Similarly, the anthocyanidin concentrations in the wild clones and cultivated bilberry of further north origins or from northern parents were higher [74].

Genetic Diversity
The characterization of germplasms and preservation of genetic diversity has become a common motivation for geneticists and breeders to set up new breeding programs aiming for the genetic improvement of plant species for specific purposes [75]. Genetic diversity is the key to a healthy plant population, as it maintains different genes that could lead to higher fruit yield and quality and better resistance to pests and diseases and enable individuals to adapt to various biotic and abiotic stresses [76]. Wild species, breeding stocks, and mutant lines are used to develop and improve crop varieties that possess climatic adaptation with all desirable traits, including resistance to various biotic and abiotic stresses [77]. Novel genes tolerant to biotic and abiotic stresses and genes for quality traits and aesthetic properties need to be conserved for future use in blueberry breeding programs. Before discovering DNA-based markers, earlier civilizations used plant phenotypes for selection and breeding. The recent improvements in DNA-based marker technologies for berry crops, especially for characters that do not allow visible characterization, would help breeders run crop improvement exercises with high accuracy and in a time-dependent manner [4]. It is of prime importance that genetic diversity is maintained for species to survive and adapt to the rapidly changing world environment. The improvement and propagation of cultivars with higher agronomic values have replaced wild clones with heterogeneous cultivars. This loss of genetic diversity is contributed to by selective propagation and breeding and agricultural practices, climate change, urbanization, natural disasters, and the movement of people on a large scale due to war.

Estimation of Genetic Diversity
Markers are essential for the accurate measurement of genetic diversity. They measure the relatedness and differences at different levels, depending on the type of marker/s employed, between the individuals of a population or, in some cases, among different populations [78]. There are mainly three types of marker systems: (1) morphological, (2) biochemical, and (3) DNA-based or molecular markers; the first two were traditionally used and are still being used for berry genetic diversity analysis in combination with the third class of marker systems. Biochemical markers started with the development of electrophoretic assays, especially for isozymes, to separate proteins on the gel that enabled the understanding of hereditary variation within organismal genomes [79]. To evaluate genetic variability, classical strategies, including morphology, comparative anatomy, physiology, and embryology, have been supplemented by molecular and biochemical approaches such as metabolomics and molecular markers. The isozyme (allozyme) markers are codominant but provide fewer markers than the DNA-based markers that identify variations in the nucleotide sequence at a specific genome location. Unlike morphology, comparative anatomy, physiology, and embryology, DNA markers create DNA "fingerprints" distinctive of DNA fragments [79].
The choice of marker depends on its specific strengths and limitations, e.g., the abundance of markers and the nature of inheritance. Before the 1980s, genetic variations were evaluated by anatomy, morphology, embryology, and physiology [4,79]. Chemical content analysis is essential for the profiling of the compounds (metabolites) present in plants in response to various growth conditions. These metabolites act as markers of the quality traits that help during breeding experiments. The selection of secondary metabolites, which are unique and differentiate between varieties, largely depends on the plant's ability to produce such metabolites. Metabolites can be used as markers when they are independent of external factors, such as the environment. Isozymes that differ in their amino acid sequence but have the same catalytic function are the other biochemical markers used in plant diversity analysis [80]. The limitation of isozyme markers is that fewer markers are available compared to DNA-based methods. However, the morphological traits of the leaf, flower, and fruit were the traditional choices for variation analysis. These markers have a limitation in distinguishing cultivars that are closely related, when morphological indices alone are not helpful. In addition, morphological characters are often affected by environmental effects. The constraints of morphological markers were overcome and often complemented with biochemical markers. Staining protocols are available for a limited number of enzyme loci, and toxic staining ingredients and the inability to distinguish the bands of the two subunits of an enzyme also make its application reasonably limited. The comparison of genetic material, independent of an environment, is the crucial feature of DNA-based markers, and it has a far better resolving capacity for genetic variability than isozymes [81]. Applying molecular markers as complementary techniques of morphological or biochemical markers provides a complete understanding of diversity, which can best be exploited to enhance agricultural production, sustainable food, and nutrition supply [78]. The ability to make billions of copies of short segments of desired DNA in no time using polymerase chain reaction (PCR) has enabled scientists to use many sophisticated techniques for population genetic studies and diversity analysis [82].

DNA-Based Markers
A molecular marker is a stretch of nuclear, chloroplast, and mitochondrial DNA that is highly heritable and carries conserved information [83]. The limitation of biochemical markers led to the development of DNA markers that detect polymorphism as nucleotide sequence variation at a particular location in the genome [4]. Many available commercial DNA markers allow DNA fingerprints to be compared [84]. DNA-based markers ( Table 2) can be dominant (which cannot differentiate between the alleles of a gene) or codominant (which can distinguish between the alleles of a gene). The DNA-based markers for the assessment of genetic diversity provided the basis for the development of techniques that target more specific applications. The DNA markers extensively used in past decades are simple sequence repeats (SSR), sequence-tagged sites (STS), randomly amplified polymorphic DNA (RAPD), single strand conformation polymorphism (SSCP), cleaved amplified polymorphic sequence (CAPS), sequence characterized amplified region (SCAR), intersimple sequence repeats (ISSR), randomly amplified microsatellite polymorphism (RAMP), amplified fragment length polymorphism (AFLP), microsatellite primed-PCR (MP-PCR), expressed sequence tags (EST)-PCR, single nucleotide polymorphism (SNP), sequencerelated amplified polymorphism (SRAP) and target region amplification polymorphism (TRAP) [4]. The marker techniques can be categorized into three types: (1) techniques that use hybridization not PCR (e.g., RFLP); (2) techniques that use PCR application (e.g., AFLP, RAPD, ISSR, SSR, EST-PCR, CAPS, and SCAR); (3) DNA microchip-based techniques. There are two subcategories in PCR-based techniques: (1) non-specific-sequence PCR-based techniques such as RAPD and AFLP and (2) sequence-specific PCR-based techniques such as SSRs [4,85]. The selection of a marker is determined by the purpose. Therefore, it is essential to classify the criteria through which a marker system is shortlisted. Ideally, a molecular marker should be polymorphic and codominant; moreover, it should be frequently present along the genome, readily available, and reproducible [83].
A number of DNA-based markers were used to study genetic diversity and relationships in different plant species. RFLP reveals variation in DNA sequences by the presence or absence of fragments produced by the restriction of endonucleases [86]. Most RFLP markers are codominant and, thus, able to detect both alleles in a heterozygous sample. RFLP markers are moderately polymorphic and highly locus-specific and reproducible [78]. However, this technique is practically laborious, time-consuming, and lacking in automation [86]. The foundation of the RAPD technique is an amplification of genomic DNA based on the difference in the sequence. The polymorphism in the sequence may be caused by mutations. Due to its relatively fast speed and higher efficiency than RFLP, genetic maps were developed in many plant species using RAPD. Due to producing false-positive bands, non-reproducibility, and dominant inheritance, RAPD is a less preferred tool for genome-wide analysis. The AFLP technique was developed to overcome the drawback of reproducibility in the RAPD technique. It is also a PCR-based technique that applies both RFLP and RAPD. The selective amplified microsatellite polymorphic locus is a variant of AFLP that uses an AFLP primer that is complementary to microsatellite sequences. The ISSRs are DNA fragments of roughly 100-3000 bp in size that are flanked by inversely placed microsatellite regions where primers are designed for microsatellites to amplify inter-simple sequence DNA sequences. Primers with overlapping flanking regions avoid self-priming and smear formation. SSR or microsatellite markers are tandem repeats or simple sequence repeats of DNA that flank the DNA region (coding or noncoding) and are present throughout the whole genome. The repeats are created during DNA replication because of strand slippage. The frequency of slippage mutations occurring depends on the length (di-, tri-, tetra-, or pentanucleotide), the number of repeats, and the nucleotide composition of the microsatellite [87]. SSRs can be screened through various online sequence databases, and primers can be designed. Due to their high variability, SSRs are preferred in distinguishing closely related cultivars [80]. SSRs are highly informative, codominant, multi-allele genetic, and reproducible, but they are primarily transferable among closely related species, as they are expensive and time-consuming [88]. The availability of many genomic sequences of various species makes it easier for researchers to look for a more specific and transferable SSR. A thorough search across species for gene-based SSRs rather than random SSRs is required to select transferable SSR markers. In this case, the EST-SSR databases are an excellent alternative for the SSRs of many species [89]. SNP denotes the change in a single base in a DNA sequence. SNPs are the most widely distributed molecular markers through genomes [90]. Their frequency of occurrence in the genome varies among species; for example, one SNP is present at every 60-120 bp in maize [91], every 268 bp in rice [92], and every 185-266 bp in soybean genomes [93]. SNP can be present in the coding or noncoding regions of a genome. It can lead to changes in the DNA sequence, and, thus, it may alter the amino acid sequences during transcription. SNPs can be the primary choice for many genetic studies due to their advantages, such as flexibility, a reduced error rate, and being less time-consuming [94]. Despite being highly abundant and codominant in inheritance, SNP genotyping is still not favourable because it is expensive and requires specialization. However, the recent development of fully automated high-throughput SNP genotyping platforms has dramatically reduced the costs and time of developing plant breeding schemes [94]. Moreover, SNP markers can be easily switched to the universal genotype information by integrating different SNP platforms. Many approaches were employed in the discovery of SNP, including hetero-duplex analysis [95] and next-generation sequencing (NGS; [96]). NGS technologies and the availability of a reference genome sequence for many plant species allowed the implementation of several methods for SNPs' discovery. The availability of NGS and EST libraries made it possible to evaluate genetic diversity at the DNA sequence level. A genetic association map of an interspecific diploid blueberry population was constructed with the help of SNP and other primer systems [97]. To be able to make the right choice of marker-based technology for study material and application, one should consider factors such as availability, pros-cons, cost, etc. Table 2 [83,98] summarizes the comparison of different molecular markers.

Genetic Diversity in Blueberries
A number of DNA-based genetic markers were used to study the biodiversity and relationship in blueberries ( Table 3). The studies before 2007 on the utilisation of DNA markers in genetic diversity analysis in blueberry species were reviewed by Debnath et al. [79]. RFLP was used to examine variations and organelle inheritance in NHB blueberry cultivars and V. ashei using chloroplast and mitochondrial DNA, where DNA was cleaved using 23 restriction enzymes manner cellulose [4,13]. Identical chloroplast DNA fragment patterns were displayed in all species and genotypes, but a high degree of polymorphism was observed in the mitochondrial genomes "Bluecrop" and "Jersey", which did not appear to have the "Rubel" cytoplasm as previously believed. Table 3. Examples of different types of molecular markers used in blueberries to analyse their genetic diversity.   (1) V. angustifolium V. corymbosum × V. angustifolium hybrids, and V. corymbosum EST-PCR, EST-SSR, G-SSR [121] LB (36), HH (4), NHB (2), and hybrids (28) V. angustifolium, V. corymbosum × V. angustifolium hybrids, and V. corymbosum Many researchers used RAPD to assess the genetic relationship among blueberry populations, cultivars, clones, and wild accessions. RAPD was used to explore the extent of genetic variation or closeness among RE blueberry cultivars and wild selections and found a reduced genetic distance between improved cultivars compared to wild accessions [79]. In a recent genetic diversity study among 45 blueberry cultivars, 210 polymorphic bands were observed using RAPD markers [105]. The cluster analysis divided them into two main clusters with a similarity value of 0.65. Cluster I consisted of four RE blueberries cultivars (Alapaha, Pink Lemonade, Titan, and Vernon) and an NHB cultivar, Ashworth.
In comparison, Cluster II comprised 31 NHB cultivars, 8 SHB blueberry cultivars, and Northland HH blueberry cultivars. Carvalho et al. [104] evaluated the genetic similarity of highbush blueberry cultivars using RAPD and ISSR markers. They observed 78% polymorphism among the cultivars while using fruit DNA, which was 84% in leaf DNA. Dendrogram and principal coordinate analysis resulted in the clear division of the types of highbush blueberry cultivars (northern and southern) into two distinct groups.
AFLP techniques have been used in Vaccinium spp. for several decades, as reviewed by Debnath et al. [79]. Yang et al. [123] examined the genetic relationship between the founder population and recently developed (24 years) animal-dispersed Vaccinium membranaceum (black huckleberry) AFLP markers on volcanic deposits at Mount St. Helens (Washington, USA). They observed genetic variation within new and source populations and no strong founder effect in the new population. The genetic diversity in the newly founded population was higher than in several source regions. Podwyszynska et al. [106] selected five diverse V. myrtillus genotypes from 21 accessions collected at the National Institute of Horticultural Research in Skierniewice, Poland, using AFLP-PCR. These genotypes were obtained from the Polish locations of Bolimów Landscape Park, Budy Grabskie, and forest complex Zwierzyniec (Łód'z Province) and Norwegian habitats. The UPGMA analysis using Jaccard similarity indexes showed that the accessions were clustered into two major groups: one with Norwayagian accessions and the other with accessions from Poland. The Polish accessions were further classified into two distinct sub-groups: one with accessions from Zwierzyniec and the other with accessions from Budy Grabskie, around 9 km away from Zwierzyniec.
ISSR markers were used to analyse genetic diversity among NHB, SHB, RE, and LB blueberry cultivars and wild clones. Six ISSR markers were used by Garriga et al. [107] in 10 highbush blueberry cultivars, which found 80% polymorphism among the cultivars. ISSR analysis exhibited higher polymorphism than the RAPD analysis among the cultivars and blueberry types [107]. Thirteen ISSR primers were found to identify a wide genetic variation in forty-three wild LB blueberry clones collected from four Canadian provinces and in a LB blueberry cultivar, "Fundy" [108].
SSR markers were used to differentiate blueberry accessions and assess genetic diversity in cultivated and wild highbush blueberries. Using 21 single-locus SSR markers, genetic relationships and the effects of hybridization on genetic diversity were studied in 68 genotypes of SHB blueberry by Brevis et al. [110]. They reported that SHB blueberry cultivars exhibited similar levels of molecular relatedness to NHB blueberry cultivars. Similar studies were conducted to assess genetic diversity and population structure [2]. Bassil et al. [111] identified 67 diploid individuals from V. elliottii, V. fuscatum, and V. darrowii accessions using 14 V. corymbosum cultivar "Bluecrop" SSRs. They also used a 5-SSR fingerprinting set of tri-nucleotide-containing SSRs, compared the fingerprints of blueberry cultivars, and extended the SSR set with another five SSRs to produce a 10-SSR fingerprinting set [112]. Vega-Polo et al. [113] used 16 species-specific SSR markers to characterize 100 Andean blueberry mortiño (V. floribundum Kunth) individuals collected from 27 sites in Ecuador and showed a high degree of genetic diversity (HE = 0.73) for the Ecuadorian mortiño. Population structure analysis indicated the presence of distinct genetic clusters among the southern, central, and northern highlands. Moreover, another cluster was also clearly differentiated, which included individuals from higher elevations in all locations.
EST-PCR and EST-SSR markers were used for genetic fingerprinting and studying relationships among highbush, RE, and LB blueberry cultivars and selections. The EST-PCR primers designed from highbush blueberries were used in LB [115] and RE blueberries [116]. They observed genetic variation among the genotypes that was in agreement with the pedigree information. EST-PCR markers distinguished even closely spaced V. angustifolium clones within the same field. A similar study was completed using 40 EST-SSR primers to differentiate 30 cultivars of blueberry [117]. Using 249 ESP-PCR markers, Rowland et al. [118] studied the phylogenetic relationships in 50 accessions of the Cyanococcus section of Vaccinium genus. Beers et al. [119] evaluated the genetic diversity of LB blueberry across its geographic range in the eastern United States using EST-PCR molecular markers. Most genetic diversity was found within almost all the 17 blueberry populations (75%), with each population genetically unique except the populations of Jonesboro and Lubec in Maine. They also investigated the effects of management for commercial fruit harvesting on genetic diversity in four locations in Maine. Significant differences between the four paired managed/non-managed populations were reported. The authors proposed that commercial management for fruit production influenced the diversity of LB blueberries in the landscape, even though they were naturally grown.
The use of more than one marker type is more effective than only one marker type in estimating genetic diversity and assessing the usefulness of DNA markers in germplasm conservation and plant breeding. A study involving 28 wild blueberry clones, 6 HH cultivars, and 2 selections of highbush blueberries used 10 EST-PCR and 2 EST-SSR primers to investigate the genetic structure and diversity [120]. Structured diversity and genetic relatedness were also investigated in 56 LB blueberry wild clones and 1 selection as well as in 6 HH, highbush, and LB blueberry cultivars, using 11 EST-PCR, 7 EST-SSR, and 2 genomic SSR markers [121]. A similar study was conducted by Bhatt and Debnath [63] to characterize a set of blueberry wild clones, cultivars, and hybrids at the molecular level by EST-PCR, EST-SSR, and genomic SSR markers.
Gailīte et al. [122] worked with EST-SSR and cpSSR (Chloroplast SSR) markers to investigate the population structure and genetic diversity of V. myrtillus, V. uliginosum, and V. vitis-idaea species. Wild Vaccinium populations were reasonably genetically distinguished, while some populations were greatly differentiated but without a greater order of group clustering. These results indicated the absence of dispersal barriers for these Vaccinium species within Baltic countries. The genetic diversity of populations grown in managed forests, protected areas, and intensively utilized public areas for recreation was similar. Using microsatellite loci, Rodríguez-Peña et al. [124] compared the reproductive biology and the population genetic structure of V. ekmanii and V. racemosum species in the Dominican Republic and evaluated the influence of anthropogenic behaviours on genetic diversity and pollinators. Pollinator elimination experiments showed that both species were predominantly outcrossing. The authors also reported that anthropogenic behaviours were related to the smaller size of the population, smaller genetic diversity, and small native pollinator frequency. Populations from both species exhibited low to moderate degrees of genetic diversity and differentiation. The authors confirmed that the degradation of habitat had a harmful effect on genetic diversity and pollinators.
Rowland et al. [118] used 249 EST-PCR markers to study phylogenetic relationships among 50 accessions of different blueberry species and observed that tetraploid V. corymbosum was most closely related to the two diploids, V. caesariense and V. fuscatum, followed by another diploid, V. elliottii. On the other hand, the LB blueberry tetraploid V. angustifolium was closely related to two diploids, V. myrtilloides and V. boreale, while the hexaploid V. virgatum had the most closeness with the diploid V. tenellum. These findings provided valuable information on the origins of these blueberry species.

Relationship between Biochemical and Molecular Analysis
Not many reports are available to compare genetic diversity based on molecular markers with that based on antioxidant properties. Wang et al. [64] estimated phenolic compounds in the blueberry leaves of 104 cultivars and grouped them into three distinct groups using hierarchical cluster analysis. The analysis was able to distinguish RE blueberry cultivars from most of the highbush blueberry cultivars, as all RE blueberry cultivars were grouped in Cluster III, while most NHB cultivars were gathered together in Cluster I. However, SHB blueberry cultivars were spread over three groups. The authors identified three RE blueberry cultivars ("Vernon", "Britewell" and "T-172", "Festival") as good sources of antioxidants [64]. Phytochemicals were grouped by cluster analysis with their functional structure (e.g., flavanols, flavonols, anthocyanins, and phenolic acids), where diploid, tetraploid, and hexaploid blueberry accessions were separated by the mul-tivariate analysis of the traits [72]. Giordani et al. [125] evaluated the genetic diversity in 42 individual V. myrtillus plants obtained from various places in the Tuscan Apennines (Italy), representing the southernmost producing latitude for this plant using the RAPD technique. The total anthocyanins, polyphenols, and radical scavenging activity were also estimated in two consecutive harvesting years on V. myrtillus samples of the same plants, as portrayed by molecular analysis. RAPD analysis showed a highly prevalent gamic propagation of V. myrtillus in the investigated area, and, for populations of Central and Northern Europe, studies reported the presence of a clonal-to-gamic genetic gradient of the propagation strategy from North Europe to South Europe. For biochemical data, variations were occasionally observed in closely located individuals. The authors ascribed such variations that might mainly be due to genetics, since the nearby area of biomes displayed different biochemical characteristics. However, RAPD clustering by molecular analysis did not display any association with biochemical diversity. The authors stated that strong variations due to climatic conditions were accountable for the significant variability in the biochemical content found during two harvesting seasons [125]. Goyali et al. [61] reported that the total phenolic, flavonoid, anthocyanin, and proanthocyanidin content and antioxidant activity in LB blueberries were significantly different in two years of growing. They suggested that the variation in the synthesis of the phenolic compounds in blueberries was rendered to variations by environmental factors such as light, temperature, humidity, and precipitation. Environmental stresses such as low-light conditions and nutrient concentration in growing media increased the activity of the phenylalanine ammonia lyase enzyme, which is a crucial regulatory factor of the phenol metabolic pathway. The total phenolics and anthocyanins could be elevated in field-grown ripe blueberries and red leaves by applying stress-inducing growth regulators [71,126]. The effects of the production year and location on the phytochemical content and antioxidant activity are dominant and genotype-specific.
Bhatt and Debnath [63] conducted an experiment with wild blueberry clones, cultivars, and hybrids to study genetic diversity regarding antioxidant activity, total flavonoid, and phenolic contents, using EST-SSR, genomic SSR, and EST-PCR markers. The association study of EST-SSR, genomic SSR, and EST-PCR markers with antioxidant properties in 70 blueberry genotypes identified 17 EST-SSR, G-SSR, and EST-PCR markers that were linked with antioxidant properties [63]. Although Bhatt and Debnath [63] reported a wide diversity among the genotypes for antioxidant properties and diversity indices (Shannon's index and expected heterozygosity), along with distinct grouping by STRUCTURE, principal coordinate, and neighbour-joining analyses, grouping based on biochemical data and molecular analysis did not coincide, indicating a random distribution of loci in the blueberry genome was conferring the antioxidant properties [63]. Similar reports were also observed by Debnath and Sion [75] in lingonberry, Debnath and Ricard [127] in strawberry, and Debnath and An [128] in cranberry. Genetic clustering does not coincide with that of biochemical data and specifies varying genomic coverage in berry crops. Molecular markers cover the whole genome, and most of them are not expressed. The genomic noncoding regions are not expressed at the phenotypic level, which might be the reason for the variation between the chemical and molecular diversities [63,79].

Molecular Markers as Mapping Approaches in Blueberry
DNA markers located close to important genes of a trait can be used to select breeding progeny or related species by that trait through a marker-assisted selection program. The markers linked to traits and genetic maps could be used to track the introgression of beneficial genes, such as disease-resistance genes, into disease-susceptible blueberries. Gene isolation using the information available on mapping, known as map-based gene cloning, could direct the transfer of important genes into other blueberries lacking these characteristics through genetic engineering methods. Rowland et al. [97] constructed a genetic linkage map in blueberry using the interspecific diploid population ((V. darrowii Fla4B × V. corymbosum W85-20) F 1 #10 × V. corymbosum W85-23) and planned to segregate for chilling requirement and cold hardiness comprising 12 linkage groups (corresponding to the haploid chromosome number of the diploid species), with a total distance of 1740 cM. The map included 265 markers of different types, such as SSR, EST-PCR, SNP, and RAPD. The map coverage was 89.9%, and the average distance between markers was 7.2 cM. The mapping population was evaluated for bud cold hardiness and chilling requirement in mid-winter under controlled conditions for 2 and 3 years, respectively. The broad-sense heritability of both chilling requirement and cold hardiness was relatively high under these conditions. One quantitative trait locus/loci (QTL) for cold hardiness and two for chilling requirement were identified in the diploid blueberry population. Rowland et al. [129] evaluated a diploid blueberry mapping population for fruit development and fruit quality traits (weight, diameter, colour, scar, firmness, flavour, and soluble solids) and found that the traits were segregating, and most of them were normally distributed in the population. Many development traits, such as the timing of shoot expansion, early bloom, and full bloom, were correlated with each other. Fruit quality traits, for example, weight, were highly correlated with the diameter and showed significant variation across genotypes and years with moderate to high heritability.

Quantitative Trait Loci (QTL) and Genome-Wide Association (GWA) Mapping for Blueberry Traits
GWA mapping assists in identifying the causal polymorphisms and molecular markers associated with fruit-related traits in several plant species. This analysis accelerates the cultivar development process via conventional breeding in perennial polyploid species, such as blueberries. Ferrão et al. [130] applied GWA mapping analysis in highbush blueberries to compare the effects of diploid and tetraploid markers in fruit-related traits. They reported that the SNP number in diploid species was less than that in tetraploid species, with a high degree of overlap (95%) between them. In tetraploid genotypes, they detected 15 SNPs significantly associated with five fruit-related traits, which included 7 SNPs significantly associated with two traits in diploid blueberry genotypes. Two SNPs overlapped with each other. The linkage disequilibrium decay presented a significant correlation between markers 73 Kb apart for the diploid model and 80 Kb apart for the tetraploid model. The lower heterozygosity was estimated by using diploid (0.34) blueberries rather than tetraploid (0.42) ones-this indicated that diploid standardisation might cause an underestimation of the heterozygosity rates. Cappai et al. [131] used high-resolution linkage mapping and QTL analyses to understand the genetic architecture of the traits related to the commercial harvestability of blueberries. They crafted a map for autotetraploid low-chill highbush blueberry containing 11,292 SNP markers and performed QTL analyses in 2-year field trials. They identified significant QTL peaks for fruit detachment force and firmness retention after cold storage. They reported low to moderate QTL effects explaining the phenotypic variance, which suggested the quantitative nature of these traits. Qi et al. [132] applied sequencing approaches for genotyping in diploid blueberries and demonstrated 12 linkage groups comprising 17,486 SNP markers, spanning a total genetic distance of 1539.4 cM. Among the 18 horticultural traits phenotyped in that population, QTLs that were significant over at least two years were identified for chilling requirement, cold hardiness, and the fruit quality traits of colour, scar size, and firmness. Ferrão et al. [133] applied GWA mapping to the target genotyping of volatile organic compounds in the blueberry population and elucidated the genetic architecture, while predictive models were tested to prove that volatile organic compounds could be accurately predicted using genomic information. By gathering the genomics, metabolomics, and sensory panel, they demonstrated that volatile organic compounds were controlled by a few major genomic regions, some of which harboured biosynthetic enzyme-coding genes and could be accurately predicted using molecular markers.
Kulkarni et al. [134] performed admixture and genetic analysis for NHB and SHB genotypes and BNJ16-5 progenies (V. corymbosum × V. darrowii) using genotyping-bysequencing (GBS) to examine genetic relatedness and parental lineage introgression in blueberries. They successfully mapped~2.8 million reads and identified 2,244,039 SNPs from 334 million GBS reads (75 bp) by aligning those to the V. corymbosum cv. Draper v1.0 reference genome sequence. The blueberry genotypes formed three major clusters on principal component analysis: (a) NHB cultivars, (b) SHB cultivars, and (c) BNJ16-5 progenies. The overall fixation index and nucleotide diversity indicated that the NHB and SHB cultivars had wide genetic differentiation, and haplotype analysis revealed that the SHB cultivars were more genetically diverse than the NHB cultivars. Admixture analysis identified the introgression of several parental genomic lineages into the BNJ16-5 progenies [134]. Similarly, Manzanero et al. [135] investigated genomic and evolutionary relationships among 195 blueberry accessions from five species, V. corymbosum, V. boreale, V. darrowii, V. myrsinites, and V. tenellum, using GBS. GBS generated~751 million raw reads, of which~80% were mapped to the reference genome V. corymbosum cv. Draper v1.0. Principal component (PC) analysis separated blueberry accessions into three main clusters, in which the first two PCs accounted for~29% of the total genetic variance. The nucleotide diversity was highest for V. tenellum and V. boreale and lowest for V. darrowii. The authors identified species boundaries in blueberry accessions and indicated V. boreale was a genetically distant outgroup, while V. darrowii, V. myrsinites, and V. tenellum were closely related.
Whole-genome assembly is a very useful technology to develop genetic markers for diversity analysis in Vaccinium species. Cui et al. [136] used Oxford Nanopore (ONT) and high-throughput chromatin conformation capture (Hi-C) technologies to analyse the relationship between V. darrowii and other Vaccinium species. They assembled the V. darrowii genome into 12 pseudochromosomes and predicted 41,815 genes using RNA-sequencing evidence. Syntenic analysis across three Vaccinium species revealed a highly conserved genome structure, with the highest collinearity between V. darrowii and V. corymbosum. Yu et al. [137] also generated a sequence of a total length of 1.06 gigabases (Gb) in the chromosomal-scale genome of V. darrowii by using a combination of PacBio sequencing and Hi-C scaffolding technologies. Over 98% of the genome sequences were scaffolded into 24 chromosomes representing the two haplotypes. The primary haplotype assembly of V. darrowii contains 34,809 protein-coding genes. Comparison to a V. corymbosum haplotype assembly revealed high collinearity between the two genomes, with small intrachromosomal rearrangements in eight chromosome pairs. Based on the high-quality reference genome and high compatibility in hybridisation with other species, V. darrowii was proposed as a significant new resource for assessing the evergreen blueberry species [136,137]. Wu et al. [138] used similar genomic analysis technology in an undomesticated wild bilberry species (V. myrtillus) to find out the relationship of wild bilberry with domesticated Vaccinium blueberries and cranberries (V. macrocarpon). They used a genome containing 12 pseudochromosomes that represented~97% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) genes and showed a high conservation of synteny against the blueberry genome. Kmer analysis confirmed that the sequenced samples might come from two individuals. The authors identified the key regulating MYB genes that determined anthocyanin production. The high conservation of synteny between bilberry and blueberry genomes confirmed that the comparative genome mapping could be applied to transfer marker-trait association knowledge between these two species.
Miao et al. [139] applied high-throughput sequencing technology to sequence and assemble the whole chloroplast genome of the SHB blueberry variety Sharpblue. The complete chloroplast genome consisted of two inverted repeat regions (31,076 bp and 3044 bp) and contained a total of 144 functional genes. Phylogenetic analysis clustered V. corymbosum and V. oldhamii into one group, which indicated that these two species had a close evolutionary relationship. Nagasaka et al. [140] performed a genome-wide association study (GWAS) to analyse genetic variation in SHB blueberry genotypes for their phenological traits, including chilling requirement, flowering date, ripening date, fruit development period, and continuous flowering. The detection of robust phenotype-genotype association peaks indicated higher heritability of the phenology-related traits of blueberry. A comparison of the genotypes at the GWAS peaks between the NHB and SHB genotypes revealed the putative introgression of low-chill and late-flowering alleles into the highbush genetic pool.
Nishiyama et al. [141] genotyped 132 polyploid NHB, SHB, and RE blueberries, based on double-digest restriction site-associated DNA sequencing. The genome-wide SNP data indicated that RE blueberry cultivars were genetically distinct from NHB and SHB cultivars, whereas NHB and SHB blueberries were genetically indistinguishable. The genotype data implied that there are no or very few genomic segments that were commonly introgressed from low-chill Vaccinium species to the SHB genome. A few loci associated with a variable could partially differentiate the SHB genome from the NHB genome.

Conclusions
Increased awareness of blueberries' health benefits also increased demand, leading to commercial cultivation. However, only selected species with high value were cultivated to meet the farmers' demand, thereby ignoring other species. It was observed that wild blueberry cultivars have higher genetic diversity than improved cultivars. Heterogeneity is declining as a result of selective breeding in wild clones of SHB blueberries as well as cultivated highbush blueberries [110]. With the help of genetic diversity and mapping technology, the level of diversity between cultivated and wild-type cultivars can be determined, and efforts can also be made to introduce novel genetic diversity. The measurement of antioxidant properties to explore biochemical and molecular diversity in blueberry germplasm and to identify promising genotypes containing high bioactive components with wide diversity are of prime significance to improve the functional quality of blueberries. Functional fruit quality, including the synthesis of bioactive compounds such as anthocyanins, flavanols, flavonols, and phenolic acids, is strongly controlled by genetic factors [72]. Crossing between selected genotypes is expected to develop new cultivars combining superior health-promoting bioactive components with diverse adaptability under changing environments. Antioxidant content is related to fruit weight and volume. Fruit size can be assessed as a substitution of its weight or volume, and vice versa, which was negatively associated with most of the phytochemical contents, such as the flavonoids and phenolic acids in tetraploid blueberries. However, the large fruit of cultivated accessions with high anthocyanin content was identified by Mengist et al. [72]. This result suggests that metabolite concentrations and fruit size can be simultaneously improved to a certain degree. In these blueberry accessions, other size-independent variations were significantly prominent, which may direct breeders to explore other fruit quality-related factors, such as the genes involved in the biosynthetic pathway controlling the anthocyanin content and profile (e.g., acylated vs. non-acylated anthocyanin). The combined study of the transcriptome and metabolomics at the different developmental stages of blueberry can be used to determine the regulatory network between genes and metabolites, exhibiting a more comprehensive and accurate molecular basis for the pathways of flavonoid and anthocyanin synthesis, which help in determining the genetic diversity in blueberry species [142]. Diversity in antioxidant metabolite concentration and composition has important implications for plant breeders to develop cultivars with high antioxidant capacity.
There have been significantly few efforts in this direction, so conservation efforts require more attention. The system of Agriculture and Agri-Food Canada is Canada's commitment to the Canadian biodiversity strategy, in response to the convention on biological diversity [4]. The Canadian Clonal Genebank, introduced in 1989 and located in Harrow, Ontario, is responsible for the conservation, characterisation, virus indexing, and distribution of trees and small fruit [4]. A berry improvement program was established in 1999 to develop blueberry cultivars at the St. John's Research and Development Centre (SJRDC) of Agriculture and Agri-Food Canada in Newfoundland and Labrador [4]. As a part of this program, wild blueberry clones collected from various parts of North America are maintained at SJRDC for further breeding programs to generate better-quality blueberry plants.
Author Contributions: Conceptualization, writing, reviewing, and editing by S.C.D.; writing and draft preparation by D.B.; writing, reviewing, and editing by J.C.G. The authors declare that the content of this paper has not been published or submitted for publication elsewhere. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding. Data Availability Statement: Original data are the property of the Agriculture and Agri-Food Canada.