Next Article in Journal
Nuclear Receptors Linking Metabolism, Inflammation, and Fibrosis in Nonalcoholic Fatty Liver Disease
Next Article in Special Issue
Editorial: Functional Genomics in Plant Breeding 2.0
Previous Article in Journal
MISF2 Encodes an Essential Mitochondrial Splicing Cofactor Required for nad2 mRNA Processing and Embryo Development in Arabidopsis thaliana
Previous Article in Special Issue
Molecular Network for Regulation of Ovule Number in Plants
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species

by
Cassandria Geraldine Tay Fernandez
,
Benjamin John Nestor
,
Monica Furaste Danilevicz
,
Mitchell Gill
,
Jakob Petereit
,
Philipp Emanuel Bayer
,
Patrick Michael Finnegan
,
Jacqueline Batley
and
David Edwards
*
School of Biological Sciences, The University of Western Australia, Perth, WA 6009, Australia
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2022, 23(5), 2671; https://doi.org/10.3390/ijms23052671
Submission received: 1 February 2022 / Revised: 21 February 2022 / Accepted: 21 February 2022 / Published: 28 February 2022
(This article belongs to the Special Issue Functional Genomics for Plant Breeding 2.0)

Abstract

:
Pangenomes are a rich resource to examine the genomic variation observed within a species or genera, supporting population genetics studies, with applications for the improvement of crop traits. Major crop species such as maize (Zea mays), rice (Oryza sativa), Brassica (Brassica spp.), and soybean (Glycine max) have had pangenomes constructed and released, and this has led to the discovery of valuable genes associated with disease resistance and yield components. However, pangenome data are not available for many less prominent crop species that are currently under-utilised. Despite many under-utilised species being important food sources in regional populations, the scarcity of genomic data for these species hinders their improvement. Here, we assess several under-utilised crops and review the pangenome approaches that could be used to build resources for their improvement. Many of these under-utilised crops are cultivated in arid or semi-arid environments, suggesting that novel genes related to drought tolerance may be identified and used for introgression into related major crop species. In addition, we discuss how previously collected data could be used to enrich pangenome functional analysis in genome-wide association studies (GWAS) based on studies in major crops. Considering the technological advances in genome sequencing, pangenome references for under-utilised species are becoming more obtainable, offering the opportunity to identify novel genes related to agro-morphological traits in these species.

1. Introduction

Plant breeders have continually faced the challenge of increasing crop yield, nutrition, and disease resistance as the human population increases, and regions suitable for the production of crops shift with a changing global environment [1,2,3]. The construction of the first reference genome assembly for a crop species, rice (Oryza sativa) in 2002 [4], greatly improved the ability to associate traits with genomic regions, increasing the success of selection for specific traits that increase agronomically beneficial phenotypes. Improving genomic resources for crop species has predominantly focused on a limited number of high-yield, popular species such as wheat (Triticum aestivum) [5], rice (Oryza sativa) [6], maize (Zea mays) [7], barley (Hordeum vulgare) [8], soybean (Glycine max) [9,10], canola (Brassica napus) [11], and sorghum (Sorghum bicolor) [12]. These species are often referred to as major crops due to their extensive use in agriculture systems and high demand as food sources worldwide. The focus of genomic research and trait selection on major crops has led to many minor crops falling behind, limiting the opportunity to diversify the food bowl or discover the genetic basis for valuable traits within these species. Hence, under-utilised crops need investment to support their improvement and characterise traits that can potentially be transferred to major crops [13,14].
Reference genome sequences have recently been assembled for some under-utilised crop species such as yam bean (Pachyrhizus erosus) [15], kenaf (Hibiscus cannabinus) [16] and white fonio (Digitaria exilis) [17]. However, using a single reference leads to bias due to the significant structural variation (SV) observed within a species [18,19,20]. SVs can arise as a consequence of whole-genome duplication and subsequent fragmentation [21,22,23], or tandem and segmental duplication of genomic regions [24]. This duplication and fragmentation can lead to gene copy number variation (CNV) and gene presence/absence variation (PAV). CNV and PAV can also result from insertion of gene copies by transposable elements [25], de novo gene birth [22,26], introgression from closely related species, or horizontal gene transfer [27], which may affect heritable traits. Hence, single reference genomes do not reflect the gene content and diversity of a species, and improvements in the genomic resources over single reference genomes are needed in order to increase the success of genomics-based plant breeding for both major and under-utilised crop species.
Pangenomes are references that capture the genetic diversity of a species rather than a single individual and can reduce reference bias in genomic analysis, allowing more accurate prediction of traits [18,19]. A pangenome contains a core genome (shared among all individuals) and the variable or dispensable genome that is absent from one or more individuals [28]. The idea of a core and variable genome for a species represented by a pangenome was first described by Tettelin et al. in 2005 [28] and later proposed for use in plants by Morgante et al. in 2007 [29]. In 2014, the first plant pangenome was published, representing seven wild soybean (Glycine soja) individuals [30]. This was used to associate genes with the domestication traits of organ size, biomass, seed composition, flowering and maturity time, and disease resistance. Since then, other pangenomes have been constructed, including one representing 3 rice individuals [31], 10 Brassica oleracea individuals [32], 18 bread wheat individuals [20], 54 Brachypodium distachyon individuals [33], 53 canola individuals [34], 5 sesame (Sesamum indicum) individuals [35], 725 tomato (Solanum lycopersicum) individuals [36], 89 pigeon pea (Cajanus cajan) individuals [37], and 1961 cotton (Gossypium spp.) individuals [38] (Table 1). These provide valuable resources for understanding genetic variation in these species [39]. However, there are few pangenomic resources for under-utilised species, which limits the application of genomics to develop improved varieties of these crops. In this review, we examine several under-utilised crop species that lack pangenome resources and discuss the benefits the development of these resources may have for these species as well as the overall benefits to agriculture. The current methods for pangenome construction and trait analyses are also discussed. This review aims to provide a foundation for further studies to construct pangenomes for under-utilised crop species and improve their traits through plant breeding based on pangenomic analyses.

2. Under-Utilised Species

Many minor crops have yet to benefit from genomics-based breeding methods, despite many being important food sources in specific regions (Table 2). Under-utilised crop species cover a broad range of crop types, including cereal grains, vegetable, tubers, fruits, and crops with industrial uses (Table 3). Here, we describe several promising under-utilised crop species for each crop type and discuss the currently available genomic resources.

2.1. Cereal Grains

Wheat, maize, and rice constitute the major cereal grain crops and are responsible for supplying the majority of the global food requirement. However, these species are sensitive to drought and heat stress, leading to reduced yield or even crop failure in some environments [40]. Several under-utilised cereal crops are adapted to harsh environments and are alternatives to these major crops [41].
Little millet (Panicum sumatrense) is a small millet species native to India (hence its alternative name ‘Indian Millet’) and is primarily grown in semi-arid regions of Asia and Africa. This species requires minimal water and has a tolerance to drought and high salinity soil. However, Little millet is only grown in specific regions and few people consume it despite its nutritional benefits of high carbohydrate, dietary fibre, calcium, iron and Vitamin E content [42]. The genomic resources for Little millet are limited to the chloroplast genome sequence [43] and a transcriptome assembly [44] that has been used to characterise genes responsible for abiotic stress tolerance. This species lacks both a sequenced genome and a genetic map, limiting further study and genomics-based selection of traits.
White fonio (Digitaria exilis) is a panicoid grass and an under-utilised cereal crop from West Africa that is valued for its grain that is high in dietary fibre and protein [45]. The crop grows in hot, dry and low-fertility environments and requires no fertiliser or irrigation on poor-quality soils. However, white fonio has a low yield and minimal research has been undertaken into breeding to improve traits of this crop [46]. A genome sequence of white fonio has recently been assembled and annotated [17,47] and has been used for sequence-based genotyping [48]. Combining these genetic resources with other panicoid grass resources such as Setaria italica (foxtail millet) [49], Cenchrus americanus (pearl millet) [50], and Panicum miliaceum (proso millet) [51] through pangenomic and comparative genomic strategies may support white fonio research to benefit breeders and consumers [52].

2.2. Vegetable/Pulse Crops

The Vigna genus of legumes has many genetic resources, but few specifically for the under-utilised crop, moth bean (Vigna aconitifolia) [53]. Moth bean is a multipurpose legume that provides hot-season pasture and hay for livestock and seed. This species is the most heat-tolerant crop of the Asian Vigna species and is able to withstand drought conditions. Seeds and young pods of moth bean are suitable for human consumption and have a high vitamin and mineral content. While moth bean domestication is well studied and documented, the genetics of the domestication process is largely unknown. Genetic resources are largely limited to genetic linkage maps that can identify domestication-related traits and QTLs not present in moth bean, but that are present in other Vigna species [54]. These data can be integrated into genomic resources such as a pangenome, enhancing the genetic improvement of moth bean and related Vigna species.
Lablab bean (Hyacinth bean, Lablab purpureus) is a leguminous crop that is commonly grown as a food source due to the seed having high protein content and a comparable nutritional profile to soybean [55]. In addition to being a source of nutrition, lablab bean is used to improve soil fertility as a cover crop and green manure [56]. Lablab bean has a higher drought tolerance compared to other commonly cultivated legumes and is able to grow across a wide range of climate and environmental conditions, withstanding temperatures from 18 °C to 50 °C and annual rainfalls from 200 to 2500 mm [57]. To enhance the production and benefits of lablab bean, new varieties need to be developed, especially those that are tailored to extended drought periods. Studies have largely focused on conventional breeding, but polygenic traits such as drought tolerance can be supported by more genomic research, which has been limited [58,59]. A draft genome for lablab bean was assembled in 2019 [60] and a chloroplast genome assembly in 2021 [61]. Further development of genomic resources through pangenomics would provide tools to help improve traits of this species and become an important safety net crop against the impact of climate change on legume production.

2.3. Tuberous Crops

The genus Pachyrhizus contains three yam bean species cultivated for their starchy tuberous root, P. erosus, P. ahipa and P. tuberosus. Yam bean is a regionally important crop in Mexico and Southeast Asia where it is eaten as part of many traditional dishes. Yam bean has a high yield and the crop can thrive in humid conditions [46,62]. The P. erosus tuber contains high vitamin C, iron, zinc and potassium [63]. Presently, there is a draft genome assembly P. erosus [15], and a flow cytometry study analysis [64], but P. erosus lacks the pangenome resources that would support studies of its abiotic stress traits for transfer to major legume crops [65].
African arrowroot (Canna edulis) is a tuber crop that originated in Central and South America and is distributed throughout Europe, North America and in tropical regions of the world. The tuber contains large amounts of starch which is highly viscous, often used in cakes, noodles, dye, and animal fodder [66]. African arrowroot is also known for its horticultural use in gardening and for the treatment of industrial wastewaters to remove pollutants such as nitrogenous and phosphorous compounds [67]. African arrowroot has a diverse germplasm and has over 1000 hybrids, making genomic studies into the species difficult. Presently, the only genomic resources for African arrowroot are a complete chloroplast genome [68]. Pangenome resources could be used to explore the diversity in gene content and compare genomic structures with related species.

2.4. Industrial Crops

Kenaf (Hibiscus cannabinus) is an annual crop that is cultivated for the bast fibres that are produced on the stem bark of the plant. The species is the third most important source for fibre production after cotton and jute (Corchorus spp.) and it is often used in the production of paper, rope, building materials and as a livestock feed [69]. Kenaf has a high biomass yield and can acclimate to many different climates and soils [69], but little research has been undertaken on this species. A de novo transcriptome of kenaf was assembled in 2015 [69], and a mitochondrial genome sequence was assembled in 2018 [70]. These resources were recently supplemented in 2020 by a genome assembly, allowing for genes involved in the development of bast fibre and leaf shape to be identified [16]. Further study of the candidate genomic regions for bast fibre yield and quality-related traits using pangenomics could provide insights into yield and quality traits that could expedite the selection of elite traits.
Safflower (Carthamus tinctorius) is a thistle-like plant that is commercially cultivated for the vegetable oil extracted from its seeds. The species is found across Asia, Europe, Australia and the Americas [71], where it is popular due to the high content of linoleic acid and flavonoids, such as hydroxysafflor yellow A, in the oil [72]. Molecular studies have been undertaken in safflower primarily for fatty acid composition and flavonoid biosynthesis. Whole-genome sequencing efforts had been limited to short-read sequencing [71], but more recently, a chromosome-level reference genome assembly [73] was constructed that has allowed for evolutionary analysis of the divergence of safflower and the study of linoleic acid and flavonoid biosynthesis. While this whole-genome reference sequence has aided study into the genetic improvement of safflower, further improvement and understanding of how the Asteraceae family evolved and speciated can be achieved through the construction of pangenomic resources for safflower.

2.5. Fruits

Guava (Psidium guajava) is an important tropical and subtropical fruit of the Myrtaceae family, being the fourth most significant fruit crop in India [74]. The species is a highly sought-after export because it is a rich source of vitamin C, fibres and phytochemicals [75]. However, guava is vulnerable to the guava wilt pathogen Nalanthamala psidii and fruit flies, causing worldwide threats to the stability of guava production. Despite being economically valuable, there are few genomic resources for the species, especially resources that can be used to study the response of guava to biotic and abiotic stresses [76]. Most genomic resources for guava have only emerged in the early 2020s, including a genome assembly [76,77], high throughput and EST-based InDel/SNP markers [76] and a transcriptome assembly [78]. These resources lay the groundwork for improving the agronomic traits of guava by gene mapping and genomic selection that could be expedited through a pangenome.
Ethiopian banana (Ensete ventricosum) is a local crop that contributes to the food security of Ethiopia, providing a staple food source for approximately 20 million people [79]. The Ethiopian banana is an important dietary starch source and has uses in the production of fibre, medicine and other industrial products as well as an important role in stabilising soils, as well as being of cultural importance in Ethiopia [79]. Unlike most under-utilised crop species, pangenomics have been applied to Ethiopian banana with the species being included in a higher-level pangenome assembled for the Musaceae family [80]. This banana pangenome has allowed the identification of candidate regions for drought resistance, meristem initiation and stress resistance. The continued development of this pangenome will increase its value as a tool for trait improvement, broader diversity studies and evolutionary studies of banana species.

3. Developments in Pangenome Resources to Aid in the Breeding of Under-Utilised Crops

The three main approaches for pangenome construction used across genomic research are de novo assembly and comparison, iterative mapping and assembly, and graph-based assembly (Figure 1). The suitability of each approach depends on several factors, such as organism genome complexity, sequencing data quality and coverage, genetic similarity among individuals used for the pangenome construction and the intended final application of the pangenome. De novo assembly requires the individual genomes to be assembled separately, followed by whole genome comparison [29,30]. The iterative mapping and assembly approach involves mapping reads from different individuals to a starting reference genome, assembling the unmapped reads into novel contigs and then adding the novel contigs to the reference, forming a pangenome [32,34]. The iterative mapping approach and the de novo assembly approach are highly complementary, widely used and have been extensively discussed in other reviews [18,81,82].
Modelling suggests that as few as 10 representative individuals in a pangenome may capture the majority of gene diversity of a species. However, the advantage of increasing the number of individuals is that it permits an assessment of gene content variation across a population, and how this may change with breeding [9]. Recent pangenome studies of major crop species assess data from thousands or tens of thousands of individuals and include high quality chromosome-scale assemblies to further increase trait prediction accuracy [18,36].
Pangenome graphs are a relatively new pangenome construction method that combine the benefits of the iterative mapping and de novo assembly approaches. The method presents variation across multiple genomes as different paths along a graph of sequence or variant nodes. Pangenome graphs are constructed through whole-genome alignment, unassembled read alignment or de novo graph assembly [83,84]. Sequence graphs such as minigraph [85] represent nodes as short sequences, leading to highly complex networks that can present SVs in a manner where they can be compared among closely-related species [85,86]. Variation graphs, on the other hand, are a compact form of sequence graph used to present genetic variation across a population [87]. In variation graphs such as vg [88] or MGR [89], SNPs and SVs are represented by nodes and are connected when shared among individuals, allowing representation of large-scale SVs such as inversions and duplications [85,90,91].
Another type of pangenome graph is the practical haplotype graph (PHG) [92,93], which is a trellis graph representing genic and intergenic regions. PHGs avoid challenges in aligning repetitive and highly divergent regions through the use of a reference genome coordinate system that uses genes to anchor sequences [92,94], minimising errors due to reference bias, poor alignment and miscalled variants [95]. A common use of PHGs is to determine which haplotypes or genotypes of parental haplotypes that have been sequenced at high coverage are present in progeny that have been sequenced at low coverage. These graphs have been used in sorghum [92], maize [96] and cassava (Manihot esculenta) [95] to impute SNPs from low-coverage DNA sequence data. PHGs can support plant breeding as they can accurately capture the position of genomic variations among individuals. Advances in pangenomics are leading to the construction of higher-level pangenomes often referred to as super-pangenomes that represent genomic information at the genus level and above [80,97,98]. Super-pangenomes have the potential to aid introgression of traits from related species that can confer agronomic benefits. An example is alien introgression in Brassica breeding, where the Ogura fertility restorer gene system carried by the Rfo locus was introgressed into B. napus (which contains the Brassica A and C genomes) from closely related Raphanus sativus (radish) [99,100].
Super-pangenomes can support a more comprehensive view of gene PAV across species and provide a framework for evolutionary studies. The super-pangenome of banana identified gene differences between Musa and Ensete genera [80], as well as 12,310 new gene models in the species, forming distinct PAV clusters between the Ensete and Musa accessions. Variable genes related to flowering, meristem regulation and nutrient metabolism were enriched in the Musa accessions, reflecting the morphological diversity of Musa fruits [80]. Super-pangenomes at the genus level can also identify traits or genes lost during domestication or that have evolved in related species that can then be selected for in breeding. The latest soybean pangenome represented 1110 soybean individuals [10] and demonstrated that there had been a reduction in the number of protein-coding genes during domestication and subsequent breeding of elite varieties, with wild soybean having on average 620 more genes and a 21 Mbp larger genome than modern cultivars [10,101]. Studying how genes change in frequency between domesticated crops and their wild relatives using super-pangenomes can support the breeding of crops better adapted to diverse environments and more resilient to climate change.
Plant pangenome assemblies have shown that variable regions are often associated with biotic or abiotic stress [93], leading researchers to focus on the construction of pangenomes based on specific functional traits. These trait pangenomes aim to describe the landscape of genetic variation related to a trait. For example, resistance gene analogues (RGAs) have conserved domains and motifs that contribute to resistance to pathogens [102,103,104,105]. Thus, a pan-RGA can provide a platform to investigate the impact of genetic variation on plant resistance, as well as identify genetic markers for RGA profiling of species that may have limited genomic data [102]. A pan-RGA can be employed as a reference for resistance gene cloning [106,107]. In addition, trait pangenomes can be used to investigate the evolution and domestication of specific traits. For example, one study examined the differences in the nucleotide binding sites of leucine-rich repeat receptors (NLRs) during colonisation of new habitats by Solanum chilense, reinforcing that NLR evolution is constrained by their interaction with the products of other genes [108]. In the case of under-utilised species, trait pangenomes can help dissect the genetic variability associated with drought tolerance in the moth bean [109,110] and lablab bean [111,112], as well as potentially increase crop productivity by comparing yield-related genes with higher performing relatives. The functional analysis of the genetic diversity uncovered by pangenome studies is still largely unexplored but can be improved through the use of trait pangenomes, providing a foundation to accelerate breeding of under-utilised crop species that support food security globally.

4. The Breeding Potential of Under-Utilised Crop Species

Structural variation represented in pangenomes has been linked with pathogen resistance and tolerance to abiotic stress [32,113,114]. Identifying advantageous genes and alleles relies on associating pangenome SVs with phenotypic traits through genome-wide association studies (GWAS), quantitative trait loci (QTL) mapping or genomic selection [36,115,116]. As an example of pangenome-assisted GWAS analysis in major crops, a soybean graph-based pangenome with 29 assemblies identified a previously unknown PAV associated with seed luster [9]. Pangenome GWAS studies in other species detected 124 PAVs associated with yield and fibre quality in cotton [38], genes associated with seed traits and early leaf senescence in rice [6,117], PAVs associated with seed and flowering traits in canola [11], and 398 SNPs associated with agronomic traits in sorghum [12]. Pangenome GWAS and other functional comparisons support the linking of genomic variation with beneficial traits with an accuracy that linear single reference genomes are unable to provide. A functional pangenome analysis for under-utilised crops may uncover novel alleles related to agronomic traits in the variable genome that may be used for introgression into major crops or be used as genetic markers to improve traits of under-utilised crops.
Characterising the relationship between SVs and differences in plant function requires integrating other data types, such as phenotype, metabolite and gene expression data, with the pangenome [82,118]. For example, SVs identified in a cotton pangenome with 890 accessions were compared through meta-GWAS and gene expression analysis to identify candidate genes related to yield and fibre quality. Genes identified include the previously uncharacterised gene GhIDD7 that was subsequently shown to control fibre length by using gene knockout with CRISPR-Cas9 [38]. Meta-GWAS was also employed in a soybean study using 17,556 accessions and associated phenotypic data to identify candidate genes related to agronomic traits, reporting several new loci, some of which were associated with multiple traits suggesting pleiotropic effects [119]. Leveraging previously published studies with biochemical analysis may help bridge the understanding of the effect of SVs on plant morphology. For example, although there are limited genomic resources for guava, a few studies have been conducted to investigate fruit and leaf metabolites [120,121] and fruit aroma volatiles of 27 guava accessions [122]. These datasets could be used to scan a guava pangenome for fruit related traits. A super-pangenome of yam bean species (P. erosus, P. ahipa and P. tuberosus) would provide a basis for integrating associated phenotype data. Multiple studies using agro-morphological traits collected for the yam bean varieties grown in Brazil, West Africa and Costa Rica have found significant variation between the genotypes employed in each study [123,124,125]. Integrating rich phenotype data with a yam bean super-pangenome could be used to infer the effects of SVs on phenotype, including traits directly related to plant performance such as day to flowering and maturity, plant height, and root biomass [125].
Previously identified genomic markers can be mapped to a pangenome reference to support the discovery of novel alleles. A recent pangenome study in tomato mapped 359 QTLs associated with volatile organic compounds [36,126]. These QTL regions were compared across diverse tomato populations, allowing the identification of alleles that can be used to improve fruit aroma [126]. Another study examined a tomato super-pangenome with 166 accessions from the wild ancestor S. pimpinellifolium, the semi-domesticated species S. lycopersicum var cerasiforme, and early domesticated S. lycopersicum var lycopersicum. They identified functional polymorphisms in the LIN5, ALMT9, AAT1, CXE1, and LoxC genes associated with fruit flavour. Beneficial haplotypes were identified that could be introgressed through conventional breeding [127]. These studies demonstrate the use of pangenomes to build on previous studies.
Although there are limited genetic data for under-utilised crops, collating previous studies from closely related species may present encouraging results. For example, a study with finger millet (Eleusine coracana) used genotyping by sequence data to identify 109 SNPs, with five of these located in genes involved in flowering, maturity and grain yield [128]. Another study on finger millet identified 418 SNPs related to mineral micronutrient density that could be employed to improve grain nutrient quality [129]. Mapping previously reported markers onto a millet pangenome could improve our understanding of the genes related to agro-morphological traits in this under-utilised crop, thus supporting millet performance in the field.
Advances in bioinformatics tools and data analysis will help accelerate under-utilised crop improvement using currently available genomic data. Machine Learning (ML) is a computational technology used to predict outcomes for specific problems based upon previous data. In bioinformatics, ML is becoming increasingly applied and optimised for crop-related advances in genomics and phenomics [118,130,131,132]. A recent study used random forest classification in conjunction with linkage disequilibrium mapping to identify pangenome PAV tags in domesticated barley with 83.6% accuracy, and in wild barley with 88.6% accuracy [133]. These barley PAV tags will help construct future barley pangenomes and can be applied to association analysis. Pangenomics ML has also been applied to understand gene loss mechanisms in Brassica [134]. It was demonstrated that gene loss was mainly associated with transposable elements in the diploid B. oleracea and B. rapa, while in the polyploid B. napus, the loss of genes was mostly associated with homoeologous recombination. ML can also be used for trait association in pangenomes, as seen in B. napus, where PAV associations were identified for disease resistance [135], and in pigeon pea for seed weight [37]. Here, using PAVs and SNPs from a pangenome rather than just SNPs derived from a single reference genome sequence as input when training ML models will increase the efficiency and reliability of prediction of traits in these crops. As the application of ML in crop science increases, these methods will become more common for the translation of pangenomic and crop trait data for under-utilised crop variety improvement.

5. The Future of Pangenomics in Breeding Under-Utilised Crops

Many of the advances in genomics and pangenomics have been driven by improvements in DNA sequencing technology. More accurate non-fragmented assemblies can now be generated using long-read sequencing methods such as Pacific Biosciences (PacBio) single-molecule real-time (SMRT) sequencing [136] or Oxford Nanopore Technologies (ONT) sequencing [137]. Long-read sequencing can now generate data with low error rates (between <1% and <5%, depending on the sequencer used) and span repetitive sequences, leading to pangenomes that contain fewer gaps and the ability to resolve placements of homeologous scaffolds [138,139]. Long-read sequencing also allows base modifications in complex repetitive regions to be analysed and for large SVs (>500 bp) to be assessed [140]. Improved sequencing and assembly methods have also allowed the capture of repetitive elements and complex inversions and translocations, allowing detection of SVs that would be missed in fragmented low-quality assemblies [81,141].
The additional SV data produced by these technologies can be translated to high-throughput and flexible molecular genetic markers for under-utilised crops. These markers can be used in breeding projects to maximise the efficiency of genomic selection for agronomically valuable traits [142]. However, the relatively high cost of generating long-read sequence data means that these high-throughput markers are not feasible for many genotyping applications. Furthermore, long-read sequencing has a large computational requirement in the analysis stage [143]. While software packages that analyse pangenomes and identify core and variable SNPs do exist, such as PanSeq [144], database systems for interpreting complicated SVs are rare. This rarity makes the use of long-read sequencing a challenge [145]. Nevertheless, the benefits of long-read sequencing for the construction of high-quality pangenomes makes it the approach of choice for future pangenomes, while the lower cost of short-read Illumina sequencing makes it more amenable for larger scale genotyping approaches.
As larger and more accurate genome assemblies are being produced, tools are being developed to annotate them more quickly and accurately [146]. Genome annotation tools such as BRAKER2 [147] and MAKER [148] combine ab initio (statistical model) and evidence-based gene predictions to produce higher quality annotations while still being relatively easy to use. However, annotation remains a bottleneck for large-scale genome and pangenome analysis, because gene prediction and functional annotation still lags behind assembly approaches [149,150,151]. In general, current gene prediction is complex. Most functional annotation tools draw from functional annotation databases that are either relatively small and manually curated, and therefore accurate, such as Swiss-Prot [152], or large and non-curated, and therefore potentially containing errors, such as the National Center for Biotechnology Information (NCBI) non-redundant database [153]. More accurate annotation methods are required to study differences in genetic architecture, because the detection of complex traits can be confounded when SVs and PAVs are incorrectly positioned. Future high-quality functional annotation will likely use transcriptomic, proteomic, phenomic, and metabolomic data with pangenomics together with approaches such as machine learning (ML) to increase accuracy. Currently, there are no universal ab initio methods or homology-based methods capable of aligning variations found in plant genomes with a reference pangenome [154]. To address this problem, research is underway to efficiently index, store and interrogate graphical representations of pangenomes that will lead to more accurate annotation [155] (Figure 2).
The full genetic potential of many under-utilised crops has yet to be fully realised, primarily due to a lack of resources that can be used to aid identification and selection of agronomically valuable traits. With the decreasing cost of sequencing, pangenomes for many under-utilised crop species can be assembled. These pangenomes can be used to identify genomic variation that can be studied with trait mapping tools such as GWAS and QTL, allowing the prediction of desirable crop traits using molecular markers [9,36]. By developing resources for under-utilised crops, novel genes related to agro-morphological traits can be detected and used to inform breeding programs or used for introgression into related major crop species. Furthermore, advancements in sequencing technologies will likely see pangenomes constructed with long-read DNA sequencing methods and chromosome-scale assemblies overtake single reference genomes for use in plant breeding research. The implementation of these pangenome assemblies in graph-based pangenomes and improvements in the accuracy of assembly and annotation tools will allow for more detailed analyses of the genetic constitution of under-utilised crops, and more efficient improvement of traits [88,92,131,156]. With pangenomes, existing genomic data and ML tools informing genetic breeding and gene editing, some of these climate-resilient and nutritious under-utilised crops show the potential to become alternative food sources or safety nets to major crops, supporting future increased agriculture system diversity and food security.
Table 1. Plant pangenomes constructed to date and method of assembly.
Table 1. Plant pangenomes constructed to date and method of assembly.
Species Name# of Individual GenomesAssembly MethodReferences
Amborella trichopoda10Iterative mapping and assembly[76,77]
Arabidopsis thaliana7De novo assembly[157]
Brachypodium distachyon54De novo assembly[33,158]
Brachypodium hybridum4De novo assembly[158]
Brassica napus53Iterative mapping and assembly[34]
Brassica napus8De novo assembly[11]
Brassica oleracea10Iterative mapping and assembly[32]
Cajanus cajan89Iterative mapping and assembly[37]
Capsicum5Iterative mapping and assembly[156]
Glycine max29Graph-based de novo assembly[9]
Glycine max1110Iterative mapping and assembly[10]
Gossypium1961De novo assembly[38]
Hordeum vulgare20De novo assembly[8]
Helianthus annuus287De novo assembly[159]
Malus domestica91De novo assembly[160]
Manihot esculenta57Practical haplotype graphs[161]
Medicago truncatula15De novo assembly[162]
Oryza sativa3De novo assembly[31]
Oryza31De novo assembly[6]
Poplar10De novo assembly[163]
Sesamum indicum5De novo assembly[35]
Solanum lycopersicum725De novo assembly[36]
Sorghum bicolor398 Practical haplotype graphs[92]
Sorghum bicolor176Iterative mapping and assembly[12]
Triticum aestivum18Iterative mapping and assembly[20]
Zea mays4705Practical haplotype graphs[96]
Table 2. Research involving underutilised crops without genomic references.
Table 2. Research involving underutilised crops without genomic references.
Scientific NamesCommon NamesType of ResourceReferences
Basella albaMalabar spinachReports of viruses infecting Malbar spinach[164,165]
Chromosome counts/Nuclear DNA quantification[166]
Calathea allouiaGuinea arrowrootFuture prospects for underutilised medicinally valuable plants[167]
Couma utilisMilk treeIdentifying pollinators in edible Amazon fruit plants[168]
Crambe cordifoliaGreater sea-kale Ancestral chromosomal blocks in Brassiceae species[169]
Leopoldia comosaTassel grape hyacinthIdentifying physiological responses[170]
Mineral content and chemical analysis[171]
Schinziophyton rautaneniiMongongo treeSustainability review[172]
Chemical composition of oil[173]
Ullucus tuberosusUllucoViruses detected in ulluco[174]
High throughput sequencing to detect novel viruses in ulluco [175]
Table 3. Underutilised crops with genetic resources.
Table 3. Underutilised crops with genetic resources.
Scientific NamesCommon NamesType of Genomic ResourcesReferences
Cereal grains
Canna edulisAfrican arrowrootChloroplast genome sequence[68]
Digitaria exilisWhite fonioGenome assembly and annotation[17,47]
Genotype-by-sequencing and SNP data[48]
Panicum sumatrenseLittle MilletChloroplast genome sequences[43]
De novo transcriptome assembly[44]
Vegetable/Pulse crops
Lablab purpureusHyacinth bean/Lablab beanChloroplast genome assembly[61]
Draft genome assembly[60]
Upregulation of drought tolerant genes[58]
RFLP markers[176]
Solanum nigrumBlack nightshade plantTranscriptome sequence[177,178]
Chloroplast genome sequence[179,180]
Vigna aconitifoliaMoth beanGenetic linkage map[54]
Novel Vigna genetic resources[53]
Tuberous crops
Pachyrhizus erosusYam beanDraft genome assembly[15]
Vigna vexillataZombi pea or Wild cowpeaAnti-inflammatory bioactivity[181]
QTL analysis[182]
Molecular linkage analysis[183]
Hybridisation accession analysis[184]
Industrial Crops
Carthamus tinctoriusSafflowersTranscriptome sequencing[185,186]
Chromosome-scale reference genome[73]
Chloroplast genome sequence[187]
Genetic mapping of SNPs[71]
Hibiscus cannabinusKenafMitochondrial genome assembly[70]
Genome assembly and annotation[16]
De novo transcriptome assembly[69]
Fruit/Nuts
Bactris gasipaesPeach palmChloroplast DNA for phylogenetic study[188]
Macaúba palm transcriptome sequencing[189]
RNA-seq of tropical palms[190]
Plastome sequence[191]
Citrullus colocynthisDesert Watermelon or Wild watermelonGene markers[192]
Transcriptome assembly[193]
Genome Resequencing[194]
Elaeagnus angustifoliaRussian olive or wild oliveGeographic study using machine learning[195]
Hi-C assembly[196]
Transciptome profiling[197]
Plant signalling regarding salt [198]
Ensete ventricosumEthiopian BananaGenome assembly[199,200]
Pangenome assembly[80]
Markers/Microsatellites[201]
Metabolite data[202]
Euterpe oleraceaAçaíChemical genomic profiling[203]
Karyotype and genome size[204]
Psidium guajavaGuavaGenome assembly[76,77]
Genome Markers[76]
RNA-seq/transcriptome assembly[78]
Vaccinium meridionaleAgraz or Colombian BerryPhylogenetic relationships within the Vaccinieae tribe[205]
Chemical, antimicrobial and molecular characterisation[206]
Characterisation of phenotypic plasticity[207]
Antiproliferative potential of Agraz juice[208]

Author Contributions

C.G.T.F., B.J.N., M.F.D., M.G. and D.E. wrote and edited this manuscript. B.J.N. and M.F.D. constructed the figures. J.P., P.E.B., P.M.F. and J.B. reviewed and edited this manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by the Australia Research Council (Projects DP210100296, DP200100762, and DE210100398) and the Grains Research and Development Corporation (Projects 9177539 and 9177591). Cassandria G. Tay Fernandez, Benjamin J. Nestor and Monica F. Danilevicz are supported by Research Training Program scholarships. Benjamin J. Nestor is supported by a university postgraduate award at The University of Western Australia. Monica F. Danilevicz further support from the Forrest Research Foundation. This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Anderson, R.; Bayer, P.E.; Edwards, D. Climate change and the need for agricultural adaptation. Curr. Opin. Plant Biol. 2020, 56, 197–202. [Google Scholar] [CrossRef] [PubMed]
  2. World Health Organization. The State of Food Security and Nutrition in the World 2019: Safeguarding Against Economic Slowdowns and Downturns; Food and Agriculture Organization of the United Nations: Rome, Italy, 2019. [Google Scholar]
  3. Ray, D.K.; Mueller, N.D.; West, P.C.; Foley, J.A. Yield trends are insufficient to double global crop production by 2050. PLoS ONE 2013, 8, e66428. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Yu, J.; Hu, S.; Wang, J.; Wong, G.K.S.; Li, S.; Liu, B.; Deng, Y.; Dai, L.; Zhou, Y.; Zhang, X.; et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 2002, 296, 92–100. [Google Scholar] [CrossRef] [PubMed]
  5. International Wheat Genome Sequencing Consortium (IWGSC); Appels, R.; Eversole, K.; Stein, N.; Feuillet, C.; Keller, B.; Rogers, J.; Pozniak, C.J.; Choulet, F.; Distelfeld, A.; et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 2018, 361, eaar7191. [Google Scholar]
  6. Qin, P.; Lu, H.; Du, H.; Wang, H.; Chen, W.; Chen, Z.; He, Q.; Ou, S.; Zhang, H.; Li, X.; et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 2021, 184, 3542–3558.e3516. [Google Scholar] [CrossRef]
  7. Lu, F.; Romay, M.C.; Glaubitz, J.C.; Bradbury, P.J.; Elshire, R.J.; Wang, T.; Li, Y.; Li, Y.; Semagn, K.; Zhang, X.; et al. High-resolution genetic mapping of maize pan-genome sequence anchors. Nat. Commun. 2015, 6, 6914. [Google Scholar] [CrossRef] [Green Version]
  8. Jayakodi, M.; Padmarasu, S.; Haberer, G.; Bonthala, V.S.; Gundlach, H.; Monat, C.; Lux, T.; Kamal, N.; Lang, D.; Himmelbach, A.; et al. The barley pan-genome reveals the hidden legacy of mutation breeding. Nature 2020, 588, 284–289. [Google Scholar] [CrossRef]
  9. Liu, Y.; Du, H.; Li, P.; Shen, Y.; Peng, H.; Liu, S.; Zhou, G.A.; Zhang, H.; Liu, Z.; Shi, M.; et al. Pan-genome of wild and cultivated soybeans. Cell 2020, 182, 162–176.e113. [Google Scholar] [CrossRef]
  10. Bayer, P.E.; Valliyodan, B.; Hu, H.; Marsh, J.I.; Yuan, Y.; Vuong, T.D.; Patil, G.; Song, Q.; Batley, J.; Varshney, R.K.; et al. Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. Plant Genome 2021, e20109. [Google Scholar] [CrossRef]
  11. Song, J.-M.; Guan, Z.; Hu, J.; Guo, C.; Yang, Z.; Wang, S.; Liu, D.; Wang, B.; Lu, S.; Zhou, R.; et al. Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat. Plants 2020, 6, 34–45. [Google Scholar] [CrossRef]
  12. Ruperao, P.; Thirunavukkarasu, N.; Gandham, P.; Selvanayagam, S.; Govindaraj, M.; Nebie, B.; Manyasa, E.; Gupta, R.; Das, R.R.; Odeny, D.A.; et al. Sorghum pan-genome explores the functional utility for genomic-assisted breeding to accelerate the genetic gain. Front. Plant Sci. 2021, 12, 666342. [Google Scholar] [CrossRef]
  13. Pratap, A.; Prajapati, U.; Singh, C.M.; Gupta, S.; Rathore, M.; Malviya, N.; Tomar, R.; Gupta, A.K.; Tripathi, S.; Singh, N.P. Potential, constraints and applications of in vitro methods in improving grain legumes. Plant Breed. 2018, 137, 235–249. [Google Scholar] [CrossRef]
  14. Abewoy, D. Review on genetics and breeding of tomato (Lycopersicon esculentum Mill). Adv. Crop Sci. Technol. 2017, 5, 306. [Google Scholar]
  15. Tay Fernandez, C.G.; Pati, K.; Severn-Ellis, A.A.; Batley, J.; Edwards, D. Studying the genetic diversity of yam bean using a new draft genome assembly. Agronomy 2021, 11, 953. [Google Scholar] [CrossRef]
  16. Zhang, L.; Xu, Y.; Zhang, X.; Ma, X.; Zhang, L.; Liao, Z.; Zhang, Q.; Wan, X.; Cheng, Y.; Zhang, J.; et al. The genome of kenaf (Hibiscus cannabinus L.) provides insights into bast fibre and leaf shape biogenesis. Plant Biotechnol. J. 2020, 18, 1796–1809. [Google Scholar] [CrossRef] [Green Version]
  17. Wang, X.; Chen, S.; Ma, X.; Yssel, A.E.J.; Chaluvadi, S.R.; Johnson, M.S.; Gangashetty, P.; Hamidou, F.; Sanogo, M.D.; Zwaenepoel, A.; et al. Genome sequence and genetic diversity analysis of an under-domesticated orphan crop, white fonio (Digitaria exilis). GigaScience 2021, 10, giab013. [Google Scholar] [CrossRef]
  18. Bayer, P.E.; Golicz, A.A.; Scheben, A.; Batley, J.; Edwards, D. Plant pan-genomes are the new reference. Nat. Plants 2020, 6, 914–920. [Google Scholar] [CrossRef]
  19. Gage, J.L.; Vaillancourt, B.; Hamilton, J.P.; Manrique-Carpintero, N.C.; Gustafson, T.J.; Barry, K.; Lipzen, A.; Tracy, W.F.; Mikel, M.A.; Kaeppler, S.M.; et al. Multiple maize reference genomes impact the identification of variants by genome-wide association study in a diverse inbred panel. Plant Genome 2019, 12, 180069. [Google Scholar] [CrossRef] [Green Version]
  20. Montenegro, J.D.; Golicz, A.A.; Bayer, P.E.; Hurgobin, B.; Lee, H.; Chan, C.-K.K.; Visendi, P.; Lai, K.; Doležel, J.; Batley, J.; et al. The pangenome of hexaploid bread wheat. Plant J. 2017, 90, 1007–1013. [Google Scholar] [CrossRef] [Green Version]
  21. The Rice Chromosomes 11 and 12 Sequencing Consortia. The sequence of rice chromosomes 11 and 12, rich in disease resistance genes and recent gene duplications. BMC Biol. 2005, 3, 20. [Google Scholar]
  22. Woodhouse, M.R.; Schnable, J.C.; Pedersen, B.S.; Lyons, E.; Lisch, D.; Subramaniam, S.; Freeling, M. Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homeologs. PLoS Biol. 2010, 8, e1000409. [Google Scholar] [CrossRef] [Green Version]
  23. Yu, J.; Tehrim, S.; Zhang, F.; Tong, C.; Huang, J.; Cheng, X.; Dong, C.; Zhou, Y.; Qin, R.; Hua, W.; et al. Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana. BMC Genom. 2014, 15, 3. [Google Scholar] [CrossRef] [Green Version]
  24. Chen, J.-Y.; Huang, J.-Q.; Li, N.-Y.; Ma, X.-F.; Wang, J.-L.; Liu, C.; Liu, Y.-F.; Liang, Y.; Bao, Y.-M.; Dai, X.-F. Genome-wide analysis of the gene families of resistance gene analogues in cotton and their response to Verticillium wilt. BMC Plant Biol. 2015, 15, 148. [Google Scholar] [CrossRef] [Green Version]
  25. Bennetzen, J.L. Transposable element contributions to plant gene and genome evolution. Plant Mol. Biol. 2000, 42, 251–269. [Google Scholar] [CrossRef]
  26. Zhang, L.; Ren, Y.; Yang, T.; Li, G.; Chen, J.; Gschwend, A.R.; Yu, Y.; Hou, G.; Zi, J.; Zhou, R.; et al. Rapid evolution of protein diversity by de novo origination in Oryza. Nat. Ecol. Evol. 2019, 3, 679–690. [Google Scholar] [CrossRef]
  27. Dunning, L.T.; Olofsson, J.K.; Parisod, C.; Choudhury, R.R.; Moreno-Villena, J.J.; Yang, Y.; Dionora, J.; Quick, W.P.; Park, M.; Bennetzen, J.L.; et al. Lateral transfers of large DNA fragments spread functional genes among grasses. Proc. Natl. Acad. Sci. USA 2019, 116, 4416. [Google Scholar] [CrossRef] [Green Version]
  28. Tettelin, H.; Masignani, V.; Cieslewicz, M.J.; Donati, C.; Medini, D.; Ward, N.L.; Angiuoli, S.V.; Crabtree, J.; Jones, A.L.; Durkin, A.S.; et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proc. Natl. Acad. Sci. USA 2005, 102, 13950. [Google Scholar] [CrossRef] [Green Version]
  29. Morgante, M.; De Paoli, E.; Radovic, S. Transposable elements and the plant pan-genomes. Curr. Opin. Plant Biol. 2007, 10, 149–155. [Google Scholar] [CrossRef]
  30. Li, Y.H.; Zhou, G.; Ma, J.; Jiang, W.; Jin, L.G.; Zhang, Z.; Guo, Y.; Zhang, J.; Sui, Y.; Zheng, L.; et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 2014, 32, 1045–1052. [Google Scholar] [CrossRef] [Green Version]
  31. Schatz, M.C.; Maron, L.G.; Stein, J.C.; Wences, A.H.; Gurtowski, J.; Biggers, E.; Lee, H.; Kramer, M.; Antoniou, E.; Ghiban, E.; et al. Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol. 2014, 15, 506. [Google Scholar]
  32. Golicz, A.A.; Bayer, P.E.; Barker, G.C.; Edger, P.P.; Kim, H.; Martinez, P.A.; Chan, C.K.K.; Severn-Ellis, A.; McCombie, W.R.; Parkin, I.A.P.; et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat. Commun. 2016, 7, 13390. [Google Scholar] [CrossRef] [PubMed]
  33. Gordon, S.P.; Contreras-Moreira, B.; Woods, D.P.; Des Marais, D.L.; Burgess, D.; Shu, S.; Stritt, C.; Roulin, A.C.; Schackwitz, W.; Tyler, L.; et al. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure. Nat. Commun. 2017, 8, 2184. [Google Scholar] [CrossRef] [PubMed]
  34. Hurgobin, B.; Golicz, A.A.; Bayer, P.E.; Chan, C.-K.K.; Tirnaz, S.; Dolatabadian, A.; Schiessl, S.V.; Samans, B.; Montenegro, J.D.; Parkin, I.A.P.; et al. Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnol. J. 2017, 16, 1265–1274. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Yu, J.; Golicz, A.A.; Lu, K.; Dossa, K.; Zhang, Y.; Chen, J.; Wang, L.; You, J.; Fan, D.; Edwards, D.; et al. Insight into the evolution and functional characteristics of the pan-genome assembly from sesame landraces and modern cultivars. Plant Biotechnol. J. 2019, 17, 881–892. [Google Scholar] [CrossRef] [Green Version]
  36. Gao, L.; Gonda, I.; Sun, H.; Ma, Q.; Bao, K.; Tieman, D.M.; Burzynski-Chang, E.A.; Fish, T.L.; Stromberg, K.A.; Sacks, G.L.; et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat. Genet. 2019, 51, 1044–1051. [Google Scholar] [CrossRef]
  37. Zhao, J.; Bayer, P.E.; Ruperao, P.; Saxena, R.K.; Khan, A.W.; Golicz, A.A.; Nguyen, H.T.; Batley, J.; Edwards, D.; Varshney, R.K. Trait associations in the pangenome of pigeon pea (Cajanus cajan). Plant Biotechnol. J. 2020, 18, 1946–1954. [Google Scholar] [CrossRef] [Green Version]
  38. Li, J.; Yuan, D.; Wang, P.; Wang, Q.; Sun, M.; Liu, Z.; Si, H.; Xu, Z.; Ma, Y.; Zhang, B.; et al. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection. Genome Biol. 2021, 22, 119. [Google Scholar] [CrossRef]
  39. Golicz, A.A.; Bayer, P.E.; Bhalla, P.L.; Batley, J.; Edwards, D. Pangenomics comes of age: From bacteria to plant and animal applications. Trends Genet. 2020, 36, 132–145. [Google Scholar] [CrossRef]
  40. Kamal, N.M.; Gorafi, Y.S.A.; Abdelrahman, M.; Abdellatef, E.; Tsujimoto, H. Stay-green trait: A prospective approach for yield potential, and drought and heat stress adaptation in globally important cereals. Int. J. Mol. Sci. 2019, 20, 5837. [Google Scholar] [CrossRef] [Green Version]
  41. Muthamilarasan, M.; Prasad, M. Small millets for enduring food security amidst pandemics. Trends Plant Sci. 2021, 26, 33–40. [Google Scholar] [CrossRef]
  42. Chauhan, M.; Sonawane, S.K.; Arya, S. Nutritional and nutraceutical properties of millets: A review. Clin. J. Nutr. Diet. 2018, 1, 1–10. [Google Scholar]
  43. Sebastin, R.; Lee, G.A.; Lee, K.J.; Shin, M.J.; Cho, G.T.; Lee, J.R.; Ma, K.H.; Chung, J.W. The complete chloroplast genome sequences of little millet (Panicum sumatrense Roth ex Roem. and Schult.) (Poaceae). Mitochondrial DNA Part B 2018, 3, 719–720. [Google Scholar] [CrossRef] [Green Version]
  44. Das, R.R.; Pradhan, S.; Parida, A. De-novo transcriptome analysis unveils differentially expressed genes regulating drought and salt stress response in Panicum sumatrense. Sci. Rep. 2020, 10, 21251. [Google Scholar] [CrossRef]
  45. Ballogou, V.; Soumanou, M.; Toukourou, F.; Hounhouigan, J. Structure and nutritional composition of fonio (Digitaria exilis) grains: A review. Int. Res. J. Biol. Sci. 2013, 2, 73–79. [Google Scholar]
  46. National Research Council. Lost Crops of Africa: Volume 1: Grains; National Academies Press: Washington, DC, USA, 1996. [Google Scholar]
  47. Abrouk, M.; Ahmed, H.I.; Cubry, P.; Šimoníková, D.; Cauet, S.; Pailles, Y.; Bettgenhaeuser, J.; Gapa, L.; Scarcelli, N.; Couderc, M.; et al. Fonio millet genome unlocks African orphan crop diversity for agriculture in a changing climate. Nat. Commun. 2020, 11, 4488. [Google Scholar] [CrossRef]
  48. Ibrahim Bio Yerima, A.R.; Issoufou, K.A.; Adje, C.A.; Mamadou, A.; Oselebe, H.; Gueye, M.C.; Billot, C.; Achigan-Dako, E.G. Genome-wide scanning enabled SNP discovery, linkage disequilibrium patterns and population structure in a panel of fonio (Digitaria exilis [Kippist] Stapf) germplasm. Front. Sustain. Food Syst. 2021, 5, 699549. [Google Scholar] [CrossRef]
  49. Bennetzen, J.L.; Schmutz, J.; Wang, H.; Percifield, R.; Hawkins, J.; Pontaroli, A.C.; Estep, M.; Feng, L.; Vaughn, J.N.; Grimwood, J. Reference genome sequence of the model plant Setaria. Nat. Biotechnol. 2012, 30, 555–561. [Google Scholar] [CrossRef] [Green Version]
  50. Varshney, R.K.; Shi, C.; Thudi, M.; Mariac, C.; Wallace, J.; Qi, P.; Zhang, H.; Zhao, Y.; Wang, X.; Rathore, A.; et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat. Biotechnol. 2017, 35, 969–976. [Google Scholar] [CrossRef] [Green Version]
  51. Zou, C.; Li, L.; Miki, D.; Li, D.; Tang, Q.; Xiao, L.; Rajput, S.; Deng, P.; Peng, L.; Jia, W. The genome of broomcorn millet. Nat. Commun. 2019, 10, 1–11. [Google Scholar] [CrossRef] [Green Version]
  52. Bennetzen, J.L.; Freeling, M. The unified grass genome: Synergy in synteny. Genome Res. 1997, 7, 301–306. [Google Scholar] [CrossRef] [Green Version]
  53. Takahashi, Y.; Somta, P.; Muto, C.; Iseki, K.; Naito, K.; Pandiyan, M.; Natesan, S.; Tomooka, N. Novel genetic resources in the genus vigna unveiled from gene bank accessions. PLoS ONE 2016, 11, e0147568. [Google Scholar] [CrossRef]
  54. Yundaeng, C.; Somta, P.; Amkul, K.; Kongjaimun, A.; Kaga, A.; Tomooka, N. Construction of genetic linkage map and genome dissection of domestication-related traits of moth bean (Vigna aconitifolia), a legume crop of arid areas. Mol. Genet. Genom. 2019, 294, 621–635. [Google Scholar] [CrossRef]
  55. Minde, J.J.; Venkataramana, P.B.; Matemu, A.O. Dolichos Lablab-an underutilized crop with future potentials for food and nutrition security: A review. Crit. Rev. Food Sci. Nutr. 2021, 61, 2249–2261. [Google Scholar] [CrossRef]
  56. Chakoma, I.; Manyawu, G.J.; Gwiriri, L.; Moyo, S.; Dube, S. The Agronomy and Use of Lablab Purpureus in Smallholder Farming Systems of Southern Africa; International Livestock Research Institute: Nairobi, Kenya, 2016. [Google Scholar]
  57. Missanga, J.S.; Venkataramana, P.B.; Ndakidemi, P.A. Recent developments in Lablab purpureus genomics: A focus on drought stress tolerance and use of genomic resources to develop stress-resilient varieties. Legume Sci. 2021, 3, e99. [Google Scholar] [CrossRef]
  58. Wang, B.; Zhao, M.; Yao, L.; Joao, V.M.d.S.; Babu, V.; Wu, T.; Nguyen, H.T. Identification of drought-inducible regulatory factors in Lablab purpureus by a comparative genomic approach. Crop Pasture Sci. 2018, 69, 632–641. [Google Scholar] [CrossRef]
  59. Rai, K.K.; Rai, N.; Rai, S.P. Recent advancement in modern genomic tools for adaptation of Lablab purpureus L to biotic and abiotic stresses: Present mechanisms and future adaptations. Acta Physiol. Plant. 2018, 40, 164. [Google Scholar] [CrossRef]
  60. Chang, Y.; Liu, H.; Liu, M.; Liao, X.; Sahu, S.K.; Fu, Y.; Song, B.; Cheng, S.; Kariba, R.; Muthemba, S.; et al. The draft genomes of five agriculturally important African orphan crops. GigaScience 2019, 8, giy152. [Google Scholar] [CrossRef]
  61. Li, N.; Bai, J.-Q.; Gao, S.; Yang, L.; Li, J.; Du, S.-B.; Wang, X.-P. The complete molecular sequence of chloroplast genome of Lablab purpureus (L.) Sweet. Mitochondrial DNA Part B 2021, 6, 758–759. [Google Scholar] [CrossRef]
  62. Sørensen, M. Observations on distribution, ecology and cultivation of the tuber-bearing legume genus Pachyrhizus Rich. ex DC. Agric. Univ. Wagening. Pap. 1990, 3, 38. [Google Scholar]
  63. Ade-Omowaye, B.; Tucker, G.; Smetanska, I. Nutritional potential of nine underexploited legumes in Southwest Nigeria. Int. Food Res. J. 2015, 22, 798. [Google Scholar]
  64. Pati, K.; Zhang, F.; Batley, J. First report of genome size and ploidy of the underutilized leguminous tuber crop Yam Bean (Pachyrhizus erosus and P. tuberosus) by flow cytometry. Plant Genet. Resour. Charact. Util. 2019, 17, 456–459. [Google Scholar] [CrossRef]
  65. Sørensen, M. Yam Bean: Pachyrhizus DC.-Promoting the Conservation and Use of Underutilized and Neglected Crops. 2; Bioversity International: Roma, Italy, 1996; Volume 2. [Google Scholar]
  66. Zhang, W.E.; Wang, F.; Pan, X.J.; Tian, Z.G.; Zhao, X.M. Antioxidant enzymes and photosynthetic responses to drought stress of three Canna edulis Cultivars. Hortic. Sci. Technol. 2013, 31, 677–686. [Google Scholar] [CrossRef] [Green Version]
  67. Sandoval, L.; Zamora-Castro, S.A.; Vidal-Álvarez, M.; Marín-Muñiz, J.L. Role of wetland plants and use of ornamental flowering plants in constructed wetlands for wastewater treatment: A review. Appl. Sci. 2019, 9, 685. [Google Scholar] [CrossRef] [Green Version]
  68. Zhu, Q.; Cai, L.; Li, H.; Zhang, Y.; Su, W.; Zhou, Q. The complete chloroplast genome sequence of the Canna edulis Ker Gawl. (Cannaceae). Mitochondrial DNA Part B Resour. 2020, 5, 2427–2428. [Google Scholar] [CrossRef] [PubMed]
  69. Zhang, L.; Wan, X.; Xu, J.; Lin, L.; Qi, J. De novo assembly of kenaf (Hibiscus cannabinus) transcriptome using Illumina sequencing for gene discovery and marker identification. Mol. Breed. 2015, 35, 192. [Google Scholar] [CrossRef]
  70. Liao, X.; Zhao, Y.; Kong, X.; Khan, A.; Zhou, B.; Liu, D.; Kashif, M.H.; Chen, P.; Wang, H.; Zhou, R. Complete sequence of kenaf (Hibiscus cannabinus) mitochondrial genome and comparative analysis with the mitochondrial genomes of other plants. Sci. Rep. 2018, 8, 12714. [Google Scholar] [CrossRef]
  71. Bowers, J.E.; Pearl, S.A.; Burke, J.M. Genetic mapping of millions of SNPs in safflower (Carthamus tinctorius L.) via whole-genome resequencing. G3 Genes|Genomes|Genet. 2016, 6, 2203–2211. [Google Scholar] [CrossRef] [Green Version]
  72. Zhu, H.; Wang, Z.; Ma, C.; Tian, J.; Fu, F.; Li, C.; Guo, D.; Roeder, E.; Liu, K. Neuroprotective effects of hydroxysafflor yellow A: In vivo and in vitro studies. Planta Med. 2003, 69, 429–433. [Google Scholar]
  73. Wu, Z.; Liu, H.; Zhan, W.; Yu, Z.; Qin, E.; Liu, S.; Yang, T.; Xiang, N.; Kudrna, D.; Chen, Y.; et al. The chromosome-scale reference genome of safflower (Carthamus tinctorius) provides insights into linoleic acid and flavonoid biosynthesis. Plant Biotechnol. J. 2021, 19, 1725–1742. [Google Scholar] [CrossRef]
  74. Ray, P.K. Breeding Tropical and Subtropical Fruits; Springer Science & Business Media: Berlin, Germany, 2002. [Google Scholar]
  75. Upadhyay, R.; Dass, J.F.P.; Chauhan, A.K.; Yadav, P.; Singh, M.; Singh, R.B. Chapter 21—Guava enriched functional foods: Therapeutic potentials and technological challenges. In The Role of Functional Food Security in Global Health; Singh, R.B., Watson, R.R., Takahashi, T., Eds.; Academic Press: Cambridge, MA, USA, 2019; pp. 365–378. [Google Scholar]
  76. Thakur, S.; Yadav, I.S.; Jindal, M.; Sharma, P.K.; Dhillon, G.S.; Boora, R.S.; Arora, N.K.; Gill, M.I.S.; Chhuneja, P.; Mittal, A. Development of genome-wide functional markers using draft genome assembly of guava (Psidium guajava L.) cv. Allahabad safeda to expedite molecular breeding. Front. Plant Sci. 2021, 12, 708332. [Google Scholar] [CrossRef]
  77. Feng, C.; Feng, C.; Lin, X.; Liu, S.; Li, Y.; Kang, M. A chromosome-level genome assembly provides insights into ascorbic acid accumulation and fruit softening in guava (Psidium guajava). Plant Biotechnol. J. 2021, 19, 717–730. [Google Scholar] [CrossRef]
  78. Mittal, A.; Yadav, I.S.; Arora, N.K.; Boora, R.S.; Mittal, M.; Kaur, P.; Erskine, W.; Chhuneja, P.; Gill, M.I.S.; Singh, K. RNA-sequencing based gene expression landscape of guava cv. Allahabad Safeda and comparative analysis to colored cultivars. BMC Genom. 2020, 21, 484. [Google Scholar] [CrossRef]
  79. Borrell, J.S.; Biswas, M.K.; Goodwin, M.; Blomme, G.; Schwarzacher, T.; Heslop-Harrison, J.S.; Wendawek, A.M.; Berhanu, A.; Kallow, S.; Janssens, S.; et al. Enset in Ethiopia: A poorly characterized but resilient starch staple. Ann. Bot. 2019, 123, 747–766. [Google Scholar] [CrossRef] [Green Version]
  80. Rijzaani, H.; Bayer, P.E.; Rouard, M.; Doležel, J.; Batley, J.; Edwards, D. The pangenome of banana highlights differences between genera and genomes. Plant Genome 2021, e20100. [Google Scholar] [CrossRef]
  81. Golicz, A.A.; Batley, J.; Edwards, D. Towards plant pangenomics. Plant Biotechnol. J. 2016, 14, 1099–1105. [Google Scholar] [CrossRef]
  82. Danilevicz, M.F.; Tay Fernandez, C.G.; Marsh, J.I.; Bayer, P.E.; Edwards, D. Plant pangenomics: Approaches, applications and advancements. Curr. Opin. Plant Biol. 2020, 54, 18–25. [Google Scholar] [CrossRef]
  83. Garrison, E.; Sirén, J.; Novak, A.M.; Hickey, G.; Eizenga, J.M.; Dawson, E.T.; Jones, W.; Garg, S.; Markello, C.; Lin, M.F.; et al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat. Biotechnol. 2018, 36, 875–879. [Google Scholar] [CrossRef]
  84. Marcus, S.; Lee, H.; Schatz, M.C. SplitMEM: A graphical algorithm for pan-genome analysis with suffix skips. Bioinformatics 2014, 30, 3476–3483. [Google Scholar] [CrossRef] [Green Version]
  85. Li, H.; Feng, X.; Chu, C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 2020, 21, 265. [Google Scholar] [CrossRef]
  86. Eizenga, J.M.; Novak, A.M.; Sibbesen, J.A.; Heumos, S.; Ghaffaari, A.; Hickey, G.; Chang, X.; Seaman, J.D.; Rounthwaite, R.; Ebler, J.; et al. Pangenome graphs. Annu. Rev. Genom. Hum. Genet. 2020, 21, 139–162. [Google Scholar] [CrossRef]
  87. Paten, B.; Novak, A.M.; Eizenga, J.M.; Garrison, E. Genome graphs and the evolution of genome inference. Genome Res. 2017, 27, 665–676. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  88. Hickey, G.; Heller, D.; Monlong, J.; Sibbesen, J.A.; Sirén, J.; Eizenga, J.; Dawson, E.T.; Garrison, E.; Novak, A.M.; Paten, B. Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biol. 2020, 21, 35. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  89. Rabbani, L.; Müller, J.; Weigel, D. An algorithm to build a multi-genome reference. bioRxiv 2020. [Google Scholar] [CrossRef] [Green Version]
  90. The Computational Pan-Genomics Consortium. Computational pan-genomics: Status, promises and challenges. Brief. Bioinform. 2018, 19, 118–135. [Google Scholar]
  91. Rakocevic, G.; Semenyuk, V.; Lee, W.-P.; Spencer, J.; Browning, J.; Johnson, I.J.; Arsenijevic, V.; Nadj, J.; Ghose, K.; Suciu, M.C.; et al. Fast and accurate genomic analyses using genome graphs. Nat. Genet. 2019, 51, 354–362. [Google Scholar] [CrossRef]
  92. Jensen, S.E.; Charles, J.R.; Muleta, K.; Bradbury, P.J.; Casstevens, T.; Deshpande, S.P.; Gore, M.A.; Gupta, R.; Ilut, D.C.; Johnson, L.; et al. A sorghum Practical Haplotype Graph facilitates genome-wide imputation and cost-effective genomic prediction. Plant Genome 2020, 13, e20009. [Google Scholar] [CrossRef] [Green Version]
  93. Zanini, S.F.; Bayer, P.E.; Wells, R.; Snowdon, R.J.; Batley, J.; Varshney, R.K.; Nguyen, H.T.; Edwards, D.; Golicz, A.A. Pangenomics in crop improvement—From coding structural variations to finding regulatory variants with pangenome graphs. Plant Genome 2021, 13, e20177. [Google Scholar] [CrossRef]
  94. Bradbury, P.J.; Casstevens, T.; Jensen, S.E.; Johnson, L.C.; Miller, Z.R.; Monier, B.; Romay, M.C.; Song, B.; Buckler, E.S. The practical haplotype graph, a platform for storing and using pangenomes for imputation. bioRxiv 2021. [Google Scholar] [CrossRef]
  95. Long, E.M.; Bradbury, P.J.; Romay, M.C.; Buckler, E.S.; Robbins, K.R. Genome-wide imputation using the practical haplotype graph in the heterozygous crop cassava. G3 Genes|Genomes|Genet. 2021, 12, jkab383. [Google Scholar] [CrossRef]
  96. Franco, J.A.V.; Gage, J.L.; Bradbury, P.J.; Johnson, L.C.; Miller, Z.R.; Buckler, E.S.; Romay, M.C. A maize practical haplotype graph leverages diverse NAM assemblies. bioRxiv 2020. [Google Scholar] [CrossRef]
  97. Maistrenko, O.M.; Mende, D.R.; Luetge, M.; Hildebrand, F.; Schmidt, T.S.B.; Li, S.S.; Rodrigues, J.F.M.; von Mering, C.; Pedro Coelho, L.; Huerta-Cepas, J.; et al. Disentangling the impact of environmental and phylogenetic constraints on prokaryotic within-species diversity. ISME J. 2020, 14, 1247–1259. [Google Scholar] [CrossRef] [Green Version]
  98. Khan, A.W.; Garg, V.; Roorkiwal, M.; Golicz, A.A.; Edwards, D.; Varshney, R.K. Super-pangenome by integrating the wild side of a species for accelerated crop improvement. Trends Plant Sci. 2020, 25, 148–158. [Google Scholar] [CrossRef] [Green Version]
  99. He, Z.; Ji, R.; Havlickova, L.; Wang, L.; Li, Y.; Lee, H.T.; Song, J.; Koh, C.; Yang, J.; Zhang, M.; et al. Genome structural evolution in Brassica crops. Nat. Plants 2021, 7, 757–765. [Google Scholar] [CrossRef]
  100. Mohd Saad, N.S.; Severn-Ellis, A.A.; Pradhan, A.; Edwards, D.; Batley, J. Genomics armed with diversity leads the way in Brassica improvement in a changing global environment. Front Genet 2021, 12, 110. [Google Scholar] [CrossRef]
  101. Yuan, Y.; Bayer, P.E.; Batley, J.; Edwards, D. Current status of structural variation studies in plants. Plant Biotechnol. J. 2021, 19, 2153–2163. [Google Scholar] [CrossRef]
  102. Sekhwal, M.K.; Li, P.; Lam, I.; Wang, X.; Cloutier, S.; You, F.M. Disease resistance gene analogs (RGAs) in plants. Int. J. Mol. Sci. 2015, 16, 19248–19290. [Google Scholar] [CrossRef] [Green Version]
  103. Bayer, P.E.; Golicz, A.A.; Tirnaz, S.; Chan, C.-K.K.; Edwards, D.; Batley, J. Variation in abundance of predicted resistance genes in the Brassica oleracea pangenome. Plant Biotechnol. J. 2019, 17, 789–800. [Google Scholar] [CrossRef] [Green Version]
  104. Dolatabadian, A.; Bayer, P.E.; Tirnaz, S.; Hurgobin, B.; Edwards, D.; Batley, J. Characterization of disease resistance genes in the Brassica napus pangenome reveals significant structural variation. Plant Biotechnol. J. 2020, 18, 969–982. [Google Scholar] [CrossRef] [Green Version]
  105. Cantila, A.Y.; Saad, N.S.M.; Amas, J.C.; Edwards, D.; Batley, J. Recent findings unravel genes and genetic factors underlying leptosphaeria maculans resistance in Brassica napus and its relatives. Int. J. Mol. Sci. 2020, 22, 313. [Google Scholar] [CrossRef]
  106. Zhang, Y.; Thomas, W.; Bayer, P.E.; Edwards, D.; Batley, J. Frontiers in dissecting and managing Brassica diseases: From reference-based RGA candidate identification to building Pan-RGAomes. Int. J. Mol. Sci. 2020, 21, 8964. [Google Scholar] [CrossRef]
  107. Bakker, E.G.; Toomajian, C.; Kreitman, M.; Bergelson, J. A genome-wide survey of R gene polymorphisms in Arabidopsis. Plant Cell 2006, 18, 1803–1818. [Google Scholar] [CrossRef] [Green Version]
  108. Stam, R.; Silva-Arias, G.A.; Tellier, A. Subsets of NLR genes show differential signatures of adaptation during colonization of new habitats. New Phytol. 2019, 224, 367–379. [Google Scholar] [CrossRef]
  109. Garg, B.K.; Kathju, S.; Burman, U. Influence of Water Stress on Water Relations, Photosynthetic Parameters and Nitrogen Metabolism of Moth Bean Genotypes. Biol. Plant. 2001, 44, 289–292. [Google Scholar] [CrossRef]
  110. Garg, B.K.; Burman, U.; Kathju, S. The influence of phosphorus nutrition on the physiological response of moth bean genotypes to drought. J. Plant Nutr. Soil Sci. 2004, 167, 503–508. [Google Scholar] [CrossRef]
  111. Yao, L.-M.; Wang, B.; Cheng, L.-J.; Wu, T.-L. Identification of key drought stress-related genes in the hyacinth bean. PLoS ONE 2013, 8, e58108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  112. Naeem, M.; Shabbir, A.; Ansari, A.A.; Aftab, T.; Khan, M.M.A.; Uddin, M. Hyacinth bean (Lablab purpureus L.)—An underutilised crop with future potential. Sci. Hortic. 2020, 272, 109551. [Google Scholar] [CrossRef]
  113. Hu, H.; Scheben, A.; Verpaalen, B.; Tirnaz, S.; Bayer, P.E.; Hodel, R.G.J.; Batley, J.; Soltis, D.E.; Soltis, P.S.; Edwards, D. Amborella gene presence/absence variation is associated with abiotic stress responses that may contribute to environmental adaptation. New Phytol. 2021, 233, 1548–1555. [Google Scholar] [CrossRef]
  114. Zhang, X.; Liu, T.; Wang, J.; Wang, P.; Qiu, Y.; Zhao, W.; Pang, S.; Li, X.; Wang, H.; Song, J.; et al. Pan-genome of Raphanus highlights genetic variation and introgression among domesticated, wild, and weedy radishes. Mol. Plant 2021, 14, 2032–2055. [Google Scholar] [CrossRef]
  115. Alonge, M.; Wang, X.; Benoit, M.; Soyk, S.; Pereira, L.; Zhang, L.; Suresh, H.; Ramakrishnan, S.; Maumus, F.; Ciren, D.; et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell 2020, 182, 145–161.e123. [Google Scholar] [CrossRef]
  116. Varshney, R.K.; Bohra, A.; Roorkiwal, M.; Barmukh, R.; Cowling, W.A.; Chitikineni, A.; Lam, H.-M.; Hickey, L.T.; Croser, J.S.; Bayer, P.E.; et al. Fast-forward breeding for a food-secure world. Trends Genet. 2021, 37, 1124–1136. [Google Scholar] [CrossRef]
  117. Tanaka, N.; Shenton, M.; Kawahara, Y.; Kumagai, M.; Sakai, H.; Kanamori, H.; Yonemaru, J.; Fukuoka, S.; Sugimoto, K.; Ishimoto, M.; et al. Whole-genome sequencing of the NARO world rice core collection (WRC) as the Basis for diversity and association studies. Plant Cell Physiol. 2020, 61, 922–932. [Google Scholar] [CrossRef] [Green Version]
  118. Marsh, J.I.; Hu, H.; Gill, M.; Batley, J.; Edwards, D. Crop breeding for a changing climate: Integrating phenomics and genomics with bioinformatics. Theor. Appl. Genet. 2021, 134, 1677–1690. [Google Scholar] [CrossRef]
  119. Shook, J.M.; Zhang, J.; Jones, S.E.; Singh, A.; Diers, B.W.; Singh, A.K. Meta-GWAS for quantitative trait loci identification in soybean. G3 Genes|Genomes|Genet. 2021, 11, kab117. [Google Scholar] [CrossRef] [PubMed]
  120. Zheng, B.; Zhao, Q.; Wu, H.; Wang, S.; Zou, M. A Comparative metabolomics analysis of guava (Psidium guajava L.) fruit with different colors. ACS Food Sci. Technol. 2021, 1, 96–106. [Google Scholar] [CrossRef]
  121. Lee, S.; Choi, H.-K.; Cho, S.K.; Kim, Y.-S. Metabolic analysis of guava (Psidium guajava L.) fruits at different ripening stages using different data-processing approaches. J. Chromatogr. B 2010, 878, 2983–2988. [Google Scholar] [CrossRef]
  122. Moon, P.; Fu, Y.; Bai, J.; Plotto, A.; Crane, J.; Chambers, A. Assessment of fruit aroma for twenty-seven guava (Psidium guajava) accessions through three fruit developmental stages. Sci. Hortic. 2018, 238, 375–383. [Google Scholar] [CrossRef]
  123. Zanklan, A.S.; Becker, H.C.; Sørensen, M.; Pawelzik, E.; Grüneberg, W.J. Genetic diversity in cultivated yam bean (Pachyrhizus spp.) evaluated through multivariate analysis of morphological and agronomic traits. Genet. Resour. Crop Evol. 2018, 65, 811–843. [Google Scholar] [CrossRef] [Green Version]
  124. Tapia, C.; Sørensen, M. Morphological characterization of the genetic variation existing in a Neotropical collection of yam bean, Pachyrhizus tuberosus (Lam.) Spreng. Genet. Resour. Crop Evol. 2003, 50, 681–692. [Google Scholar] [CrossRef]
  125. Silva, E.; Filho, D.; Ticona-Benavente, C. Diversity of yam bean (Pachyrhizus spp. Fabaceae) based on morphoagronomic traits in the Brazilian Amazon. Acta Amaz. 2016, 233, 233–240. [Google Scholar] [CrossRef] [Green Version]
  126. Martina, M.; Tikunov, Y.; Portis, E.; Bovy, A.G. The genetic basis of tomato aroma. Genes 2021, 12, 226. [Google Scholar] [CrossRef]
  127. Pereira, L.; Sapkota, M.; Alonge, M.; Zheng, Y.; Zhang, Y.; Razifard, H.; Taitano, N.K.; Schatz, M.C.; Fernie, A.R.; Wang, Y.; et al. Natural genetic diversity in tomato flavor genes. Front. Plant Sci. 2021, 12, 914. [Google Scholar] [CrossRef] [PubMed]
  128. Sharma, D.; Tiwari, A.; Sood, S.; Jamra, G.; Singh, N.K.; Meher, P.K.; Kumar, A. Genome wide association mapping of agro-morphological traits among a diverse collection of finger millet (Eleusine coracana L.) genotypes using SNP markers. PLoS ONE 2018, 13, e0199444. [Google Scholar] [CrossRef] [PubMed]
  129. Puranik, S.; Sahu, P.P.; Beynon, S.; Srivastava, R.K.; Sehgal, D.; Ojulong, H.; Yadav, R. Genome-wide association mapping and comparative genomics identifies genomic regions governing grain nutritional traits in finger millet (Eleusine coracana L. Gaertn.). Plants People Planet 2020, 2, 649–662. [Google Scholar] [CrossRef]
  130. Liu, Y.; Wang, D.; He, F.; Wang, J.; Joshi, T.; Xu, D. Phenotype prediction and genome-wide association study using deep convolutional neural network of soybean. Front. Genet. 2019, 10, 1091. [Google Scholar] [CrossRef]
  131. Bayer, P.E.; Petereit, J.; Danilevicz, M.F.; Anderson, R.; Batley, J.; Edwards, D. The application of pangenomics and machine learning in genomic selection in plants. Plant Genome 2021, 14, e20112. [Google Scholar] [CrossRef]
  132. Min, S.; Lee, B.; Yoon, S. Deep learning in bioinformatics. Brief. Bioinform. 2017, 18, 851–869. [Google Scholar] [CrossRef] [Green Version]
  133. Gao, S.; Wu, J.; Stiller, J.; Zheng, Z.; Zhou, M.; Wang, Y.-G.; Liu, C. Identifying barley pan-genome sequence anchors using genetic mapping and machine learning. Theor. Appl. Genet. 2020, 133, 2535–2544. [Google Scholar] [CrossRef]
  134. Bayer, P.E.; Scheben, A.; Golicz, A.A.; Yuan, Y.; Faure, S.; Lee, H.; Chawla, H.S.; Anderson, R.; Bancroft, I.; Raman, H.; et al. Modelling of gene loss propensity in the pangenomes of three Brassica species suggests different mechanisms between polyploids and diploids. Plant Biotechnol. J. 2021, 19, 2488–2500. [Google Scholar] [CrossRef]
  135. Gabur, I.; Chawla, H.S.; Lopisso, D.T.; von Tiedemann, A.; Snowdon, R.J.; Obermeier, C. Gene presence-absence variation associates with quantitative Verticillium longisporum disease resistance in Brassica napus. Sci. Rep. 2020, 10, 4131. [Google Scholar] [CrossRef] [Green Version]
  136. Merker, J.D.; Wenger, A.M.; Sneddon, T.; Grove, M.; Zappala, Z.; Fresard, L.; Waggott, D.; Utiramerur, S.; Hou, Y.; Smith, K.S.; et al. Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genet. Med. 2018, 20, 159–163. [Google Scholar] [CrossRef] [Green Version]
  137. Jain, M.; Olsen, H.E.; Paten, B.; Akeson, M. The oxford nanopore MinION: Delivery of nanopore sequencing to the genomics community. Genome Biol. 2016, 17, 239. [Google Scholar] [CrossRef] [Green Version]
  138. Jain, M.; Koren, S.; Miga, K.H.; Quick, J.; Rand, A.C.; Sasani, T.A.; Tyson, J.R.; Beggs, A.D.; Dilthey, A.T.; Fiddes, I.T.; et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 2018, 36, 338–345. [Google Scholar] [CrossRef] [Green Version]
  139. Wenger, A.M.; Peluso, P.; Rowell, W.J.; Chang, P.-C.; Hall, R.J.; Concepcion, G.T.; Ebler, J.; Fungtammasan, A.; Kolesnikov, A.; Olson, N.D.; et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019, 37, 1155–1162. [Google Scholar] [CrossRef]
  140. Lan, T.; Renner, T.; Ibarra-Laclette, E.; Farr, K.M.; Chang, T.-H.; Cervantes-Pérez, S.A.; Zheng, C.; Sankoff, D.; Tang, H.; Purbojati, R.W.; et al. Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome. Proc. Natl. Acad. Sci. USA 2017, 114, E4435. [Google Scholar] [CrossRef] [Green Version]
  141. Mahmoud, M.; Gobet, N.; Cruz-Dávalos, D.I.; Mounier, N.; Dessimoz, C.; Sedlazeck, F.J. Structural variant calling: The long and the short of it. Genome Biol. 2019, 20, 246. [Google Scholar] [CrossRef]
  142. Bhat, J.A.; Ali, S.; Salgotra, R.K.; Mir, Z.A.; Dutta, S.; Jadon, V.; Tyagi, A.; Mushtaq, M.; Jain, N.; Singh, P.K.; et al. Genomic selection in the Era of next generation sequencing for complex traits in plant breeding. Front Genet 2016, 7, 221. [Google Scholar] [CrossRef] [Green Version]
  143. Midha, M.K.; Wu, M.; Chiu, K.P. Long-read sequencing in deciphering human genetics to a greater depth. Hum. Genet. 2019, 138, 1201–1215. [Google Scholar] [CrossRef]
  144. Laing, C.; Buchanan, C.; Taboada, E.N.; Zhang, Y.; Kropinski, A.; Villegas, A.; Thomas, J.E.; Gannon, V.P.J. Pan-genome sequence analysis using Panseq: An online tool for the rapid analysis of core and accessory genomic regions. BMC Bioinform. 2010, 11, 461. [Google Scholar] [CrossRef] [Green Version]
  145. Xiao, T.; Zhou, W. The third generation sequencing: The advanced approach to genetic diseases. Transl. Pediatr. 2020, 9, 163–173. [Google Scholar] [CrossRef]
  146. Amarasinghe, S.L.; Su, S.; Dong, X.; Zappia, L.; Ritchie, M.E.; Gouil, Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020, 21, 30. [Google Scholar] [CrossRef] [Green Version]
  147. Brůna, T.; Hoff, K.J.; Lomsadze, A.; Stanke, M.; Borodovsky, M. BRAKER2: Automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinform. 2021, 3, lqaa108. [Google Scholar] [CrossRef] [PubMed]
  148. Campbell, M.S.; Holt, C.; Moore, B.; Yandell, M. Genome annotation and curation using MAKER and MAKER-P. Curr. Protoc. Bioinform. 2014, 48, 4.11.11–14.11.39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  149. Salzberg, S.L. Next-generation genome annotation: We still struggle to get it right. Genome Biol. 2019, 20, 92. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  150. Golicz, A.A.; Bhalla, P.L.; Singh, M.B. MCRiceRepGP: A framework for the identification of genes associated with sexual reproduction in rice. Plant J. 2018, 96, 188–202. [Google Scholar] [CrossRef] [Green Version]
  151. Scheben, A.; Edwards, D. Bottlenecks for genome-edited crops on the road from lab to farm. Genome Biol. 2018, 19, 178. [Google Scholar] [CrossRef] [Green Version]
  152. The UniProt Consortium. UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Res. 2021, 49, D480–D489. [Google Scholar] [CrossRef]
  153. Coordinators, N.R. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2018, 46, D8–D13. [Google Scholar] [CrossRef] [Green Version]
  154. Bayer, P.E.; Edwards, D.; Batley, J. Bias in resistance gene prediction due to repeat masking. Nat. Plants 2018, 4, 762–765. [Google Scholar] [CrossRef]
  155. Sherman, R.M.; Salzberg, S.L. Pan-genomics in the human genome era. Nat. Rev. Genet. 2020, 21, 243–254. [Google Scholar] [CrossRef]
  156. Sirén, J.; Garrison, E.; Novak, A.M.; Paten, B.; Durbin, R. Haplotype-aware graph indexes. Bioinformatics 2020, 36, 400–407. [Google Scholar] [CrossRef]
  157. Ou, L.; Li, D.; Lv, J.; Chen, W.; Zhang, Z.; Li, X.; Yang, B.; Zhou, S.; Yang, S.; Li, W.; et al. Pan-genome of cultivated pepper (Capsicum) and its use in gene presence–absence variation analyses. New Phytol. 2018, 220, 360–363. [Google Scholar] [CrossRef] [Green Version]
  158. Jiao, W.-B.; Schneeberger, K. Chromosome-level assemblies of multiple Arabidopsis genomes reveal hotspots of rearrangements with altered evolutionary dynamics. Nat. Commun. 2020, 11, 989. [Google Scholar] [CrossRef] [Green Version]
  159. Gordon, S.P.; Contreras-Moreira, B.; Levy, J.J.; Djamei, A.; Czedik-Eysenberg, A.; Tartaglio, V.S.; Session, A.; Martin, J.; Cartwright, A.; Katz, A.; et al. Gradual polyploid genome evolution revealed by pan-genomic analysis of Brachypodium hybridum and its diploid progenitors. Nat. Commun. 2020, 11, 3670. [Google Scholar] [CrossRef]
  160. Hübner, S.; Bercovich, N.; Todesco, M.; Mandel, J.R.; Odenheimer, J.; Ziegler, E.; Lee, J.S.; Baute, G.J.; Owens, G.L.; Grassa, C.J.; et al. Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat. Plants 2019, 5, 54–62. [Google Scholar] [CrossRef]
  161. Sun, X.; Jiao, C.; Schwaninger, H.; Chao, C.T.; Ma, Y.; Duan, N.; Khan, A.; Ban, S.; Xu, K.; Cheng, L.; et al. Phased diploid genome assemblies and pan-genomes provide insights into the genetic history of apple domestication. Nat. Genet. 2020, 52, 1423–1432. [Google Scholar] [CrossRef]
  162. Zhou, P.; Silverstein, K.A.T.; Ramaraj, T.; Guhlin, J.; Denny, R.; Liu, J.; Farmer, A.D.; Steele, K.P.; Stupar, R.M.; Miller, J.R.; et al. Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes. BMC Genom. 2017, 18, 261. [Google Scholar] [CrossRef] [Green Version]
  163. Zhang, B.; Zhu, W.; Diao, S.; Wu, X.; Lu, J.; Ding, C.; Su, X. The poplar pangenome provides insights into the evolutionary history of the genus. Commun. Biol. 2019, 2, 215. [Google Scholar] [CrossRef]
  164. Okada, R.; Kiyota, E.; Moriyama, H.; Toshiyuki, F.; Valverde, R.A. A new endornavirus species infecting Malabar spinach (Basella alba L.). Arch. Virol. 2014, 159, 807–809. [Google Scholar] [CrossRef]
  165. Wang, X.; Larrea-Sarmiento, A.; Borth, W.B.; Barone, R.; Olmedo-Velarde, A.; Melzer, M.J.; Suzuki, J.Y.; Wall, M.M.; Hu, J.S. First report of Basella alba naturally infected with basella rugose mosaic virus in Hawaii. Plant Dis. 2020, 104, 2296. [Google Scholar] [CrossRef] [Green Version]
  166. Silva, L.; Techio, V.; Resende, L.; TBraz, G.; Resende, K.; Samartini, C. Unconventional vegetables collected in Brazil: Chromosome number and description of nuclear DNA content ARTICLE. Crop Breed. Appl. Biotechnol. 2017, 17, 320–326. [Google Scholar] [CrossRef] [Green Version]
  167. Tripathi, P.; Shahin, L.; Sangra, A.; Bajaj, R.; Arun, A.; Berrios, J.A.N. Current Status and future prospects for select underutilized medicinally valuable plants of puerto rico: A case study. In Medicinal Plants: From Farm to Pharmacy; Joshee, N., Dhekney, S.A., Parajuli, P., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 81–110. [Google Scholar]
  168. Paz, F.S.; Pinto, C.E.; de Brito, R.M.; Imperatriz-Fonseca, V.L.; Giannini, T.C. Edible fruit plant species in the amazon forest rely mostly on bees and beetles as pollinators. J. Econ. Entomol. 2021, 114, 710–722. [Google Scholar] [CrossRef] [PubMed]
  169. Lysak, M.A.; Cheung, K.; Kitschke, M.; Bures, P. Ancestral chromosomal blocks are triplicated in Brassiceae species with varying chromosome number and genome size. Plant Physiol. 2007, 145, 402–410. [Google Scholar] [CrossRef] [PubMed]
  170. Grande, F.; Rizzuti, B.; Occhiuzzi, M.A.; Ioele, G.; Casacchia, T.; Gelmini, F.; Guzzi, R.; Garofalo, A.; Statti, G. Identification by molecular docking of homoisoflavones from Leopoldia comosa as ligands of estrogen receptors. Molecules 2018, 23, 894. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  171. Boulfia, M.; Lamchouri, F.; Senhaji, S.; Lachkar, N.; Bouabid, K.; Toufik, H. Mineral content, chemical analysis, in vitro antidiabetic and antioxidant activities, and antibacterial power of aqueous and organic extracts of moroccan Leopoldia comosa (L.) parl. bulbs. Evid.-Based Complementary Altern. Med. 2021, 2021, 9932291. [Google Scholar] [CrossRef]
  172. Maroyi, A. Contribution of Schinziophyton rautanenii to sustainable diets, livelihood needs and environmental sustainability in Southern Africa. Sustainability 2018, 10, 581. [Google Scholar] [CrossRef] [Green Version]
  173. Frankova, A.; Manourova, A.; Kotikova, Z.; Vejvodova, K.; Drabek, O.; Riljakova, B.; Famera, O.; Ngula, M.; Ndiyoi, M.; Polesny, Z.; et al. The chemical composition of oils and cakes of Ochna serrulata (Ochnaceae) and other underutilized traditional oil trees from Western Zambia. Molecules 2021, 26, 5210. [Google Scholar] [CrossRef]
  174. Brunt, A.; Phillips, S.U.E.; Jones, R.; Kenten, R. Viruses detected in Ullucus tuberosus (Basellaceae) from Peru and Bolivia. Ann. Appl. Biol. 2008, 101, 65–71. [Google Scholar] [CrossRef]
  175. Fox, A.; Fowkes, A.; Skelton, A.; Harju, V.; Buxton-Kirk, A.; Kelly, M.; Forde, S.; Pufal, H.; Conyers, C.; Ward, R.; et al. Using high throughput sequencing in support of a plant health outbreak reveals novel viruses in Ullucus tuberosus (Basellaceae). Plant Pathol. 2019, 68, 576–587. [Google Scholar]
  176. Dholakia, H.; Mehta, D.; Joshi, M.; Delvadiya, I. Molecular characterization of Indian bean (Lablab purpureus L.) genotypes. J. Pharmacogn. Phytochem. 2019, 8, 455–463. [Google Scholar]
  177. Wang, J.; Chen, X.; Chu, S.; You, Y.; Chi, Y.; Wang, R.; Yang, X.; Hayat, K.; Zhang, D.; Zhou, P. Comparative cytology combined with transcriptomic and metabolomic analyses of Solanum nigrum L. in response to Cd toxicity. J. Hazard. Mater. 2022, 423, 127168. [Google Scholar] [CrossRef]
  178. Xu, J.; Sun, J.; Du, L.; Liu, X. Comparative transcriptome analysis of cadmium responses in Solanum nigrum and Solanum torvum. New Phytol. 2012, 196, 110–124. [Google Scholar] [CrossRef]
  179. Khan, A.R.; Park, C.E.; Park, G.-S.; Seo, Y.-J.; So, J.-H.; Shin, J.-H. The whole chloroplast genome sequence of black nightshade plant (Solanum nigrum). Mitochondrial DNA Part A 2017, 28, 169–170. [Google Scholar] [CrossRef]
  180. Cho, K.-S.; Park, T.-H. Complete chloroplast genome sequence of Solanum nigrum and development of markers for the discrimination of S. nigrum. Hortic. Environ. Biotechnol. 2016, 57, 69–78. [Google Scholar] [CrossRef]
  181. Leu, Y.-L.; Hwang, T.-L.; Kuo, P.-C.; Liou, K.-P.; Huang, B.-S.; Chen, G.-F. Constituents from Vigna vexillata and their anti-inflammatory activity. Int. J. Mol. Sci. 2012, 13, 9754–9768. [Google Scholar] [CrossRef] [Green Version]
  182. Dachapak, S.; Tomooka, N.; Somta, P.; Naito, K.; Kaga, A.; Srinives, P. QTL analysis of domestication syndrome in zombi pea (Vigna vexillata), an underutilized legume crop. PLoS ONE 2018, 13, e0200116. [Google Scholar] [CrossRef] [Green Version]
  183. Marubodee, R.; Ogiso-Tanaka, E.; Isemura, T.; Chankaew, S.; Kaga, A.; Naito, K.; Ehara, H.; Tomooka, N. Construction of an SSR and RAD-marker based molecular linkage map of Vigna vexillata (L.) A. Rich. PLoS ONE 2015, 10, e0138942. [Google Scholar] [CrossRef]
  184. Damayanti, F.; Lawn, R.J.; Bielig, L.M. Genetic compatibility among domesticated and wild accessions of the tropical tuberous legume Vigna vexillata (L.) A. Rich. Crop Pasture Sci. 2010, 61, 785–797. [Google Scholar] [CrossRef]
  185. Lulin, H.; Xiao, Y.; Pei, S.; Wen, T.; Shangqin, H. The first illumina-based De Novo transcriptome sequencing and analysis of safflower flowers. PLoS ONE 2012, 7, e38653. [Google Scholar] [CrossRef]
  186. Chen, J.; Tang, X.; Ren, C.; Wei, B.; Wu, Y.; Wu, Q.; Pei, J. Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower. BMC Genom. 2018, 19, 548. [Google Scholar] [CrossRef] [Green Version]
  187. Wu, Z.H.; Liao, R.; Dong, X.; Qin, R.; Liu, H. Complete chloroplast genome sequence of Carthamus tinctorius L. from PacBio Sequel Platform. Mitochondrial DNA B Resour. 2019, 4, 2635–2636. [Google Scholar] [CrossRef] [Green Version]
  188. Couvreur, T.L.P.; Hahn, W.J.; Granville, J.J.d.; Pham, J.L.; Ludeña, B.; Pintaud, J.C. Phylogenetic relationships of the cultivated neotropical palm Bactris gasipaes (Arecaceae) with its wild relatives inferred from chloroplast and nuclear DNA polymorphisms. Syst. Bot. 2007, 32, 519–530. [Google Scholar] [CrossRef]
  189. Bazzo, B.R.; de Carvalho, L.M.; Carazzolle, M.F.; Pereira, G.A.G.; Colombo, C.A. Development of novel EST-SSR markers in the macaúba palm (Acrocomia aculeata) using transcriptome sequencing and cross-species transferability in Arecaceae species. BMC Plant Biol. 2018, 18, 276. [Google Scholar] [CrossRef]
  190. Xiao, Y. Efficient isolation of high quality RNA from tropical palms for RNA-seq analysis. Plant Omics 2012, 5, 584–589. [Google Scholar]
  191. Santos da Silva, R.; Roland Clement, C.; Balsanelli, E.; de Baura, V.A.; Maltempi de Souza, E.; Pacheco de Freitas Fraga, H.; do Nascimento Vieira, L. The plastome sequence of Bactris gasipaes and evolutionary analysis in tribe Cocoseae (Arecaceae). PLoS ONE 2021, 16, e0256373. [Google Scholar] [CrossRef] [PubMed]
  192. Levi, A.; Simmons, A.M.; Massey, L.; Coffey, J.; Wechter, W.P.; Jarret, R.L.; Tadmor, Y.; Nimmakayala, P.; Reddy, U.K. Genetic diversity in the desert watermelon Citrullus colocynthis and its relationship with Citrullus species as determined by high-frequency oligonucleotides-targeting active gene markers. J. Am. Soc. Hortic. Sci. 2017, 142, 47–56. [Google Scholar] [CrossRef]
  193. Wang, Z.; Hu, H.; Goertzen, L.R.; McElroy, J.S.; Dane, F. Analysis of the Citrullus colocynthis transcriptome during water deficit stress. PLoS ONE 2014, 9, e104657. [Google Scholar] [CrossRef] [PubMed]
  194. Guo, S.; Zhao, S.; Sun, H.; Wang, X.; Wu, S.; Lin, T.; Ren, Y.; Gao, L.; Deng, Y.; Zhang, J.; et al. Resequencing of 414 cultivated and wild watermelon accessions identifies selection for fruit quality traits. Nat. Genet. 2019, 51, 1616–1623. [Google Scholar] [CrossRef]
  195. Gao, P.; Xu, W.; Yan, T.; Zhang, C.; Lv, X.; He, Y. Application of near-infrared hyperspectral imaging with machine learning methods to identify geographical origins of dry narrow-leaved oleaster (Elaeagnus angustifolia) fruits. Foods 2019, 8, 620. [Google Scholar] [CrossRef] [Green Version]
  196. Liu, Z.; Zhu, J.; Yang, X.; Wu, H.; Wei, Q.; Wei, H.; Zhang, H. Growth performance, organ-level ionic relations and organic osmoregulation of Elaeagnus angustifolia in response to salt stress. PLoS ONE 2018, 13, e0191552. [Google Scholar] [CrossRef] [Green Version]
  197. Lin, J.; Li, J.P.; Yuan, F.; Yang, Z.; Wang, B.S.; Chen, M. Transcriptome profiling of genes involved in photosynthesis in Elaeagnus angustifolia L. under salt stress. Photosynthetica 2018, 56, 998–1009. [Google Scholar] [CrossRef]
  198. Liu, X.; Chen, C.; Liu, Y.; Liu, Y.; Zhao, Y.; Chen, M. The presence of moderate salt can increase tolerance of Elaeagnus angustifolia seedlings to waterlogging stress. Plant Signal. Behav. 2020, 15, 1743518. [Google Scholar] [CrossRef]
  199. Yemataw, Z.; Muzemil, S.; Ambachew, D.; Tripathi, L.; Tesfaye, K.; Chala, A.; Farbos, A.; O’Neill, P.; Moore, K.; Grant, M.; et al. Genome sequence data from 17 accessions of Ensete ventricosum, a staple food crop for millions in Ethiopia. Data Brief 2018, 18, 285–293. [Google Scholar] [CrossRef]
  200. Harrison, J.; Moore, K.A.; Paszkiewicz, K.; Jones, T.; Grant, M.R.; Ambacheew, D.; Muzemil, S.; Studholme, D.J. A draft genome sequence for Ensete ventricosum, the drought-tolerant “tree against hunger”. Agronomy 2014, 4, 13–33. [Google Scholar] [CrossRef] [Green Version]
  201. Biswas, M.K.; Darbar, J.N.; Borrell, J.S.; Bagchi, M.; Biswas, D.; Nuraga, G.W.; Demissew, S.; Wilkin, P.; Schwarzacher, T.; Heslop-Harrison, J.S. The landscape of microsatellites in the enset (Ensete ventricosum) genome and web-based marker resource development. Sci. Rep. 2020, 10, 15312. [Google Scholar] [CrossRef]
  202. Price, E.J.; Drapal, M.; Perez-Fons, L.; Amah, D.; Bhattacharjee, R.; Heider, B.; Rouard, M.; Swennen, R.; Becerra Lopez-Lavalle, L.A.; Fraser, P.D. Metabolite database for root, tuber, and banana crops to facilitate modern breeding in understudied crops. Plant J. 2020, 101, 1258–1268. [Google Scholar] [CrossRef]
  203. Ferreira, L.T.; Venancio, V.P.; Kawano, T.; Abrão, L.C.C.; Tavella, T.A.; Almeida, L.D.; Pires, G.S.; Bilsland, E.; Sunnerhagen, P.; Azevedo, L.; et al. Chemical genomic profiling unveils the in vitro and in vivo antiplasmodial mechanism of açaí (Euterpe oleracea Mart.) polyphenols. ACS Omega 2019, 4, 15628–15635. [Google Scholar] [CrossRef] [Green Version]
  204. Oliveira, L.C.; de Oliveira, M.d.S.P.; Davide, L.C.; Torres, G.A. Karyotype and genome size in Euterpe Mart. (Arecaceae) species. Comp Cytogenet. 2016, 10, 17–25. [Google Scholar]
  205. Kron, K.A.; Powell, E.A.; Luteyn, J.L. Phylogenetic relationships within the blueberry tribe (Vaccinieae, Ericaceae) based on sequence data from MATK and nuclear ribosomal ITS regions, with comments on the placement of Satyria. Am. J. Bot. 2002, 89, 327–336. [Google Scholar] [CrossRef]
  206. Llivisaca, S.; Manzano, P.; Ruales, J.; Flores, J.; Mendoza, J.; Peralta, E.; Cevallos-Cevallos, J.M. Chemical, antimicrobial, and molecular characterization of mortiño (Vaccinium floribundum Kunth) fruits and leaves. Food Sci. Nutr. 2018, 6, 934–942. [Google Scholar] [CrossRef]
  207. Ligarreto, G.A.; Patiño Mdel, P.; Magnitskiy, S.V. Phenotypic plasticity of Vaccinium meridionale (Ericaceae) in wild populations of mountain forests in Colombia. Rev. Biol. Trop. 2011, 59, 569–583. [Google Scholar]
  208. Arango-Varela, S.S.; Luzardo-Ocampo, I.; Reyes-Dieck, C.; Yahia, E.M.; Maldonado-Celis, M.E. Antiproliferative potential of Andean Berry (Vaccinium meridionale Swartz) juice in combination with Aspirin in human SW480 colon adenocarcinoma cells. J. Food Biochem. 2021, 45, e13760. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Scheme showing three pangenome assembly methods. Sequence reads from genomes A, B and C can be used to assemble the species pangenome using de novo method yielding three separate genomes that will be compared to define the core and variable regions. In the iterative assembly, genome A is assembled de novo and used as a reference for assembling the remaining genomes B and C. Because genome A has different genes from genome B and C, it may change the gene order in genome B (highlighted in the blue box) or collapsing CNV in genome C (highlighted in the blue box). In the iterative assembly, genes not represented in the reference genome (genome A) have to be assembled de novo and may lose their location information as shown by the green gene below genome B assembly. Graph pangenome assembly of genomes A, B and C represent the genes as interconnected nodes, each path representing a genome.
Figure 1. Scheme showing three pangenome assembly methods. Sequence reads from genomes A, B and C can be used to assemble the species pangenome using de novo method yielding three separate genomes that will be compared to define the core and variable regions. In the iterative assembly, genome A is assembled de novo and used as a reference for assembling the remaining genomes B and C. Because genome A has different genes from genome B and C, it may change the gene order in genome B (highlighted in the blue box) or collapsing CNV in genome C (highlighted in the blue box). In the iterative assembly, genes not represented in the reference genome (genome A) have to be assembled de novo and may lose their location information as shown by the green gene below genome B assembly. Graph pangenome assembly of genomes A, B and C represent the genes as interconnected nodes, each path representing a genome.
Ijms 23 02671 g001
Figure 2. Predicted benefits to plant breeding from future developments in pangenomics. Improvements in pangenome assembly and annotation combined with machine learning (ML) technology will increase the accuracy of analyses on gene presence/absence variation (PAV) and structural variation (SV) in different individuals of crop species. These analyses will be available to plant breeders through new tools and browsers, allowing easier selection of traits and genetic diversity in crop plants.
Figure 2. Predicted benefits to plant breeding from future developments in pangenomics. Improvements in pangenome assembly and annotation combined with machine learning (ML) technology will increase the accuracy of analyses on gene presence/absence variation (PAV) and structural variation (SV) in different individuals of crop species. These analyses will be available to plant breeders through new tools and browsers, allowing easier selection of traits and genetic diversity in crop plants.
Ijms 23 02671 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tay Fernandez, C.G.; Nestor, B.J.; Danilevicz, M.F.; Gill, M.; Petereit, J.; Bayer, P.E.; Finnegan, P.M.; Batley, J.; Edwards, D. Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species. Int. J. Mol. Sci. 2022, 23, 2671. https://doi.org/10.3390/ijms23052671

AMA Style

Tay Fernandez CG, Nestor BJ, Danilevicz MF, Gill M, Petereit J, Bayer PE, Finnegan PM, Batley J, Edwards D. Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species. International Journal of Molecular Sciences. 2022; 23(5):2671. https://doi.org/10.3390/ijms23052671

Chicago/Turabian Style

Tay Fernandez, Cassandria Geraldine, Benjamin John Nestor, Monica Furaste Danilevicz, Mitchell Gill, Jakob Petereit, Philipp Emanuel Bayer, Patrick Michael Finnegan, Jacqueline Batley, and David Edwards. 2022. "Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species" International Journal of Molecular Sciences 23, no. 5: 2671. https://doi.org/10.3390/ijms23052671

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop