Dataset on rbcL-based intra-specific diversity and population structure of Parkia biglobosa (Jacq.) in Nigeria

African locust bean (Parkia biglobosa) is a multipurpose leguminous tree species of nutritional and pharmacological value. The plant is widely distributed in Africa and across Nigeria's major agroecological areas (AEAs). Amidst declining cultivation and production, P. biglobosa is genetically threatened in its natural habitats due to overexploitation, deforestation, wildfires and lack of improved tree management practices. Consequently, concerted research efforts directed towards germplasm collection and assessment of genetic relationships are imperative for conserving its genetic resources, sustainable management and selecting promising landraces for breeding programmes. The dataset presents rbcL intraspecific genetic diversity and population structure of 62 P. biglobosa landraces in Nigeria. A relatively high level of diversity and a low degree of nucleotide variability was observed among the landraces. Relatively high values of 642 total allele sites, 601 polymorphic sites, 504 parsimony information sites, 883 total number mutations, 9 haplotypes and 0.55 gene diversity were recorded for the sequence dataset. Low values of 0.35 nucleotide diversity and 5 InDels events were also recorded for the dataset. The gene flow in this dataset demonstrated an extensive exchange of genes between the three populations of P. biglobosa, which influenced the level of genetic differentiation (Gst) between the populations. Significantly low Gst (-0.01) was recorded between the Guinea and Sudan savannah populations, a moderate value (0.03) was recorded between the Sudan savannah and Rainforest populations and a higher Gst value (0.05) was recorded between the Guinea and Rainforest populations. The dataset highlights potential evolutionary dynamics that might influence variations relevant to the breeding and conservation of P. biglobosa in Nigeria and across its range in West and Central Africa.

a b s t r a c t African locust bean ( Parkia biglobosa ) is a multipurpose leguminous tree species of nutritional and pharmacological value.The plant is widely distributed in Africa and across Nigeria's major agroecological areas (AEAs).Amidst declining cultivation and production, P. biglobosa is genetically threatened in its natural habitats due to overexploitation, deforestation, wildfires and lack of improved tree management practices.Consequently, concerted research efforts directed towards germplasm collection and assessment of genetic relationships are imperative for conserving its genetic resources, sustainable management and selecting promising landraces for breeding programmes.The dataset presents rbcL intraspecific genetic diversity and population structure of 62 P. biglobosa landraces in Nigeria.A relatively high level of diversity and a low degree of nucleotide variability was observed among the landraces.Relatively high values of 642 total allele sites, 601 polymorphic sites, 504 parsimony information sites, 883 total number mutations, 9 haplotypes and 0.55 gene diversity were recorded for the sequence dataset.Low values of 0.35 nucleotide diversity and 5 InDels events were also recorded for the dataset.The gene flow in this dataset demonstrated an extensive exchange of genes between the three populations of P. biglobosa, which influenced the level of genetic differentiation (Gst) between the populations.Significantly low Gst (-0.01) was recorded between the Guinea and Sudan savannah populations, a moderate value (0.03) was recorded between the Sudan savannah and Rainforest populations and a higher Gst value (0.05) was recorded between the Guinea and Rainforest populations.The dataset highlights potential evolutionary dynamics that might influence variations relevant to the breeding and conservation of P. biglobosa in Nigeria and across its range in West and Central Africa.
© 2024 The Author(s Nucleotide sequence statistics were performed using MEGAX and CodonW.The genetic diversity and population structure, gene flow, and genetic differentiation among different populations of P. biglobosa were performed using DnaSP v6.12.03.The total number of sites, invariable sites, parsimony information sites, the total number of mutations (Eta), the number of haplotypes, gene diversity, the variance of haplotypes, nucleotide diversity and the total number of insertion and deletions (InDels) and the total number of ( continued on next page ) InDels events were evaluated.Haplotype-based statistics, the average proportion of nucleotide difference between populations, the genetic differentiation index based on the frequency of haplotypes, the average number of nucleotide substitutions per site between populations, and the net nucleotide substitutions per site between populations were also estimated.Data were also recorded for codon and its indices per sequence accession of P. biglobosa .

Data source location
The collection areas and distribution map of the Parkia biglobosa are summarised in Table 1 and Figure

Value of the Data
• Based on sequences data, the species range and the three major endemic agro-ecological areas are recognised for Nigeria and by extension, the West-Central Africa regions.These include; the Guinea savannah, the Sudan savannah, and the Rainforest.• The area(s) of greater gene diversity for the species across the agro-ecological regions in Nigeria are highlighted and is in the increasing order of Guinea savannah (0.43), Sudan savannah (0.62) and the Rainforest (0.73).• The Sudan savannah population is observing a greater degree of genetic variation, probably due to anthropogenic and climate change pressures and displayed higher number of mutations, total number of segregating sites, the haplotype number and average number of nucleotide differences between sequences.• The genetic information provided by the dataset offers genome-based species recovery, conservation and genetic improvement strategies like Genome-wide association studies ( GWAS), haplotype-assisted genomic selection, and haplotype-based breeding to augment the natural hybridisation of the species.• The sequence data on P. biglobosa are integral for re-cultivation, genetic characterisation, conservation and species improvement by breeders and scientists.

Objectives
The objectives of the dataset are to determine intraspecific genetic diversity, population structure, and gene flow among three agro-ecological populations of African locust beans ( Parkia biglobosa ) in Nigeria.

Data Description
The data presents genetic intraspecific diversity and population structure of Parkia biglobosa across various agro-ecological areas in Nigeria and West-Central Africa.Sixty-two landraces of three populations representing the major agro-ecological areas -AEAs (Rainforest, Guinea Savannah, Sudan savannah).PCR amplification and Sanger Sequencing using the Ribulose-1,5bisphosphate carboxylase/oxygenase large subunit (rbcL) gene were performed on the samples.Table 1 describes the collection areas and AEAs of P. biglobosa landraces; Fig. 1 shows the landraces distribution in Nigeria across the AEAs.Table 2 describes nucleotide sequences statistics; sequence length (bp), the weight of single and double DNA strands, the frequencies of nucleotide  bases (A, T, G, C; C + G, and A + T) and the total number of codons per sequence accession.Table 3 presents the genetic diversity parameters such as the total number of sites, invariable sites, parsimony information sites, the total number of mutations (Eta), number of haplotypes, gene diversity, the variance of haplotypes, nucleotide diversity, the total number of insertion and deletions (InDels) and the total number of InDels events.Table 4 records the multidomain analysis and population structure of the three populations of the sixty-two landraces, whereas Table 5 presents the gene flow and genetic differentiations among the three P. biglobosa populations.Table 6 shows the codon usage and amino residues of the P. biglobosa sequences analysed for the 62 accessions.

Sample collection
Leaf samples of sixty-two (62) P. biglobosa landraces were collected across three agroecological areas AEAs (Rainforest, Guinea savannah and Sudan savannah) in Nigeria, following the procedures described by Omonhinmin et al. [ 3 ].The samples were Silica gel-dried, cleaned, assigned accession numbers, and stored in a −80 °C cooling facility at the Molecular Biology Research Laboratory, Department of Biological Sciences, Covenant University, Ota, Nigeria.

Genomic DNA extraction
Genomic DNA was extracted using a modified CTAB-based method [ 4 ]. and quality and quantity were authenticated using the ThermoFischer® Nanodrop spectrophotometer ND-80 0 0-GL

Sequencing and data analysis
The PCR amplicons were sequenced at Inqaba biotechnical Industries (Pty) Ltd, South Africa.Sequences were cleaned and aligned using default settings in BioEdit Sequence Alignment Editor [ 8,9 ].Nucleotide sequence statistics of the 62 P. biglobosa accessions were performed using MEGAX following the procedure of Tajima et al. [ 10 ].DnaSP v6.12.03 software was employed to determine genetic diversity parameters, multidomain analysis and population structure parameters.The number of sequences per population, the total number of polymorphic sites, the total number of mutations, the average number of nucleotide differences between the sequences, mutation rate per population/sequences, nucleotide diversity, haplotype number and diversity, haplotype diversity variance, gene flow and genetic differentiation among the different populations were estimated.Amino acid residues and codon usage parameters were assessed using CodonW [10][11][12].

Fig. 1 .
Fig. 1.Distribution pattern and the Agro-ecological categories of the collection sites for Parkia biglobosa landraces (Pattern shows P. biglobosa is distributed outside the very wet areas).

Table 1
Collection location, Sample ID, Agro-ecological areas, and Geodetic coordinate system for the Parkia biglobosa accessions landraces studied.

Table 1 (
continued ) § All samples were identified by Botanists -C. A. Omonhinmin and J. O. Popoola and assigned sample identity numbers (Original Sample ID).

Table 7
records the codon usage indices per sequence accession of the 62 P. biglobosa landraces.

Table 5
Gene flow and genetic differentiation among the 3 populations of the 62 Parkia biglobosa accessions studied.Ks = statistics based on nucleotide sequences, Kxy = average proportion of nucleotide difference between populations; Gst = genetic differentiation index based on the frequency of haplotypes; GammaS t = genetic differentiation coefficient; Fst = genetic differentiation; Dxy = average number of nucleotide substitutions per site between populations; Da = net nucleotide substitutions per site between populations.
Freq = Frequency.All frequencies are averages over all taxa.RSCU -Relative synonymous codon usage.