Genotypic selection and trait variation in sweet orange (Citrus sinensis L. Osbeck) dataset of Bangladesh

The dataset primarily focused on selecting genotypes of sweet oranges based on their phenotypic performances. The dataset resulted significant variations in the best linear unbiased predictions (BLUPs) of 20 out of 21 traits, including leaves, flowers, fruits, and seeds. A strong positive correlation (r= 0.73 to 0.95) was observed among the majority of morphological traits. The sweet orange genotypes demonstrated considerable genetic variance, surpassing 65% for almost all traits, with a selection accuracy exceeding 92%. Using the multi-trait genotype-ideotype distance index (MGIDI), CS Jain-001 emerged as the top-ranked genotype, followed by BAU Malta-3 and CS Jain-002 in order of desirability. The broad sense heritability of selected traits was above 75.60%, and the selection gain reached a maximum of 12.60. These identified genotypes show promise as potential parent donors in breeding programs, leveraging their strengths and weaknesses to develop promising varieties in Bangladesh.


a b s t r a c t
The dataset primarily focused on selecting genotypes of sweet oranges based on their phenotypic performances.The dataset resulted significant variations in the best linear unbiased predictions (BLUPs) of 20 out of 21 traits, including leaves, flowers, fruits, and seeds.A strong positive correlation (r = 0.73 to 0.95) was observed among the majority of morphological traits.The sweet orange genotypes demonstrated considerable genetic variance, surpassing 65% for almost all traits, with a selection accuracy exceeding 92%.Using the multi-trait genotype-ideotype distance index (MGIDI), CS Jain-001 emerged as the top-ranked genotype, followed by BAU Malta-3 and CS Jain-002 in order of desirability.The broad sense heritability of selected traits was above 75.60%,and the selection gain reached a maximum of 12.60.These identified genotypes show promise as potential parent donors in breeding programs, leveraging their strengths and weaknesses to develop promising varieties in Bangladesh. ©

Value of the Data
• The dataset illustrates the connections among morphological traits, aiding in the identification of genotypes exhibiting substantial genetic variation.This offers breeders a broad genetic reservoir, allowing for the incorporation of unique and resilient characteristics into upcoming sweet orange varieties.This contributes to the sustainable development of sweet orange cultivation in the long run.• The findings of the dataset equip farmers with the information necessary for making wellinformed decisions regarding the cultivation of sweet orange genotypes.By pinpointing top-performing varieties through the MGIDI index, farmers can anticipate reliable and consistent performance.The focus on traits that significantly enhance selection accuracy (exceeding 92%) assists farmers in fine-tuning their cultivation methods to achieve improved yields and economic gains.This knowledge empowers farmers to choose varieties that match their specific requirements, promoting optimal resource utilization and enhancing overall farm profitability.• The assessment of strengths and weaknesses in chosen genotypes is pivotal for advancing sustainable agricultural methods.By capitalizing on the strengths of specific genotypes, breeders can create varieties that cater to farmers' preferences, in harmony with the increasing global emphasis on sustainable agriculture.

Background
The sweet orange genotypes exhibit considerable variation in their morphologies, encompassing diverse attributes such as the size and shape of the canopy, the color, size, type, and season of ripening fruit, as well as the quantity of seeds per fruit [1] .Diversity in agricultural traits [2][3][4] offer an exceptional foundation for crop improvement.Skilled plant breeders often consider specific combinations of phenotypic characteristics from a diverse collection to develop new genotypes that exhibit exceptional performance.This concept, known as ideotype, was introduced by Donald [5] in wheat breeding.The core concept of ideotype design aims to enhance Fig. 1.Genotypic variance (%), residual variance (%) and selection accuracy for 20 quantitative traits of 8 sweet orange genotypes obtained from restricted maximum likelihood (RELM) test.PH plant height, BG base grith, LLL length of leaf lamina, WLL width of leaf lamina, LP length of pedicel, LFB length of flower bud, LPT length of petal, WP width of petal, ST number of stamens, LA length of anther, TM thickness of mesocarp, WEE width of epicarp at equatorial area, SEG number of segments per fruit, DFA diameter of fruit axis, SL seed length, SW seed width, SWT single seed weight, BSN bold seed number per fruit, FD fruit diameter and FL fruit length.crop performance by selecting genotypes that exhibit multiple desirable traits simultaneously.The multi-trait genotype-ideotype distance index (MGIDI), a recently developed method, is created to aid in genotype selection based on breeding values, incorporating information from several traits [ 6 , 7 ].The dataset contains four tables and three figures.

Data Description
In the restricted maximum likelihood model (RELM), the likelihood ratio test (LRT) indicated statistically significant differences at a 5% significance level among the best linear unbiased prediction (BLUP) values for all assessed traits, with the exception of AT, across the genotypes under investigation ( Table 1 ).The genotypic variance components obtained from RELM suggest that genetic effects have a substantial impact, exceeding 65%, in comparison to residual variance for all traits except LFB ( Fig. 1 ).Moreover, the selection accuracy exceeded 92% for all traits except LFB ( Fig. 1 ).
A significant (p < 0.05) correlation coefficient (r = 0.73 to 0.95) was identified among the morphological traits of sweet orange, as illustrated in Fig. 2 .The findings indicated noteworthy positive correlations between various traits such as LP and TM, FL and TM, FD and TM, WEE and TM, SL and LP, FL and LP, SW and BSN, SWT and BSN, SL and BSN, SWT and SW, SL and SW, FL and SW, SL and SWT, FL and SWT, FL and SL, FD and SL, FL and BG, ST and PH, WP and ST, LLL and ST, FL and ST, FD and ST, WEE and ST, WLL and ST, LLL and WP, FL and WP, FD and WP, WEE and WP, WLL and WP, FL and LLL, FD and LLL, WEE and LLL, FD and FL, WEE and FL, WEE and FD.Additionally, a significant negative correlation (r = 0.73) was observed between SEG and LFB, as well as LA and LFB.
In Fig. 3a , the arrangement of eight sweet orange genotypes according to the multi-trait genotype-ideotype index (MGIDI) is depicted.The genotypes are organized in a descending order of MGIDI index values, where the highest value is positioned at the center and the lowest at the outer circle.The red circle in Fig. 3a represents a selection intensity (SI) of 40%, indicating an increasing selection sense for all traits except LFB, as determined by the MGIDI selection index.The successful attainment of the selection sense goal is highlighted in Table 2 .The red  PH plant height, BG base grith, LLL length of leaf lamina, WLL width of leaf lamina, LP length of pedicel, LFB length of flower bud, LPT length of petal, WP width of petal, ST number of stamens, LA length of anther, TM thickness of mesocarp, WEE width of epicarp at equatorial area, SEG number of segments per fruit, DFA diameter of fruit axis, SL seed length, SW seed width, SWT single seed weight, BSN bold seed number per fruit, FD fruit diameter and FL fruit length.dots within the circle represent the sweet orange genotypes selected through the MGIDI index ( Fig. 3a ).The selection gains (SG) ranged from 0.07 to 12.6, and heritability ( h 2 ) ranged from 75.60% to 100% ( Table 2 ).Notably, CS Jain-001 was identified as the most desirable genotype, followed by BAU Malta-3 and CS Jain-002, according to the MGIDI index ( Fig. 3a ).Fig. 3b illustrates the strengths and weaknesses of these selected genotypes using four identified factors (FA) ( Table 2 ) and their loadings ( Table 4 ) derived from significant principal components ( Table 3 ).The arrangement of factors concerning genotypes in Fig. 3b signifies their impact, with dotted lines indicating the average performance in factor contribution.Higher factor values moving toward the center indicate weaknesses, while lower values denote strengths ( Fig. 3b ).Within Fig. 3b , Factor 1 (FA1) incorporates traits such as LLL, LP, WP, ST, TM, WEE, DFA, FD, and FL; Factor 2 (FA2) includes WLL, LFB, LPT, and SEG; Factor 3 (FA3) comprises BG, SL, SW, SWT, and BSN; and Factor 4 (FA4) encompasses PH and LA, as outlined in Table 2 .The positions of the selected genotypes in Fig. 3b reflect their strengths and weaknesses in relation to these factors.

Description of Experimental Site
The present investigation was conducted at Citrus Research Station (CRS), situated in Jaintiapur, Sylhet (24.8949 °N and 91.8687 °E), over the course of the period spanning from February 2020 to December 2021 (one growing season).The experimental site falls under the classification of AEZ-20 (Eastern Surmakusiyara flood plains), which encompasses a diverse range of geographical features, including alluvial fans, back swamps, flood plains, flat hills, solitary hills, piedmont plains, point bars, and ridges.The soil texture may exhibit variation from clay loam to sandy loam, with certain regions predominantly composed of sand and the subsoil being constituted by a higher proportion of clay [8] .Furthermore, in AEZ-20, the soil organic matter content is generally elevated, while the pH remains neutral [9] .The location of sweet orange orchard is situated within a subtropical climatic region, characterized by elevated levels of precipitation due to the influence of monsoon air originating from the south-west.

Plant Materials
In this particular study, a collection of eight sweet orange genotypes, namely CS Jain-001, CS Jain-002, CS Jain-003, BAU Malta-3, BAU Malta-1, BARI Malta-1, CS Ram-001, and Variegated Malta, were employed.The aforementioned genotypes were cultivated within the orchards of experimental sites established in Sylhet, Bangladesh.

Design of Experiment and Data Collection
The experiment comprised of three replications that were completely randomized, with each replication consisting of ten trees per genotype.A total of 240 trees, aged between 7 and 8

Table 5
The equations for calculating the genetic parameter and breeding values using MGIDI to select superior sweet orange genotypes.

Selection indices Equation No. Reference
Multi trait genotype-ideotype index, Ideotype design and rescaling of traits, rX i j = Mixed-effect model, y i j = m + g i + r j + e i j ix [13] years, were included within the radius of the characterization study.The spacing between trees and rows was upheld at 4 meters and 3 meters, respectively.The sweet orange genotypes were cultivated using pummelo rootstock, and no incidence of insect or pest infestation was observed during the course of the investigation.The morphological traits of sweet oranges were recorded following the guidelines outlined in the International Plant Genetic Resources Institute (IPGRI) descriptor of citrus [10] .The traits were age of tree (AT), plant height (PH), base grith (BG), length of leaf lamina (LLL), width of leaf lamina (WLL), length of pedicel (LP), length of flower bud (LFB), length of petal (LPT), width of petal (WP), number of stamens (ST), length of anther (LA), thickness of mesocarp (TM), width of epicarp at equatorial area (WEE), number of segments per fruit (SEG), diameter of fruit axis (DFA),seed length (SL), seed width (SW), single seed weight (SWT), bold seed number per fruit (BSN), fruit diameter (FD) and fruit length (FL).The measurements were taken from three trees in each replication, and the results were averaged.

Statistical Analysis
We conducted Pearson's correlation analysis on genotypic mean performances, examining 21 morphological traits.The indices for calculating the genetic parameters and breeding values ( Table 5 ) using multi-trait genotype-ideotype distance index (MGIDI) for ranking sweet orange genotypes were analysed using "metan" package of R software [11] .The calculation of the MGIDI index aimed to identify the top-ranked genotype (Eq.i).Prior to making selections, ideotype design and trait rescaling were undertaken (Eq.ii).For assessing the significance of each trait concerning genotypes, a linear mixed model (Eq.iii) was employed.Calculations of the observed mean ( Xo ) and best linear unbiased prediction (BLUP) values for the predicted mean ( Xs ) of genotypes were performed to determine selection gain (Eq.iv).Variance components obtained from MGIDI analysis were used to compute broad-sense heritability (h 2 ) based on the mean performance of genotypes (Eq.v).Factor analysis (Eq.vi) and factor loadings (vii) with varimax rotation criteria [12] were executed with an aim of increase selection for all traits, except LFB, with a 40% selection intensity.Additionally, the strengths and weaknesses of selected genotypes (viii) were identified using factor loadings.The genetic parameters, including genotypic variance, residual variance, and selection accuracy, were calculated using the mixed-effect model (ix) within a randomized complete block design.In this model, replicates were treated as fixed, while genotypes were considered random.

Limitations
This dataset, although instrumental for the selection of sweet orange genotypes based on phenotypic characteristics, is constrained by the absence of molecular or genomic information associated with these genotypes.This limitation impedes a comprehensive elucidation of the molecular mechanisms underpinning the observed phenotypic traits.The integration of genomic data has the potential to augment the dataset's analytical depth, facilitating a more exhaustive investigation into the genetic foundations of these traits and enhancing the precision of genotype selection methodologies within breeding programs.

Fig. 2 .
Fig.2.Coefficient of correlation matrix (Pearson's correlation) of 21 traits of eight sweet orange genotypes.AT age of tree, PH plant height, BG base grith, LLL length of leaf lamina, WLL width of leaf lamina, LP length of pedicel, LFB length of flower bud, LPT length of petal, WP width of petal, ST number of stamens, LA length of anther, TM thickness of mesocarp, WEE width of epicarp at equatorial area, SEG number of segments per fruit, DFA diameter of fruit axis, SL seed length, SW seed width, SWT single seed weight, BSN bold seed number per fruit, FD fruit diameter and FL fruit length.

Fig. 3 .
Fig. 3. Selection of sweet orange genotypes through MGIDI index (a) and the strengths and weaknesses of selected genotypes (b).

Table 1
BLUP values for 20 quantitative traits of eight sweet orange genotypes obtained using RELM.Significant at 5% probability level in likelihood ratio test (LRT),

Table 2
Factor contribution (FA), broad-sense heritability ( h 2 ) and selection gain (SG) obtained using MGIDI selection index for 20 quantitative traits of three selected genotypes of sweet orange = observed mean, X s = predicted mean obtained from BLUP values, PH plant height, BG base grith, LLL length of leaf lamina, WLL width of leaf lamina, LP length of pedicel, LFB length of flower bud, LPT length of petal, WP width of petal, ST number of stamens, LA length of anther, TM thickness of mesocarp, WEE width of epicarp at equatorial area, SEG number of segment per fruit, DFA diameter of fruit axis, SL seed length, SW seed width, SWT single seed weight, BSN bold seed number per fruit, FD fruit diameter and FL fruit length.

Table 4
Factor loadings (FA), communality and uniquenesses of 20 quantitative traits of sweet orange genotypes plant height, BG base grith, LLL length of leaf lamina, WLL width of leaf lamina, LP length of pedicel, LFB length of flower bud, LPT length of petal, WP width of petal, ST number of stamens, LA length of anther, TM thickness of mesocarp, WEE width of epicarp at equatorial area, SEG number of segments per fruit, DFA diameter of fruit axis, SL seed length, SW seed width, SWT single seed weight, BSN bold seed number per fruit, FD fruit diameter and FL fruit length. PH