Towards the Development, Maintenance and Standardized Phenotypic Characterization of Single‐Seed‐Descent Genetic Resources for Chickpea

Here we present the approach used to develop the INCREASE “Intelligent Chickpea” Collections, from analysis of the information on the life history and population structure of chickpea germplasm, the availability of genomic and genetic resources, the identification of key phenotypic traits and methodologies to characterize chickpea. We present two phenotypic protocols within H2O20 Project INCREASE to characterize, develop, and maintain chickpea single‐seed‐descent (SSD) line collections. Such protocols and related genetic resource data from the project will be available for the legume community to apply the standardized approaches to develop Chickpea Intelligent Collections further or for multiplication/seed‐increase purposes. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC.


INTRODUCTION
In recent years, there has been clear recognition of the role of food legumes in the transition to a plant-based diet, along with the importance of their wider use in the fight against climate change and to promote food security and human health (Bellucci et al., 2021). A crucial step to increase the cultivation and consumption of food legumes is the analysis of their genetic resources. Indeed, characterization, maintenance, and use of food legume genetic resources are key steps in breeding new varieties with greater adaptation and improved quality traits and nutritional value. Among food legumes, chickpea (Cicer arietinum) is the second most important food legume (after the common bean) for direct human consumption worldwide (FAOStat 2019, see Internet Resources). Advances in genomics and metabolomics offer new opportunities to characterize the heritable diversity of genetic resources and identify the genetic basis of phenotypic traits for the development of genomic prediction tools and a deeper understanding of Genotype × Environment interactions. Under this scenario, the development of well-described collections based on pure single-seed-descent (SSD) lines where the phenotypic characteristics are linked to a specific genotype offers the potential to manage the genetic resources efficiently and create unique information for access by users. Here, we first outline the main aspects of our knowledge of chickpea's evolutionary history and population structure, along with a description of the germplasm collections conserved worldwide and the available genomic resources. In the second part, we present the concept and protocols for developing "Intelligent Collections" (ICs) for chickpea and their phenotypic characterization during the multiplication/seed-increase steps, as has also been proposed for common bean (Cortinovis et al., 2021), lentil (Guerra-García et al., 2021), and lupin (Kroc et al., 2021).

ORIGIN, DOMESTICATION, DIFFUSION, AND EVOLUTION OUT OF THE CENTERS OF ORIGIN
The genus Cicer comprises 46 species with 10 annuals and 36 perennials, which include a new Cicer species that was recently discovered, and is named Cicer turcicum Toker, Berger & Gokturk (Ladizinsky & Adler, 1976;Toker et al., 2021;van der Maesen et al., 2007).
Following the gene-pool concept introduced by Harlan and De Wet (1971), the different Cicer species can be grouped into (i) the primary gene pool that includes the domesticated C. arietinum and its wild progenitor C. reticulatum; (ii) the secondary gene pool that is represented by C. echinospermum P.H. Davis; and (iii) the tertiary gene pool that comprises wild Cicer species, as six annuals (C. pinnatifidum Jaub. & Spach.; C. bijugum Rech. f.; C. judaicum Boiss; C. yamashitae Kitam.; C. chorassanicum Popow; C. cuneatum Hochst. ex A. Rich.) and 36 perennials (Ahmad, Slinkard, & Scoles, 1988;Smýkal et al., 2015). Based on similarities of internal transcribed spacers (ITS), Toker et al. (2021) reported that the recently proposed C. turcicum species is closely related to Rocchetti et al. C. echinospermum and suggested that it is likely to be a member of the secondary gene pool; however, crossing experiments need to be carried out to test hybridization with other chickpea species (Toker et al., 2021).
The Fertile Crescent is considered the center of chickpea domestication (Harlan, 1992). Cultivated chickpea spread further into the Levant and ultimately throughout the Mediterranean basin (southern Europe, northern Africa). Cultivation of chickpea began within the past century in North America and Australia (Berger & Turner 2007). Figure 1 shows the distribution by the origin of the chickpea genetic resources conserved in gene banks (Genesys, see Internet Resources).
Many domesticated species from the Fertile Crescent are cultivated after autumn germination, which is also the pattern observed among their wild progenitors. The situation with chickpeas is less clear. Breeding has adapted the modern crop to diverse agroclimatic zones, with chickpea grown as a spring and summer crop in Turkey, North Africa, and North America, winter in Australia and India, and post-monsoon fall in Ethiopia. Abbo, Berger, and Turner (2003) suggested that wild Cicer has a fall germination pattern and that the practice of spring cropping was adopted by farmers attempting to avoid the devastating effects of Ascochyta blight. However, von Wettberg et al. (2018) documents examples where wild populations exhibit a pattern of springtime (April-May) vegetative growth, with maturation occurring in early summer (June) as temperatures rise and humidity declines. Berger, Buck, Henzell, and Turner (2005) compared wild and domesticated chickpea genotypes and detected a loss of vernalization responses and other chilling tolerant traits (i.e., pod setting delay with temperatures <16°C) for the cultivated germplasm, showing that selection for different phenological mechanisms across the cultivation range has important adaptive implications (Berger & Turner 2007;Berger et al., 2011).
Two market classes of chickpea are recognized: desi and kabuli. Desi chickpea is characterized by purple flowers and small dark seeds, predominantly grown in the Indian sub-continent and Ethiopia. Kabuli chickpea is characterized by white flowers and larger, light-colored seeds with a thin seed coat produced and consumed in the Mediterranean Basin, the Middle East, and North America. As with other legumes, anthocyanins and pro-anthocyanidins appear to underlie differences in flower and seed color, as well as seed coat thickness (Penmetsa et al., 2016). Genetic analyses demonstrate that the desivs-kabuli dichotomy arose multiple times independently, correlated with loss of function in a key transcriptional regulator (Penmetsa et al., 2016). Population structure analysis of diverse chickpea collections has shown that the identification of these groups is largely correlated with geographic distribution rather than desi and kabuli differentiation (De Giovanni et al., 2017;Penmetsa et al., 2016;Sokolkova et al., 2020;Varshney et al., 2021).

WORLDWIDE GERMPLASM COLLECTIONS
The genetic diversity of chickpea germplasm maintained in gene banks is vital for discovering useful genes for chickpea breeding programs. At the worldwide level, there are about 59 gene banks and more than 65,000 conserved Cicer accessions (Genesys, 2021, accessed June 10, 2021; see Internet Resources). For all germplasm collections, overrepresentations of specific collection sites and duplication of accessions due to exchange between gene banks is common (see the example from Berger, Abbo, & Turner, 2003 for wild Cicer) and suggest much care when developing sampling schemes. The International Crop Research Institute for the Semi-Arid Tropics (ICRISAT) in India holds the largest Cicer collection, with 20,764 accessions, followed by the International Centre for Agricultural Research in Dry Areas (ICARDA) in Lebanon, with 15,368 accessions; the Australian Grains Genebank, Department of Economic Development Jobs Transport and Resources, with 9771 accessions; and the Western Regional Plant Introduction Station (USDA-ARS), with 6732 accessions. Cicer accessions are also maintained by the N.I. Vavilov Institute of Plant Genetic Resources (VIR), one of the oldest gene banks in the world, with 2767 chickpea accessions. The VIR chickpea collection is unique in its preponderance of historic landraces, sampled prior to the advent of intensive modern breeding. Thus, the VIR collection is likely to capture genetic adaptations and functional diversity that reflect historic local and regional practices (Plekhanova et al., 2017), providing opportunities to study post-domestication and diversification processes prior to the Green Revolution (Sokolkova et al., 2020).

GENOMICS OF THE GENETIC RESOURCES
The development of high-throughput sequencing technologies has provided new opportunities for genomic characterization of genetic resources and the application of population genetics approaches and genome-wide analysis to identify the genetic control of phenotypic variance for specific traits. In the last decade, such methods have been largely used for chickpea genetic resources, which created a reference set of genomics data useful for the scientific community and plant breeders (Roorkiwal et al., 2020).
Three reference genomes are available for chickpea: one for the kabuli type (CDC Frontier genotype; Varshney et al., 2013), one for the desi type (ICC 4958 genotype; Parween et al., 2015), and one for a wild C. reticulatum accession (PI489777; Gupta et al., 2017). The three genomes, together with de novo whole-genome sequencing data (aligned to CDC Frontier) from 3171 cultivated and 28 C. reticulatum accessions, were used to guide the assembly of the chickpea pan-genome . Draft genome sequences of the wild C. echinospermum S2Drd 065 accession and the wild C. reticulatum Besevler 079 accession are also in the NCBI BioProject database (accession numbers, PRJNA418060, PRJNA418059, respectively), as developed by the D. Cook laboratory (University of California, Davis).
Whole-genome sequencing data of different and wide diversity panels are already available. The work of Thudi et al. (2016) shows the whole-genome sequencing analysis of a set of 129 commercial chickpea varieties that were released between 1948 and 2012 in 14 countries. Resequencing data are also available for a panel of a diverse set of 429 lines that were collected in 45 countries (Varshney et al., 2019). Other whole-genome sequencing data for chickpea genetic resources are in the NCBI BioProject database, including 229 wild Cicer accessions (accession number, PRJNA416007).
From the Australian chickpea breeding program, a small set of 69 chickpea lines that differed in their resistance to Ascochyta blight (Li et al., 2017) and a set of 132 commercial varieties and advanced breeding lines were resequenced (Li et al., 2018).
Several studies that have used chickpea core collections are also available, based on genotyping by sequencing (GBS) data Roorkiwal et al., 2018). Sokolkova et al. (2020) used GBS to genotype a set of 407 landraces sampled from the chickpea VIR collections, which included the domestication and secondary diversity centers. Von Wettberg et al. (2018) collected a wide set of wild chickpea materials (about 1,000 accessions) and characterized these using GBS. Ecological principles were used to guide the collection of these materials across the full range of habitats in which wild chickpea occurs. This research work also increased the supply of wild Cicer accessions compared to the material that had been available over the previous decade (Berger et al., 2003). It provided related GBS data and deep characterization of the accessions (e.g., evolutionary history, ancestral adaptations, the impact of environment on genetic structure, trait values), which has proven useful to improve cultivated chickpea germplasm.
Segregant populations also represent important plant genetic resources, such as biparental populations like recombinant inbred lines (RILs) and multiparent populations like nested association mapping (NAM) and multi-parent advanced generation intercross (MAGIC). In chickpea, a NAM population of 284 lineages was developed from the crosses of a single early flowering cultivated lineage (ICCV 96029) and 20 wild accessions (Shin et al., 2019). The 284 lineages were composed of 255 F 3 families derived from C. arietinum (ICCV 96029), 17 C. reticulatum parents, and 29 F 3 families derived from C. arietinum (ICCV 96029) × 3 C. echinospermum wild parents. The combined 284 lineages were genotyped using GBS and phenotyped for agronomic traits (Shin et al., 2019).
Two MAGIC populations are available for chickpea: one that was developed at ICRISAT that includes 1136 F 8 RILs derived from eight well-adapted and drought-tolerant desi chickpea cultivars (Samineni et al., 2021), and one that was developed at ICARDA that includes 3053 RILs derived from 12 diverse parents (unpublished data).

of 16
Current Protocols

DEVELOPMENT, MAINTENANCE, AND CHARACTERIZATION OF CHICKPEA "INCREASE INTELLIGENT" COLLECTIONS
The characterization and maintenance of legume genetic resources are crucial for their exploitation in breeding programs to develop novel varieties adapted to different environments and that show interesting agronomic and quality traits. Traditional methods identify plant genetics resources based on passport data and maintain seed collections at the accession level. Generally, each accession is composed of a mixture of individuals (population) with unknown genotypic information. This approach does not allow phenotypic and genotypic information to be linked. Moreover, this type of conservation based on small effective population sizes when seeds are regenerated can be affected by random genetic drift, adding new selection pressures in the gene-bank fields (Mascher et al., 2019).
Recently, the strategy of developing and using SSD purified accessions has been applied in several projects (e.g., AGILE, BEAN_ADAPT, BRESOV, BRIDGE); indeed, SSD lines offer the possibility to associate phenotypes to a unique genotype, thus promoting genetic resources conservation and their use in pre-breeding programs. With the purpose of conservation, management, and making the best use of food-legume genetic resources, the European Union Horizon 2020 has funded the Project INCREASE-Intelligent Collections of Food Legumes Genetic Resources for European Agrofood Systems (Bellucci et al., 2021) (https:// www.pulsesincrease.eu/ ).
The main aim of INCREASE is to enhance the genotypic and phenotypic characterization of genetic resources of four important food legumes linked to the European food tradition: chickpea, common bean, lentil, and lupin. The INCREASE approach involves the development of nested-core collections of different sizes based on genetically purified accessions (Bellucci et al., 2021). The collections are designed to represent the worldwide diversity of each legume crop. Their genetic and phenotypic characterization at different levels is also planned based on the collection size. Linking genotypic and phenotypic data will allow identification of the genes and/or genomic regions responsible for important adaptive and agronomic traits and implement genomic selection models to predict phenotypes based on genetic data (Bellucci et al., 2021).
Currently, for chickpea, two INCREASE collections have been established: the larger Reference core (R-CORE) collection that comprises a total of 3276 accessions (mainly C. arietinum, but also including wild Cicer species of its gene pools); and the Training CORE (T-CORE) collection, defined by maximizing the geographic distribution and the phenotypic diversity based on characterizations already available. The T-CORE, which represents a subset of the R-CORE, comprises 450 C. arietinum genotypes that are mostly (392 lines) derived from the European and Mediterranean Association Panel (EMCAP) collection (480 lines) developed at the Polytechnic University of Marche (Ancona, Italy) (Rocchetti et al., 2020).
The procedures applied by the INCREASE project to develop, maintain, and characterize SSD lines of each legume species included in INCREASE are summarized in Figure  2. Thus, the development and characterization of SSD lines of chickpea Intelligent Collections will follow the same workflow (i.e., selfing cycles under controlled conditions, assignment of Digital Object Identifiers, DNA extraction, phenotypic characterization, transfer of information to the INCREASE database).
Specific protocols to characterize the seeds before and after the selfing cycles (including images of the seeds) and the plants during cultivation under controlled conditions have been established and described for common bean (Cortinovis et al., 2021), lentil (Guerra-García et al., 2021), and lupin (Kroc et al., 2021); here, we describe the specific protocols to be applied for characterization of chickpea SSD lines during the process of development and maintenance of seeds (i.e., the so-called Primary Seed Increase within INCREASE; see Cortinovis et al., 2021). These protocols for chickpea were designed based on highly hereditable seed and plant traits that can be used for phenotypic characterization and to monitor the SSD seed-multiplication process (i.e., check for potential human errors during the different seed-increase cycles). The protocols are based on IBPGR/ICRISAT/ICARDA 1993 Descriptors for chickpea (C. arietinum L.), Chickpea Crop Ontology, https:// www.cropontology.org/ , and other specific traits newly defined in INCREASE. The source of each descriptor is indicated in brackets after each one.

CHARACTERIZATION OF CHICKPEA SEEDS FOR SEED-TRAIT DESCRIPTORS
This protocol was developed to characterize seeds before the start of any selfing multiplication cycles carried out under controlled conditions. The protocol allows the characterization of the original seed phenotype from heterogeneous accessions and the seeds Rocchetti et al.

of 16
Current Protocols obtained in subsequent cycles. The data obtained will also be used as a validation tool for the accuracy of the entire process (i.e., identification of potential human/technical errors).

Ruler
Analytical balance Spreadsheet for data collection 1. Seed weight (g) -(Crop Ontology Code ID CO_338:0000026). Count 100 welldeveloped seeds harvested when plants are completely dried. If this number is not possible, count and weigh a sample of 10 seeds for each line (if more than 10 seeds are available, three samples of 10 randomly selected seeds should be weighed, and the values averaged).
Immature, broken, or infected seeds should be excluded.
2. Seed shape (Crop Ontology Code ID CO_338:0000019). Take at least five seeds from each accession (heterogeneous materials) or line (previously developed SSD lines), and through visual observation, evaluate and classify the seed shape according to the following categories (Fig. 3): 3. Seed surface (Crop Ontology Code ID CO_338:0000021). Take at least five seeds (the same used to evaluate seed shape) from each accession (heterogeneous materials) or line (previously developed SSD lines), and through visual observation, evaluate and classify the seed surface according to the following categories (Fig. 4): . Seed color (INCREASE descriptor). Take at least five seeds (the same used to evaluate seed shape and surface) from each accession (heterogeneous materials) or line (previously developed SSD lines), and through visual observation, evaluate and

of 16
Current Protocols 5. Dotted seed coat (Crop Ontology Code ID CO_338:0000022). Visually assess the presence or absence of dots on the seed coat (Fig. 6): 1 = Absent 2 = Present 6. Seed imaging. Take images of chickpea seeds before and after each multiplication cycle following the SEED imaging protocol established within the INCREASE project for all four legume species. The protocol is reported in Basic Protocol 2 and Cortinovis et al. (2021). An example of images obtained for chickpea SSD lines is given in Figure 7.

CHARACTERIZATION OF CHICKPEA LINES FOR PLANT-TRAIT DESCRIPTORS SPECIFIC FOR PRIMARY SEED INCREASE
This protocol was developed to characterize plant and seed traits for each line during the seed multiplication cycle (Primary Seed Increase). Plants must be grown in a greenhouse or growth chamber to ensure insect-free conditions and avoid cross-pollination.
Data are differentiated according to two levels of priority: Priority 1 (Mandatory Traits) and Priority 2 (Non-Mandatory Traits).
This protocol assumes that the materials (i.e., plants, seeds)

Conclusions
Here we have provided the phenotypic protocols to be applied to chickpea seeds and plants during the development and maintenance of SSD lines. The protocols were developed within the H 2 O20 Project INCREASE, and they are designed to improve the characterization of legume genetic resources and promote their exploitation for breeding. The protocols ensure a valuable phenotypic evaluation of chickpea materials based on highly hereditable seed and plant traits that can also be used as a validation system to monitor the maintenance and seed increase of SSD puri-fied accessions. Such SSD lines are, indeed, crucial to identify marker-trait associations and to predict phenotypes based on genotypic data.
Within INCREASE, the data will be integrated into a centralized data system. Each SSD line will be linked to genotypic and phenotypic data produced during the project and will be freely accessible by users. Also, after the end of INCREASE, these materials can be used by research institutes and gene banks, and they can apply these protocols to increase the information present in the database.

of 16
Current Protocols