Phenotypic characterization and validation of provitamin A functional genes in early maturing provitamin A‐quality protein maize (Zea mays) inbred lines

Abstract The number of drought and low‐N tolerant hybrids with elevated levels of provitamin A (PVA) in sub‐Saharan Africa could increase when PVA genes are optimized and validated for developed drought and low‐N tolerant inbred lines. This study aimed to (a) determine the levels of drought and low‐N tolerance, and PVA concentrations in early maturing PVA‐quality protein maize (QPM) inbred lines, and (b) identify lines harbouring the crtRB1 and LcyE genes as sources of favourable alleles of PVA. Seventy early maturing PVA‐QPM inbreds were evaluated under drought, low‐N and optimal environments in Nigeria for two years. The inbreds were assayed for PVA levels and the presence of PVA genes using allele‐specific PCR markers. Moderate range of PVA contents was observed for the inbreds. Nonetheless, TZEIORQ 55 combined high PVA concentration with drought and low‐N tolerance. The crtRB1‐3′TE primer and the KASP SNP (snpZM0015) consistently identified nine inbreds including TZEIORQ 55 harbouring the favourable alleles of the crtRB1 gene. These inbreds could serve as donor parents of the favourable crtRB1‐3′TE allele for PVA breeding in maize.

Several studies (Li, Tayie, Young, Rocheford, & White, 2007;Mugode et al., 2014;Muzhingi et al., 2011) were conducted on the retention of PVA carotenoids and their bioavailability when consumed. This was necessary because the traditional African household processing methods could result in PVA reduction of about 20%-30%. The conversion factor of yellow maize β-carotene to retinol (the form of vitamin A used by the body) by weight was 3.2 ± 1.5 to 1 (Muzhingi et al., 2011). Preliminary PVA target density of 15 µg/g DW (HarvestPlus, 2004) was, therefore, set to provide an estimated 50% dietary requirement of 275 µg of retinol in children and about 500 µg in women based on common quantities of PVA maize consumed (Simpungwe et al., 2017).
For example, with 15 µg/g DW PVA which translates into about 5 µg of retinol (using a conversion factor of 3:1), a child could meet the estimated 50% daily retinol requirement of 275 µg by consuming about 825 µg DW PVA obtainable from an average quantity of 55 g of PVA maize consumed. In view of this potential of PVA maize, there has been some progress with respect to the improvement of PVA carotenoids in developed maize varieties in SSA, with over 40 PVA varieties released (Andersson, Saltzman, Virk, & Pfeiffer, 2017;Listman et al., 2019). However, the PVA concentrations of these varieties range between 6 and 10 µg/g (Andersson et al., 2017). Thus, much could be done regarding the number of hybrids released in SSA and the levels of PVA concentrations. This calls for the need for detection and validation of the candidate genes especially phytoene synthase1 (PSY1), lycopene epsilon cyclase (LcyE) and β-carotene hydroxylase1 (crtRB1) which have been identified to regulate the key steps involved in the accumulation of PVA carotenoids in maize endosperm (Wurtzel, Cuttriss, & Vallabhaneni, 2012). Among these three genes, crtRB1 (with the alleles crtRB1-5′TE and crtRB1-3′TE) is the most favourable for increased β-carotene levels (Babu, Rojas, Gao, Yan, & Pixley, 2013). Also, two of the three significant polymorphic alleles of LcyE, that is LcyE-5′TE and LcyE-3′ indel, have been validated using 26 different tropical segregating populations .
The quality protein maize (QPM) has about twice of its lysine and tryptophan contents compared to the conventional maize. In whole grain per sample, a maize genotype is classified as QPM when the tryptophan content exceeds 0.075% (Teklewold et al., 2015;Vivek, Krivanek, Palacios-Rojas, Twumasi-Afriyie, & Diallo, 2008). The development of QPM was successful due to (a) the presence of the recessive opaque 2 alleles (o2o2) which are lacking in the conventional maize counterparts, (b) enhancers of the endosperm containing the o2-gene for increased levels of tryptophan and lysine and (c) modifying genes responsible for the hardness of the o2-induced soft endosperm (Twumasi-Afriyie et al., 2016). In addition, the Kompetitive Allele-Specific PCR (KASP) SNP, snpZM0015 located inside the crtRB1 gene on chromosome 10, has been optimized and recommended for accelerating PVA improvement in maize (Intertek Group Plc., Sweden, unpublished).
The early maturing maize inbred lines with backgrounds of PVA, as well as essential amino acids, that is tryptophan and lysine that qualify maize genotypes as QPM (Atlin et al., 2011;Krivanek, Groote, Gunaratna, Diallo, & Friese, 2007), are a new set of inbred lines developed by the Maize Improvement Programme (MIP) at the International Institute of Tropical Agriculture (IITA), Ibadan-Nigeria (Badu-Apraku & Fakorede, 2017). The lines were selected based on the deep orange colour (colour scores > 8.0) of the kernels (Chandler et al., 2013) which made them distinct from the typical yellow cultivars. This implied that the inbred lines at least had elevated levels of PVA compared to the typical yellow maize which usually have PVA levels ranging from 0.5 µg/g to 1.5 µg/g DW (Egesel, Wong, Lambert, & Rocheford, 2003). Studies by Sivaranjani, Prasanna, Hossain, and Santha, (2013) have revealed positive correlation between kernel colour and total carotenoid concentration. However, a genotype can have higher levels of total carotenoids with lower concentrations of PVA. Azmach, Gedil, Menkir, and Spillane, (2013) have found weak or zero correlations between kernel colour and provitamin A carotenoids.
Therefore, orange colour per se is not reliable in determining PVA levels of maize genotypes (Azmach et al., 2013;Menkir, Liu, White, Maziya-Dixon, & Rocheford, 2008;Menkir & Maziya-Dixon, 2004). Although kernel colour was initially used to select inbred lines purporting to harbour elevated levels of PVA, in the present study, chemical analysis was used to determine the levels of these carotenoids. In addition, a clear knowledge of the presence of the PVA favourable genes to accelerate PVA accumulation in the inbreds is needed to guide breeding strategies for developing superior early PVA-QPM hybrids. This study was, therefore, conducted to (a) determine the levels of drought and low-N tolerance, and PVA concentration in the early PVA-QPM inbred lines, and (b) identify lines harbouring the crtRB1 and LcyE genes as sources of favourable PVA alleles to serve as donor parents. At the end of the 2015 growing season, 150 S 7 inbred lines with varying reactions to the multiple stresses were analysed for tryptophan content in the IITA chemical laboratory in Ibadan, Nigeria, and 73 (only those with >0.075%) were kept for further use (Badu-Apraku & Fakorede, 2017). From the 73 early PVA-QPM inbreds, 64 plus six checks were selected for the present study.

| Field trials
Evaluation of the 70 inbred lines was carried out under drought, low-N and optimal conditions in Nigeria. The drought experiments were conducted at Ikenne (6°50′N, 30°45′E, 62 m altitude, 1,200 mm mean rainfall annually) in the 2016/2017 and 2017/2018 dry seasons. Drought stress was achieved by supplying 17 mm of sprinkler irrigation water in a week up to 25 days after planting (DAP) after which the irrigation was terminated and the maize plants depended on the available soil moisture to reach physiological maturity. The managed drought trials received NPK fertilizer at the rate of 60 kg/ ha each of N, P and K (15-15-15) during planting. Additionally, top dressing was done with 60 kg/ha of N (supplied as urea) at 3 weeks after planting (WAP).
Evaluation of the inbreds under low-N (30 kg/ha) conditions was carried out at Ile-Ife (7°30′N, 5°31′E, and 240 m altitude, 1,250 mm mean rainfall annually) and Mokwa-(10°20′N, 5°6′E, 459 m above sea level, 1,050 mm mean rainfall annually) in the 2016 and 2017 major growing seasons. Low soil nitrogen conditions at both locations were accomplished by depleting the fields of N through continuous cultivation of densely populated maize without fertilizer application for three cropping seasons and complete removal of crop residues at the end of every harvest. Prior to field preparation, topsoil samples were collected at the depth of 0-15 cm for analysis of the contents of nitrogen (N), phosphorus (P) and potassium (K) using the Kjeldahl digestion and colorimetric procedure (Bremner & Mulvaney, 1982) at the IITA analytical services laboratory, Ibadan, Nigeria. The low-N experimental field at Mokwa had 0.085 g/kg N, 6.32 g/kg P and 0.20 g/kg K, whereas that of Ile-Ife contained 0.084 g/kg N, 2.05 g/kg P and 0.358 g/kg K. Based on the soil tests, NPK fertilizer was formulated using urea (N source), triple superphosphate (P 2 O 5 source) and muriate of potash (K 2 O source), respectively, and it was applied immediately after thinning (2 WAP). The urea provided a basal available N of 15 kg/ha, P 2 O 5 and K 2 O fertilizers supplied 60 kg/ha each of P and K. Additionally, top dressing of 15 kg/ha of N (supplied as urea) was done at 4 WAP to bring the total available N received on the low-N fields to 30 kg/ha.

| Agronomic data collection
Based on individual plots, data were taken for 50% days to anthesis (DA) and silking (DS) as well as plant and ear heights (PLHT and EHT). Plant and ear aspects (PASP and EASP) were rated on a scale of 1-9 (1 = excellent plants or ears and 9 = extremely poor plants or ears). The difference between DA and DS was calculated as anthesis-silking interval (ASI). The number of ears per plant (EPP) was obtained as the ratio of the number of harvested ears in a plot to the total number of plants in that plot. At 70 DAP, visual ratings for stay-green characteristic (STGR) were carried out for the trials under drought and low-N using a scale of 1-9 (1 = less than 10% dead leaf area and 9 = more than 80% dead leaf area).
The harvested ears from each plot under the two stress conditions were shelled to measure grain weight. Grain moisture content was determined using Kett moisture tester PM-450. Grain weight was adjusted to 15% moisture content, and grain yield (GY) in kg/ha was computed on plot basis. For the optimal trials on the other hand, an assumption of 80% shelling percentage was considered per plot to compute GY from ear weight adjusted to 15% moisture content.

| Production of kernel samples for carotenoid analysis
The inbred lines were planted under well-watered growing con-

| Analysis of provitamin A carotenoids
Carotenoids were extracted and quantified by HPLC at the IITA nutritional laboratory, Ibadan, Nigeria. The protocol for extraction and carotenoid analysis was based on the procedure described in Howe and Tanumihardjo (2006). Total carotenoids were computed as the sum of concentrations of α-carotene, β-carotene, lutein, zeaxanthin and β-cryptoxanthin. PVA was computed as the sum of β-carotene, and half of each of β-cryptoxanthin and α-carotene contents, since β-cryptoxanthin and α-carotene contribute half (50%) of the value of β-carotene as PVA (US Institute of Medicine, 2001). The selected early maturing PVA-QPM inbred lines were analysed for only tryptophan content but not lysine or both in whole grain flour. This is because lysine content of the maize endosperm is highly correlated with that of tryptophan (greater than 0.9; Nurit, Tiessen, Pixley, & Palacios-Rojas, 2009;Villegas, Vasal, & Bjarnason, 1992). Moreover, analysis for tryptophan is far cheaper than lysine and so it is economically prudent for the breeder to use tryptophan content to determine the nutritional potential of QPM genotypes at early breeding stages (Nurit et al., 2009;Villegas et al., 1992). Tryptophan was quantified by the colorimetric method (Herbabdes & Bates, 1969). Values of all carotenoids and tryptophan for each sample were obtained from two technical replications from each field replication to increase accuracy in the carotenoid quantification. diver sitya rrays.com/files/ DArTD NAiso lation.pdf). The DNA concentration was obtained by spectrometry measurement using NanoDrop 8000 machine (Thermo Scientific), and DNA quality was confirmed by running DNA samples on 0.8% agarose gel. Short or degraded DNA was eliminated, and DNA concentrations of 30 ng/μl were used. Standard Buffer, 0.5 µl of each primer and ultra-pure water making up to 10 µl total reaction volume. PCR thermal cycling profile was 1 cycle of initial denaturation at 94°C for 3 min, followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 58°C for 1 min and extension at 68°C for 1 min. This was followed by 1 cycle of final extension at 68°C for 5 min and hold at 4°C. Fragments in the PCR products were resolved on 2% agarose gel. The polymorphic sites of the crtRB1-3′TE gene used have a 325/1,250 bp indel, with 595 bp amplicon being the favourable allele, while 920 and 1,845 bp are the unfavourable alleles. Also, the polymorphic sites of the LcyE-5′TE gene used have a 401/1,567 bp indel, with 595 bp amplicon being the favourable allele and 1,845 bp being the unfavourable allele (Azmach et al., 2013).

| Kompetitive allele-specific PCR (KASP) genotyping
Genomic DNA isolated from leaf tissue of the 70 maize inbred lines was used as template for the KASP genotyping reaction. Sample DNA was diluted to a working concentration of 30 ng/µl for use in the KASP genotyping reaction. KASP assay, snpZM0015, was used to investigate the presence and/or absence of the favourable alleles for the crtRB1 gene. KASP reaction was performed in a 96-well plate in a reaction volume of 10 µl consisting 5 µl template DNA and 5 µl of the prepared genotyping mix (2× KASP master mix and primer mix).
Protocols for the preparation and running of KASP reactions are presented in the KASP manual (http://www.kbios cience.co.uk, accessed on 2nd July 2018). KASP assay kit was purchased from LGC Genomics (LGC Group). All amplification reactions were performed using the Roche LightCycler 480 II (LC480 II) System (Roche Life Science) at the Bioscience Centre of IITA Ibadan, Nigeria. Amplification condition was as follows: 1 cycle of KASP special Taq activation at 94°C for 15 min, followed by 36 cycles of denaturation at 94°C for 20 s, and annealing and elongation at 60°C (dropping 0.6°C per cycle) for 1 min. Endpoint detection of the fluorescence signal was acquired for 1 min at 30°C using the same instrument. Genotyping result was analysed using KlusterCaller software (LGC Group), and genotyping data were visualized as cluster plots and downloaded using SNPviewer software (LGC Group). Allele calls for the SNP were made based on validation result provided by Intertek (Intertek Group Plc.), as homozygous AA for favourable allele, homozygous GG for unfavourable allele or heterozygous AG for both alleles.

| Statistical analysis
Agronomic data recorded for the inbreds were subjected to analysis of variance (ANOVA) under each and across environments using PROC GLM in Statistical Analysis Software (SAS) version 9.4 with a random statement and a test option (SAS Institute, 2012), and means were separated using standard error of difference (SED). In the ANOVA, location by year combination was considered as an environment. Environments, replications within environments and incomplete blocks within replications environment interactions were treated as random factors, while inbred was regarded as a fixed factor. PVA carotenoid data were transformed using natural logarithm as the ratios were not expected to follow a normal distribution curve. ANOVA was also performed for PVA carotenoids for each and across the two locations (Mokwa and Ibadan). Mean concentrations of PVA and the component carotenoids measured for Ibadan and Mokwa were compared by performing a two-tailed independent samples t test with equal pooled variance using SAS. Furthermore, effect size of the significant t-values was determined by the estimated Cohen's d (Cohen, 1988) as follows: where t = t-value, and N1 and N2 = number of observations in samples one and two, respectively. A measure of effect size (Cohen's d estimates) up to 0.20, 0.50 and 0.80 was classified as small, medium and large, respectively. Repeatability of the traits was calculated on mean basis using the following formula: where 2 G is the genotypic variance, 2 GE is genotype × environment variance, 2 is error variance, e is number of environments, and r is number of replications. Variances were estimated using REML method in SAS MIXED procedure.

| Analysis of variance for agronomic traits and provitamin A carotenoids
ANOVA under two drought environments revealed significant (p < .01) variation among environments (E) and inbreds (G) except ASI, PASP, EASP and EPP for environments (Table 1). However, inbred × environment interactions (GEI) were significant (p < .05) for only GY and ASI. Across the three low-N environments, significant (p < .01) variations were observed among E, G and GEI mean squares for the traits measured except GEI mean squares for EHT. High repeatability (R) values were estimated for PASP, EASP, EPP and STGR compared with that for GY (56%) under drought. Under low-N, GY had a relatively low R estimate compared with the moderately high values estimated for DA, DS, PLHT and EHT. Under optimal environments, significant (p < .05) differences were found among E, G and GEI mean squares for all traits. The few exceptions were GEI mean squares for DA, PLHT and EASP (Table 2). Across the eight test environments, there were significant (p < .01) differences among E, G,

| Mean responses of inbreds for agronomic traits and provitamin A carotenoids
Ranking of the lines was based on the multiple trait base index across environments ( Table 4). The inbreds recorded greater yield reductions under drought relative to those under low-N and across the two stresses. The analysis of data across stress environments generally revealed that the inbreds that had increased ASI, higher STGR and high percentage GY reduction also had lower GY and negative selection indices. Thirty-three inbred lines showed tolerance across stress environments based on the multiple trait base index with performance under optimal conditions serving as a check.
Although kernels were sampled under well-watered conditions for both locations, mean concentrations of PVA and the component carotenoids were consistently higher for the results of Ibadan compared to the Mokwa site (Table 5). However, the two-tailed independent samples t test revealed that only the PVA and β-carotene concentrations were significantly higher. Larger effect size of 1.21 was estimated for the significant difference observed for β-carotene concentrations measured for the two locations. Conversely, very small effect size of 0.09 was found for the significant difference in PVA concentrations between the two locations. Estimated PVA contents of selected inbred lines across the two locations varied from 4.83 μg/g for TZEIORQ 48 to 14.62 μg/g for TZEIORQ 55 with a mean of 7.38 μg/g (Table S1). Zeaxanthin (44%) and lutein (27%) were the most predominant carotenoids vis-à-vis the β-cryptoxanthin (8%) and α-carotene (6%) which had lower values. Relative to the contents of the other carotenoids, lower levels of α-carotene were measured for most of the inbred lines under each and across locations.

| Identification of inbred lines carrying favourable alleles of LcyE and crtRB1 genes
Among the 3 PCR-based functional markers used, the crtRB1_3′TE_T_ The inbreds identified to harbour the favourable alleles of crtRB1 generally had moderate levels of PVA ranging from 6.01 μg/g for TZEIORQ 10 to 14.62 μg/g for TZEIORQ 55 (Table 6)  Abbreviations: ASI, anthesis-silking interval; C1 to C6, checks 1 to 6, respectively; Env, environment; M. I, multiple trait base index; STGR, stay-green characteristics.
Correspondingly, snpZM0015 identified TZEIORQ 55 to harbour the heterozygous alleles (G:A [green] = heterozygous) as revealed by the PCR-gel-based crtRB1-3′TE marker. Fifty-nine inbreds had the unfavourable allele (G:G [red] = inbred lines with the unfavourable allele), while no amplification was observed for two of the inbred lines (? [pink] = inbreds that did not amplify). The 2 non-template controls [NTC (black) = no template controls] effectively checked the amplification and efficiency of the KASP SNP (snpZM0015) by clustering together away from the inbred samples (Figure 2).

| D ISCUSS I ON
The significant variability among G for grain yield and most of the traits measured under each and across test environments implied the existence of genetic variability in the early maturing PVA-QPM inbreds. The significance of E and GEI observed for GY and several other traits under each and across environments suggested the inconsistent rankings of the traits measured in varying environments and that inbred evaluations in more environments were necessary to identify outstanding cultivars under drought (Badu-Apraku et al., 2011;Edmeades, 2013) andunder low-N (Meseka, Menkir, Ibrahim, &Ajala, 2006). The high repeatability (R) values estimated for PASP, EASP, EPP and STGR compared with that for GY under drought indicated that selecting for the yield related traits would be effective to complement GY in the identification of drought tolerant inbred lines (Bänziger, Edmeades, Beck, & Bellon, 2000). The moderately high R values estimated for DA, DS, PLHT and EHT under low-N indicated that early generation testing of the inbred lines using these traits under low-N conditions would be successful (Badu-Apraku et al., 2013;. The wide range of PVA values obtained across the locations indicated the existence of significant variation for the PVA carotenoids in the set of inbred lines used (Harjes et al., 2008;Mishra & Singh, 2010) and that each location revealed different genetic variations among the inbred lines. This suggested that G × E was significant for the measured carotenoids as evident in the significant mean squares of inbred × location interaction observed.
Larger location mean squares compared to the corresponding error TA B L E 5 Mean comparison of provitamin A and the component carotenoids of selected early maturing PVA-QPM inbred lines under two well-watered conditions in Ibadan and Mokwa in Nigeria, 2018  (Egesel et al., 2003;Menkir et al., 2008;Menkir & Maxiya-Dixon, 2004) and in temperate maize (Quakenbush, Firch, Brunson, & House, 1966 Forty-seven per cent of the inbreds possessing drought and low-N tolerance signalled the common adaptive mechanisms involved in the tolerance to the two stresses and that selection under drought could result in improved low-N tolerance (Badu-Apraku, Akinwale, Franco, & Oyekunle, 2012;Bänziger et al., 1999;Meseka et al., 2006). However, the greater yield reduction found under drought relative to that under low-N and across the two stresses suggested that the drought conditions were more severe than low-N and that with limited resources, selection for drought tolerance should be prioritized over low-N. The inbreds identified would serve as an invaluable source of drought and low-N tolerant genes for the development of superior hybrids and synthetics (Ifie, Badu-Apraku, Gracen, & Danquah, 2015).  Babu et al. (2013) and Azmach et al. (2013) who validated the crtRB1-3′TE functional marker in a PVA maize germplasm. However, with the exception of TZEIORQ 55, the PVA contents of the nine inbreds were moderate suggesting a situation of reduced gene expression (Hood, 2004;Mocellin & Provenzano, 2004) which could be due to gene silencing occurring during transcriptional or translational processes (Redberry, 2006

ACK N OWLED G EM ENTS
The authors are grateful for the financial support of the USAID through the West Africa Centre for Crop Improvement (WACCI) and the support from the Bill and Melinda Gates Foundation through the DTMA/STMA Projects, as well as IITA for the execution of this study.
We are also grateful to the staff of the IITA Maize Improvement Program and the Food and Nutrition Laboratory of IITA in Ibadan, Nigeria, for the technical assistance.

CO N FLI C T O F I NTE R E S T
The authors hereby declare that the study was carried out without any financial and/or commercial commitments that could result in a potential conflict of interest.