Map-based cloning and characterization of BoCCD4, a gene responsible for white/yellow petal color in B. oleracea

Background Brassica oleracea exhibits extensive phenotypic diversity. As an important trait, petal color varies among different B. oleracea cultivars, enabling the study of the genetic basis of this trait. In a previous study, the gene responsible for petal color in B. oleracea was mapped to a 503-kb region on chromosome 3, but the candidate gene has not yet been identified. Results In the present study, we report that the candidate gene was further delineated to a 207-kb fragment. BoCCD4, a homolog of the Arabidopsis carotenoid cleavage dioxygenase 4 (CCD4) gene, was selected for evaluation as the candidate gene. Sequence analysis of the YL-1 inbred line revealed three insertions/deletions and 34 single-nucleotide polymorphisms in the coding region of BoCCD4. Functional complementation showed that BoCCD4 from the white-petal inbred line 11–192 can rescue the yellow-petal trait of YL-1. Expression analysis revealed that BoCCD4 is exclusively expressed in petal tissue of white-petal plants, and phylogenetic analysis indicated that CCD4 homologs may share evolutionarily conserved roles in carotenoid metabolism. These findings demonstrate that BoCCD4 is responsible for white/yellow petal color variation in B. oleracea. Conclusions This study demonstrated that function loss of BoCCD4, a homolog of Arabidopsis CCD4, is responsible for yellow petal color in B. oleracea. Electronic supplementary material The online version of this article (10.1186/s12864-019-5596-2) contains supplementary material, which is available to authorized users.


Background
The flower is the reproductive structure of angiosperms, and petals exhibit extensive color variation, mainly due to the accumulation of flavonoids, carotenoids and/or betalain pigments. Flower color serves as a visual signal to attract pollinators and is thus very important for plant reproduction [1][2][3]. There are several reports that by affecting gene exchange, changes in flower color contribute to the species differentiation [4,5]. Thus, flower can serve as a model for studying the relationship between phenotype and genotype during evolution [6]. Additionally, flower color protects plants against disease and UV radiation and helps to maintain the normal physiological function of floral organs [7,8].
Carotenoids are mostly C 40 isoprenoid compounds, comprising of over 750 members widely distributed in fungi, cyanobacteria, algae and plant [9]. Carotenoids biosynthesis takes place in plastids of plants. They present in photosynthetic tissues for light harvesting and photoprotection during photosynthesis [10]. In non-photosynthetic tissues, carotenoids impart color ranging from yellow to red to fruits and flowers as well as other organs [11]. Carotenoids also provide precursors for biosynthesis of plant hormones, including abscisic acid (ABA) and strigolactones [12,13]. The pathway of carotenoid biosynthesis has been well characterized, and nearly all the enzymes involved in carotenoid biosynthesis in plants have been identified (see reviews by Howitt and Pogson; Ruiz-Sola and Rodríguez-Concepción) [14,15].
Brassica oleracea comprises multiple subspecies showing extreme phenotypic diversity. As an important trait, flower color varies among different B. oleracea cultivars: pervasive yellow petals with different degrees of yellowness and relatively fewer white petals, only existing in some cultivars of Chinese kale and cauliflower. Biochemical studies have revealed that variation in flower color in Brassica species is due to differences in the presence, amount, or type of carotenoid pigment [17,22,28]. Although previous studies have demonstrated that this yellow/white petal trait in B. oleracea is controlled by a single locus on C03 [29][30][31], the candidate gene has not yet been identified, and the molecular mechanism underlying petal color variation in B. oleracea species has not been elucidated.
Previously, we mapped the gene cpc-1 responsible for petal color in B. oleracea to a 503-kb region [31], though the candidate gene was not found. In the present study, we reported further mapping results for cpc-1. The coding region of the candidate gene was cloned and compared between white-petal line  (a Chinese kale inbred line) and yellow-petal line YL-1 (a cabbage inbred line). Agrobacterium-mediated transformation of B. oleracea was conducted to validate the function of the candidate gene.

Results
Fine mapping of the petal color gene cpc-1 In a previous study, the candidate gene for petal color was mapped to a 503-kb region on C03 [31]. A larger F 2 population was then developed, with 1251 recessive (yellow petal) individuals. By genotyping all 1251 recessive individuals using two flanking markers, we obtained 36 recombinants for M4064 and 22 recombinants for M4139. To genotype all recombinant individuals, additional InDel markers were developed in this interval. However, we detected a possible error in the 02-12 assembly (http://www.ocri-genomics.org/bolbase/index.html), as several markers showed different orders in the genetic map compared with their physical positions.
Thus, another physical map was constructed based on the TO1000 reference genome (http://plants.ensembl. org/Brassica_oleracea/Info/Index). By comparing the mapping region in the 02-12 and TO1000 reference genomes, we found that the region spans two scaffolds, Scaffold000063 and Scaffold000205, in the 02-12 reference genome. The physical and genetic maps indicated that Scaffold000063 was reversely assembled. In addition, cpc-1 was re-mapped to a 207-kb genomic region (C03:48,444,077..48,651,173) flanked by markers M4089 and M4085, with genetic distances of 0.16 cM and 0.88 cM, respectively. The genetic and physical maps are shown in Fig. 1.
Bol029878 is the candidate gene for cpc-1 B. oleracea database (http://brassicadb.org/brad/) analysis revealed 14 predicted genes (Table 1) in the 207-kb region. Bol029878 is a homolog of Arabidopsis CCD4. Due to its important role in oxidative cleavage pathways of carotenoids, Bol029878 was chosen as a candidate gene and named BoCCD4.
The full-length sequence of the BoCCD4 gene was downloaded from two reference genomes, TO1000 and 02-12. Wild-type BoCCD4 has one exon predicted to encode a putative 596-amino acid protein, with 87.1% sequence identity with Arabidopsis CCD4. To detect any nucleotide variation in BoCCD4 between white-and yellow-petal plants, the region encompassing the gene body and − 2-kb promoter of BoCCD4 was amplified and sequenced using genomic DNA from YL-1 and 11-192. The gene sequence from 11-192(GenBank no. MK599257) was identical to that of TO1000, consistent with the fact that TO1000 is a Chinese kale-like-morphology plant with white flowers. Sequencing of this gene revealed multiple mutations in YL-1 (GenBank no. MK599258), a 1-bp insertion at the + 312 nucleotide position, a 7-bp deletion at the + 771 nucleotide position, a 1-bp deletion at the + 1094 nucleotide position, and 34 single-nucleotide polymorphisms (SNPs) (Fig. 2). All InDels result in a frameshift and premature stop codon.
As the first important mutation in the coding region, the 1-bp insertion alters the open reading frame from the 105th position and causes a premature stop codon, resulting in a predicted truncated 114-amino acid protein lacking the important carotenoid-oxygenase domain (amino acids 74-590) (Additional file 1). Using first-strand cDNA as a template, amplification with the primer Bocpc-CDS generated a full-length coding sequence of BoCCD4 from 11 to 192 but no product from YL-1. Together with RT-PCR results (see below), this finding indicates that BoCCD4 is not expressed in yellow-petal line YL-1, possibly due to altered transcript stability [32] or mutations in the promoter region. Indeed, we detected mutations in the − 2-kb promoter region, though it remains to be determined whether these mutations are crucial for suppressing expression of the transcript.
A genic marker, Bol035718D771, was designed based on the 7-bp deletion and used for genotyping the parental line and recombinants of M4089 and M4085. All 13 recombinants showed the same band size as that of YL-1 (Additional file 2), indicating that this gene was co-segregates with the petal color phenotype.

Expression pattern of BoCCD4
To analyze the expression pattern of BoCCD4, semiquantitative RT-PCR was performed using different tissues: root, stem, leaf, silique, young buds and anther, pistil and open-flower petal. No expression was detected in any of the YL-1 tissues. However, BoCCD4 was preferentially expressed in petals of the white-petal line (Fig. 3). These results revealed that BoCCD4 is a tissue-specific gene that may cleave carotenoids in floral tissues, which is very different from its homolog in Arabidopsis. Overexpression of BoCCD4 in YL-1 results in a transition of petal color from yellow to white or pale yellow We introduced wild-type BoCCD4 driven by the CaMV35S promoter into the yellow-petal parent YL-1 using Agrobacterium-mediated B. oleracea transformation and obtained three independent overexpressing transgenic lines, OEX1, OEX2 and OEX3. OEX1 and OEX3 showed intermediate phenotypes, whereas OEX2 displayed a completely white petal similar to that of the white-petal line   (Fig. 4a).
We next examined the expression levels of the BoCCD4 gene introduced into these transgenic lines by semiquantitative RT-PCR. OEX2 showed the highest expression level, followed by OEX1 and OEX3 (Fig. 4b), indicating high correlation between the expression level of BoCCD4 and the white-petal phenotype. These results suggest that BoCCD4 disruption is responsible for yellow petal color in B. oleracea.

Phylogenetic analysis
To analyze the phylogenetic relationship between the BoCCD4 protein and its close homologs, we conducted BLASTP searches based on the protein database of NCBI and Ensembl Plants (http://plants.ensembl.org) using the full-length amino acid sequence of BoCCD4. We generated a neighbor-joining tree comprising BoCCD4 and 55 homologs from 38 species. These homologs were grouped into three main clades. BoCCD4 shows 87.1% sequence identity with Arabidopsis CCD4 and is located in the same clade as Arabidopsis CCD4, along with homologs from other cruciferous plants, B. rapa and B. napus (Fig. 5). All CCD4s from Cruciferae species evolved from a common ancestor. Arabidopsis has one CCD4, wheareas Brassica species retained two CCD4s homologs. Brassica CCD4s were assigned to subclades in accordance with their locations on A, B or C genome, indicating that Brassica CCD4s rapidly evolved after the whole-genome duplication event and the Brassiceae-lineage-specific whole-genome triplication event. We also conducted sequence alignment and analyses using BoCCD4 and functionally characterized CCD4s in other species, including Arabidopsis thaliana, Osmanthus fragrans, Chrysanthemum x morifolium, and Prunus persica (Fig. 6). All of these CCD4s contain a chloroplast transient peptide, four highly conserved histidine residues as an iron-ligating cofactor, and a

Discussion
In B. oleracea, the white-petal trait segregates as a single locus, and white is dominant over yellow confirmed by different crosses [29][30][31]33]. Recently, this locus was mapped to a region on C03 [29,31]. In this study, we narrowed the gene to a 207-kb region, and we identified an incorrectly assembled scaffold, Scaffold000063, in the 02-12 genome, making positional mapping difficult. Two B. oleracea draft genome sequences are currently available: TO1000 (Chinese kale like) [34], and 02-12 (cabbage) [35]. These draft genomes facilitate basic genetics and genomics research but still need to be improved. The B. oleracea genome is estimated to be over 600 Mb, though the published pseudo-chromosome size is 388.8 Mb for 02-12 and 488.6 Mb for TO1000 [34,35]. B. oleracea genome assembly errors are apparently not rare in previous studies [36][37][38]. In particular, regarding Lee et al., a genotyping-by-sequencing-based high-resolution genetic map allowed identification of 37 misanchored scaffolds for 02-12 and 2 misanchored scaffolds for TO1000 [38]. We predicted a carotenoid cleavage dioxygenase gene, BoCCD4, homologous to the Arabidopsis CCD4 gene as the candidate gene. Sequence analysis, functional complementation, and expression pattern analysis demonstrated that functional loss of BoCCD4 has resulted in widespread yellow-petal B. oleracea accessions. A similar CCD4-based mechanism has been found in other plants.
In chrysanthemum (Chrysanthemum morifolium Ramat.), CmCCD4a degrades carotenoids into colorless compounds, resulting in a white petal color, as confirmed by expression and RNA interference (RNAi) analyses [25]. In azalea (Rhododendron japonicum f. flavum), high expression of a CCD4 gene was identified in a white-flowered accession and its progeny and is considered the key factor controlling flower color [23]. In peach (P. persica), evidence from cultivars, somatic revertants and ancestral relatives support that PpCCD4 is responsible for white/yellow flesh color and that yellow peach alleles have arisen from three independent mutations [32]. In B. napus, Zhang et al. reported that a transposable element insertion (TE1) disrupts Bna C3.CCD4, resulting in a yellow flower [17]. TE1 was also identified in some accessions of B. oleracea, for example,  . b expression level of BoCCD4 in parental lines and three overexpressing transgenic lines. Among the overexpressing transgenic lines, OEX2 showed the highest expression level, followed by OEX1 and OEX3, with a high correlation with phenotype in cabbage lines 02-12 (draft genome) and some yellow-petal Chinese kale lines, indicating that flower petal color variation in B. oleracea follows a similar CCD4-disruption mechanism, as confirmed in the present study. Additionally, it is possible that yellow petals originally appeared in ancestors of B. oleracea and that one mutant type, i.e. TE1, was passed to B. napus.
BoCCD4 is a floral tissue-specific gene that differs from Arabidopsis CCD4 which is expressed in various vegetative tissues and floral tissues. B. oleracea has experienced a whole-genome duplication (WGD) event [39][40][41] and subsequent whole-genome triplication (WGT) [41]. According to previous studies, duplicated gene copies may undergo divergence in expression patterns or functions [41,42]. It is interesting that one of the duplicated copies, BoCCD4 on C03, evolved tissue-specific expression patterns and underwent loss-of-function events, converting flower color from white to yellow without influencing carotenoid metabolism in vegetative tissues. This phenomenon has also been found in other plants, whereby duplicated CCD4 genes evolved different expression patterns in tomato [43], C. morifolium [25], and mandarin orange [44]. In addition, Rodrigo et al. reported that one CCD4 copy evolved novel carotenoid cleavage activity [44].
Parallel evolution is a common evolutionary pheno menon in which different populations independently evolve the same trait [45,46]. For example, three dwarf populations of the forest tree Eucalyptus globulus have evolved in parallel from local tall ecotypes [47]. In the Mina lineage of Ipomoea, parallelism was observed at different levels during the transition of flower color, primarily caused by cis-regulation of the F3'H gene [48]. In addition, parallel evolution at the FLC locus has conferred flowering time variation in the cruciferous plant Capsella rubella. In B. oleracea, using different accessions, we confirmed the presence of at least five pervasive key mutations in the coding region of BoCCD4: two transposons (TE1 and TE2) and three InDels (+ 312 insertion, + 771 deletion, and + 1094 deletion). Different yellow-petal haplotypes (nonfunctional alleles) harbor one independent key mutation or combination of two or more mutations. The presence of these independent mutations indicates that parallel evolution of BoCCD4 possibly occurred in populations of the B. oleracea ancestors. Parallel phenotypic changes may be caused by different genetic changes, different changes at the same locus, and in some cases changes in the same nucleotide at the same locus [48][49][50], which may explain the phenomenon that some nonfunctional alleles of BoCCD4 harbor combinations of different mutations, for example, the allele of the YL-1 inbred line harbors all three InDels, whereas the allele of the 02-12 inbred line harbors TE1 and the + 1094 deletion.

Conclusions
In this study, the gene responsible for petal color in B. oleracea was mapped to a 207-kb fragment. A carotenoid cleavage dioxygenase 4 (CCD4) gene, BoCCD4 was identified as a candidate. Sequence analysis revealed multiple Fig. 6 Sequence alignment of the BoCCD4 amino acid sequence and four functionally characterized CCD4s from Arabidopsis thaliana, Osmanthus fragrans, Chrysanthemum x morifolium, Prunus persica. Green asterisks indicate the four highly conserved histidine residues as an iron-ligating cofactor; red asterisks indicate the conserved glutamates or aspartate for fixing the iron-ligating histidine residues mutations in the coding region of BoCCD4 alleles of yellow-petal accessions. Overexpression of wild-type BoCCD4 allele from 11-192 rescued the yellow-petal trait in YL-1, demonstrating that functional loss of BoCCD4 resulted in the widespread yellow-petal B. oleracea accessions. This study provides insight into the formation of white/yellow petal color in B. oleracea.

Plant materials
Brassica oleracea lines YL-1 (yellow petal) and  (white petal) were described in a previous study [31]. These lines were used as parents to construct F 2 and backcross (BC) populations for mapping cpc-1 [31]. In this study, a larger F 2 population comprising 1251 recessive (yellow petal) individuals was produced for map-based cloning of cpc-1.YL-1 was also used as acceptor plants for Agrobacterium-mediated transfor mation.

Map-based cloning
Genomic DNA was extracted from fresh leaves of parents and F 2 individuals using a modified CTAB (cetyl trimethylammonium bromide) protocol [31]. A set of insertion/deletion (InDel) markers (Additional file 3) around the previously reported mapping region was developed. Polymorphic markers between YL-1 and 11-192 were used to genotype all yellow-petal individuals of the F 2 population. Polymerase chain reaction and polyacrylamide gel electrophoresis were performed following a previously described procedure [31]. Genetic and physical maps were constructed using MapDraw [51].

Plasmid construction and functional complementation
For functional complementation, the coding sequence of cpc-1 was amplified from white-petal parent  using the primer Bocpc-CDS. The fragment was subcloned into a modified binary vector pBWA(V) BS (reconstructed from pCAMBIA1301) driven by the CaMV35S promoter, and the hygromycin resistance gene was replaced with an herbicide resistance marker (Bar) to generate the construct Pro35S::BoCCD4. This construct was introduced into Agrobacterium tumefaciens strain GV3101 and transformed into yellow-petal parent YL-1 using the Agrobacterium-mediated transformation procedure for B. oleracea described by Yi et al. [52].

Expression analysis of BoCCD4
Total RNA was extracted from plant tissues, including roots, stems, leaves, siliques, young buds, sepals, petals, pistils and anthers of YL-1 and 11-192 and petals of overexpressing lines using an RNAprep pure Plant Kit (TIANGEN, Beijing, China). Genomic DNA removing from the extracted RNA, first-strand cDNA synthesis and semi-quantitative reverse transcription-polymerase chain reaction (RT-PCR) were performed as previously described [53]. The primers for RT-PCR are listed in Additional file 3.

Phylogenetic analysis
BLASTP searches were conducted using the amino acid sequence of BoCCD4 to search for homologs in the protein databases of the National Center for Biotechnology Information (NCBI) and Ensembl Plants (http://plants. ensembl.org). Protein sequence alignment was performed with MAFFT (v7.037) [54]. FastTree (LG + JTT model) was used to construct phylogenetic trees [55]. Availability of data and materials Data of this study have been included in the article or as Additional files.
Authors' contributions YZ, JS and ZF conceived and designed the work. FH, HC and BZ performed the experiments and analyzed the data. FH and YZ wrote and revised the manuscript. XL, LY, MZ, HL, YW, and ZL analyzed the data and revised the manuscript. All authors have read and approved the final manuscript.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.