Genomic and gene expression associations to morphology of a sexual ornament in the chicken

Abstract How sexual selection affects the genome ultimately relies on the strength and type of selection, and the genetic architecture of the involved traits. While associating genotype with phenotype often utilizes standard trait morphology, trait representations in morphospace using geometric morphometric approaches receive less focus in this regard. Here, we identify genetic associations to a sexual ornament, the comb, in the chicken system (Gallus gallus). Our approach combined genome-wide genotype and gene expression data (>30k genes) with different aspects of comb morphology in an advanced intercross line (F8) generated by crossing a wild-type Red Junglefowl with a domestic breed of chicken (White Leghorn). In total, 10 quantitative trait loci were found associated to various aspects of comb shape and size, while 1,184 expression QTL were found associated to gene expression patterns, among which 98 had overlapping confidence intervals with those of quantitative trait loci. Our results highlight both known genomic regions confirming previous records of a large effect quantitative trait loci associated to comb size, and novel quantitative trait loci associated to comb shape. Genes were considered candidates affecting comb morphology if they were found within both confidence intervals of the underlying quantitative trait loci and eQTL. Overlaps between quantitative trait loci and genome-wide selective sweeps identified in a previous study revealed that only loci associated to comb size may be experiencing on-going selection under domestication.


Introduction
Chickens have been domesticated for over 8,000 years, and in that time, a wide variety of different traits have been subject to selection, be it intentional or not. Large changes in body size, reproduction, behavior, DNA methylation, coloration, and bone production have occurred, among others (Price 1984;Jensen and Wright 2014;Johnsson et al. 2015aJohnsson et al. , 2015bJohnsson et al. , 2016Fogelholm et al. 2019Fogelholm et al. , 2020Hö glund et al. 2020). In addition to these, the comb of the chicken has been under strong selection, both for size and a variety of different large effect morphological changes derived from single-gene mutations. These mutations include such phenotypes as the pea comb (Wright et al. 2009), rose comb (Imsland et al. 2012), and duplex comb (Dorshorst et al. 2015) and were used by Bateson as the first examples of Mendelian inheritance in animals and of 2gene epistatic interactions (Bateson and Mendel 1913). Despite the interest in the genetics of the comb, more complex morphological evaluations of the chicken comb and the genes underpinning such morphology have yet to be performed. The classic single comb consists of several different morphological features including number of finger-like protrusions (comb fingers) and posterior paddle-like structures, which together form the general comb shape.
In chickens, the comb serves as a sexual ornament, which plays a role not only in males for social rank, mate choice, and heat regulation but is also linked to female egg production, fecundity, bone mass, and tarsus length Birkhead 2007a, 2007b;Wright et al. 2008;Johnsson et al. 2014). Therefore, larger combs result in higher egg production and are the target of anthropogenic selection in captivity as well as sexual selection in nature. In this regard, it has been shown that females prefer males with larger combs that will increase viability (Parker 2003) and assure males with increased bone reserves (Wright et al. 2008;Johnsson et al. 2014), whereas males prefer females with larger combs, as this signals superior maternal investment (Pizzari et al. 2003;Birkhead 2007a, 2007b) and increased reproductive output (Wright et al. 2008). The high heritability rates of comb variation in chickens highlight a strong genetic component for this trait (Johnson et al. 1993). Indeed, several studies have reported that quantitative trait loci (QTL) found to be associated to different aspects of comb morphology, size, and mass (Wright et al. 2009;Johnsson et al. 2012Johnsson et al. , 2014Sun et al. 2015;Shen et al. 2016). Furthermore, gene expression analysis from bone marrow tissue provided a few examples of bone allocation genes that colocalize in these comb-related regions, supporting a pleiotropic effect of higher egg production associated to larger combs (Johnsson et al. 2012(Johnsson et al. , 2014. With regard to the phenotype, most of these studies concentrated on comb length, comb height, and comb weight with the exception of Moro et al. (2015), who analyzed the impact of DNA copy number variants on geometric morphometric land-mark data.
The use of classic morphological measurements in QTL analyses has proven to be fruitful thus far, however, different components of shape, captured using geometric morphometric approaches, are less utilized and often reveal novel genomic regions being associated to the studied trait (e.g. Takahara and Takahashi 2015). In the present study, we add chicken comb shape outline and novel gene expression data to previously published DNA sequence and gene expression datasets (Johnsson et al. 2012(Johnsson et al. , 2014, to find genomic and transcriptomic associations to various aspects of comb morphology and size. In doing so, we confirmed the presence of a large effect QTL associated to a sizerelated comb trait (surface area) but also identified novel QTL associated to comb shape (outline) and comb finger number. We show that digitized photographs of highly variable chicken combs can be used to extract estimate trait measures for subsequent QTL analyses. These results add on to an ever-growing understanding of the comb serving as a sexual ornament in the chicken system.

Advanced intercross lines and genetic markers
The chicken combs used in this study were obtained from an F8 population generated by crossing a line of selected White Leghorn maintained from the 1960s and a population of Red Junglefowl obtained from Thailand. A genetic map, consisting of 651 informative SNP markers, was used in conjunction with 326 F8 individuals (201 males and 125 females) for the interval mapping. These markers resulted in a map of length $9,200 cM, with an average marker spacing of $16 cM. Of these markers, 551 were fully informative (differentially fixed between the parental populations), with the markers selected so as to be evenly spread over the genome (as reflected in the average marker spacing) (Wright et al. 2010(Wright et al. , 2012. This marker density is greater than the recommended 20-30 cM marker intensity for QTL analyses (Darvasi and Soller 1994). A more detailed description of the intercross can be found in Johnsson et al. (2012).

Gene expression
RNA was isolated from tissue at the base of the comb for 39 male individuals. Adult comb base tissue was used, as previously when assessing quantitative differences in comb size adult comb base tissue showed strong correlations with comb size (Johnsson et al. 2012). Double-stranded cDNA was labeled and hybridized to NimbleGen 12 Â 135k custom gene expression arrays (Roche NimbleGen). Gene expression data used for the interval mapping can be found in Supplementary Table 1. A detailed description of how the gene expression data was generated and processed can be found in Johnsson et al. (2014), who investigated a subset of probe sets associated with the studied comb mass QTL.

Comb phenotypes
Chicken combs were dissected and stored under À20 C prior to analysis. To extract comb phenotype data, a Nikon digital camera (D3100) with an 85-mm lens was used to generate digital images. Combs were laid down flat and pictures were taken under the same conditions for each sample within a white photo-tent and along-side a white color checker that was used as a length scale. Comb length, height, area, and the number and size of fingers were measured using the measure and selection tools in IMAGEJ v1.51 (Schneider et al. 2012) (Fig. 1). To extract comb outline measures, images were first converted to black and white and then inverted so as to have the combs as black objects on a white background prior to being imported into the R package MOMOCS, which was then used to extract comb outlines as x and y coordinates (Bonhomme et al. 2014). Phenotype measurements and genotypes used in this study can be found in the QTL cross file used for the QTL interval mapping (Supplementary Table 2).
Comb outlines were centered, scaled, and aligned before performing a final generalized Procrustes alignment. Procrustes alignment and Fourier analysis require equal numbers of coordinates for each outline; hence, all outlines were interpolated to select 16,570 coordinates in each (derived from the comb with the highest number of coordinates). Fourier coefficients were derived by running an Elliptical Fourier Analysis (EFA) on aligned outline data in MOMOCS, which did not require further normalization. In EFA, shape is defined with Elliptic Fourier Descriptors, i.e. number of cosine waves (harmonics) required to describe variation of the comb outline (Campana and Casselman 1993;Crampton 1995). In this case, 26 harmonics were sufficient to accurately describe comb outlines (>99% harmonic power as estimated using MOMOCS) and produced 200 coefficients, which were further analyzed by principal component analysis (PCA). The first principal component (PC1) explained $35%, while the second principal component (PC2) explained $14% of the variation. The highly variable comb shapes resulted in PC axis-specific descriptions of different aspects to comb shape but to utilize axes with the highest proportion of variation explained, we used only PC1 and PC2 ( Supplementary Fig. 1).

QTL and eQTL analyses
The R package, R/qtl, was used to estimate LOD scores of individual markers to phenotypic measurements based on additive and dominance models with sex, batch, population structure, and weight as fixed differences (Broman et al. 2003;R Development Core Team 2013). These were run additionally using sex as an interacting covariate to account for QTL varying between the sexes. LOD score significance thresholds were estimated via permutation tests for each phenotype, by shuffling individual labels and calculating LOD scores 1,000 times to produce null distributions. Scores were considered significant if they lay above the 0.95th quantile of the randomized distributions and suggestive if they laid above the 0.80th quantile (corresponding P values of 0.05 and 0.2, respectively). A suggestive threshold was used as per (Lander and Kruglyak 1995), though we used a 20% genomewide threshold, which is slightly more stringent than the 1 false positive per scan method utilized by Lander and Kruglyak. Gene expression data for over 35 thousand probe sets were obtained using a 135K-Affimetrix micro array and were used as phenotypes in an interval mapping analysis. Here, batch was used as a fixed factor in the model. LOD thresholds were estimated as above, using permutations. As the genomic positions of the probe sets are known, cis (or local) eQTL were assigned if the probe set lay within 50 cM of the marker exhibiting the highest LOD score. All others were assigned as trans eQTL. Confidence intervals for QTL and eQTL were estimated using a 1.8 LOD drop method (i.e. where LOD scores decrease by 1.8 in both directions of the LOD peak) and is comparable to a 95% confidence interval for intercross populations (Manichaikul et al. 2006).

Overlapping confidence intervals between QTL and eQTL
To establish which QTL and eQTL contained overlapping confidence intervals, we used the GENOMICS RANGES package in R (Lawrence et al. 2013). Overlaps were conducted using the genomic positions on the galgal6 version of the chicken genome. Genomic positions were extracted by using the closest marker to the start and end of each QTL and eQTL confidence interval. Furthermore, gene expression of identified eQTL was correlated with overlapping QTL phenotype measurements using to cor.test function with Pearson's method in R. To test if overlaps between QTL and eQTL were enriched (i.e. significantly higher than expected by chance), we conducted a permutation test by selecting 10 random regions (representing QTLs with average QTL CI size) and 1,184 random regions (representing eQTLs with average eQTL CI size) from a range the size of the chicken genome and counting the number of overlapping regions between these 2 sets. This was repeated 10,000 times to produce a random distribution against which the observed value was compared.

Overlapping QTL confidence intervals with selective sweeps
To additionally validate the biological significance of our QTL, we tested whether any of our findings overlapped with previous detections of selective sweeps (Qanbari et al. 2019), which were identified by comparing Red Junglefowl genomes to those of several domestic modern chicken breeds. As in the original study an older version of the chicken genome assembly was used (Galgal4), we converted this data to match that of the newest genome assembly (Galgal6) with the NCBI REMAP tool and used the GENOMIC RANGES R package to identify overlapping regions between detected QTL confidence intervals and selective sweep regions (Supplementary Table 9 from Qanbari et al. 2019). A permutation test as detailed above was conducted to test for enrichment in overlaps between selective sweeps and QTL. Here, 645 random regions, representing selective sweeps with average selective sweep length, were selected from a hypothetical genome the size of the chicken genome, and were overlapped with random QTL regions (see above) 10,000 times.

Phenotypic measurements of the chicken comb
Phenotypic measurements of finger number, finger size, surface area, height, width, and PCA of outline coordinate data of all 326 F8 individuals can be found within the cross file used in R/qtl for interval mapping (Supplementary Table 2). PCA of comb outline data revealed the presence of 2 main shape components along PC axes 1 and 2, with PC1 targeting the anterior comb shape and PC2 targeting the dorsal "roughness" caused by comb fingers (Fig. 2). For a representation of explained variation by PC axes 1-10, see Supplementary Figs. 1 and 2, where mean shapes with standard deviations for each PC axis, as well as variation explained by each are presented.

Comb QTL
QTL interval mapping results are presented in Table 1. In total, we identified 10 significant and/or suggestive QTL associated to the various trait measurements (3 to comb area, 1 to comb outline, 1 to finger number, 1 to finger length, 2 to comb length and 2 to comb height max). The QTL with largest effect identified in this study is located on chromosome 3 and is associated to comb area and height (i.e. size-related traits; Table 1). Additional information on QTL results, including genomic positions and genes present within confidence intervals, can be found in Supplementary Table 3.

Gene expression eQTL
We identified 1,184 gene expression eQTL in comb base tissue (660 trans and 524 cis). All significant and suggestive cis and trans eQTL can be found in Supplementary Table 4.
found in Supplementary Table 5. To prove significance, a higher number of overlapping regions were found than by chance alone, showing overlap enrichment between eQTL and QTL identified in this study (permutation test; 10,000 iterations; 1-tailed P-value <0.007). Only significant correlations between gene expression and QTL phenotype are presented in Table 2 and only include correlations with comb surface area. All correlations between gene expression and QTL phenotypes can be found in Supplementary Table 5.

Overlaps between QTL and selective sweeps
In total, 99 [out of 645 selective sweep regions identified in Qanbari et al. (2019)] were found within confidence intervals of detected QTLs from this study. These included 23 within comb length and 78 within finger number QTL (Supplementary Table  6). A comparison of the observed number of overlaps between sweep regions and QTL with the number of overlaps between random regions showed a high degree of overlap enrichment (permutation test; 10,000 iterations; 1-tailed P-value <0.0001).

Discussion
We identified several QTL associated with comb size, comb finger number, and comb outline coordinate data. The confidence intervals of these QTL overlapped with those of several eQTL, among which a gene related to bone regeneration was identified. However, significant correlations between gene expression and trait measurements were only found in the case of comb area. These confirm and are a continuation to previous associations found between comb, reproduction, and several bone allocation related genes (Johnsson et al. 2014), further highlighting how this trait may function as a sexual ornament in the chicken system.  These bone-related candidate genes are of particular relevance, given that cartilage is a precursor to bone formation (Johnsson et al. 2012(Johnsson et al. , 2014. Similarly, strong phenotypic and genetic correlations have been found in this intercross between comb mass, bone allocation, and fecundity. A novel QTL on chromosome 6 was found associated to PC1 of our comb outline coordinate data and 2 novel QTLs on chromosome 2 were found associated to finger number and finger size. To the best of our knowledge, no genomic regions were found associated to these comb traits before. The comb shape outline QTL overlapped with 3 annotated eQTL (DCDC2, TNFRSF1, and RXFP1). The encoded protein of RXFP1, relaxin receptor 1, has strong effects on sperm motility in males, and parturition in females (Park et al. 2005;Ferlin et al. 2012). Some biological processes associated to DCDC2 are cilium assembly, neuron morphogenesis, and sensory perception of sound (Grati et al. 2015;Girard et al. 2016). Mutations in TNFRSF1 underlie tumor necrosis factor receptor-associated periodic syndrome (TRAPS) and with multiple sclerosis in human patients (Ottoboni et al. 2013). eQTL-QTL overlaps were largely not biased between cis and trans eQTL, except in the case of comb shape where only trans eQTL were present. This adds on to the validity of this finding as they were found associated to gene expression patterns within the same genomic region. Further, several eQTL had overlapping confidence intervals with finger number and size QTL on chromosome 2. Of notable interest, WIPF3 codes for the WAS/WASL protein family, which are corticosteroids and may potentially be involved in spermatogenesis [estimated by gene similarity on Uniprot.org and indicated in Roy et al. (2017)]. Interestingly, the gene CP450CC24, within the overlap region with finger number, has been associated to bone regeneration specifically in studies conducted on White Leghorn chickens (Seo and Norman 1997). In addition, the CI of finger number and comb length QTL overlapped with several selective sweep regions identified in Qanbari et al. (2019), indicating that comb, bone, and reproductive investment traits along with these genomic regions are likely experiencing enhanced selection in domesticated chicken breeds. One caveat with all the above gene candidates is that these are found by correlational analyses, and full functional assays are required before one can be certain that these do indeed modify comb morphology.
In a previous study, a large effect QTL was found associated to comb weight (Johnsson et al. 2012). Moreover, this QTL was found in close vicinity to 2 genes, HAO1 and BMP2, which affect bone strength and egg laying capacities. Here, we find this same region associated to comb surface area with the expression of HAO1 being significantly correlated to comb surface area measurements. This result is not surprising as comb weight was positively associated to HAO1 expression in a previous study (Johnsson et al. 2014) and comb surface area is expected to be correlated with comb weight.
To summarize, we present novel putative QTL associated to different aspects of comb shape, namely, finger number and size and shape outline coordinate data. We demonstrate a significant enrichment in the overlap between gene expression probe sets and QTL with the implication of several genes being associated to QTL phenotypes, including RXFP1 and CP450CC24. Furthermore, overlapping regions between selective sweeps tied to domestication and comb length and finger number QTL would suggest that the entailed genes are putatively under selection. This would imply that these phenotypes and associated genomic regions have undergone or are currently under anthropogenic selection. Whether these same genomic regions are under selection in wild populations warrants further investigation by conducting genome-wide association studies and selective sweep scans in feral chicken populations.

Acknowledgments
Computational resources were supplied by the project "e-Infrastruktura CZ" (e-INFRA CZ LM2018140) supported by the Ministry of Education, Youth and Sports of the Czech Republic and the National Genomics Infrastructure in Genomics Production Stockholm funded by Science for Life Laboratory, and the SNIC/ Uppsala Multidisciplinary Centre for Advanced Computational Science for assistance with massively parallel sequencing and access to the UPPMAX computational infrastructure. Sequencing was performed by the Stockholm and Uppsala Sequencing Centers.

Author contributions
DW and RH conceived the study. This study was funded by funding allocated to DW. VB led the data analysis and manuscript writing, with all authors contributing to the writing. AH and MLMC assisted in data analysis techniques with R scripts. All authors contributed to the interpretation of results and reviews of the manuscript.