Genome-Wide Expression Patterns and the Genetic Architecture of a Fundamental Social Trait

Explaining how interactions between genes and the environment influence social behavior is a fundamental research goal, yet there is limited relevant information for species exhibiting natural variation in social organization. The fire ant Solenopsis invicta is characterized by a remarkable form of social polymorphism, with the presence of one or several queens per colony and the expression of other phenotypic and behavioral differences being completely associated with allelic variation at a single Mendelian factor marked by the gene Gp-9. Microarray analyses of adult workers revealed that differences in the Gp-9 genotype are associated with the differential expression of an unexpectedly small number of genes, many of which have predicted functions, implying a role in chemical communication relevant to the regulation of colony queen number. Even more surprisingly, worker gene expression profiles are more strongly influenced by indirect effects associated with the Gp-9 genotypic composition within their colony than by the direct effect of their own Gp-9 genotype. This constitutes an unusual example of an “extended phenotype” and suggests a complex genetic architecture with a single Mendelian factor, directly and indirectly influencing the individual behaviors that, in aggregate, produce an emergent colony-level phenotype.


Introduction
Considerable interest surrounds the genetic architectures underlying fundamental adaptive traits in wild populations [1][2][3][4][5]. In social organisms, such interest centers on the numbers and types of genes directly regulating expression of the individual behaviors that, in aggregate, create social organization, as well as genes in interactants that indirectly influence expression of socially relevant behaviors by altering the social environment [6][7][8][9][10][11][12]. This indirect influence is mediated by interactions of the genotype of a given individual with those of other group members who collectively comprise the social environment. Information on the genetic architecture of social organization is essential to constructing realistic models of social evolution that can answer questions about the numbers and types of genetic changes necessary to change a solitary to a social animal or to convert a simple society to a large and highly complex one [13].
A remarkable case of a fundamental social polymorphism that appears to be under simple genetic control (single Mendelian factor of large effect) is variation in colony social organization in the fire ant Solenopsis invicta. In this species a single genomic element marked by the protein-encoding gene Gp-9 is implicated in the production of two distinct types of queens that differ in physiology, fecundity and behavior [14][15][16][17][18][19]. This genetic factor also determines whether workers tolerate a single fertile queen (monogyne social form) or multiple queens (polygyne social form) in their colony. Colonies containing only homozygous Gp-9 BB workers accept only a single queen, whereas colonies containing both Gp-9 BB and Gp-9 Bb workers invariably accept multiple queens, but only those bearing a Gp-9 b haplotype [20][21][22][23]. The near complete absence of adult workers and queens with a bb genotype stems from the deleterious effects associated with the genomic region marked by the b allele, inducing homozygous females to die shortly after they eclose from the pupa [22,24]. The monogyne and polygyne social forms also differ in a number of important reproductive, behavioral, and life history traits besides colony queen number [25,26], differences that are also completely associated with differences at the genomic region marked by Gp-9. In contrast, there is a complete lack of differentiation at genes not tightly linked to Gp-9, presumably because frequent matings between sexuals from sympatric monogyne and polygyne colonies result in extensive gene flow between the forms [22,[27][28][29].
Colony queen number in S. invicta is regulated by the workers, which collectively decide which and how many queens from within or outside the colony are recruited as new egg-layers [21], largely on the basis of chemical signals emanating from the queens [20]. Workers in monogyne colonies (all of which possess the BB genotype) accept only a single replacement queen that must also bear genotype BB, whereas workers in polygyne colonies accept multiple queens, each of which must possess the b haplotype. Significantly, the presence of as few as 5-10% of workers with the b haplotype induces the entire colony worker force, including BB workers, to become tolerant of multiple Bb queens and thus display the polygyne social phenotype [23]. Thus the genomic region marked by Gp-9 exerts indirect genetic effects [9], in that the presence of the b variant in a colony induces changes in the social behavior of all colony members, even those lacking the b haplotype. Although the identity of the product of Gp-9 as an odorant binding protein (OBP) and other lines of evidence suggest that the gene may play a direct part in regulating social organization via a role in chemical communication, it remains an open question whether other genes tightly linked to Gp-9 (possibly locked up in an inversion with it) are also involved [22]. The first aim of this study was to investigate whether variation in the genomic region marked by Gp-9 is associated with differences in patterns of expression of genes other than Gp-9 in workers. The second aim was to study how the social environment (i.e., presence or absence of nestmate workers with the b allele) can alter individual gene expression patterns. To answer these questions and begin to address the issue of how variation at a single Mendelian factor can directly and indirectly affect gene expression to produce a complex colony-level phenotype, we employed a fire ant microarray platform representing some 10,000 genes [30].

Results/Discussion
To determine the effect of Gp-9 genotype on gene expression in focal individuals, we compared expression profiles between BB and Bb adult workers from 20 polygyne S. invicta colonies. This comparison revealed 39 genes consistently differentially expressed between workers of the two genotypes, of which about two-thirds were more highly expressed in Bb than BB workers (Figure 1; see also confirmation of microarray data by quantitative RT-PCR [qRT-PCR] in Text S1 and Table S1). Sixteen of these genes did not significantly match any sequence in the public databases (BLASTX, threshold E-value = 1e-5), ten matched predicted proteins of unknown function, and the remaining 13 matched genes with a known or inferred function (Table 1).
Three gene categories were overrepresented among the genes differently expressed between workers of alternate genotypes (Table 1  and Table S2; all P,0.05). The first two are the allergen and odorant binding protein categories, which collectively include five genes likely to contribute to chemical signaling and response, the essential components of regulation of colony queen number and social organization. In ants, venom allergens are proteins released from the venom sac, an organ that, in queens, appears to store and release chemical signals allowing recognition by workers [25]. Similarly, the two odorant binding genes that encode members of the insect OBP protein family (as does Gp-9) and the antennal chemosensory protein may be involved in pheromone transduction, thereby potentially influencing the abilities of workers of the two Gp-9 genotypes to recognize and discriminate among queens. Experiments from other systems are suggestive that changes in the expression levels of OBPs could influence discriminatory behavior by modulating the threshold for a particular response; differential regulation of OBPs has been observed in Drosophila following mating [31], exposure to starvation stress [32], and alcohol tolerance development after exposure to alcohol [33]. Additionally, genetic and biochemical evidence suggests that OBPs may interact combinatorially in odor discrimination [34,35].
The third overrepresented category comprises two transposons, which are of special interest with respect to properties that may be shared between the genomic region including Gp-9 and regions containing the sex-determining genes in species with sex chromosomes [36]. The b haplotype is found only in the polygyne social form, just as the Y chromosome is found only in males in species with male heterogamety. By analogy with the Y chromosome, theoretical predictions and empirical observations suggest that the Gp-9 b region should (i) accumulate genes beneficial in the polygyne social environment (as the Y chromosome accumulates genes beneficial to male function [37]), (ii) evolve reduced recombination to preserve associations of genes advantageous for polygyny (as occurs for genes advantageous to males on the Y chromosome [38,39]), and (iii) accumulate deleterious alleles and transposable elements (because of reduced recombination [38,39]). Consistent with these expectations, the genomic region marked by Gp-9 is characterized by low recombination [40,41], the b haplotype is a homozygous lethal [15,24,40], and the piggyBac-like transposon, which is differentially expressed between workers of alternate genotypes, appears to occur almost exclusively in individuals possessing haplotype b (data not shown). Thus the strong expression of at least this transposon in b-bearing workers, which constitutes the most extreme expression difference among the 13 genes with annotated matches (Table 1), likely reflects its unique insertion in b haplotypes. While this distribution could signify that the piggyBac-like transposon directly affects the differential expression of other candidate genes in BB and Bb workers, we note that, consistent with earlier protein electrophoresis data [40], no significant difference in the expression levels of Gp-9 was detected between workers of the two genotypes; therefore, whatever elements control the differential expression of genes in parallel with Gp-9 genotype appear not to regulate the expression of Gp-9 itself.
To determine the indirect effects of colony Gp-9 genotype composition as well as other aspects of the social environment on worker gene expression, while controlling for individual Gp-9 genotype, we compared profiles of adult workers bearing genotype BB between 20 polygyne and 20 monogyne colonies. This comparison revealed 91 genes consistently differentially expressed between workers of the alternate forms, of which over threequarters were more highly expressed in polygyne than monogyne workers ( Figure 2; see also confirmation of microarray data by qRT-PCR in Text S1 and Table S1). Forty-five of these genes did not significantly match any sequence in the public databases (BLASTX, threshold E-value = 1e-5), 13 matched predicted proteins of unknown function, and the remaining 33 matched previously annotated genes ( Table 2).
Three gene categories (mitochondrial, prefoldin complex, and viral genes) were overrepresented among the genes that were differentially expressed between BB workers from monogyne and

Author Summary
Fundamental research goals for scientists interested in social evolution are to determine the numbers and types of genes that directly regulate individual social behaviors as well as to understand how the social environment indirectly influences the expression of socially relevant traits. The fire ant Solenopsis invicta features a remarkable form of social variation in which the occurrence of two distinct social types that differ in colony queen number is associated with genetic differences at a genomic region marked by the gene Gp-9. Our analyses of gene expression profiles in fire ant workers revealed that differences in Gp-9 genotype are associated with the differential expression of an unexpectedly small number of genes, many of which are predicted to function in chemical communication relevant to the regulation of colony queen number. Surprisingly, worker gene expression profiles are more strongly influenced by indirect effects associated with the social environment within their colony than by the direct effect of their own Gp-9 genotype. These results suggest a complex genetic architecture underlying the control of colony queen number in fire ants, with a single Mendelian factor directly regulating, and the social environment indirectly influencing, the expression of the individual behaviors that, in aggregate, yield an emergent colony social organization.
polygyne colonies ( Table 2 and Table S2; all P,0.05). The 11 genes encoding mitochondrial and prefoldin complex (molecular chaperone) proteins were all up-regulated in polygyne compared to monogyne workers. Increased mitochondrial gene expression may reflect increased oxidative metabolism, while increased prefoldin expression may indicate higher protein synthesis rates, possibly in relation to the relatively smaller size and higher metabolic rates of polygyne workers [42].
The pattern of expression of the six genes in the viral gene category is consistent with the expectation that differences in social organization affect susceptibility to pathogens and parasites. In the monogyne form, there is intense selection against susceptible infected individuals during independent colony founding, a stage that colonies of the polygyne form never display [26]. Accordingly, we found that workers in the polygyne form express more sequences corresponding to viral genes than their counterparts in the monogyne form, presumably because of relaxed selection and generally greater susceptibility in the former (see also [43]). Based on sequence similarity and correlated expression across our experiments, we identified six gene products that likely represent three different viruses, a ssRNA negative-strand (2) virus and two ssRNA positive-strand (+) viruses, one of which is the SINV-2 virus [44]. All 20 polygyne study colonies showed evidence of infection with at least one virus (mean number of viral types per colony, 2.560.67), whereas only three of the 20 monogyne colonies showed evidence of infection, in all cases by a single viral type. Finally, the pattern of expression of another socially-regulated gene, this one encoding a defensin (a class of small protein antibiotics active against viruses, bacteria, and fungi [45]), also is consistent with greater selection for resistance in the monogyne form, as this gene was more highly expressed in monogyne than polygyne workers.
The numbers of genes differentially expressed in the genotype (39) and social form (91) comparisons are relatively low compared to the numbers expected based on other published microarray experiments. There are several possible explanations for this. First, the use of whole worker bodies as the source of RNA may decrease the probability of detecting genes whose level of expression varies among cells or tissues. Second, our comparisons were performed on groups of workers originating from different colonies, thus adding a colonylevel effect to, and thus increasing the total variance in, gene expression. Finally, workers of alternate genotype or social form apparently exhibit fewer phenotypic differences than queens [20,21,46,47], possibly reflecting the involvement of fewer differentially expressed genes in the former caste.
Remarkably, there was almost no overlap between genes whose level of expression was influenced by the focal workers' Gp-9 genotypes and genes whose expression was influenced by the social environment, with only one of the 129 differentially expressed genes appearing in both categories. This demonstrates an almost complete decoupling of the direct effects of the genomic region marked by Gp-9 and the indirect effects mediated by social interactions within colonies. Moreover, there is little indication of an interaction between these direct and indirect effects; genes  Table S1 for confirmation of selected gene expression results with quantitative RT-PCR (qRT-PCR). doi:10.1371/journal.pgen.1000127.g001 whose levels of expression depend on the Gp-9 genotype of focal individuals generally are expressed at similar levels in the two social forms when genotype is held constant (Figure 1), whereas genes whose levels of expression depend on the indirect influence of the social environment almost always are expressed at similar levels in polygyne workers of different Gp-9 genotypes (Figure 2; see also Figure S1).
This study reveals that variation at the S. invicta genomic region marked by Gp-9 is associated with the differential expression in workers of a relatively small number of genes that, with the exception of the piggyBac-like transposon, presumably are unlinked to Gp-9. A high proportion of these differentially expressed genes have putative functions implying a role in chemical signaling and behavior relevant to the regulation of colony queen number and, therefore, these genes may have a primary function in determining social organization. The number of such genes is unexpectedly low given the profound behavioral, physiological, and life-history differences between the two social forms and the fact that widespread changes in gene expression patterns can be observed after just a few generations of selection [48,49]. A perhaps more surprising finding is that worker gene expression profiles are significantly more strongly influenced by indirect effects associated with the Gp-9 genotypic composition within their colony than by the direct effect of their own Gp-9 genotype (chi-squared test, P,0.001), with the indirect-effect genes largely implicated in the secondary differences in colony social characteristics expected between the forms. While several studies have demonstrated that the social environment can modulate gene expression [50][51][52][53], and others have revealed indirect genetic effects on phenotypes or levels of gene expression [10,[54][55][56][57][58][59][60], this is the first example of a naturally occurring polymorphic Mendelian element that affects gene expression in other group members. The finding of a complex genetic architecture directly and indirectly influencing the individual behaviors that, in aggregate, generate a fundamen-tal colony-level social phenotype represents an unusual example of an ''extended phenotype'' [61].

Sample Collection
Colonies of S. invicta were collected near Athens, Georgia (eight polygyne and eight monogyne colonies in 2004; eight polygyne and eight monogyne colonies in 2006) and near Hammond, Louisiana (four polygyne and four monogyne colonies in 2006), USA. All colonies were returned to the laboratory and reared for one month under standard conditions [62]. We determined the social form of each study colony using several lines of evidence. Nest density, worker size distribution, and nest brood composition were used to make initial identifications of social form in the field (see [63]). Subsequently, polygyny was confirmed for each suspected polygyne colony by discovering two or more wingless inseminated (reproductive) queens, while monogyny was confirmed in each suspected monogyne colony by discovering a single, highly physogastric, wingless inseminated queen. The social form of each colony was further substantiated by electrophoretically detecting the b allele of Gp-9 in pooled samples of 20 female inhabitants of each polygyne colony and failing to detect the allele in such samples from each monogyne colony (the b allele is completely diagnostic for polygyny in S. invicta in the USA [14,40,63]).  ants were homogenized in Trizol (Invitrogen) using a Fastprep bead shaker, and DNA and RNA were extracted using the manufacturer's recommended protocol. For the 2006 samples, individual ants were homogenized in RLT+ buffer, and DNA and RNA were extracted using the AllPrep RNA/DNA Mini Kit (Qiagen). An RFLP analysis was used to determine the Gp-9 genotype of each individual in polygyne colonies [14]. We pooled the RNA from 7-10 BB workers and 7-10 Bb workers from each polygyne colony. Although bb workers generally are rare due to deleterious effects associated with the genotype [24,40], we found 13 such workers from eight colonies. Two pooled bb samples were created, one for 2004 (nine workers from four colonies) and one for 2006 (four workers from four colonies). These samples were hybridized (after amplification) but not included in the statistical analysis due to the small number and pooling of individuals across colonies. RNA from ten workers from each monogyne colony (all with genotype BB) was pooled by colony. Experimental samples were labeled with Cy3 and were hybridized against Cy5-labeled ''common reference'' RNA on our custom-made spotted cDNA microarrays. We employed a common reference design because not all samples provided  Table S1 for confirmation of selected gene expression results with qRT-PCR. doi:10.1371/journal.pgen.1000127.g002 enough amplified RNA for multiple hybridizations (e.g., for loop designs) and because this allowed within-form comparisons of polygyne genotypes and between-form comparisons of BB workers. All experimental samples were labeled in Cy3, allowing for unbiased comparisons. We used two different batches of reference RNA. For the 2004 samples, we pooled 25% of the amplified RNA from each experimental sample. For the 2006 samples, we amplified total RNA isolated en masse from hundreds of adult workers collected from the foraging area of 30 colonies (eight and seven of each social form from Georgia and Louisiana, respectively). The microarrays were made from 22,560 independent cDNAs generated from a fire ant expressed sequence tag project and are estimated to represent 11,864 different genes [30]. Two different batches of microarrays were used, one set printed in 2004 and the other in 2006. For both batches, only the 18,438 spots yielding a single PCR product (representing 9,722 putative genes) were considered in the analyses. Images of the competitive hybridization were obtained with an Agilent Technologies Scanner. The signal intensities for each spot Table 2. Genes differentially expressed between S. invicta workers from monogyne and polygyne colonies bearing the BB genotype at Gp-9 that significantly match annotated genes in public databases a .

Fire ant gene
Putative gene product of best match b E-value Gene category were extracted from the images using GenePix software. After scanning, bad spots were flagged and the background-subtracted median foreground values were used as the intensity levels in the subsequent analysis. All spots with a positive intensity were considered for the subsequent analyses (i.e., no threshold filtering was used). Raw intensity data were converted to normalized log2 ratios using ''print-tip specific'' loess normalization (within arrays; marray Bioconductor package, R [66]). Selected gene expression results were confirmed using qRT-PCR (see Text S1 and Table S1). Primers used for qRT-PCR are listed in Table S5.

Data Analysis
For the genotypic analysis, we tested for differential expression of each gene between samples of BB and Bb workers in the 20 polygyne colonies using a 2-factor mixed-model ANOVA of the form: where Y, representing the reference/sample log-transformed ratio for a spot, is the sum of effects. The symbol m represents the overall average log-transformed ratio for a given spot over all experiments. BATCH is a random effect (denoted by ,) with two levels, batch_2004 and batch_2006, that accounts for the variation between hybridizations performed in the two different years (this ''year'' effect also encapsulates the effects of two different batches of microarrays and of distinct reference RNAs). The term GENOTYPE captures the gene expression changes that are attributable to the BB and Bb genotypes. Finally, e represents the measurement error. We did not include data for the bb workers in the statistical analysis due to the small number of samples. However, these samples yielded expression profiles that appeared similar to those of Bb workers (but with even more marked differences from the profiles of BB workers, data not shown).
For the social form analysis, we tested for differential expression of each gene between BB samples from 20 monogyne and 20 polygyne colonies by using the same 2-factor mixed-model ANOVA, but with the variable SOCIAL FORM (monogyne or polygyne) replacing the variable GENOTYPE.
Analysis of variance calculations were performed in R. For the genotype comparison, 4,005 clones were removed from the ANOVA analysis because there were not enough data points for the F-statistic calculations (for example, for a given clone all the batch_2004 samples for GENOTYPE BB had negative intensities and/or were flagged). However, because many genes are represented by multiple independent clones, 95.5% (9,288/9,722) of the putative genes on the microarray were present in the 14,433 clones used in the analyses. Similarly, for the social form analysis 3,791 clones were removed from the ANOVA analysis, with the remaining 14,647 clones representing 96.9% (9,419/9,722) of the putative genes.
We restricted our analyses to the 73 and 139 cDNA clones that satisfied a false discovery rate (FDR) of 10% for the genotype and social form comparisons, respectively [67]. Duplicated clones on the microarray, independent cDNA clones representing the same gene, and sequences with greater than 95% sequence identity were merged and averaged, resulting in 39 genes with significantly different expression in the genotype comparison and 91 in the social form comparison. Expression levels presented in the figures are modified from the loess-normalized log2 expression ratios. For each gene, the batch effect (derived from the ANOVA calculations) was first subtracted from the loess-normalized log2 expression ratio. Then, the batch-adjusted expression ratios were normalized to the average across all experiments (including the two bb hybridizations).
Statistical significance of the expression differences detected by the ANOVA calculations was additionally evaluated by means of non-parametric Mann-Whitney tests conducted on the normalized, batch-adjusted data ( Figure S1). Expression differences between polygyne workers of different Gp-9 genotype (BB, Bb) were confirmed to be highly significant for all 39 genes identified by the ANOVA (all P,0.002), as were expression differences between BB workers of different social form for all 91 genes identified by the ANOVA (all P,0.002). In contrast, among the 39 genes influenced by Gp-9 genotype, only seven (18%) showed significantly different expression between monogyne and polygyne workers with the BB genotype (0.001,P,0.041), while among the 91 genes influenced by social form, only three (3.3%) showed significantly different expression between BB and Bb workers of the polygyne form (0.0001,P,0.047) (see Figure S1). Given the large number of these tests performed, some 5% of the significant results are presumed to represent Type I errors.
Expression data were hierarchically clustered and examined using Cluster and Treeview [68]. We also performed SOM (self-organizing map) clustering of the experimental samples (by array) for both the genotype and social form comparisons (data not shown). For the genotype comparisons, the samples clustered into two distinct groups according to genotype (BB and Bb) and no additional group was uncovered. Similarly, for the social form comparison, all the monogyne samples clustered together while the polygyne samples separated into two groups, those with high and those with low levels of viral gene expression. Because this analysis did not reveal any striking new patterns, the results are not presented in detail.

Annotation of Differentially Regulated Genes
Because previous annotations of the genes represented on the fire ant microarray [30] may be outdated, we performed new similarity searches against the non-redundant protein sequence database using the BLASTX algorithm [69,70]. All comparisons were performed on the Blast Network Service provided by the Swiss Institute for Bioinformatics (release July 17, 2007). The default settings were used with an E-value threshold of 1e-5, except where otherwise indicated. The accession number of the best match for each gene is reported in Tables 1 and 2, except when it was an Apis mellifera gene derived from the Ensembl automatic annotation. In this case, we chose the next best hit, because little is known about gene function in A. mellifera, and the genome of this species has been removed from the current Ensembl releases. All BLASTX matches with E#1 (but limited to the top 20) are listed in Tables S3 and S4. Each fire ant gene was also manually assigned to a descriptive category (Tables 1 and 2 and Text S1). The category putatively encapsulates the general function of each gene and is subjectively derived from examining the SwissProt or Ensembl database entries of the five best hits (all E,1e-5), with an emphasis on Gene Ontology, Interpro, and PANTHER annotations. To determine which categories were overrepresented in each set of differentially expressed genes, we used a one-tailed hypergeometric test implemented in R [71,72].
Gene expression data meet Minimum Information About a Microarray Experiment (MIAME) standards and have been deposited at Gene Expression Omnibus (http://www.ncbi.nlm. nih.gov/geo/) with accession number GSE11694. Figure S1 Results of Mann-Whitney statistical tests for gene expression differences. Found at: doi:10.1371/journal.pgen.1000127.s001 (0.03 MB PDF)