Introduction

In the social honey bee Apis mellifera, two alternative female phenotypes, long-lived fertile queens and short-lived sterile workers are produced via differential feeding with a diet known as royal jelly (RJ)1,2,3. This complex, still poorly understood nutrition contains various ingredients4,5 including carbohydrates, vitamins, unusual lipids, antimicrobial agents, epigenomic modifiers such as histone deacetylases inhibitors (HDACs)6, as well as many other less characterised compounds4,5. The bulk of RJ is formed by several Major Royal Jelly Proteins (MRJPs) that appear to be unique to Hymenoptera2. MRJPs evolved from the insect Yellow protein family that has its origins in bacteria2,7. No relatives of MRJPs or Yellow proteins have been found in modern vertebrates, but a Yellow-like protein is encoded by the genome of a chordate Branchiostoma floridae (GenBank XP_002607604). The remarkable developmental potency of RJ has been attributed to a synergistic effect of many if not all of its components acting as activators of signalling pathways via threshold based changes in metabolic flux and epigenomic modifications8,9,10,11. This view has been somewhat obscured by the 2011 study claiming that one of the MRJPs, labelled Royalactin, is capable on its own to drive all the changes needed to make a queen bee12 effectively reducing the entire process of queen development to the vagaries of one protein. Furthermore, treating Drosophila with Royalactin appears to increase body size and ovary development in female flies with the Canton S genetic background12. In order to reconcile those findings with prior evidence implicating DNA methylation in queen development13,14, Kamakura conducted DNA methylation analysis in the genomic region encoding AmEGFR. In one of the Figures (S34)12 he shows that “the overall level of methylation of amegfr in larvae reared with RJ (5%), which develop into queens, was decreased as compared with that in larvae reared with 40-30d RJ (57%), which emerge as workers. Similar results were observed in queen larvae and worker larvae reared in hive”12. Since this result stands in stark contrast to the absence of amegfr methylation in published honey bee methylomes9,15,16, including larvae of the same age as in the Kamakura paper12, we have conducted detailed analyses to re-examine the methylation status of egfr in Apis mellifera. We show not only that amergf is never methylated but has a high GC content that is consistent with non-methylated genes and consider the origins of this unfeasible result in the Kamakura study.

Results

Our desire to meticulously examine the levels of egfr methylation in Apis was inspired by the absence of any methylated CpGs in this locus in the published methylomes from both queens and workers Table 1). In particular, the larval methylomes9 representing queen and workers of the same 96 hrs stage of development as in the Kamakura’s paper12 were indicative of a potential problem. In addition, no methylation of amegfr has been found in several brain methylomes generated by two independent labs15,16. For comparison, we have contrasted sequencing coverage and methylation levels for egfr with a consistently methylated gene nadrin using 11 methylomes representing different tissues, cell types and developmental stages (Table 1). We also have noted that the 736 bp region of amegfr deemed to be methylated in the Kamakura study12 is rich in CpG dinucleotides, which is untypical for methylated genes in Apis that are CpG-depleted because of the known tendency of 5-methyl cytosines to be converted into thymines8,15,17,18. The ratio of observed to expected (o/e) CpG dinucleotides is 1.104 in the 736 bp fragment and 1.324 for the whole gene. Such values are associated with non-methylated genes in Apis8,15 (Fig. 1) that can be easily separated from methylated genes in a bimodal distribution of the o/e frequency of CpGs in all annotated genes. However, given the low sequencing depth of genome-wide methylomes and the possibility that amegfr may be methylated in a restricted number of cell types that could have been picked by chance in the 2011 study, we have decided to reproduce the original low-depth plasmid sequencing experiment using ultra-deep sequencing of the same genomic region12. Unfortunately, we have failed to amplify the 736 bp DNA fragment following the conditions described in the 2011 paper. This is not entirely surprising as it is nearly impossible to amplify DNA fragments of such length from bisulfite treated DNAs. The DNA fragmentation during the treatment19 leads to a practical upper size limit of the PCR amplicon ~400–500 bp and the need for nested primers and a second round of PCR is often essential20. Instead, we used two sets of outer and two sets of nested primers to amplify two overlapping amplicons (A1 = 440 bp and A2 = 441 bp) covering the supposedly methylated region of amegfr. To properly design our amplicons we had to re-annotate the old erroneous EGFR gene model used in the 2011 study (see Figure S1 for details). These amplicons have been processed and barcoded for ultra-deep sequencing with the NEB kit for Illumina libraries. Both libraries A1 and A2 have been sequenced together with additional libraries representing confirmed methylated genes on Illumina MiSeq using the Reagent Kit v3 that generates up to 25 million reads (15Gb) of 2 × 300 bp paired end reads.

Table 1 Sequencing coverage and egfr/nadrin methylation in the honey bee whole-genome bisulfite sequencing projects.
Figure 1
figure 1

CpG (observed/expected) bias of protein-coding regions in the honey bee genome.

The o/e value of 1.104 for the analysed egfr region is indicated with the arrow. The o/e value for the whole gene is even higher (1.324).

As shown in Table 2 even with a depth of over 330,000 reads per amplicon virtually no methylation of EGFR in larval samples has been detected. The methylation pattern counts were normalised with MPFE21 in order to eliminate most of the spurious patterns caused by sequencing errors and incomplete bisulfite conversion. As a result, for both amegfr amplicons A1 and A2, over 99.99% of reads were classified as not methylated. The very few methylated patterns observed in egfr are most likely technical noise and represent less than 0.01% of the total counts. Failure to detect methylation in the egfr region is unlikely to be a PCR artefact because methylated DNA tends to be over-represented in bisulfite sequencing data22. With this depth of sequencing it is likely the methylation patterns in essentially every cell type can be visualised as illustrated by a genuinely methylated gene, nadrin (Fig. 2). In larval samples, 157 distinct methylation patterns were observed in this amplicon after removing the spurious patterns. Whether or not these patterns represent unique epigenetic signatures of 157 cell types in growing larvae remains to be established.

Table 2 Sequencing coverage and egfr methylation level in MiSeq high-throughput amplicon experiment.
Figure 2
figure 2

Methylation patterns in egfr and nadrin revealed by deep amplicon sequencing.

Each row represents a methylation pattern (black: methylated CpGs, white: not methylated CpGs), the height of each pattern is proportional to the pattern’s abundance. Two EGFR amplicons (A1 and A2) were amplified from 96 hrs old queen and worker larvae. One nadrin amplicon was amplified from 96 hrs worker larvae that have been shown in the Kamakura paper to have elevated methylation levels. After normalising pattern frequencies using MPFE21, no methylation was detected in egfr, whereas several distinct and highly abundant methylation patterns are detected in the nadrin amplicon. The pattern proportions are sorted from the most abundant at the top to the least abundant at the bottom.

Discussion

Our findings showing that the honey bee egfr gene belongs to the non-methylated category make the result shown in the Kamakura paper12 impossible to reproduce. There are three lines of evidence questioning the correctness of the original claim. First, on the basis of its high CG content, egfr is bioinformatically predicted to be non-methylated. Second, genome-wide methylation profiles in larvae, brains and embryos from both queens and workers show no evidence of methylation. Third, our new data using ultra-deep amplicon sequencing failed to detect any sign of methylated cytosines in larval samples. At this stage the source of this questionable result is unclear. One possibility is that the original DNA samples were not properly converted with bisulfite thus leading to a false impression that some cytosines in egfr were protected by methyl tag. This problem could have been exaggerated by omitting nested primers that typically are used to improve the recovery of AT-rich methylated amplicons from BS-treated DNA. However, the increase of amegfr methylation only in worker larvae in two situations shown in the Kamakura paper (Figure S34) suggests that such an experimental clarification is unlikely. Whatever the reason for this doubtful result is, our findings have important consequences for understanding how conditional phenotypes are implemented in various lineages by engaging cell surface signalling via EGFR and its growth factor ligands.

While it is not surprising that this important cell surface receptor has been implicated in the queen crafting process in honey bees and growth regulation in Drosophila, it is evident that its regulation is not contingent on DNA methylation. Although a queen bee can be experimentally induced by silencing de novo DNA methylation in newly hatched larvae by means of RNAi approach13, such treatment is pleiotropic and affects hundreds of genes and pathways relevant to nutritional sensing, such as for example, the TOR/insulin network and importantly, Anaplastic Lymphoma Kinase that has the capacity to directly induce downstream events by bypassing insulin signalling9,23. A queen phenotype can also be induced by interfering with the expression of genes belonging to the TOR/insulin network24 and possibly other manipulations affecting nutritional signalling. For example, royal jelly is exceedingly rich in both methionine and sources of methyl groups (choline) and some of its MRJPs are unusually rich in this essential amino acid, providing substrates for methylation activities. In this context, the finding by Grandison et al.25 that in Drosophila, methionine alone was necessary and sufficient to increase fecundity as much as did full feeding, but without reducing lifespan, is striking. Although the effects of RJ on honey bee larval growth are still not fully appreciated, there is little doubt that methylation changes and diet are clearly linked. It may be prudent to more closely examine the relationships between caloric restriction, methylation and foods rich in methionine, acetylcholine with longevity and fecundity in both Apis and Drosophila. Royalactin is simply one of very many components that contribute to network flux. Obviously, it has a defined role in this process, but until all components of RJ are better understood its exclusive role in vivo should be considered with caution.

It is most interesting that egfr methylation has recently been implicated in generating quantitative variation in size of the carpenter ant26 allowing for comparative analyses of a highly conserved EGFR in three insect species in the epigenetic context. Drosophila lost its DNA methyltransferases and has to use other epigenetic mechanisms for EGFR regulation, whereas in two Hymenopterans with the complete DNA methylation toolkit, ants and honey bees, only one recruits methylation to regulate egfr. One possible explanation for such a contrast between ants and honey bees is that methylation in the carpenter ant is utilised as an indirect modulator of egfr for continuous size variations, whereas in Apis such a flexible epigenomic modification of egfr would not be desirable for the proper development of the focal individual that is critical for the colony. These comparative analyses underscore the inherent pitfalls in data transferability between different species at particular levels. In spite of a high level of EGFR conservation, both structural and functional, its context-dependent regulation appears to be driven by distinct mechanisms in insect species. An organism can utilise many different cellular systems to accomplish the same end result as long as it is the desired phenotypic outcome. Indeed, to solve the same problem many different designs can be constructed even from heteromorphic, but functionally similar elements because of the high level of degeneracy in biology27.

In conclusion, we argue that methylation data claimed to be relevant to specific biological processes need to be supported by in-depth analyses using newer, high resolution methods. However, whatever platform one utilizes for methylomic insights, all of them have to be properly scrutinised to answer the critical question: how relevant is the observed differential methylation to the phenomenon under investigation?

Materials and Methods

DNA bisulfite conversion and amplicon preparation

1.5 μg of larval genomic DNA28 was bisulfite converted using the QIAGEN Epitect® Bisulfite Kit, as per the manufacturer’s protocol29. We routinely perform two consecutive treatments to avoid incomplete conversion especially in GC-rich regions. The converted DNA was amplified via a nested PCR reaction with amegfr specific primers (see Table S2 for BS-seq primers). The PCR products were purified utilising Agencourt® AMPure® XP PCR Purification system (Beckman Coulter).

NGS library preparation

Libraries were prepared from 500–600 ng of each amplicon utilising the NEBNext® DNA Library Prep Master Mix for Illumina® and NEBNext® Multiplex Oligos for Illumina® Index Primers Set 1 and Set2 (New England Biolabs). Size selection of adaptor ligated DNA was performed using Agencourt AMPure XP beads (Beckman Coulter), with the bead:DNA ratio of the first bead selection 0.9X, followed by a second bead selection with bead:DNA ratio at 0.2X (not sure about the ratios, I do not think this is necessary). Each library was eluted in 30 μL of 0.1X TE, library size confirmed via agarose gel electrophoresis (we used Caliper LabChip GXII and HT DNA High Sensitivity Assay) and diluted to a final concentration of 4 nM.

NGS MiSeq sequencing

Next generation sequencing was performed on Illumina MiSeq instrument using MiSeq Reagent Kit v3 (Illumina) and 600 cycles. PhiX spike was added at 5% concentration as recommended by Illumina for low-diversity libraries.

Analysis of bs-seq results

For each individual analysed the frequency at which a mCpG occurred was calculated across all reads using custom Python scripts and open-source software. The process comprised of two steps. In the first, pairs of reads with the 30 nucleotide sequence starting at position 4 matching exactly the last 30 nucleotides of the primers used for nested amplicon PCR were extracted from FASTQ files, aligned with in silico bisulfite-converted genomic template using MUSCLE29, overlapping regions (if any) were proportionally truncated and, after removing all aligner-introduced gaps, both reads were combined into one continuous sequence and appended to a to a separate file (“extract file”) for each amplicon and each library/sample. In addition, a quality filter was applied, rejecting all sequences shorter than 90% of the length of the template or containing in excess of 5% gaps. In the second step, batches of sequences from the “extract” files were re-aligned with the template using MUSCLE (to eliminate any potential positional errors introduced by read indels), the aligned template sequence was used to calculate positional information of all the expected CpGs and SNPs and the positional data were used to score methylation status [ie. 0 for T and 1 for C occurring at a CpG position] for each combined read pair. The data were next appended to a separate table for each amplicon and each library/sample. The final tables were used to calculate and graph amplicon methylation data using MPFE21.

Additional Information

How to cite this article: Kucharski, R. et al. EGFR gene methylation is not involved in Royalactin controlled phenotypic polymorphism in honey bees. Sci. Rep. 5, 14070; doi: 10.1038/srep14070 (2015).