Introduction

All life on Earth uses nucleic acids (DNA and RNA) as genetic molecules. The prebiotic synthesis of these polymeric macromolecules has been a longstanding problem in origin of life studies. The structure and composition of DNA are highly complex, which make it unlikely to have formed spontaneously on the early Earth, and a “chicken-and-egg” problem arises from the fact that proteins are required for DNA synthesis and DNA is required for protein synthesis. RNA has been proposed as an evolutionary precursor to DNA (Gilbert 1986) because of its dual ability to catalyze reactions and store genetic information (Cech et al. 1981; Cech 1987), in addition to the perceived advantage that only one polymer needs to be synthesized (rather than two for a nucleic acid-protein system). However, RNA has a similarly complex sugar-phosphate backbone to DNA, which is difficult to synthesize under plausibly prebiotic conditions (Schwartz 2013). In addition, the instability of ribose and 2-deoxyribose (Larralde et al. 1995), the possible scarcity of polyphosphates on the early Earth (Keefe and Miller 1995), and thermodynamically unfavorable nucleotide oligomerization make RNA and DNA unlikely to have been produced prebiotically. As a consequence, a pre-RNA genetic polymer that could spontaneously form from likely available precursors and preceded both RNA and DNA has also been considered (Joyce et al. 1987; Cleaves and Bada 2012; Cafferty et al. 2013).

Hydroxymethylated pyrimidines may have been present on the early Earth because these compounds can be easily synthesized from pyrimidine (Oró 1961; Callahan et al. 2011) and formaldehyde (Cleaves 2008) under plausibly prebiotic conditions (Schwartz and Bakker 1989; Robertson and Miller 1995). Furthermore, the formation of hydroxymethylated pyrimidines is so favorable that if there were even moderate concentrations of formaldehyde in the prebiotic environment, these compounds may have been much more abundant than their parent pyrimidines such as uracil and cytosine (Robertson and Miller 1995). Previous studies have shown that the hydroxymethyl site of 5-hydroxymethyluracil (HMU) is highly reactive with various nucleophiles (e.g., ammonia, cyanide, glycine, etc.) (Robertson and Miller 1995). For this reason, an attempt was made to incorporate HMU into a peptide backbone to produce peptide nucleic acids (PNA) (Cleaves 2001). Although this attempt was unsuccessful, “HMU polymer” was identified as a byproduct of the reaction, but was never fully characterized. Here, we investigate the oligomerization of HMU and 5-hydroxymethylcytosine (HMC) in aqueous solutions and characterize these products using state-of-the-art liquid chromatography-high resolution Orbitrap mass spectrometry. We report that HMU can produce two classes of oligomeric structures. We also discovered that HMC can oligomerize in an analogous manner to HMU, and that simple aqueous reactions containing HMU and HMC result in oligomers containing both uracil and cytosine, which might make the storage of sequence information possible, in a manner similar to that of DNA and RNA.

Experimental

Sample Preparation

HMU and HMC were purchased from Alfa Aesar and MP Biomedicals, respectively. Ultrapure water (18.2 MΩ·cm, <5 parts-per-billion [ppb] total organic carbon) was used exclusively for this study from a Millipore Integral 10 system. Oligomers were produced by heating 0.01 M HMU in water at 100 °C for 24 h in flame-sealed glass ampoules. The same conditions were used for HMC and equimolar mixtures of HMU and HMC in order to produce mixed nucleobase oligomers. Additionally, nucleobase oligomers were produced by heating 0.1 M HMU and 0.1 M uracil in water at 100 °C for 24 h in a flame-sealed glass ampoule. We note that after removing sample solutions from the oven, a white precipitate eventually formed as well.

Data Collection

High resolution (100,000 at m/z 400) and accurate mass (< 3 ppm mass error) mass spectra were acquired by direct infusion to a Thermo Scientific LTQ Orbitrap XL hybrid mass spectrometer equipped with an electrospray ionization source, operated in positive ion mode. Additionally, HPLC-UV-MS and HPLC-UV-MS/MS data were acquired using a Thermo Scientific Accela HPLC coupled to a Thermo Scientific Accela photodiode array detector (200–400 nm) and a Thermo Scientific LTQ Orbitrap XL hybrid mass spectrometer. Chromatographic separation utilized a Phenomenex Luna phenyl-hexyl HPLC column and the mass spectrometer’s resolution was set to 30,000 in order to maintain good chromatographic peak shape. Product ion spectra (MS/MS) were collected in data dependent mode using Orbitrap detection, an HCD setting of 40 %, and an isolation width of 0.8 Da. Structures of daughter ions in the product ion spectra were generated using Thermo Scientific Mass Frontier 6.0 spectral interpretation software.

Results and Discussion

Figure 1 shows UV chromatograms (at 260 nm) acquired from reactions of HMU, HMC, and HMU + HMC. Monomeric compounds were identified based on the comparison of UV retention time, accurate mass measurements, and/or product ion spectra with reference standards. Distinctive groups of peaks (corresponding to monomers, dimers, trimers, etc.) were also noticeable (especially in the HMU reaction) and their peak intensity typically decreased as their HPLC retention time increased, which suggests higher molecular weight structures are produced. Accurate mass measurements acquired simultaneously supported the notion that oligomers were produced, and that the oligomerization proceeded via monomer addition at the hydroxymethyl site along with the loss of a water molecule. These results are not entirely surprising because the hydroxymethyl group of HMU is known to be highly reactive based on previous studies (Robertson and Miller 1995). For HMU, two types of oligomers were observed: (1) uracil-CH2-(uracil-CH2)n-uracil, n = 0–2 and (2) uracil-CH2-(uracil-CH2)n-HMU, n = 0–2, the latter type facilitating continued oligomerization because of the reactive hydroxymethyl group. Analogous types of oligomers were also observed for HMC reactions (we refer to them as HMU oligomers or HMC oligomers throughout this paper).

Fig. 1
figure 1

UV chromatograms (260 nm) acquired from reaction solutions of HMU (bottom trace), HMC (middle trace), and HMU + HMC (top trace). Boxed inset shows a close-up region for HMC reaction solution, which shows a complex distribution of lower intensity peaks. Structures as large as tetramers were observed in these data

Product ion spectra support an oligomeric structure for the products. For example, Fig. 2 shows the product ion spectrum of an HMU trimer (U-CH2-U-CH2-U form). At 0 % collision energy, only the protonated trimer mass (m/z 361.0887) was observed. As the collision energy was increased, daughter ions started to appear as a result of fragmentation of the protonated trimer parent molecule. Accurate mass measurements allowed unambiguous assignment of molecular formulae to parent and daughter ions. Two particularly characteristic daughter ions at m/z 249.0617 and 125.0343 were observed in the product ion spectrum, which indicate cleavages between the methylene group and the heterocycle (i.e. U-CH2-U-CH2+ and U-CH2+, respectively). From these results, it is clear that the oligomers synthesized are methylene linked. Oligomerization likely favors linkage between the C5 methylene carbon of HMU and the N1 nitrogen (rather than the N3 nitrogen) of a second HMU or uracil molecule based on our energy calculations at the B3LYP/6-31G** level for HMU dimers as well as a previous study using UV spectroscopy of HMU oligomers (Cleaves 2001). However, both linkages (C5-N1 and C5-N3) may be present because multiple peaks were observed in the extracted ion chromatograms for the dimer and higher order oligomers, which suggests that a number of structural isomers were produced.

Fig. 2
figure 2

Product ion spectrum at 15 % collision energy of a peak corresponding to the mass of a uracil-CH2-uracil-CH2-uracil trimer produced from a heated HMU solution. Accurate mass measurements and molecular formulae are shown for all ions. Suggested structures were generated in Mass Frontier 6.0

Possible formation routes for HMU oligomers are shown in Fig. 3. Oligomerization appears to proceed until it is “terminated” by uracil. This circumstance may be due to the dehydroxymethylation of a uracil-CH2–5-hydroxymethyl oligomer and/or the addition of free uracil (produced when the reaction was heated), though the latter scenario is less likely because we do not detect uracil in the UV and mass chromatograms of HMU reactions.

Fig. 3
figure 3

Possible mechanisms of formation of uracil-CH2-(uracil-CH2)n-uracil oligomers (red boxes) and uracil-CH2-(uracil-CH2)n-HMU oligomers (blue boxes), the latter continue oligomerization. The formation of uracil-CH2-(uracil-CH2)n-uracil oligomers may result from the dehydroxymethylation of uracil-CH2-(uracil-CH2)n-uracil-HMU oligomers and/or the addition of uracil (produced during the reaction). Analogous mechanisms can account for the products observed from HMC and mixed HMU/HMC reactions

We observed a similar phenomenon of oligomerization with HMC. We measured two types of cytosine oligomers (i.e. cytosine with a bridging methylene group to cytosine or HMC) of different sizes (up to the tetramer) in the LC-MS chromatogram. Cytosine oligomers were determined by accurate mass measurements, and these measurements were further supported by observation of their sodium and potassium adducts. For the protonated cytosine dimer (cytosine-CH2-cytosine), two major peaks in ~3:1 ratio were observed, which may correspond to the N1 and N3 substituted molecules. Product ion spectra further supported this notion because daughter ions were identical, yet a distinct ratio was observed for each, which suggested relatively similar structural isomers.

Intriguingly, in HMC solutions we also measured masses that could reasonably be assigned to dimers and trimers incorporating both cytosine and uracil into the structures. These observations suggested that deamination of the HMC to HMU occurred readily under our reaction conditions (100 °C for 24 h), leading to the potential of having “informational-like” oligomers (because they have sequence information now) as well as their potential application in dynamic combinatorial chemistry. In addition, we measured free cytosine and uracil, which suggested that dehydroxymethylation of HMC to cytosine and deamination of cytosine to uracil also occurred. We speculate that HMC could serve as a source of both cytosine and uracil on early Earth, and any regenerated pyrimidines would likely undergo reactions with formaldehyde again (assuming sufficient concentration) to form more HMC and HMU. In this case, there may be the potential for HMC/HMU additions to moderate-sized oligomers to produce much larger oligomers with the possibility of mixed sequences.

It was further confirmed that oligomers prepared from mixed HMC/HMU reactions contained both C and U moieties. For example, Fig. 4 shows product ion spectra of two isomeric dimers (elution times 17.3 and 25.9 min) synthesized from mixed HMU/HMC reactions. The fragmentation patterns strongly suggest that both sequences were synthesized (U-CH2-C and C-CH2-U). Furthermore, LC-MS and accurate mass measurements suggest the presence of higher order oligomers containing both uracil and cytosine. For example, in the HMC reaction solution, we observed multiple peaks in the mass chromatogram at m/z 397.0770 ([C14H14N8O4 + K]+), which likely correspond to the uracil-CH2-cytosine-CH2-cytosine or cytosine-CH2-cytosine-CH2-uracil trimer. An even greater diversity of mixed trimers (e.g., uracil-CH2-cytosine-CH2-uracil, m/z 397.0531, [C14H12N7O5 + K]+ and cytosine-CH2-uracil-CH2-cytosine, m/z 396.0691, [C14H13N8O4 + K]+) was measured in reactions of equimolar HMU and HMC solutions. Thus, these results show that oligomers of hydroxymethylated pyrimidines readily form in solution. These nucleobase oligomers may be part of a realistic inventory of potentially functional compounds for pre-RNA world models (Lazcano and Miller 1996) and serve as the first step to understanding chemical evolution from the pre-RNA world to the RNA world (Dworkin et al. 2003).

Fig. 4
figure 4

Product ion spectra at 40 % collision energy of chromatographic peaks corresponding to (a) a uracil-CH2-cytosine dimer (LC retention time 17.3 min.) and (b) a cytosine-CH2-uracil dimer (LC retention time 25.9 min.) produced from heating HMU + HMC solutions. Accurate mass measurements and molecular formulae are shown for all ions. Suggested structures are also shown that were generated in Mass Frontier 6.0

Finally, we analyzed an aqueous solution of HMU and uracil (0.1 M each, 100 °C for 24 h) by direct infusion to the high resolution Orbitrap mass spectrometer. Figure 5 shows the electrospray mass spectrum with several distinct mass envelopes. We detected much larger oligomers by direct infusion compared to LC-MS measurements. Both uracil-CH2-(uracil-CH2)n-uracil oligomers and uracil-CH2-(uracil-CH2)n-uracil-5-hydroxymethyl oligomers up to eight nitrogen heterocycles long were identified based on accurate mass measurements, dual verification of the parent ion (based on the observation of both sodium and potassium adducts), and predicted oligomerization chemistry (for a complete list, refer to Table 1). We note that the peak intensity decreased as the size of the oligomer increased and that higher molecular weight oligomers were also hydrated (up to 2 water molecules). Tetramer and larger oligomers may have the possibility of having either a linear or branched structure, although further verification is needed and will be the subject of future studies.

Fig. 5
figure 5

Positive ion electrospray mass spectrum of uracil-CH2-(uracil-CH2)n-uracil oligomers and uracil-CH2-(uracil-CH2)n-uracil-5-hydroxymethyl oligomers produced from a heated HMU + uracil aqueous solution. The inset displays a close-up of likely higher molecular weight oligomers (i.e. the heptamer and octamer). Oligomers were detected as sodium and potassium adducts, and higher molecular oligomers were also hydrated (up to 2 water molecules), which led to the complex mass spectrum acquired

Table 1 Summary of accurate mass measurements for compounds detected in a heated HMU + uracil aqueous solution

Studies have shown that a ribozyme merely five nucleotides long can perform aminoacyl group transfer (Turk et al. 2010), which is an essential chemical group transfer in biological peptide synthesis. Interestingly, we detected nucleobase oligomers of comparable length (up to the octamer); although, any functional/catalytic capability for these structures is currently unknown.

Oligomerization of hydroxymethylated pyrimidines might be useful in the development of simple informational polymers, especially if more than one type of nucleobase can be incorporated into the same polymer, which may increase their potential genetic and catalytic capabilities. These prebiotically plausible oligomerization reactions are very simple, readily occur in aqueous solutions, proceed nonenzymatically, and may serve as an alternative to nucleotide oligomerization, which is thermodynamically unfavorable and typically requires activating groups (Ferris et al. 1996). Furthermore, the formation of these oligomers is so robust that HMU oligomers still form in the presence of a complex distribution of organic compounds synthesized from electric discharge experiments (40 % N2, 10 % CO2, 25 % CH4, and 25 % H2 sparked for 48 h) where abundant competing nucleophiles could inhibit oligomerization, and HMU oligomers with added functional groups were also tentatively identified (data not shown). We have no evidence that these oligomers adopt a structure similar to that of DNA or RNA that would support base-pairing. Some open questions include: (1) what is the chemical stability of these nucleobase oligomers in solution and how do they compare with RNA oligomers of comparable length, (2) is it possible to extend the oligomer length longer than eight nucleobases, (3) are these nucleobase oligomers capable of base pairing with other oligomers or small molecules, and (4) do hydroxymethylated purines behave in an analogous manner?