Chemo-enzymatic production of base-modified ATP analogues for polyadenylation of RNA

Base-modified adenosine-5′-triphosphate (ATP) analogues are highly sought after as building blocks for mRNAs and non-coding RNAs, for genetic code expansion or as inhibitors. Current synthetic strategies lack efficient and robust 5′-triphosphorylation of adenosine derivatives or rely on costly phosphorylation reagents. Here, we combine the efficient organic synthesis of base-modified AMP analogues with enzymatic phosphorylation by a promiscuous polyphosphate kinase 2 class III from an unclassified Erysipelotrichaceae bacterium (EbPPK2) to generate a panel of C2-, N6-, or C8-modified ATP analogues. These can be incorporated into RNA using template independent poly(A) polymerase. C2-halogenated ATP analogues were incorporated best, with incorporations of 300 to >1000 nucleotides forming hypermodified poly(A) tails.


Materials and methods
All chemicals and reagents were purchased from Sigma-Aldrich, Acros Organics, VWR and Jena Biosciences and were used without further purification unless otherwise stated.As polyphosphate source, Graham´s salt (Sigma-Aldrich 305553) was used.

Extinction coefficients
Nucleotide concentrations were measured via UV absorbance using extinction coefficients indicated by Jena Biosciences, as follows:

NMR
NMR spectra were measured at 299 K on a Bruker Neo 400 spectrometer.The chemical shifts (δ) were reported in ppm relative to deuterated solvents as internal standard (D2O = 4.79 ppm).

Synthesis of AMP analogues 2a-5a
General procedure The respective adenosine derivative (1 eq.) was solved in 20 mL dry trimethyl phosphate (TMP) under an argon atmosphere.The solution was cooled to 0 °C and freshly distilled POCl3 (1.6 eq.) was added dropwise.After stirring for 3 hours, the

Molecular Modelling
Structural models were generated with AlphaFold 2.3 in multimer mode running on a local high-performance computing cluster.The full tetrameric model of EbPPK2 was aligned to the reported tetrameric MrPPK2 structure (PDB 5LD1) using the MatchMaker tool in UCSF ChimeraX 1.7. [1]After alignment, the coordinates of the nucleotides were transferred from the MrPPK2 structure to the predicted EbPPK2 model.The composite structure was energy minimized with the YASARA force field minimization server [2] and the nucleotide-binding pocket was analysed in UCSF Chimera X 1.7.To obtain the molecular mass, the gel filtration standard #1511901 (BioRad) was applied under the same conditions.The EbPPK2 sample elutes as monochromatic peak at 12.27 mL corresponding to 144 kDa.

Thermal shift assay
The melting temperature of EbPPK2 in different buffers and ionic strength was temperature gradient from 10 °C to 95 °C with a ramp rate of 1 °C/min was used.Data collection was performed at a wavelength of λ = 610 nm.Melting points were calculated using the 1 st derivative of the melting curves. [3]PPK2 analytical scale reactions

EbPPK2 preparative scale reactions
For preparative reactions, EbPPK2 (4 µM) in 20 mM Tris (pH 8) with 20 mM MgCl2 and 6.1 g/L polyphosphate was preincubated at 30 °C, and the reactions were started by the addition of AMP or analogues to 5 mM (1a-5a).Reactions were run in a total volume of 2 mL for 2 min at 30 °C.The enzyme was then denatured at 85 °C for 2 min, and the reactions were centrifuged (15 minutes, 21000x g, 4 °C) to remove precipitated protein.

Purification of ATP analogues
For the purification of the ATP analogues (1c-5c), anion-exchange chromatography was performed on a ÄKTA Purifier system (GE Healthcare) using a HiPrep Q FF 16/10 column (Cytiva GE28936543).The system was equilibrated in ddH2O with a flow rate of 5.0 mL/min.A preparative EbPPK2 reaction (2 mL) was mixed with 8 mL ddH2O and loaded into a 10 mL Superloop TM for injection.The absorbance at λ = 254 nm and the conductivity were continuously monitored.After each purification, the column was washed with 30 mL of 500 mM NaClO4 (pH = 4.2) and equilibrated with 40 mL ddH2O.
Fractions containing the desired ATP analogue (1c-5c) were pooled and the volume of the solution was reduced to 1-2 mL through lyophilization.Note that drying of the ATP analogue solution to completion will lead to hydrolysis of the ATP derivative to the corresponding AMP derivative.To precipitate the ATP analogues and remove residual NaClO4, the ATP solution was mixed with 50 volumes ice cold acetone and then incubated at -20 °C for 2 hours.After centrifugation (30 minutes, 3220x g, 4 °C) the supernatant was decanted, and the solid was washed with acetone (cooled to -20 °C).
After centrifugation (10 minutes, 3220x g, 4 °C) and decanting of supernatant, the product was dried at room temperature.Finally, each ATP analogue was dissolved in ddH2O, and the concentration was determined by absorbance at 260 nm (280 nm for 4c).
The fractions corresponding to the peaks marked with * were pooled.

Purification of EbPPK2
Figure S19: UV-chromatogram at 280 nm of the purification of EbPPK2 using a ÄKTA purifier equipped with a HisTrap FF (5 mL, GE Healthcare) column.

P-NMR chain length determination of sodium polyphosphate
Figure S24: 31 P-NMR spectrum of sodium hexametaphosphate crystals, +200 mesh 96 % / Graham´s salt (Sigma-Aldrich 305553).Given the heterogeneity observed in the NMR spectrum, we used the method of Lindner et al. to determine the average chain length. [5]Terminal phosphate groups give rise to signals around -7 ppm, while internal phosphates are seen around -21 ppm.Starred peaks represent small amounts of orthophosphate (left) and cyclic phosphates (right).Setting the integral at -7 ppm to 2 (for the two termini per chain) gives an integration of 9 for the internal phosphates, indicating an average chain length of 11 overall.For further information on the assignment of 31 P-NMR of polyphosphates please see Christ et al. [6] Note that many products sold as Graham´s salt or sodium hexametaphosphate show different polyphosphate chain lengths.For this product (Sigma-Aldrich 305553), the supplier states that the average chain length traditionally ranges from 9.5 to 14.5 phosphate units, which is in accordance with 31 P-NMR spectrum above.
determined using the RUBIC Buffer Screen (Molecular Dimensions), following the manufacturer´s protocol.In short, 21 µL of each buffer of the RUBIC Buffer Screen were transferred to a PCR-microplate.On ice, 2 µL of a 40 µM solution of EbPPK2 (in 25 mM Tris-HCl pH 7.5, 150 mM NaCl) were added to each well. 2 µL of a freshly prepared SYPRO Orange solution (Invitrogen, S6651, pre-diluted to 62X) were dispensed into each well, and the PCR-microplate was sealed with a clear adhesive tape.The microplate was placed in a RT-PCR machine pre-equilibrated at 10 °C.A

For analytical reactions, 1
mM AMP or analogues (1a-10a) were incubated in 20 mM Tris (pH 8) with 20 mM MgCl2, 6.1 g/L polyphosphate and 4 µM EbPPK2 in a total volume of 20 µL at 30 °C.Samples were taken after 2 min and 60 min, the enzyme was denatured at 85 °C for 2 min, and the samples were centrifuged (15 minutes, 21000x g, 4 °C) to remove precipitated protein.The supernatant (1 µL) was analysed via LC-DAD-Q-MS.

Figure S18 :
Figure S18: 31 P NMR spectrum of 5a.Note that the peak at a chemical shift of δ = 2.97 ppm corresponds to residual traces of trimethyl phosphate.

Figure S20 :
Figure S20: SDS-PAGE gel of EbPPK2 expression and purification.A) Electrophoretic separation of proteins from the pET28(a)+ EbPPK2-transformed BL21(DE3) cells.Gel shows cell suspension before (pre-induction) and after (post-induction) the induction of the protein production using IPTG, as well as insoluble (pellet) and soluble (supernatant) cell fractions after sonication and centrifugation.B) Different concentrations of Bovine serum albumin (BSA) and dilution of the purified EbPPK2.M: PageRuler Prestained Protein Ladder (Thermo Fisher Scientific).Gels (12 % polyacrylamide) were stained with Coomassie Blue G-250.

Figure S21 :
Figure S21: UV trace of EbPPK2 on an EnRich sec 650 10/300 column.Elution volume was compared to the gel filtration standard (BioRadCatalog # 1511901) and calculated to the indicated mass.

Figure S22 :Fig S23 .
Figure S22: Melting curves of EbPPK2 using the RUBIC buffer screen.The three curves with the highest Tm values are shown.A)-C) refer to the buffer conditions F8, A10, and C10 respectively of the RUBIC Buffer screen.Each of these buffers contains 100mM phosphate, suggesting that phosphate ions have a stabilizing effect on EbPPK.

Figure S36 :
Figure S36: LC-DAD-Q-MS analysis of the EbPPK2 reaction starting from 2a.EIC for 2a, 2b, 2c, 2d and the UV trace at 260 nm.Note that the UV signal at 2.8 min corresponds to 2-chloroadenine.
LC-DAD-Q-MS were performed on an Agilent 1260 Infinity II Prime equipped with a 1260 high sensitivity diode array detector (DAD) with high sensitivity 60 mm flow cell and ESI-MSD iQ single quadrupole.
Every sample is followed by 2 min equilibration with 100 %A.