Are there double knots in proteins? Prediction and in vitro verification based on TrmD-Tm1570 fusion from C. nitroreducens

We have been aware of the existence of knotted proteins for over 30 years—but it is hard to predict what is the most complicated knot that can be formed in proteins. Here, we show new and the most complex knotted topologies recorded to date—double trefoil knots (31 #31). We found five domain arrangements (architectures) that result in a doubly knotted structure in almost a thousand proteins. The double knot topology is found in knotted membrane proteins from the CaCA family, that function as ion transporters, in the group of carbonic anhydrases that catalyze the hydration of carbon dioxide, and in the proteins from the SPOUT superfamily that gathers 31 knotted methyltransferases with the active site-forming knot. For each family, we predict the presence of a double knot using AlphaFold and RoseTTaFold structure prediction. In the case of the TrmD-Tm1570 protein, which is a member of SPOUT superfamily, we show that it folds in vitro and is biologically active. Our results show that this protein forms a homodimeric structure and retains the ability to modify tRNA, which is the function of the single-domain TrmD protein. However, how the protein folds and is degraded remains unknown.


Preparation of the tRNA substrates by in vitro transcription
The double-stranded DNA template coding for E. coli tRNA Leu (CAG) provided with a functional promoter was obtained by PCR using bracketing oligonucleotides.The exact sequences of the oligonucleotides as well as the final tRNA products are listed in Figure S10.The DNA template was PCR-amplified with Q5 DNA Polymerase (NEB).The concentrations of F1 and R4 oligonucleotides in the reaction mix was 100 μM whereas lower amounts (1 µM) were used for R2 and F3.After purification with the PCR Cleanup Kit (New England Biolabs), the DNA product was used as a template for in vitro transcription by T7 RNA Polymerase (Thermo Scientific).At the end of the reaction, the tRNA transcript was purified with Monarch RNA Cleanup Kit (New England Biolabs) and the concentration was measured with DeNovix microvolume UV-VIS spectrophotometer.C. nitroreducens tRNA Leu (CAG) sequence was retrieved from RefSeq database (Pruitt et al., 2005) and annotated using tRNAscan-SE (Chan et al., 2021).Appropriate DNA oligonucleotides were designed (see Figure S10) and the same protocol as for E. coli tRNA was repeated.In order to produce C. nitroreducens tRNA Leu (CAG) mutant that would not be modified by TrmD enzyme, G to T mutation into the DNA matrix was introduced at the position 36 of the template DNA.This resulted in G to U mutation in a final tRNA product as shown in Figure S10.

Methyltransferase activity assays
In order to assess the activity of TrmD-Tm1570 and its individual domains we used a commercially available MTase-Glo Assay Kit (Promega).Luminescence measurements were done using Synergy H1 (Biotek) plate reader in black half-area 96-well plates (Greiner, article number 784904).The methyltransferase reaction was carried out at 37 °C but luminescence measurements were done at ambient temperature.The reaction buffer contained 0.1 M Tris, 50 mM KCl, 1 mM EDTA, 4 mM DTT, 5 mM MgCl 2 , pH 7.25 and was first used to prepare a standard curve of SAH as illustrated on Figure S11A.Standard curve as well as all measurements of the enzyme activities were performed according to the assay kit manual.First, we determined the linear range of the initial velocities by plotting this parameter against enzyme concentrations (see Figure S11B, C).Since Tm1570 demonstrated almost no activity at these conditions, it was omitted from further experiment optimization procedures.For both full length TrmD-Tm1570 and for the isolated TrmD we decided to use 50 nM as enzyme concentration for further reactions.In order to find optimal substrates concentrations for the experiment where one could compare the activities of different TrmD-Tm1570 fragments, we varied tRNA concentration while keeping enzyme and SAM concentrations constant, 50 nM and 30 µM, respectively.As a result, even though very detailed analysis of the kinetic parameters is beyond the scope of this article, we obtained Michaelis-Menten plots for TrmD-Tm1570 fusion protein and for TrmD domain (Figure S12).Overall, both curves are very similar to each other, and the most pronounced differences are visible for low to medium range tRNA concentrations.Therefore for comparing activities of different protein fragments towards a range of substrates we decided to use 8 µM tRNA, 30 µM SAM and 50 nM enzyme.
Table S1.Topology of proteins with composite knot architectures.The structures are predicted by AlphaFold (either already deposited in the AlphaFold database or modeled with our locally installed version) or RoseTTaFold modeling.We show the quality of the models for both methods using their scores (pLDDT for AlphaFold and confidence score for RoseTTaFold).

Figure S1 :
Figure S1: Superposition of the example fusion proteins from PF03587-PF03587 and PF00588-PF00588 architectures with their single-domain counterparts.A. Nep1-Nep1 protein (blue; PF03587-PF03587 architecture; UniProtKB ID: A0A498KD62) with homodimeric structure of Nep1 (green and cyan; PDB ID: 3o7b).B. Protein with PF00588-PF00588 architecture (UniProtKB ID: Q4DMW6) superposed with homodimeric structure from PF00588 family (PDB ID: 3kty).Both of the fusion proteins are predicted to have their domains arranged in the same fashion as single-domain protein form their homodimeric complex.

Figure S3 :Figure S4 :Figure S5 :
Figure S3: Superposition of SAM binding sites between the crystal structure and the fusion protein.The crystal TrmD dimer (PDB ID:5WYQ) is in yellow and the crystal Tm1570 dimer (PDB ID:3DCM) is in magenta.

Figure
Figure S6: (A) Standard curve allowing for conversion from relative luminescence units to the amount of SAH.The linear range of the MTase-Glo detection with respect to initial velocity and concentration of TrmD-Tm1570 (B) and TrmD (C).The data points represent mean ± SD (n=3).

Table S2 .
Domain annotations for double knotted proteins from AlphaFold database.Results based on HHpred search of a sequence of a single protein against Pfam v. 35.

Table S3 .
PDB entries that are the most structurally similar to Tm1570 crystal (PDB ID: 3dcm).Based on DALI search against all PDB.

Table S4 .
The dataset of TrmD dimers

Table S5 .
The residues involved in the dimer contact of TrmD dimers as well as Tm1570 dimer.

Table S6 .
The residue conservation at dimer interfaces