Elsevier

Methods

Volume 89, 1 November 2015, Pages 138-148
Methods

Analysis of protein–RNA interactions in CRISPR proteins and effector complexes by UV-induced cross-linking and mass spectrometry

https://doi.org/10.1016/j.ymeth.2015.06.005Get rights and content

Highlights

  • Investigation of protein–RNA interaction sites by UV cross-linking and mass spectrometry.

  • Identification of RNA binding sites in RNA binding proteins.

  • Identification of amino acids cross-linked to RNA.

  • RNA interaction sites in Cas7 family proteins of the CRISPR-Cas system.

  • RNA binding interfaces in homology models of Cas7 proteins.

Abstract

Ribonucleoprotein (RNP) complexes play important roles in the cell by mediating basic cellular processes, including gene expression and its regulation. Understanding the molecular details of these processes requires the identification and characterization of protein–RNA interactions. Over the years various approaches have been used to investigate these interactions, including computational analyses to look for RNA binding domains, gel-shift mobility assays on recombinant and mutant proteins as well as co-crystallization and NMR studies for structure elucidation. Here we report a more specialized and direct approach using UV-induced cross-linking coupled with mass spectrometry. This approach permits the identification of cross-linked peptides and RNA moieties and can also pin-point exact RNA contact sites within the protein. The power of this method is illustrated by the application to different single- and multi-subunit RNP complexes belonging to the prokaryotic adaptive immune system, CRISPR-Cas (CRISPR: clustered regularly interspaced short palindromic repeats; Cas: CRISPR associated). In particular, we identified the RNA-binding sites within three Cas7 protein homologs and mapped the cross-linking results to reveal structurally conserved Cas7 – RNA binding interfaces. These results demonstrate the strong potential of UV-induced cross-linking coupled with mass spectrometry analysis to identify RNA interaction sites on the RNA binding proteins.

Introduction

In a cell, RNA molecules almost invariably function in association with proteins. Since RNA molecules can have enzymatic activity, and are structurally more versatile than double-stranded DNA, the variety and numbers of proteins binding to RNA is significantly greater than those found associated with classical double-stranded DNA. Accordingly, a multitude of RNA-binding proteins (RBPs) have been described in prokaryotes and eukaryotes [1], [2]. RNA binding by these proteins is versatile and is mediated by many different RNA-binding domains (RBDs), which can occur in various combinations within one RBP. In contrast, DNA-binding proteins such as transcription factors reveal only a very moderate variation in their DNA binding motifs.

Proteins that bind to RNA can modulate or stabilize RNA structures, thereby making RNA catalytically active and also mediate interactions between RNA and other macromolecules [3]. Conversely, RNA molecules can guide catalytically active proteins to their destinations. Furthermore – like the vast majority of proteins in higher eukaryotes, which are organized in protein complexes – RBPs with their cognate RNAs also serve as assembly platforms for proteins, while also being able to prevent proteins from interacting with the RNA. Thus RBPs are often, if not always, organized in ribonucleoprotein complexes (RNPs) [1]. These play essential roles in the major cellular steps of gene expression and its regulation. Hence, there is major interest in the molecular characterization of RNA-binding proteins with clear emphasis on identifying putative RNA-binding sites, as these regions are often essential for a functional RNP.

The “gold standard” for characterizing molecular interactions of RBDs with their cognate RNA molecules by structure determination is co-crystallization [4], [5]; others include NMR of the complex [6], or high-resolution EM of entire RNPs, as performed for the ribosome [7]. Although the number of co-structures of RBPs has been steadily increasing with more than 200 co-structures of protein–RNA complexes available in the PDB, most RBPs are still crystallized without RNA. Consequently, the molecular characterization of the RBD requires mutation studies combined with definition of the surface charge of the protein to allow localization of the RBD. Similarly, perturbations in the chemical shift of amino acid residues in NMR that are caused by interaction with RNA can allow the localization of the RBDs [8].

In recent years, chemical protein–protein cross-linking and UV-induced protein–nucleic acid cross-linking, in combination with mass spectrometry, have emerged as complementary methods for obtaining information about the spatial arrangement of proteins in complexes and in RNPs [9], [10]. In the case of UV-induced protein–RNA cross-linking, MS has been applied to identify the cross-linked proteins by standard quantitative MS-based proteomic approaches [11], [12], [13]. Subsequent database-searching has led to the identification of conserved structural motifs in these proteins [2], such as RNA-recognition motifs (RRMs) [14], K homology (KH) domains [15], zinc-finger domains [16], tudor domains [17], double-stranded RNA binding domains (dsRBDs) [18], G-patch domains [19], Sm motifs [20] etc. However, such proteomic approaches yield little or no information about (i) whether the protein cross-links to the RNA through its canonical RBD or through other domains within the protein; (ii) which RBD is involved in interaction with RNA when the proteins contains several potential RBDs; (iii) how proteins that do not harbor any known RBD (as identified by sequence) interact with RNA.

The latter situation occurs very often when prokaryotic RNA-binding proteins are investigated. These do not show primary RNA-binding sequence motifs that resemble those of eukaryotic proteins. Nonetheless, three-dimensional structures of bacterial RBPs are similar to structures of eukaryotic RBDs, for example, the bacterial HfQ protein with the characteristic Sm fold [21], [22] and the prokaryotic Cas7 protein family with their RRM motifs [23], [24].

We have now developed a straightforward approach that utilizes UV-induced cross-linking and mass spectrometry, not only to identify proteins that cross-link to RNA but also to identify unambiguously the cross-linked amino-acid and the cross-linked nucleotide(s) [25]. The approach is easily applicable to single (e.g., recombinant) proteins that interact with RNA but whose structure cannot be determined in complex with RNA. In contrast to other approaches, it can be also applied to assembled RNPs of any complexity, obtained either by reconstitution or by purification from extracts. Importantly, it can even be applied at the level of entire UV-cross-linked cells.

Here we describe the method for applying this approach to single recombinant proteins bound to RNA in detail. The proteins described here belong to the recently discovered prokaryotic adaptive immune defense system CRISPR-Cas [26]. In this system Cas proteins are guided by a CRISPR RNA (crRNA) to target and degrade complementary foreign nucleic acids in a manner that is functionally reminiscent of the eukaryotic RNA interference mechanism [27]. Type I, II and III CRISPR-Cas systems are classified based on their signature Cas genes (cas3, cas9 and cas10 respectively) that are further classified into different subtypes based on the presence of other Cas genes [28]. Type I and subtypes III-A and III-B form multiprotein RNPs together with different Cas proteins in addition to Cas3 or Cas10. Type II contains mainly one Cas protein, Cas9, and generates an RNP with two different RNA molecules (crRNA and tracrRNA). Some Cas proteins comprise nuclease domains, distinct helicase domains and also RRM domains that are typical for RNA-binding proteins [29]. The Cas7 family proteins, which form the backbone of the surveillance and effector complexes in Type I and Type III systems, consist of RRMs and belong to the RAMP (repeat associated mysterious proteins) superfamily [28]. Interestingly, most Cas proteins lack conserved amino-acid residues that account for RNA interaction. The diverse peripheral domains of the Cas protein family thus mediate RNA binding.

The Cas proteins that we use to demonstrate our approach are: Type I-A Cas7 from Thermoproteus tenax; Type I-D Cas7 from Thermofilum pendens; and Type III-A Cas7 (Csm3) from Thermus thermophilus. These homologs belonging to the Cas7 protein family were not co-crystallized with their cognate crRNAs. The investigations shown here in detail for Csm3 from T. thermophilus derived from a recent study of the fully assembled CRISPR-Cas Type III-A Csm complex in which we mapped protein–RNA cross-linking sites on all the proteins within this complex [30].

Section snippets

Experimental procedures

Below we give a detailed protocol for the investigation of the molecular interaction of recombinant RNA-binding proteins with their (cognate) RNA oligonucleotides and of endogenous protein–RNA complexes isolated from prokaryotic cells using UV-induced cross-linking. The protocol allows the mapping of UV cross-linking sites between proteins and RNA at single amino acid and nucleotide resolution. The principle of this approach is that after UV-induced cross-linking of amino acid side chains

Mapping the RNA binding interface in Cas7 proteins

We applied the biochemical, mass spectrometric and computational workflow to map the RNA-binding sites within homologous Cas7 family proteins – T. tenax Cas7, T. pendens Cas7 and T. thermophilus Csm3 – bound to polyU and to crRNA. In vivo, several copies of Cas7 proteins are wrapped around crRNA in a sequence-unspecific helical fashion [5], [30], [50], [51]. Crystal structures from single and complex-bound Cas7 proteins show two composite RNA-binding surfaces: a central cleft and a structurally

Conclusions

We have established a general workflow of UV-induced cross-linking and mass spectrometry for the identification of proteins with their respective peptides and amino acids in contact with RNA. The workflow outlined here proves especially useful when crystal structures or structural models of RNA-binding proteins are available without their cognate RNA. In this case, the cross-linking sites help map the RNA on to the structure of its binding proteins. The given examples of the Cas7 protein

Author contributions

K.S. carried out the protein–RNA crosslinking experiments and data analysis in the lab of H.U. A.H. performed the expression and purification of T. pendens and T. Tenax Cas7 proteins in the lab of E.C, using the plasmid constructs provided by A.M. and L.R. respectively. A.H. performed the modeling and superposition for Fig. 3. R.S purified the endogenous T. thermophilus Type III-A Csm complex in the lab of J.v.d.O. K.K., T.S and O.K. established the data analysis workflow and provided useful

Acknowledgements

The authors thank M. Raabe and U. Pleßmann for technical assistance, all the members of Urlaub laboratory and members of Forschergruppe 1680 for helpful discussions. This work was supported by the Deutsche Forschungsgemeinschaft [DFG, FOR 1680].

References (60)

  • H. Nishimasu et al.

    Cell

    (2014)
  • A. Castello et al.

    Cell

    (2012)
  • A.G. Baltz et al.

    Mol. Cell

    (2012)
  • C.P. Ponting

    Trends Biochem. Sci.

    (1997)
  • L. Aravind et al.

    Trends Biochem. Sci.

    (1999)
  • J. van der Oost et al.

    Trends Biochem. Sci.

    (2009)
  • R.H. Staals et al.

    Mol. Cell

    (2014)
  • X. Luo et al.

    Mol. Cell

    (2008)
  • H. Urlaub et al.

    The Journal of biological chemistry

    (2000)
  • K. Kramer et al.

    Int. J. Mass Spectrom.

    (2011)
  • M.R. Larsen et al.

    Mol. Cell. Proteomics: MCP

    (2005)
  • C. Rouillon et al.

    Mol. Cell

    (2013)
  • M. Spilman et al.

    Mol. Cell

    (2013)
  • N.G. Lintner et al.

    J. Biol. Chem.

    (2011)
  • M. Hafner et al.

    Cell

    (2010)
  • B.M. Lunde et al.

    Nat. Rev. Mol. Cell Biol.

    (2007)
  • S. Gerstberger et al.

    Nat. Rev. Genet.

    (2014)
  • C.G. Burd et al.

    Science

    (1994)
  • R.N. Jackson et al.

    Science

    (2014)
  • O. Duss et al.

    Nature

    (2014)
  • A.M. Anger et al.

    Nature

    (2013)
  • J.R. Stagno et al.

    Nucleic Acids Res.

    (2011)
  • H. Christian et al.

    Nucleic Acids Res.

    (2014)
  • H. Urlaub et al.

    Methods Mol. Biol.

    (2008)
  • M. Scheibe et al.

    Nucleic Acids Res.

    (2012)
  • C. Maris et al.

    FEBS J.

    (2005)
  • R. Valverde et al.

    FEBS J.

    (2008)
  • S.M. Quintal et al.

    Metall. Integr. Biomet. Sci.

    (2011)
  • S.D. Jayasena et al.

    Proc. Natl. Acad. Sci. U.S.A.

    (1992)
  • H. Hermann et al.

    EMBO J.

    (1995)
  • Cited by (23)

    • Approaches to study CRISPR RNA biogenesis and the key players involved

      2020, Methods
      Citation Excerpt :

      After RNase and trypsin digestion, the enriched crosslinked peptide-RNA heteroconjugates can be analyzed by mass spectrometry, yielding direct information about the amino acid residues involved in the interaction and the respective ribonucleosides. Details of this methodology have been published by Sharma et al. [104]. This approach was used to characterize the Shewanella putrefaciens CN-32 (S. putrefaciens) type I-Fv complex that consists of crRNA and only three Cas proteins (Cas5fv, Cas7fv and Cas6f).

    • The Human CCHC-type Zinc Finger Nucleic Acid-Binding Protein Binds G-Rich Elements in Target mRNA Coding Sequences and Promotes Translation

      2017, Cell Reports
      Citation Excerpt :

      This motif is mirrored by the RREs of CNBP found by in vitro selection (Ray et al., 2016), as well as by PAR-CLIP in this study. RNA-protein crosslinking in 4SU-PAR-CLIP, HiTS-CLIP, and other CLIP-seq procedures occurs predominantly at uridines (Kramer et al., 2014; Sharma et al., 2015) and, therefore, requires the presence of uridine bases within a few nucleotides of the binding site. Our unbiased motif enrichment analysis revealed CNBP’s G-rich RRE, mitigating concerns that results from UV-crosslinking-based protocols are disproportionately skewed toward U-rich RREs.

    • Editorial

      2015, Methods
    View all citing articles on Scopus
    View full text