Hfq Globally Binds and Destabilizes sRNAs and mRNAs in Yersinia pestis

Discovered in 1968 as an Escherichia coli host factor that was essential for replication of the bacteriophage Qβ, the Hfq protein is a ubiquitous and highly abundant RNA-binding protein in many bacteria. With the assistance of Hfq, small RNAs in bacteria play important roles in regulating the stability and translation of mRNAs by base pairing. In this study, we want to elucidate the Hfq-assisted sRNA-mRNA regulation in Yersinia pestis. A global map of Hfq interaction sites in Y. pestis was obtained by sequencing cDNAs converted from the Hfq-bound RNA fragments using UV cross-linking coupled immunoprecipitation technology. We demonstrate that Hfq could bind to hundreds of sRNAs and the majority of mRNAs in Y. pestis. The enriched binding motifs in sRNAs and mRNAs are complementary to each other, suggesting a general base-pairing mechanism for sRNA-mRNA interaction. The Hfq-bound sRNA and mRNA regions were both destabilized. The results suggest that Hfq binding facilitates sRNA-mRNA base pairing and coordinates their degradation, which might enable Hfq to surveil the homeostasis of most mRNAs in bacteria.

FLAG epitope. The experimental Hfq strain expressed Hfq-FLAG from a plasmid in a hfq deletion (Δhfq) genetic background (Hfq-FLAG). A wild-type (WT) strain with Flag epitope (WT-FLAG) was used as a control. Another control by transforming pHfq into a Δhfq strain (Hfq) was also obtained. A previous study has shown that the exogenously expressed Hfq-FLAG was functionally competent (38). Using native polyacrylamide gel electrophoresis (PAGE), we found that Hfq-FLAG stably existed as trimer and hexamer in bacterial cells (see Fig. S1A in the supplemental material). Transcriptome profiling of the constructed strains demonstrated a highly correlated expression pattern (R Ͼ 0.98) of these three strains (Fig. S1B).
The FLAG-tagged Hfq-RNA complexes were cross-linked by UV irradiation of cultured cells, followed by coimmunoprecipitation by anti-FLAG and partial digestion of unprotected RNA segments by RNase T1 (Fig. S1C). The Hfq-bound RNA segments were purified and ligated with adaptors for sequencing (Fig. S1D). Equal amounts of bacterial cultures from ΔHfq_Hfq-FLAG and WT-FLAG strains were lysed and subjected to parallel immunoprecipitation experiments to obtain CLIP-seq data. Experiments with these two strains were strictly performed in parallel. However, we obtained much less cDNAs from the WT-FLAG strain than from the Hfq-FLAG strain, suggesting that the FLAG tag itself did not yield much RNA binding noise and therefore was successful coimmunoprecipitation (co-IP) system (Fig. S1D). We obtained 18.1 million Hfq-FLAG-bound RNA tags and 2.1 million FLAG-bound control tags and then mapped them to the Y. pestis 91001 genome (see Table S1 in the supplemental material). Free FLAG control exclusively bound rRNA (86.03%) and tRNA (2.38%), indicating nonspecific binding. The fraction of CLIP-seq reads mapped to the annotated sRNA regions from the Hfq-FLAG strain was 10-fold higher than that of the WT-FLAG control, consistent with the specific sRNA binding activity of Hfq. We found the increased Hfq binding in sRNA, mRNA, and intergenic regions seems to be genome-wide rather than to some specific genes (Fig. 1A). These results suggest that Hfq selectively binds a large population of sRNA and mRNA in bacterial cells.
To identify the Hfq-bound genes from our CLIP-seq data, we normalized the CLIP-seq reads in each gene to the nonspecific bound 23S rRNA gene YP_r2. This rRNA represents the most abundant one among all identified RNAs in both strains. With twofold enrichment and at least 10 bound reads as thresholds, we obtained a total of 3,331 Hfq-bound RNAs and 864 Hfq-unbound RNAs in Y. pestis (Table S2A and B). All of the 22 rRNA genes and 65 out of 68 tRNA genes were not bound to Hfq. Among the seven annotated sRNAs, six were detected with the CLIP-seq reads, four were identified as Hfq bound, and two were not bound to Hfq (Table S2C). The Hfq-bound sRNAs includes the well-studied Spf, CsrB, and SsrS. In previous studies (40,41), Spf sRNA is Hfq bound in both E. coli and Salmonella. Ffs, SsrA, RnpB, and SsrS are not bound by Hfq in E. coli, and CsrB and SsrS are not Hfq bound in Salmonella. The Hfq-unbound sRNAs in Y. pestis included Ffs, SsrA, and RnpB (Table S2C).
We showed that 80.5% (3,323 out of 4,128) of all mRNA genes were enriched in the Hfq-FLAG strain. Transcriptome sequencing data from the two experimental Y. pestis strains cultured under the same condition were obtained as another set of controls (Table S1). The CLIP-seq method revealed that Hfq-bound and -unbound genes were well expressed, and it seemed that Hfq-bound genes tend to be clustered in the higher-expressed gene population (Fig. 1B). Hfq-bound genes were enriched in a large array of metabolic pathways, while Hfq-unbound genes were enriched in flagellar assembly, bacterial secretion system, and chemotaxis (Fig. S1D). These results collectively suggested that Hfq binds to most genes important for the exponential growth of Y. pestis, which supports its global and extensive regulatory role.
Comparison between Hfq binding profiles and the corresponding transcriptional profiles by counteracting common depth from each other indicated the binding specificity (Fig. 1C). For example, the CLIP-seq and transcriptome sequencing (RNA-seq) reads peaked at different locations for the previously known Hfq-bound cpxP mRNA. The RNA-seq reads were spread throughout the coding region, while CLIP-seq reads peaked at the 3= untranslated region (3=UTR) corresponding to the CpxQ sRNA (Fig. 1D, (Continued on next page) top). The Hfq-bound CpxQ sRNA has been recently reported to play a role in protecting bacteria against inner membrane damage (42). The 5= leader of rpoS mRNA is located at the 3= part of nlpD mRNA and is known to be bound by Hfq and DsrA sRNA (17,43). We showed that Hfq has a strong binding peak in the 5= leader region of rpoS mRNA. Moderate binding peaks in the gene body region and 3= downstream region were also evident (Fig. 1D, middle). The CLIP-seq and RNA-seq profiles of one Hfq-unbound mRNA, rpoZ, were also shown (Fig. 1D, bottom).
Potential regulatory RNAs predicted from CLIP-seq and RNA-seq. To more globally validate the CLIP-seq results, we performed two independent sets of Hfq RIP-seq experiments with the same strains and growth conditions. A total of 9,186,327 and 26,971,805 clean reads were obtained from the Hfq-FLAG strains. The mapping features of RIP-seq data were similar to those of the CLIP-seq data (Table S1). Moreover, distribution of RIP-seq reads in all genes was more similar among the two repeated experiments and the CLIP-seq data compared to their RNA-seq and WT-FLAG controls ( Fig. 2A and Fig. S2A). When Hfq-bound and -unbound genes were similarly identified from the two sets of RIP-seq data, the results showed that 3,263 (86.85%) Hfq-bound mRNAs and sRNAs overlapped among different immunoprecipitation experiments (Fig. 2B).
In order to better understand the length features of Hfq-bound intergenic and antisense RNAs revealed by CLIP-seq data, longer Hfq-bound RNA segments (with mean insertion size of 150 nucleotides [nt]) were selected for sequencing in RIP-seq experiments shown in this study. Without RNase T1 digestion, RIP-seq methodology preferred longer transcripts and selected against short sRNA transcripts. We found that 67.5% intergenic regions and 70.3% antisense regions showed Hfq-bound evidence from CLIP-seq data, while 49.1% and 30.4% corresponding regions obtained Hfq-bound evidence from one set of RIP-seq data (Fig. S2B). We also found that two out of three Hfq-bound sRNAs from CLIP-seq lost Hfq-bound signals from RIP-seq (Fig. S2C). Compared with the CLIP-seq binding profile, the specifically reduced binding capacity in the intergenic and antisense RNAs, but not mRNAs from RIP-seq libraries, suggests that intergenic and antisense RNAs are generally short transcripts. Their higher Hfq-bound efficiency indicates an unexpected global function in gene regulation.
The 5= leaders are well-known to regulate bacterial gene expression (9). The regulatory role of the 3= region of mRNA genes has recently been identified (44). We explored the Hfq binding profiles in these two classes of noncoding regions in Y. pestis. Compared to the 5= leader regions (Fig. 2C, left), we showed that Hfq binding is strongly enriched at the 3= regions of downstream mRNA genes, which was evident both by CLIP-seq and RIP-seq density (Fig. 2C, right), suggesting a global regulatory role of 3= region in Y. pestis.
The mRNA 3= regions have been reported to encode two major types of sRNAs. Type I is independently transcribed from the 3= end of a mRNA, and type II is processed by an endonuclease at the 3= region of the mRNA from a primary transcript (44,45). The four reported type I and II sRNAs from E. coli and Salmonella were examined for their  transcripts and Hfq binding profiles in Y. pestis. Three sRNAs' host mRNAs were well expressed in Y. pestis, including the type I MicL in cutC mRNA and type II SroC and CpxQ sRNAs located at the 3= of gltl and cpxP mRNA. All of these three 3= sRNAs were bound by Hfq ( Fig. 2D and Fig. 1D).
Hfq binding sites and motifs in the coding and noncoding regions. We used a window-based algorithm, calculating read density in adjacent windows and comparing their difference between IP and control to detect the Hfq-binding and transcriptional peaks from CLIP-seq and RNA-seq, respectively (see Materials and Methods for detailed information). A total of 2,511 and 1,518 peaks were recovered from CLIP-seq and RNA-seq data, respectively (Table S3). The dominant length of Hfq-bound peaks was around 150 nt and could be as long as 500 nt (Fig. 3A). Such a long Hfq-binding region could be partially resulted from the partial RNase T1 digestion during operation. In contrast, the transcript peaks were generally longer than Hfq-bound peaks (Fig. 3A).
Theoretically, CLIP-seq peaks indicated the Hfq-bound regions, while RNA-seq peaks indicated the steady-level transcripts. The latter is expected to cover the former. We then selected Hfq-bound peaks in which CLIP peaks containing fourfold-more CLIP-seq reads than RNA-seq reads, resulting in 1,499 qualified peaks. These peaks were defined as strong peaks, and other peaks from CLIP-seq were defined as weak peaks. Among these peaks, 1,168 overlapped the known genes, 131 overlapped the antisense strands, and 200 overlapped in the intergenic regions ( Fig. 3B), showing that Hfq has a larger tendency to associate with the noncoding regions, including both the intergenic and antisense regions. The selection criteria are quite strict, as reflected by the loss of four of the five Hfq-bound sRNAs and all three Hfq-bound tRNAs identified above (Fig. 1). This strict selection should allow us to explore the reliable binding features, particularly binding motifs of Hfq in Y. pestis transcriptome.
We used Homer software (46), well suited for finding motifs in large-scale genomics data, to recover highly represented Hfq binding motifs from these three different classes of peaks. These cellular motifs harbor all three known types of motif sequences, including poly(U), A-rich, and UA-rich bound on the proximal, distal, and rim surfaces of Hfq, respectively. Hfq-bound RNA motifs in Y. pestis were conserved in short motif sequence composition but quite flexible in motif organizations. For example, the top motif AAUAA was highly represented in mRNAs, intergenic RNAs, and antisense RNAs ( Table 1). The two conserved nucleotides preceding this motif were AG(C) in mRNAs, AG in intergenic sRNAs, and UA in antisense sRNAs. The resulting motif composition contained a combination of ARN and UAA motifs in mRNAs and intergenic sRNAs, and two UAA motifs in antisense sRNAs. Moreover, our results revealed a previously unrecognized G-rich motif. The GGGGAUU motif was highly represented in Hfq-bound mRNAs and intergenic sRNAs, but not in antisense sRNAs. The G-rich motif might contact Hfq at the distal face as the ARN motif does (20).
As a conserved sequence component of the rho-independent terminator, poly(U) is a symbol of the Hfq-bound sRNAs. We found that the U 6 stretch motif was presented in 57.7% of the intergenic peaks ( Table 1). As expected, the U 6 stretch was preferentially located at the 3= ends of strong Hfq-bound peaks (Fig. 3C, left). No such enrichment was observed for RNA-seq peaks (Fig. 3C, left). A population of Hfq-bound mRNA peaks also contain the U 6 motif at the 3= end (Fig. 3C, right). Such a U 6 stretch enrichment at the 3= end was not much evident for the antisense RNAs (Fig. S3A). It is noteworthy that the U 5 motif occurred at a much higher frequency with a pattern similar to the U 6 stretch ( Fig. S3B to D). The presence of the poly(U) motif at the 3= end of Hfq-bound Hfq Global Binding Profile in Y. pestis mRNA suggests that Hfq could use its proximal surface to contact with mRNA as well, consistent with the recently identified class of sRNAs located in the 3= regions of mRNAs.
In addition, we analyzed the distribution of the top three motifs on mRNAs harboring strong Hfq-bound sites. The AG/CAAUAA motif was found most often at the 5= and 3= ends of the target mRNAs, while the other two highly represented motifs CUUGGG and GGGAUU were presented in the body regions of mRNAs (Fig. 3D, left panels). We wondered whether the motif selection was caused by Hfq selection. Analysis of the sequence composition of all mRNAs from both Y. pestis and E. coli showed that the above motif patterns were true for all mRNAs (Fig. 3D, middle and right panels). Therefore, the location specificity of these Hfq-bound mRNA motifs should not be caused by selection of Hfq binding; instead, it is an intrinsic feature of bacterial mRNA structure. Nevertheless, we noticed that all classes of motifs located at the 5= end were more selected than those located at other regions (Fig. 3D, left and middle panels). We found that 53.64% and 70.83% mRNAs from Y. pestis and E. coli, respectively, contain either an AGAAUAA, CUUGGG, or GGGAUU motif. Of Hfq-bound mRNAs, 71.82% contain at least one of these top motifs. AGAAUAA motifs at the 5= and 3= termini were present at similarly high frequencies in both Y. pestis and E. coli.
The distinct sRNA profile between CLIP-seq and RNA-seq. We wanted to identify Y. pestis sRNAs to further understand the binding features of Hfq-sRNA by using the RNA-seq data obtained from the same bacterial strains and similar culture conditions as for generating CLIP-seq data. Although previous studies have identified hundreds of

TABLE 1
Top three consensus motifs generated from three kinds of peaks bound by Hfq a a The percentages of target or background represent the detection ratio (as a percentage) of the motifs in Hfq-bound peaks or simulated background peaks from randomly selected genomic sequences, respectively.
We predicted sRNAs from RNA-seq data in intergenic and antisense regions by using the peak calling algorithm described above. We identified 250, 315, and 238 transcriptional peaks in the Hfq-FLAG strain, WT_FLAG strain, and Hfq strain, respectively (Table S3). These peaks strongly overlapped each other, and 178 of them have more than 80% overlapped sequence, which were considered the same sRNAs (Fig. 4A). After merging, 373 sRNA transcripts were identified from all three strains. Among the intergenic sRNAs, only 40 of them harbor a canonical terminator within 150 nt of their 3= ends, while 85 harbored a canonical promoter within 150 nt of their 5= end (Fig. S3F). Among them, 12 harbor both the terminator and promoter. More than 40% of the previously identified Yersinia sRNAs from different studies were found among these 373 sRNAs (Table S4).
We also predicted 456 qualified intergenic and antisense peaks bound by Hfq from CLIP-seq data (Table S3), with a shorter length distribution than that from RNA-seq (Fig. 4B, P value ϭ 2.42eϪ9 by t test). Compared to the RNA-seq sRNA peaks, Hfq-bound sRNA peaks were closer to canonical transcription terminators, and most of them were located downstream of the predicted terminators (Fig. 4C). When we analyzed whether Hfq-bound sRNA peaks and RNA-seq sRNA peaks overlapped by setting 1-nucleotide overlap as a criterion, i.e., genomic overlap of Ն1 nt, about two-thirds of Hfq-bound sRNA peaks did not overlap with RNA-seq sRNA peaks We showed sRNA peaks (left panel) and overall peaks (right panel). (Fig. 4D). These results implied the inconsistent features of peaks predicted by CLIP-seq and RNA-seq data, which led to a hypothesis that Hfq binding may induce destabilization of sRNAs and mRNAs, rendering Hfq-bound sRNAs regions less detectable than the unbound regions by the RNA-seq approach.
RNA segments downstream of Hfq-bound sites in both sRNAs and mRNAs were destabilized. To further explore the above hypothesis, we separated sRNA peaks into three different classes. Stable non-Hfq-bound sRNA peaks (type I) have RNA-seq peaks only and have 156 sRNA members. Unstable Hfq-bound sRNA peaks (type II) have Hfq-bound peaks only and have 361 members. Stable Hfq-bound sRNA peaks (type III) have both Hfq-bound and RNA-seq peaks that overlapped by at least one nucleotide and have 93 members (Table S5). Please note that we described sRNA peaks instead of the whole sRNA transcripts here, and we were detecting the peaks from the same sRNA by RNA-seq and CLIP-seq approaches. We found that these three classes of sRNA peaks have very distinct transcript abundance recovered from RNA-seq data (Fig. 5A). We plotted the length distribution of these three classes of peaks, showing the overlapped peaks were generally longer than the nonoverlapped peaks (Fig. S4A, P value Ͻ 0.001 by t test).
We then plotted CLIP-seq and RNA-seq reads around the center of strong Hfqbound sRNA peaks to study the RNA abundance around Hfq-bound sites. Interestingly, the distribution of RNA-seq reads inside and downstream of Hfq-bound sites strongly declined compared with that of the upstream (Fig. 5B), which supported the hypothesis of Hfq-induced destabilization of sRNA segments downstream of the Hfq-bound sites. We then plotted CLIP-seq and RNA-seq reads around the center of nonstrong intergenic CLIP peaks, a similar declined abundance of RNA-seq reads was observed downstream of the Hfq-bound sites (Fig. 5C). Interestingly, a highly abundant transcript peak upstream of the Hfq-binding center was observed for nonstrong intergenic CLIP peaks, with a distance of about 120 nt (Fig. 5C).
The distribution of Hfq-bound cDNA reads and transcript cDNA reads in individual sRNA peaks were plotted, showing examples of three classes (Fig. S4B). We also analyzed the stability of these sRNAs in response to Hfq deletion. Northern blot analysis of sRNAs in WT and ΔHfq strain (Fig. S4B) showed that the knockout of Hfq decreased the stability of almost all Hfq-bound sRNAs, regardless of their differential stability in the Hfqϩ strain. In contrast, the abundance of all non-Hfq-bound sRNAs was not affected by Hfq deletion (Fig. S4B). These results are consistent with a model where the destabilization of Hfq-bound sRNA segments depends on their base pairing with target mRNAs facilitated by Hfq binding (32). The relationship between Hfq binding and sRNA destabilization suggested that the stable non-Hfq-bound sRNA may lack Hfq-binding motifs. Analysis of the overrepresented motifs in all 373 sRNA peaks identified from RNA-seq revealed the lack of typical Hfq-binding motifs in sRNAs (Fig. S4C).
Destabilization of Hfq-bound sRNAs could have resulted from coupled degradation of a sRNA and its mRNA targets (32). The transcript abundance of the three classes of mRNA peaks were similar to those of sRNAs (Fig. 5D). We then analyzed the distribution of transcript reads around the center of Hfq-bound sites from strong and weak CLIP peaks recovered from mRNA regions. Strong Hfq binding correlated with the destabilization of the downstream mRNA segments, highly similar to that of sRNAs (Fig. 5E). The RNA-seq read distribution upstream of weak Hfq binding sites was almost the same as that of sRNAs, and the downstream destabilization was also evident (Fig. 5F).
In light of the proposed mechanism of Hfq-facilitated sRNA-mRNA degradation, we explored the relationship between Hfq binding of mRNA and their stability. The cumulative abundance of mRNAs displaying strong or weak Hfq peaks was plotted, showing that mRNAs showing weak peaks were more abundant than those showing strong peaks (P value ϭ 0.01 by the Kolmogorov-Smirnov [K-S] test; Fig. 5G). Genes displaying non-Hfq-bound peaks were generally expressed at lower levels than those showing Hfq-bound peaks (P value Ͻ 2.2eϪ16 by K-S test; Fig. 5G).

Hfq regulates the processing and/or stability of Hfq-bound sRNAs. Several
Hfq-binding sRNAs, including GlmZ and ArcZ, are reported to undergo RNase E processing (36), while GcvB contains two transcriptional termination sites to produce two sRNA isoforms (51)(52)(53)(54)(55). The proposed secondary structure of Y. pestis GlmZ is almost identical to that of E. coli (Fig. 6A), while that of ArcZ was strikingly different (Fig. 6B) (55). To provide processing evidence of Hfq-bound sRNAs, we recovered the 5=-end position and density of transcripts from Hfq-FLAG strains using a highthroughput sequencing method (see Materials and Methods for detailed information). The RNase E processing sites on GlmZ and ArcZ were readily recognized from the Hfq binding density map: the CLIP-seq read density cleft for GlmZ (Fig. 6C) and the site indicating sharp read density switch for ArcZ (Fig. 6D). The two sRNA isoforms indicative of cleaved product and uncleaved precursor RNA for both GlmZ and ArcZ were detected by Northern blot analysis ( Fig. 6C and D, insets).
Knocking out Hfq led to the change of the ratio between the cleaved products and precursors ( Fig. 6C and D), suggesting that Hfq may regulate the processing of these two sRNAs. However, we could not exclude the possibility that Hfq may affect the stability of the cleaved products and precursor differentially. In both GlmZ and ArcZ cases, processing resulted in a U-rich 5=-end ( Fig. 6A and B), which was coordinate with the cleavage feature of RNase E (36). Interestingly, the GcvB sRNA did not display two transcript isoforms, although its sequence is highly similar to E. coli and we detected three dominant 5= sites (Fig. 6E). Ffs also had only one transcript isoform (Fig. 6F). Hfq binding regulation of sRNA stability was also found from other three sRNAs, including sR128, sR142, and sR132 with changed isoform ratios upon Hfq deletion (Fig. S4B).

DISCUSSION
Sm proteins are a family of small proteins that assemble the core components of the U1, U2, U4, and U5 snRNPs, and therefore are central for eukaryotic pre-mRNA splicing (56). Lsm proteins containing the "Sm motif" often function in eukaryotic mRNA decapping and decay (57,58). Hfq has been known for more than a half century and is a typical LSm protein (3). By cooperation with diverse sRNAs, Hfq has been shown to play a key role in degrading bacterial mRNAs. The process involves the recruitment of RNase E, a key member of RNA degradosome (7,30). Decay of mRNA can be either coupled with sRNA or not (4,5,59). There are several fundamental questions waiting to be addressed in bacteria, including the following. (i) How many sRNAs and mRNAs are contacted by Hfq? (ii) How do the different Hfq surfaces contact sRNA and mRNA in bacteria? (iii) How does the Hfq binding contribute to sRNA and mRNA base pairing and their decay in bacteria? In this study, we obtained Hfq-bound RNAs by using both CLIP-seq and RIP-seq techniques. By setting RNA-seq data as controls and developing proper algorithm to analyze the genome-wide sequencing data, we were able to address these three questions to a good depth and propose a model for Hfq binding and facilitation of sRNA-mRNA-coupled degradation in bacteria.
Hfq extensively binds mRNAs. Hfq is known to play a role in mRNA degradation in E. coli. It interacts with poly(A) polymerase I and is used for substrate recognition by binding to rho-independent terminators (60). It is believed to destabilize that structure and allow polyadenylation to occur. Although several coimmunoprecipitation studies have revealed that Hfq binds hundreds of mRNAs and tens of sRNAs in both E. coli and Salmonella (28,29,40,41,61,62), more information on Hfq-bound RNAs in other bacteria is needed to better understand Hfq actions. We expressed Hfq-FLAG protein with a Hfq knockout background in Y. pestis and obtained high-quality CLIP-seq and RIP-seq data. By using non-Hfq-bound 23S rRNA as a control, we found that thousands of expressed mRNA genes (ϳ80%) showed Hfq binding density above background.  These results lead to a hypothesis that Hfq might control the stability of most mRNAs with its sRNA partners.
Hfq flexibly contacts sRNA and mRNAs with multiple surfaces: formation of Hfq-sRNA-mRNA complex. In vitro studies have revealed that Hfq uses its proximal face to bind poly(U) sequence in sRNAs and its distal and rim surfaces to contact A-rich and UA-rich sequences in sRNA and mRNAs (11). Hfq-bound RNA motifs from our CLIP-seq data revealed a comprehensive Hfq binding strategy in cells. In addition to A-rich and UA-rich motifs, Hfq-bound G-rich and UG-rich motifs have been identified in mRNAs of Y. pestis. These motifs mirror A-rich and UA-rich sequences and may be contacted by the distal and rim surfaces of Hfq. For sRNAs, the top three motifs were featured either by the canonical terminator sequence containing the U 6 stretch motif preceded by the GC-rich sequence (Table 1) or by other two motifs that are G rich or A rich.
The in vivo Hfq motifs are comprised of different known short motifs, enabling an Hfq hexamer to use different surfaces to recognize and effectively contact a RNA sequence. The combinatory organization of different motif blocks could allow a specific RNA sequence to contact multiple faces of an Hfq hexamer. This organization could have additional advantages in the assembly of the Hfq-RNA complex. Increasing numbers of sRNAs have been proved to simultaneously act on multiple mRNAs. Likewise, many mRNA transcripts are emerging as shared targets of multiple cognate sRNAs. Since Hfq levels are assumed insufficient relative to RNA species, RNA is shown to actively cycle by competition for the access to Hfq (10,63).
The findings presented here expanded our understanding of the dynamics and efficiency of Hfq binding in mRNAs. The results presented in this study suggest that the Hfq-sRNA complexes could select their target in a relative very flexible way. For example, the ubiquitous U 6 stretch of many sRNAs can base pair with many mRNAs containing A-rich or G-rich motifs at the terminal parts or the body regions. The Hfq-sRNA complex interacts with the translationally inactive and/or repressed mRNAs, which enables the formation of intermolecular base pairing between sRNA-mRNAs and the increase in local concentrations of RNase E for cleavage of sRNA-mRNA duplex (Fig. 7).
Hfq-bound sRNAs were generally unstable: a comprehensive list of Y. pestis sRNAs. We have demonstrated that Hfq binding of sRNAs is more complex than expected. First, the predicted sRNAs from RNA-seq data can represent only a population of sRNAs, not all of the sRNAs. Second, a large population of sRNAs are unbalanced in their stability, with the 5= portion being more stable than the 3= portion, largely due to the Hfq binding. Therefore, some sRNAs predicted from transcriptome reads may be shorter than the full transcripts. At last, Hfq binding is associated with sRNA degradation during the normal growth condition where transcription is active.
In summary, sRNAs are highly dynamic in their transcription and degradation. Identification of full-length sRNA genes is challenging. The challenge is further complicated by the lack of canonical terminators in many sRNAs and the presence of sRNAs overlapping the 3= ends of mRNA genes (42,44,64). In this study, we generated a comprehensive sRNA list comprising about 700 members encoded by avirulent Y. pestis strains; only a small fraction of these have been identified before. This list does not include those that overlapped the 3= region of mRNA genes. About 363 Hfq-bound sRNAs difficult to be identified by transcriptome sequencing were identified. We therefore proposed the idea that Hfq binds hundreds of sRNAs, which could be involved in controlling the stability and translation of most, if not all, mRNAs in Y. pestis.
Hfq-bound sites define two positions for the coupled degradation of the base-paired sRNA-mRNA: a general mechanism for cellular mRNA surveillance. The mechanism of the coupled degradation of the sRNA-mRNA complex appears feasible for a quick response of environmental change by bacterial cells (7) and was proposed in 2003 (32), but there is little direct evidence supporting it (4,7,30). In this study, genome-wide analysis of multiple classes of sRNAs and mRNAs in their aspects of Hfq-bound capability allows us to comprehensively revisit this issue. We have demonstrated that sRNA segments at Hfq-bound sites and downstream of these sites are globally unstable. The mRNAs displaying strong Hfq-bound peaks showed a pattern similar to the pattern that Hfq-bound sRNA showed. However, mRNAs displaying nonstrong Hfq-bound peaks showed that the Hfq-binding sites are protected, while the downstream segments are destabilized. In light of the RNase E function in Hfq-sRNAmediated RNA degradation (30), we proposed that Hfq-bound sites render two positions for RNase E entry, which will result in the mRNA segment degradation downstream of Hfq binding sites via a 5=P-dependent RNase E degradation pathway (Fig. 7). This hypothesis is in line the current knowledge of Hfq in regulating RNA stability via interaction with poly(A) polymerase I (65). The coupled degradation of both mRNA and sRNA in the Hfq-bound sites lead to either the direct degradation of the 5=-P-containing sRNAs or recycling of cleaved sRNAs containing either 5=-P or 5=-PPP by RNase E cleavage. Recycling of cleaved sRNAs explains the lack of protected Hfq-bound sites in sRNAs. The U-rich binding sites of Hfq also suggest that degradation can also occur when Hfq binds to rho-independent terminators to trigger polyadenylation (60).
Interestingly, we showed that Hfq-bound AGAAUAA motifs are located at both the 5= and 3= termini of Y. pestis and E. coli mRNAs, while CUUGGG and GGGAUU are located at the body regions of mRNAs. All these sequence motifs are partially complementary to U-rich sequence in sRNAs. Given the large diversity of sRNA sequences, it is not surprising that the Hfq-sRNA complex has a chance to bind most of cellular mRNAs and to mediate their degradation when they are not effectively translated. Although there are Hfq-bound mRNA motifs, we find that Hfq binding of mRNAs lacks selectivity because the motifs were intrinsic features of mRNAs. The lack of binding site selection supports the RNA chaperone Hfq surveilling the RNA homeostasis of the whole bacterial transcripts via the cooperation with its partner sRNAs.

MATERIALS AND METHODS
Construction of Flag-tagged plasmids. By using a fusion PCR protocol, oligonucleotides encoding 3ϫFLAG affinity tag (DYKDHDGDYKDHDIDYKDDDDK) were added before the TAA termination codon of the Hfq gene to construct the C-terminally Flag-tagged plasmids. A fragment covering a region of 298 nucleotides (nt) upstream, the entire hfq gene followed by 3ϫflag and 176 nt downstream was cloned into the multiple cloning site of plasmid pACYC184, designated pHfq-FLAG. The other fragment spanning a region of 298 nt upstream, the first 21 nt of the hfq gene followed by 3ϫflag and 176 nt downstream was also introduced into pACYC184, designated FLAG. The complementary plasmid was constructed by inserting a PCR fragment covering a region from the 300-bp fragment upstream to 200 bp downstream of the hfq gene into pACYC184, designated pHfq. The inserts mentioned above were cloned into pACYC184 via BamHI and XbaI/EcoRV restriction sites. The list of oligonucleotide primers was shown in Table S6 in the supplemental material.
Bacterial strains and growth conditions. Y. pestis wild-type strain 201 belongs to a newly established Y. pestis biovar, microtus, which is avirulent in humans but highly lethal in mice. The hfq deletion strain (Δhfq) was generated by -Red homologous recombination methods as previously described (38). The Y. pestis Hfq-FLAG, Hfq, and WT-FLAG strains were constructed by transforming the pHfq-FLAG and pHfq into the Δhfq strain and Flag into the WT strain, respectively. Bacteria were grown in brain heart infusion (BHI) broth (Difco) supplemented with appropriate antibiotics overnight at 26°C with shaking at 200 rpm until exponential growth phase (optical density at 629 nm [OD 620 ] of 0.8). Bacterial growth was stopped by centrifugation for 6 min at 5,000 rpm at 4°C. The pellets were frozen into liquid nitrogen and stored at Ϫ80°C until the cells were lysed. Western blotting was performed by using monoclonal FLAG antibody (Sigma) to detect the FLAG-tagged proteins.
RNA-seq, CLIP-seq, and RIP-seq. For RNA-seq, total RNAs were extracted from Y. pestis Hfq-FLAG, WT-FLAG, and Hfq strains mentioned above by using TRIzol reagent (Invitrogen). For CLIP-seq, two strains with FLAG grown under the same conditions were collected and resuspended in 10 mM Tris-HCl (pH 8.0). The pellets were dispersed on a petri dish and irradiated uncovered with 400 mJ/cm 2 of UV 254 nm to form the cross-linked RNA-protein complex. Bacterial cells were collected and lysed in RIP lysis buffer (1ϫ phosphate-buffered saline [PBS], 0.1% SDS, 0.5% NP-40, and 0.5% sodium deoxycholate) and subjected to coimmunoprecipitation (Co-IP). Co-IP was conducted to isolate the FLAG-bound and Hfq-FLAG-bound RNA by using FLAG antibody according to the manufacturer's instructions for RNA-binding protein immunoprecipitation kit (Millipore). Briefly, the lysate was centrifuged at 12,000 ϫ g at 4°C for 10 min. The clear lysate was incubated with 1.0 ml of bead-antibody complex in RIP immunoprecipitation buffer, followed by incubation at 4°C for 3 h on a rotator. The immunoprecipitation tubes were centrifuged briefly and placed on the magnetic separator, and the supernatant was discarded. The anti-FLAG beads were then washed a total of six times with 0.5 ml of RIP wash buffer and digested by RNase T. The immunoprecipitated RNA fragments were radiolabeled using PNK and separated by SDS-PAGE. The bands corresponding to the equivalent size of Hfq protein were cut out and purified. The cross-linked RNA-protein complexes were digested with proteinase K at 55°C for 30 min. RNA was extracted using TRIzol and phenol-chloroform, followed by isopropanol precipitation. The purified RNA was treated with DNase I (Promega) and sequenced using the Illumina/Solexa RNA-sequencing protocol.
For RIP-seq, 500 l lysate was incubated with 10 g anti-Flag antibody or control IgG antibody overnight at 4°C. The immunoprecipitates were further incubated with protein A Dynabeads for 1 h at 4°C. After applying to magnet and removing the supernatants, the beads were sequentially washed with lysis buffer, high-salt buffer (250 mM Tris [pH 7.4], 750 mM NaCl, 10 mM EDTA, 0.1% SDS, 0.5% NP-40, and 0.5 deoxycholate), and PNK buffer (50 mM Tris, 20 mM EGTA, and 0.5% NP-40) two times in each buffer. The immunoprecipitates were eluted from the beads with elution buffer (50 nM Tris [pH 8.0], 10 mM EDTA, and 1% SDS), and the RNA was purified with TRIzol reagent (Life Technologies).
Purified RNAs were iron fragmented at 95°C followed by end repair and 5= adaptor ligation. Then reverse transcription was performed with reverse transcriptase (RT) primer harboring 3= adaptor sequence and randomized hexamer. The cDNAs were purified and amplified, and PCR products corresponding to 200 to 500 bp were purified, quantified and stored at Ϫ80°C until used for sequencing.
For high-throughput sequencing, the libraries were prepared following the manufacturer's instructions and applied to Illumina GAIIx system for 80 single-end sequencing by ABLife Inc., Wuhan, China.
given gene set, and all genes were regarded as background. R software was used to perform statistical significance analysis, including all the hypothesis testing types in the article.
Data availability. The RNA-seq, CLIP-seq, and RIP-seq data reported in this paper have been deposited in NCBI GEO under accession number GSE77555.

ACKNOWLEDGMENTS
We thank ABLife Group for technical support in generating and graphing the sequencing data in this study.
This study was funded in part by the National Natural Science Foundation of China