Tombusvirus p19 captures RNase III-cleaved double-stranded RNAs formed by overlapping sense and antisense transcripts in E. coli

Antisense transcription is widespread in bacteria. By base pairing with overlapping sense RNAs, antisense RNAs (asRNA) can form long double-stranded RNAs (dsRNA), which are cleaved by RNase III, a dsRNA endoribonuclease. Ectopic expression of plant tombusvirus p19 in E. coli stabilizes ~21 bp dsRNA RNase III decay intermediates, which enabled us to characterize otherwise highly unstable asRNA by deep sequencing of p19-captured dsRNA and total RNA. dsRNA formed at most bacterial genes in the bacterial chromosome and in a plasmid. The most abundant dsRNA clusters were mostly formed by divergent transcription of sense and antisense transcripts overlapping at their 5’-ends. The most abundant clusters included small RNAs, such as ryeA/ryeB, 4 toxin-antitoxin genes, and 3 tRNAs, and some longer coding genes, including rsd and cspD. The sense and antisense transcripts in abundant dsRNA clusters were more plentiful and had longer half-lives in RNase III mutant strains, suggesting that formation of dsRNAs promoted RNA decay at these loci. However, widespread changes in protein levels did not occur in RNase III mutant bacteria. Nonetheless, some proteins involved in antioxidant responses and glycolysis changed reproducibly. dsRNAs accumulated in bacterial cells lacking RNase III, increasing in stationary phase, and correlated with increased cell death in RNase III mutant bacteria in late stationary phase. The physiological importance of widespread antisense transcription in bacteria remains unclear but it may become important during environmental stress. Ectopic expression of p19 is a sensitive method for identifying antisense transcripts and RNase III cleavage sites in bacteria.

Introduction 1 2 Endogenous antisense RNAs (asRNA) are products of DNA-dependent RNA polymerase 3 initiated from antisense promoters that overlap at least partially with any coding or functional 4 RNA (sense RNA). The overlapping regions of sense and antisense RNAs are fully 5 complementary so they have the potential to form perfectly matched double-stranded RNAs 6 (dsRNA). Usually asRNA are much less abundant than the corresponding sense RNA. Next 7 generation sequencing has identified many new species of asRNA (1, 2), but their biological 8 significance is not well understood. 9 10 Examples of RNA regulation of gene expression in bacteria have been described that 11 involve a variety of small non-coding RNAs and asRNA (3)(4)(5)(6). asRNA and RNase III regulate 12 gene expression of plasmids and toxins. The E. coli ColE1 plasmid replication origin encodes 13 two non-coding RNAs -RNA II, which serves as a DNA replication primer, and RNA I, which 14 is shorter and fully complementary to the 5´ portion of RNA II (7)(8)(9)(10). RNA I inhibits plasmid 15 replication by binding to the RNA II plasmid replication primer. Other well-known asRNA-16 regulated systems are the Type I Toxin-Antitoxin (TA) genes (11). In Type I TA systems, a 17 small RNA gene lies opposite to, but overlapping with, a gene encoding a toxic peptide. The 18 small asRNA inhibits the expression of the toxin by at least partially base pairing with the toxin 19 RNA. Examples of asRNA-mediated toxin regulation systems include hok/sok in the R1 20 plasmid (12,13) and ldrD/rdlD in the E. coli genome (14). RNase III, an exonuclease that 21 cleaves dsRNAs (15) to generate 5'-phosphate and 3'-hydroxyl termini, leaving a characteristic 22 3' 2 nucleotide (nt) overhang (16,17), regulates both the plasmid replication system 10

and the 23
Type I TA systems (18,19). Exhaustive digestion of dsRNAs by RNase III produces small 24 dsRNAs of about 14 bp. 25 1 Bacterial genomes produce many asRNAs from protein coding genes. Using a whole-2 genome tiling microarray, the Church group discovered that a large percentage of the E. coli 3 genome is transcribed in both directions (20). Multiple groups subsequently used deep 4 sequencing to study the transcriptome of bacterial genomes (6). Lasa  in Staphylococcus aureus (2). Their findings suggested that asRNA are widely transcribed 7 across the genome of Gram-positive bacteria but are degraded with sense RNAs into small The tombusvirus p19 protein captures siRNAs (~21 nucleotide small dsRNAs) to defend 24 against the antiviral effects of RNA interference in plants (26,27). We previously found (28) that ectopic expression of p19 in E. coli captures ~21 nucleotide small dsRNAs generated from 1 overlapping exogenous sense and antisense transcripts. These small RNA duplexes, which are 2 apparently intermediary degradation products of RNase III, were termed pro-siRNAs 3 (prokaryotic short interfering RNAs). Precipitation of p19 in bacterial cells co-expressing p19 4 together with ~500 nt sense and antisense sequences or a similarly sized sense-antisense stem-5 loop of an exogenous gene enabled us to isolate and purify pro-siRNAs that specifically and 6 efficiently knocked down the exogenous gene when transfected into mammalian cells (28-30). 7 pro-siRNAs mapped to multiple sequences in the target gene. In bacteria expressing p19, but 8 no exogenous sequences, ~21 nucleotide dsRNAs were also captured (referred as 'p19-9 captured dsRNAs'). These short dsRNAs were greatly reduced in the absence of p19 or in 10 RNase III-deficient bacteria expressing p19. We hypothesized that these short dsRNAs 11 represent p19-stabilized RNase III decay intermediates of overlapping endogenous sense and 12 antisense transcripts that might provide a useful method for characterizing labile endogenous 13 dsRNAs. 14 15

Results 16
Plasmid-directed synthesis of p19 uncovered plasmid encoded dsRNAs 17 Two methods for expressing p19 proteins in bacteria were designed (Fig. 1a). The first 18 method uses a pcDNA3.1 plasmid (pcDNA3.1-p19-FLAG), previously engineered by us (28) 19 to express p19, driven by the CMV promoter. To characterize the dsRNAs captured by p19 20 pull-down, we compared RNAs isolated after overnight culture from cell lysates of WT 21 (MG1693) and RNase III-deficient rnc-38 in the MG1693 background (SK7622) (31), 22 transformed with pcDNA3.1-p19-FLAG. In rnc-38, insertion of a kanamycin resistance gene 23 within a 40-bp fragment in the rnc gene abrogates RNase activity (31). dsRNAs bound to p19 24 were isolated using affinity chromatography, cloned and deep sequenced. Sequencing reads were mainly 21-22 nt long from WT E. coli and were reduced ~10-fold in the RNase III mutant 1 strain (Fig. 1b, c), consistent with our previous finding that pro-siRNAs are produced by RNase 2 III (28). The aligned reads in WT E. coli mapped to both the E. coli genome and plasmid, but 3 most of the aligned reads (78%) mapped to the plasmid (Fig. 1c). The plasmid reads were 4 unevenly distributed across the entire plasmid, but were concentrated in 'hot spots' (Fig. 1d), 5 as previously found in cells expressing exogenous hairpin RNAs (28). The hot spots contained 6 unequal levels of sense and antisense reads, as previously found for exogenous sequences, 7 where the differences in abundance of sense and antisense reads were shown to likely be due 8 to cloning bias (28). 9 10 The pcDNA3.1-p19-FLAG plasmid is comprised of a pUC bacterial plasmid backbone, but 11 includes additional sequences supporting functions in eukaryotic cells since pcDNA3.1 was 12 designed for use in mammalian cells (Fig. 1d). p19-captured dsRNA hot spots were most 13 abundant in the sequences of bacterial plasmid origin and distributed along it, suggesting that 14 the bacterial plasmid produces multiple overlapping sense and antisense RNAs, consistent with 15 a recent study (32). By contrast non-bacterial sequences were largely devoid of dsRNAs, except 16 for dsRNAs observed within the CMV promoter region, which drives p19 expression. We 17 speculate that the other non-bacterial sequences might not be adapted to initiate transcription 18 in bacteria and thus produce fewer overlapping transcripts and fewer dsRNAs. The origin of 19 replication of pcDNA3.1 is derived from the pUC plasmid, which is known to produce a sense 20 RNA I transcript, which promotes replication, and an antisense RNA II transcript, which 21 inhibits replication (33). The overlapping region of RNA I and RNA II contained a dsRNA hot 22 spot, consistent with the idea that dsRNAs originate from overlapping transcripts (Fig. 1d). 23 24 clusters of similar size (for divergent operons, median RPKM of random dataset was 11.0; for 1 convergent operons, median RPKM of random dataset was 12.2). These results suggest that 2 RNase III-produced short dsRNAs captured by p19 are more often generated from divergent 3 transcripts. 4 5

Characterization of top dsRNA clusters 6
To focus on dsRNAs that are more likely to be biologically relevant, we analyzed the more 7 abundant clusters, because most of these were also identified in other studies (Fig. 2e, f) ryeA (249 nt) in a growth and RNase III-dependent manner (42). The dsRNA sequencing data 10 (in S1) showed one dominant short dsRNA peak of ~21 nt in the overlapping region of ryeA 11 and ryeB (Fig. 3b, left). Within this hot spot, a zoomed in look at the sequence showed that this 12 21-nt dsRNA has 3' 2-nt overhangs at both ends, suggesting that these dsRNAs are bona fide 13 RNase III products. Full length and shorter transcripts of both ryeA and ryeB were more 14 abundant and ryeA had a ~3-fold increased half-life in rnc-14 than WT bacteria (Fig. 3b, (Fig. 3c, left). A zoomed 24 in view of the dsRNA hot spot showed that it is a 22-nt dsRNA with 3' 2-nt overhangs at both ends ( Fig. 4c, middle), consistent with production by RNase III. An spf asRNA, of the same 1 size as Spot 42 RNA (109 nt), was only detected in the rnc mutant (Fig. 4c, right), consistent 2 with results of Lybecker et al. (22). Moreover, spf expression and half-life were greater in the 3 rnc mutant strain (Fig. 4c, right). We also identified an abundant p19-captured dsRNA cluster 4 (9,452 RPM) for another stress-related small RNA, micA (72 nt), which binds to Hfq, is known 5 to be processed by RNase III and inhibit mRNA translation by an antisense mechanism (44). 6 Although the abundance of micA was not substantially changed in the rnc-38 mutant strain, its 7 stability increased ( Supplementary Fig. 4a). The antisense transcripts of micA were more 8 abundant and several additional short transcripts were detected only in the rnc mutant. Thus, 9 RNase III cleaves Spot 42 and micA dsRNAs, which regulates their stability. Abundant  ldrD encodes for a small toxic peptide and the antisense rdlD is an antitoxin small RNA that form a Type I TA system (19). This locus (5,553 RPM) was not identified as forming dsRNA 1 by Lybecker et al. (22). However, a ~21 nt dsRNA peak is present within the overlapping 2 region of rdlD and ldrD (Fig. 3d). The expression level and half-life of the full-length transcript 3 of ldrD (ldrD long) and rdlD were both slightly increased in rnc mutant strains (Fig. 3d middle). 4 A stable smaller fragment of ldrD transcript (ldrD short), which accumulated during bacterial 5 growth, was detectable only in the rnc mutant (Fig. 3d right). In two other E. coli Type I TA 6 loci with overlapping sense and antisense RNAs, mokC/sokC and ibsD/sibD (45), the stability 7 of the toxin transcripts increased in rnc mutants ( Supplementary Fig. 6). By contrast, the 8 steady-state level and stability of opposite strand antitoxin small RNAs did not change much 9 in rnc mutant strains. In all three Type I TA loci, stable smaller sense RNA fragments were 10 detected only in the rnc mutants, which could either be alternative transcripts or degradation 11 products of the full-length transcript ( Fig. 3d and Supplementary Fig. 6). Our data suggest that 12 RNase III regulates the expression of toxins by cleaving dsRNAs formed with the toxin mRNA. 13 14 Together our data suggest that RNase III regulates the level and/or stability of some mature 15 small RNA transcripts, including both cis-acting (e.g. ryeA/ryeB and Type I TA loci) and trans-16 acting (e.g. Spot 42 and micA) small RNAs. 17 18

Top p19-captured dsRNA clusters in coding genes 19
The p19-captured dsRNA seq and RNA-seq reads of protein coding gene loci were also 20 mapped onto the annotated genome for E. coli MG1655 in the UCSC genome browser, together 21 with the published dsRNA (22) and TSS (38) predictions (Fig. 4). The Conway dataset was 22 used to mark full length transcripts, when available. Based on all the available data, all top p19-23 captured dsRNA clusters were classified as in previous publications (2,22,38) according to 24 whether the sense and antisense transcripts were divergent (5' overlap), convergent (3' overlap) or the coding sequence (CDS) overlapped entirely or almost entirely with the predicted 1 antisense transcript (Fig. 4, Table 2, and Supplementary Table 3). dsRNAs were assigned to 2 the latter class if >50% of the CDS was contained in p19-captured reads. A fourth category 3 was defined by abundant dsRNA clusters that did not overlap with previously annotated 4 antisense transcripts. Some clusters contained more than one predicted dsRNA. The last type of dsRNA involves dsRNAs arising from unannotated asRNA. Some of the 14 p19-captured dsRNA loci could not be assigned to divergent gene transcripts or known asRNA, 15 suggesting they arise from uncharacterized asRNA. One example is the C2-43 locus that 16 contains both yajO and dxs genes on one strand (Fig. 4). RNA-seq reads, corroborated by an 17 asTSS and Conway operon, suggest that there is an antisense transcript (opposite to yajO and predicts an antisense transcript in secY ( Supplementary Fig. 7b). 24

Confirmation and characterization of the rsd antisense transcript 1
To verify the sequencing results, the rsd gene, which was amongst the 3 most abundant 2 coding gene p19-captured dsRNA clusters in both S1 and S2, was analyzed by Northern 3 blotting ( Supplementary Fig. 8). Antisense reads, which overlapped with the 5'-end of the rsd 4 transcript by RNA-seq, may have originated from divergently oriented antisense transcripts 5 that could be the transcript of an adjacent gene, nudC. dsRNA could have been formed between 6 the 5' UTR of a nudC transcript and the 5'-end of an rsd transcript. This locus resembles the 7 "excludon" in Listeria, where two operons on opposite strands overlap at 5' ends (25). A faint 8 and smeary ~500 nt signal for rsd sense RNA (coding sequence is 477 nt) was detected in both 9 WT and rnc mutant at similar levels ( Supplementary Fig. 8a). Two more abundant shorter rsd 10 sense transcripts and similarly sized antisense transcripts between 150 and 300 nt in length 11 were detected only in the rnc mutant, suggesting that the sense and antisense RNAs formed 12 dsRNAs that were degraded by RNase III. The rsd asRNA was less abundant in bacteria 13 deficient in both RNase III and rpoS, which encodes a general stress response sigma factor that 14 induces gene expression in stationary phase, suggesting that transcription of the asRNA may 15 have been induced by RpoS. 16 17 Next, 5' RACE was used to identify the TSS of the antisense transcript using rnc mutant 18 bacteria. asRNAs with two potential asRNA TSS were cloned that were located opposite to the 19 5' region of rsd ( Supplementary Fig. 8b). A putative promoter sequence associated with the 20 more upstream of the two potential asRNA TSS (corresponding to 3 of 4 clones) was identified 21 and assessed in a lacZ reporter system, together with a construct with predicted inactivating 22 mutations. As expected, the WT antisense promoter drove lacZ expression, and the promoter 23 activity was reduced 2.5-fold in the promoter mutant (p-value<0.0001). To confirm our 24 identification of the asRNA promoter, HA-tagged rsd reporter plasmids containing the WT or mutated antisense promoter (synonymous mutation for rsd) were introduced into WT and rnc 1 mutant E. coli and expression of rsd sense and antisense transcripts and Rsd-HA protein were 2 assessed by Northern blot and immunoblot, respectively ( Supplementary Fig. 8c). Mutation of 3 the antisense promoter reduced at least one of the short sense rsd transcripts and the two 4 antisense transcripts. These findings confirm the identification of the antisense promoter of rsd 5 asRNA. There was also a suggestion that the full-length Rsd transcript and protein levels might 6 be slightly reduced. However, we cannot exclude the possibility that the antisense promoter 7 mutation might affect the stability of the rsd sense transcript. These data suggest that Rsd might 8 be subtly regulated by its asRNA, but we were unable to show that this difference has any 9 biological significance on cell growth or survival. (~300 nt) cspD RNA increased in the rnc mutant, suggesting that cspD is a direct target of 19 asRNA and that regulation depends on RNase III (Fig. 5a, b). A slightly shortened cspD sense 20 RNA of the same size as the cspD asRNA was detected only in the rnc mutant. The length of 21 the short sense RNA and asRNA were roughly equal to the length of the region covered by 22 dsRNAs. These data suggest that dsRNAs containing the overlapping region of the sense and 23 antisense RNAs accumulated in the rnc mutant. A cspD dsRNA was also identified by dsRNA 24 antibody pulldown at approximately the same location (22). The RNA-seq reads of cspD RNA increased ~2-3-fold in rnc-14 and rnc-38 mutants ( Fig. 5c and Supplementary Table 2), 1 consistent with the Northern blots (Fig. 5a). Quantitative proteomics also found increased 2 CspD in the rnc-14 mutant ( Fig. 5d and Supplementary Table 4). These data suggest that cspD 3 mRNA and protein expression are reduced by asRNA in a RNase III-dependent manner.  Table 4). When the relative changes in protein levels in each rnc mutant were compared to WT 12 levels, changes in protein abundance in the mutants and the abundance of p19-captured 13 dsRNAs in WT bacteria were not correlated. The levels of proteins encoded by genes producing 14 abundant dsRNAs did not consistently increase in rnc mutants, although CspD increased 15 (marked in red in Supplementary Fig. 9a). These findings suggest that RNase III cleavage of 16 sense-asRNA duplexes does not globally impact protein abundance under homeostatic 17 conditions. 18

19
To identify individual proteins that might be affected by RNase III deficiency, the ratio of 20 protein abundance in both rnc mutant strains relative to WT in exponential phase was plotted 21 ( Fig. 6c, Supplementary Fig. 9b). Several proteins were consistently upregulated (YjhC, GabD 22 AceA, AceB) or downregulated (SodA) in both rnc mutants in exponential phase samples. 23 These proteins are involved in glycolysis and antioxidant responses. 24

RNase III mutants have increased cell death in late stationary phase 1
To begin to determine whether RNase III cleavage of dsRNA has biological significance, 2 we used J2 antibody to assess the abundance of dsRNA during bacterial growth by 3 immunoblotting equal amounts of electrophoresed RNA (Fig. 6a, Supplementary Fig. 1b). 4 dsRNAs >100 bp in length accumulated in rnc-14 and rnc-38, but not WT cells, as bacteria 5 entered stationary phase. The relative amount of dsRNAs in the rnc mutant strains increased 6 with cell density, suggesting that more dsRNAs were formed during stationary phase that were 7 degraded by RNase III in WT cells. LIVE/DEAD staining of WT and rnc mutant strains 8 showed no significant difference in death in early culture (18 hr Here we developed a method to capture endogenous small dsRNAs (~21 bp) by ectopic 14 expression of tombusvirus p19 in E. coli. Deep sequencing of p19-captured dsRNAs and total 15 rRNA-depleted RNA suggested that clusters of short dsRNAs arise from long duplexes formed 16 by overlapping sense and antisense transcripts that are processed into short dsRNAs by RNase 17 III. p19 capture did not appear to introduce any substantial sequence bias, but stabilized labile 18 dsRNA products to enable us to detect dsRNA with high sensitivity. asRNAs were transcribed 19 from most genes, as previously noted (2,22,49), but with a wide-range of abundance. The 20 abundance of captured dsRNAs correlated with asRNA reads. Although some of the less 21 abundant asRNA and dsRNA may represent transcriptional noise, the most abundant p19-22 captured dsRNA clusters we identified agreed well with asRNA identified in other studies by 23 deep sequencing, assignment of antisense transcription start sites (38) and operons (39), and 24 dsRNA captured with anti-dsRNA antibody (22) and are likely the result of intended transcription. Our method confirmed hundreds of previously identified asRNAs and identified 1 potentially hundreds of new such loci. One advantage of p19 capture is that it was performed 2 in WT bacteria, potentially avoiding secondary effects caused by RNase III deficiency in 3 RNase III mutant cells used in some studies (21, 22). Our data should provide a valuable 4 resource for studying asRNA in E. coli and the method could be readily adapted to study 5 asRNA in other species. The p19-captured dsRNA and RNA deep sequencing data have been 6 formatted for convenient viewing in the UCSC genome browser (in Supplementary data file 1-7 4 in bedGraph format). pairing of sense and antisense transcripts, but can also cleave structured RNAs that contain 12 perfectly or imperfectly paired double-stranded regions (e.g. rRNA precursor (31) and R1.1 13 RNA of T7 phage (50)). Therefore, some changes in RNA level and half-life in rnc mutant 14 bacteria are due to RNase III degradation of dsRNAs generated from pairing of sense and 15 antisense transcripts, but others may be due to RNase III cleavage of structured RNAs. There 16 is no simple way to separate the antisense dependent effects of RNase III. However, p19 only 17 binds perfectly paired dsRNAs, such as would arise from antisense transcripts pairing with 18 sense transcripts, but not imperfect duplexes that would arise in structured regions of RNA that 19 might also be substrates of RNase III, providing a specific way to capture antisense transcripts. 20 Supplementary The most abundant p19-captured dsRNA clusters, which were mostly found in other studies 1 and least likely to be caused by transcriptional noise, were confirmed by Northern blotting and 2 studied in more detail. Most of these clusters arose from divergent transcripts from overlapping 3 5' regions of operons on opposite strands. RNase III deficiency for most of the abundant 4 clusters increased the quantity and/or half-life not only of the antisense transcript, but also of 5 the corresponding sense transcript, suggesting that asRNA transcription and RNase III 6 degradation of dsRNAs promote more efficient sense RNA decay. Moreover, shorter asRNAs 7 were generally detected only in RNase III-deficient bacteria. dsRNAs accumulated as RNase 8 III-deficient bacteria exited exponential growth in stationary phase when RNase III-deficient 9 bacteria were significantly more prone to undergo cell death. However, despite widespread 10 antisense transcription, unbiased quantitative proteomics did not indicate global changes in 11 protein levels that correlated with the abundance of dsRNAs at a locus. This finding suggested 12 that antisense transcription might not play a large role in regulating protein levels and bacterial 13 physiology under most conditions. However, a few proteins associated with antioxidant 14 responses and glycolysis reproducibly were altered in two RNase III mutant strains and the 15 mRNA and protein level of CSP CspD increased in the mutant strains. These proteins might 16 be more important in stressed conditions. Moreover, many of the most abundant p19-captured 17 dsRNA clusters, including ryeA/ryeB, the toxin-antitoxin genes, and tRNAs, might also be 18 particularly important during stress. These findings, when considered with the increased death 19 in late stationary phase of RNase III mutant bacteria, suggest that asRNAs and degradation of 20 dsRNAs, especially those regulating noncoding RNAs, might only become important when 21 nutrients are limiting or during other forms of environmental stress. However, we cannot 22 exclude the possibility that RNase III activity on structured RNAs could be responsible for the 23 increased death of rnc mutants in late stationary phase. Additional work is required to 24 determine whether antisense transcription has any physiological role and under what conditions. 1 A few well studied small regulatory RNAs generate abundant dsRNAs (Table 1). Our data 2 demonstrate that RNase III cleaves all three families of cis-acting Type I TA class small RNAs 3 including ldrD/rdlD, mokC/sokC, and ibsD/sibD (Fig. 3d, Supplementary Fig. 6), suggesting 4 that one way RNase III may regulate bacterial survival and physiology is by controlling toxic 5 proteins. The abundance and stability of the spf small RNA were also shown to be regulated 6 by RNase III. Its maturation might require RNase III processing ( shortened RNAs as decay intermediates (model in Supplementary Fig. 10). RNase E is a 5'-19 end-dependent (54) single-stranded endoribonuclease that acts on most E. coli mRNAs. After 20 the mRNA is cut, the newly formed 3'-end can then be degraded by PNPase, which also acts 21 only on single-stranded and nonstructured RNAs. PNPase would stall at 5' ends of overlapping 22 sense and antisense transcripts or at structured regions. We propose that RNase III clears these 23 stalled dsRNAs. The shortened RNA products that accumulate in rnc mutants in small RNA 24 loci like ryeA/ryeB (Fig. 3b) or in coding genes like rsd ( Supplementary Fig. 8a), might be stalled products of RNase E and PNPase that are further degraded by RNase III. RNase III 1 prefers to cleave dsRNAs into ~14-bp or shorter fragments, which can then be completely 2 degraded into mononucleotides by the concerted actions of RNA helicases, PNPase and 3 oligoribonuclease (55, 56). Lack of clearance of dsRNA intermediates, which accumulate in 4 stationary phase in rnc mutants, could be toxic and lead to increased death of rnc mutant 5 bacteria in this phase. dsRNAs are also sensed by innate immune receptors in eukaryotes, 6 which might be costly for bacteria during infection. However, further work is required to 7 explore these conjectures. 8 9 cspD appears to be a rare example of RNase III-regulated protein production through a cis-10 acting asRNA (Fig. 5). RNase III might be essential for degrading cspD sense mRNA. cspD 11 asRNA covers a substantial region of the sense RNA and the dsRNA might mask cleavage sites 12 of other RNases (e.g. RNase E) and stabilize the cspD sense RNA. A similar mechanism in 13 which asRNA stabilizes sense RNA and impedes RNase degradation has been described for 14 the gadY small RNA, which stabilizes overlapping gadX mRNA (57  Fig. 2, (28)), strongly suggest that E. coli RNase III has some intrinsic sequence bias. We previously demonstrated that p19-captured dsRNA hot spot patterns were 1 not due to cloning bias (28). Surprisingly, cleavage bias also seemed likely for human Dicer in 2 our in vitro digestion assay (E4 in Supplementary Fig. 2), although current models suggest that 3 Dicer cuts dsRNAs from the 3'-end and in a phased manner without bias (60). However, a few 4 recent studies have shown sequence preferences for RNase III class enzymes, including Mini-5 III in Bacillus subtilis (61), yeast Rnt1p (62), and dicer-like enzymes in Paramecium (63). A 6 GC bias was also found in plant viral-derived siRNAs (64). Further analysis of dsRNAs in E. 7 coli and other bacterial species may help to unravel the mechanisms underlying sequence bias 8 of RNase III class enzymes. Of note, RNase III is required for making guide RNAs for the 9 bacterial CRISPR system and any sequence bias of RNase III could potentially influence the 10 selection of genes targeted by CRISPR. 11 12 In summary, our study presents a new method for identifying and studying asRNA in 13 bacteria that could also be adapted to eukaryotic studies. p19-captured dsRNA clusters mark  Table 6. Ectopic expression of p19 and dsRNA isolation were 2 based on our previous methods (28, 29) with modifications described in SI Materials and 3 Methods. Detailed protocols for bacterial culture, total RNA extraction, small RNA and total 4 RNA deep sequencing, bioinformatic analysis, RNA immunoblot, Northern blot, Western blot, 5 5' RACE, lacZ reporter assay, proteomics, and statistics are included in SI Materials and 6 Methods.     p19-captured dsRNA seq (S1) b a Pearson's r = 0.059 Pearson's r = 0.705 p19-captured dsRNA seq (S1 vs S2)   p19-captured dsRNA sequencing reads (both S1 and S2) and RNA sequencing reads from a WT E. coli are plotted in UCSC genome browser. dsRNA identified by Lybecker et al is marked by a red bar. We have arbitrarily predicted potential overlapping transcripts that could give rise to p19-captured dsRNAs and classified all p19-captured dsRNA clusters (Supplementary Table 3 Supplementary Fig. 9b). n.s.: not significant, which means protein fold change did not pass the cutoff for significance.   at 37°C with shaking at 250 rpm and antibiotics when required were used at the following concentrations: carbenicillin (100 µg/ml), kanamycin (10 or 25 µg/ml), and tetracycline (12.5 µg/ml).

E. coli total RNA extraction
For each 5 ml of E. coli culture, 5 ml of cold methanol was added to the sample immediately after harvesting in order to stabilize RNA, and the sample was kept on ice for processing. After IPTG was added at 0.5 mM for 1 h (final OD600 of the culture was 2.0). Total RNAs were extracted as described above and p19 magnetic beads (from p19 miRNA Detection Kit, E3312, NEB) were used to pull down small RNAs from total RNAs (isolated from 20 ml of bacterial culture) as previously described (2).

Northern blotting
Northern blotting was performed using two methods. Method 1 used a 5% TBE-Urea polyacrylamide RNA gel cast using the Bio-Rad Mini-PROTEAN Tetra Cell system (2). RNA samples (15 µg total RNA) were heated to 95°C for 5 min in Gel Loading Buffer II (AM8546G, Ambion) and immediately placed on ice until gel loading. Electrophoresis was performed at room temperature and the gel was run at 150 V for about 1 h. Gels were stained with SYBR-Gold (S11494, Invitrogen) and then transferred to a Hybond-N+ Membrane (RPN303B, Amersham) by RNA half-lives were calculated using the slope of a linear trendline fitted from the normalized intensity of hybridization bands.

E. coli total RNA deep sequencing
Total RNAs were extracted as described above and ribosomal RNAs were removed using bacterial Ribo-Zero rRNA Removal Kit (MRZMB126, Epicentre) following the manufacturer's protocol.

RNA immunoblot
Total RNA (10 μg) was separated by native electrophoresis using mini-sized homemade 5% polyacrylamide TBE gels and a Bio-Rad Mini-PROTEAN Tetra Cell system. RNA samples were prepared in Gel Loading Buffer II (AM8546G, Ambion) and electrophoresed at room temperature.
RNA was blotted onto a Hybond-N+ Membrane (RPN303B, Amersham) by capillary transfer overnight, and then UV-crosslinked. The membrane was first incubated with anti-dsRNA J2 antibody (used at 1:1,000, Scicons) in PBS buffer containing 5% BSA overnight at 4°C. HRPconjugated anti-mouse secondary antibody was used at 1:10,000 and the signal was visualized using SuperSignal West Pico Chemiluminescent Substrate (34580, Thermo Scientific).   Table 6) were cultured in quintuplicate in 200 μl of LB in a 96 well plate at 900 rpm, 80% humidity, 37C, in a multitron shaker. At an OD600 of 0.6, cells were lysed and processed for the -galactosidase assays using microtiter plates and a microtiter plate reader, and Miller units were calculated as described (3).

Statistics
Significance of differences between two samples was calculated using Student's T-test.
Significance of the difference between two correlation coefficients, based on Fisher r-to-z transformation, was calculated using an online tool: http://vassarstats.net/rdiff.html.