CRISPR screens in the era of microbiomes

Recent advances in genomics have uncovered the tremendous diversity and richness of microbial ecosystems. New functional genomics methods are now needed to probe gene function in high-throughput and provide mechanistic insights. Here, we review how the CRISPR toolbox can be used to inactivate, repress or overexpress genes in a sequence-speciﬁc manner and how this offers diverse attractive solutions to identify gene function in high-throughput. Developed both in eukaryotes and prokaryotes, CRISPR screening technologies have already provided meaningful insights in microbiology and host-pathogen interactions. In the era of microbiomes, the versatility and the functional diversity of CRISPR-derived tools has the potential to signiﬁcantly improve our understanding of microbial communities and their interaction with the host. CRISPR-based tools amenable to high-throughput screening. The Cas9 nuclease can be used to introduce mutations in a target sequence through error-prone DNA repair by the NHEJ pathway or through homologous recombination with a custom DNA template. The dCas9 protein can be used to inhibit gene expression by binding DNA and blocking the RNA polymerase. Transcription can be activated by fusing dCas9 with a transcription activator domain. New base-editing technologies modify nucleotides without introducing DNA breaks by exploiting the fusion of dCas9 or Cas9 nickase (nCas9) to an adenine or cytosine deaminase. Another method uses a retron system to produce multi-copy single- stranded DNA (msDNA) used as an editing template. Finally, prime editing uses an engineered guide RNA and a fusion between nCas9 and a reverse transcriptase to introduce mutations encoded in the guide sequence. distribution in each bin. Finally, cells can be imaged inside a microfluidic chamber before in situ genotyping by FISH. the properties dCas9 in an unbiased way design rules robust CRISPRi in This study describes a universal CRISPRi system transferable by con- jugation into a broad range of bacterial species. This valuable toolbox expands the use of CRISPRi to members of the microbiota. This preprint describes the ﬁrst CRISPRi screen employed in vivo in bacteria, investigating the population bottlenecks during pneumococcal infection in mice and identifying pneumococcal genes required for pathogenicity.


Introduction
The next-generation sequencing (NGS) revolution opened the genomics era and is now delivering massive amounts of DNA sequences into databases, enabling the characterization of microbial communities across Earth ecosystems. In particular, the human microbiome was estimated to comprise 500-1000 species per individual, representing a gene repertoire much more diverse than the human genome itself (reviewed in Ref. [1]). The investigation of gene function is now critical to make sense of this data. Because the vast majority of genes has no experimental characterization, unravelling new biological functions and identifying essential genes is key for our understanding of microbial ecosystems as well as for biotechnology and drug discovery. Despite significant advances in comparative genomics, novel experimental methods are still required to characterize gene function in a high-throughput manner.
The discovery of CRISPR systems as bacterial and archaeal adaptive immune systems [2] was a true paradigm shift in microbiology and genomics. CRISPR systems provide acquired resistance to invading genetic elements by cleaving their target (DNA or RNA) in a sequence-specific manner. CRISPR-Cas systems show a remarkable diversity in their mode of action and genetic components [3]. The type-II effector Cas9 was the first characterized single-component nuclease that can specifically cleave DNA upon base pairing between a guide RNA and the target sequence [2]. Thanks to its portability and specificity, the Cas9 nuclease from Streptococcus pyogenes was quickly repurposed into a variety of programmable tools to inactivate genes [4,5] or change their expression [6][7][8] (Figure 1). Taking advantage of the NGS revolution, powerful genetic screens have recently been developed both in eukaryotes and in prokaryotes. A range of insightful results have broadened our understanding of gene function and host-pathogen interactions while providing improved design rules for robust screening. In this review, we define the principles of CRISPR-based screening techniques and show how CRISPR screens can provide meaningful insights in the study of microbial communities.

CRISPR-mediated mutational screening
Shortly after its discovery, the idea of using CRISPR-Cas as a gene-editing tool rapidly emerged, both in eukaryotes [9,10] and in prokaryotes [5]. Upon dsDNA cleavage, eukaryotic cells are able to repair breaks by the non-homologous end-joining (NHEJ) pathway, a highly error-prone process that frequently introduces indels leading to frameshifts and gene knockout. Using computational models [11,12], optimized single-guide RNA (sgRNA) libraries can be designed to target tens of thousands of genomic locations in order to perform loss-of-function screens. While the number of available target positions in a genome would in principle enable the design of much larger libraries, experimental constrains such as the capacity of on-chip oligonucleotide synthesis, the number of cells and the sequencing depth required to maintain a good coverage have so far prevented the use of much larger libraries. Transfection of such libraries results in a population where each cell has a different mutation [13][14][15]. The sgRNA serves both as an editing tool and as a DNA barcode to monitor the abundance of each mutant by deep sequencing. The variation in sgRNA distribution over a selection step is used as a proxy for the fitness of each mutant in the tested condition. CRISPR/Cas9 screens were used to decipher how pathogenic bacteria interact with their host through the identification of host genes required for bacterial invasion [16 ,17] and susceptibility to toxins and secretion systems from various pathogens [15,[18][19][20][21][22]23 ]. As an example, a pool of CRISPR-edited epithelial cells was infected with enterohemorrhagic Escherichia coli (EHEC), a shiga-toxin-producing strain expressing a type-III secretion system [16 ]. While most wild-type epithelial cells are killed, resistant mutants are enriched after several rounds of infection. Mutations in genes involved in the synthesis of the shiga-toxin receptor globotriaosylceramide (Gab3) provided strong resistance. This screen also identified two proteins of unknown function that were demonstrated to be required for Gab3 biosynthesis.
In contrast with eukaryotic cells, cell death is the main outcome of DNA cleavage in prokaryotes rather than gene knockout mediated by NHEJ [5,24]. Editing by homologous recombination requires the use of a template DNA harboring the desired mutation which limits the ease-of-use of this approach in high-throughput. This was nevertheless achieved in E. coli in a method called CRISPR-enabled trackable genome engineering (CRE-ATE) based on the parallel automated design of tens of thousands of recombination cassettes [25]. This method was used to generate saturated mutant libraries of genes of interest in order to identify mutations that confer an increased resistance to antibiotics or chemicals.  CRISPR-based tools amenable to high-throughput screening. The Cas9 nuclease can be used to introduce mutations in a target sequence through error-prone DNA repair by the NHEJ pathway or through homologous recombination with a custom DNA template. The dCas9 protein can be used to inhibit gene expression by binding DNA and blocking the RNA polymerase. Transcription can be activated by fusing dCas9 with a transcription activator domain. New base-editing technologies modify nucleotides without introducing DNA breaks by exploiting the fusion of dCas9 or Cas9 nickase (nCas9) to an adenine or cytosine deaminase. Another method uses a retron system to produce multi-copy singlestranded DNA (msDNA) used as an editing template. Finally, prime editing uses an engineered guide RNA and a fusion between nCas9 and a reverse transcriptase to introduce mutations encoded in the guide sequence.

CRISPRi screening in bacteria
Beyond its potential for genome editing, the CRISPR-Cas9 system was repurposed as a DNA-binding nucleoprotein complex by abrogating the nuclease catalytic functions of Cas9, yielding dead-Cas9 (dCas9) [6][7][8]. As a result, dCas9 can bind its target and block either the initiation of transcription or the passage of the RNA polymerase, resulting in a sequence-specific gene silencing tool called CRISPR-interference (CRISPRi) [26]. Note that the protospacer adjacent motif (PAM) of CRISPR effectors varies in terms of GC content and complexity. Since the PAM dictates the target frequency in a given genome, different Cas effectors may be used to increase the possible target space in organisms with various GC content. As a consequence of the polycistronic structure of bacterial mRNAs, dCas9-mediated repression of a gene in an operon is polar towards downstream genes. While this can be seen as a limitation of CRISPRi in bacteria, genes from an operon often belong to the same metabolic pathway or protein complex, making CRISPRi a valuable tool for the study of multi-gene pathways. Note that the degradation of prematurely terminated transcripts can in some cases lead to a reverse polar effect where the repression of a target gene also reduces the expression level of upstream genes. The strength of this effect seems to depend on the organism but appears to be typically small to non-detectable [26].
Libraries of sgRNAs can be custom-designed to target an entire genome or a subset of genes of interest ( Figure 2a). The analysis of the first genome-wide screens enabled the investigation of the properties of CRISPRi and provided design rules to optimize CRISPRi screens in bacteria. In particular, a surprising sequence-specific toxicity of S. pyogenes dCas9 termed 'bad-seed' effect was identified in E. coli [27 ]. Guides sharing specific 5-nucleotide motifs in their seed sequence are toxic regardless of the rest of the guide for a reason that remains to be elucidated. Another study observed that a high level of dCas9 expression induces morphological defects [28]. This toxicity is however alleviated by tuning down dCas9 concentration [27 ,29] while maintaining a strong on-target activity. Genome-wide screens also showed that some sgRNAs can direct dCas9 to bind other genomic positions sharing partial homology to the sgRNA sequence in a mechanism called off-target activity. While off-targets are a major caveat in mammalian cells, the probability of such events is much lower in bacteria due to a $1000-fold difference in genome size. This makes it easy in bacteria to design sgRNAs that lack any extensive complementarity to an off-target position. However, while extensive complementarity is required for Cas9 cleavage, dCas9 is able to block the expression of an off-target gene with as little as 9 nucleotides of identity to its promoter [27 ]. Care should thus still be taken when designing sgRNAs for dCas9 in bacteria, and it is preferable to rely on multiple guides per gene. Apart from dCas9, a CRISPRi screen was also recently performed to identify the determinants of dCas12a binding at off-target positions [30].
CRISPRi screens have also revealed an important variability in the repression efficiency mediated by different guides and can be used to identify determinants of sgRNA activity [31 ]. These design rules enable a reduction in the library size that is required to obtain robust screening results by ensuring the selection of guides with high activity and specificity. Using libraries of 10 4 to 10 5 sgRNAs, CRISPRi screens can perform comparably to transposon sequencing-based approaches (Tn-seq), a gold standard in high-throughput functional genomics, which typically require libraries larger by an order of magnitude to obtain a good genome coverage [32 ,33 ]. Smaller libraries make it easier to run multiple experiments in parallel and to avoid population bottlenecks, while decreasing sequencing costs.
An interesting feature of CRISPRi is that the level of repression can be modulated through the introduction of mismatches between the guide and the target in order to explore the entire range of protein expression [7,34,35 ].
In a recent study in E. coli and Bacillus subtilis, a model was built to predict how specific mismatches affect the repression efficiency of a guide [35 ]. A library of mismatched sgRNAs targeting essential genes was then used to investigate the relationship between their expression level and bacterial fitness.
CRISPRi screens have now provided insightful results in microbiology and have been performed in a few bacterial species. The first CRISPRi screens in bacteria [36,37] were performed in array, that is, by measuring the phenotype of each knockdown individually (Figure 2b). In this case, a large set of readouts are accessible such as growth characteristics or morphology, and knockdowns of interest can easily be retrieved for further study. CRIS-PRi components are typically expressed in an inducible manner so that the library can be maintained in a noninduced state, enabling the study of essential genes. Reports in B. subtilis [36], in Streptococcus pneumoniae [37] and more recently in Mycobacterium smegmatis [38] combined growth measurements with high-throughput microscopy to study the morphological defects associated with the depletion of essential proteins. In addition, the target of a compound inducing cell wall damage in B. subtilis could be identified [36]. Another study in Vibrio cholerae analyzed the phenotypes associated with lipoprotein transport depletion [39]. Finally, an arrayed CRISPRi screen was also used in E. coli to identify phosphatases which repression increases terpenoid production [40].
The individual cloning and phenotyping of each library member naturally limits the scale of arrayed CRISPRi screens. In contrast, pooled screens are typically Bacterial CRISPR screens Rousset and Bikard 73

Current Opinion in Microbiology
Overview of CRISPRi screening methods in bacteria. An sgRNA library is computationally designed to target a whole genome or a subset of genes of interest (a). A number of design rules can be taken into account such as the predicted repression efficiency and off-target activity. Each sgRNA can be synthesized and cloned individually, resulting in an arrayed library of CRISPRi knockdowns. Alternatively, large libraries are synthesized onchip and cloned at once, yielding a pooled library of CRISPRi knockdowns. In the case of arrayed libraries, each knockdown is assayed individually (b). Growth capacity can be measured with growth curves or colony size measurement in various conditions and the cell morphology can be obtained by high-throughput microscopy. In the case of pooled screens, cells compete in various conditions and next-generation sequencing of the sgRNA library is used as a readout (c). The bacterial population can be grown in various media or selected with different stresses. The fitness of each sgRNA is obtained by comparing the read counts before and after the experiment or with and without dCas9 induction.Cells can also be sorted by the expression level of a fluorescent reporter. The impact of sgRNAs on reporter expression can be estimated from their distribution in each bin. Finally, cells can be imaged inside a microfluidic chamber before in situ genotyping by FISH.
conducted on a larger scale since the use of NGS as a readout enables a significant increase in throughput. All cells compete against each other in a defined growth condition or stress. Alternatively, one can screen phenotypes linked to the expression of a fluorescent reporter through cell sorting followed by sequencing (Figure 2c). In the most straightforward assay, a pooled library can be grown in rich medium to identify essential genes. This approach was recently used in E. coli, both in the lab strain K-12 [31 ,32 ,33 ,41] and in a collection of natural isolates [42 ], as well as in Synechocystis [43 ], in mycobacteria [44] and in Vibrio natriegens [45 ] where Tn-seq was not applicable because of a low transposon insertion rate. In pooled genetic screening approaches, the identification of enriched genotypes (positive selection) is typically easier than that of depleted genotypes (negative selection). Indeed, under negative selection, the effect size is limited by the initial coverage which can be quite low when using large libraries, while under positive selection there is no limitation to the effect size which can become very large when a strong selection is applied. To improve the sensitivity of negative selection screens, a method using a TIMER fluorescent protein was developed to enrich and isolate slow-growing bacteria through cell sorting [46].
Other phenotypes can also be investigated. Screens performed in E. coli identified host factors facilitating infection by bacteriophages, that is, host genes providing phage resistance once inactivated [33 ]. Pooled CRIS-PRi screens were also used in the identification of genes influencing the tolerance or production of chemicals of biotechnological interest [32 ,43 ], or decoupling bacterial growth from protein production [47]. Current limitations of pooled approaches include the difficulty to retrieve individual knockdowns and to study cooperative phenotypes. This caveat has recently been tackled using droplet microfluidics to identify knockdowns that increase L-lactate production yield in cyanobacteria [43 ]. Another study combined microfluidics with in situ genotyping to assess complex phenotypes of individual knockdowns in high-throughput [48 ]. The authors performed time-lapse microscopy on a pooled library of 235 knockdowns to determine their division size and track the replication fork in each cell. After phenotyping, the genotype of each cell can be determined by sequential FISH to a RNA barcode.
All pooled screening approaches rely on on-chip oligo synthesis, a process that limits the available library size for technical and economic reasons. A new method was recently developed to generate crRNA libraries directly inside cells by repurposing the CRISPR adaptation machinery from S. pyogenes [49 ]. Comprehensive and extremely large libraries can easily be produced at a low cost and were employed to explore aminoglycoside resistance in Staphylococcus aureus.
A significant challenge now lies in making CRISPRi screens available to most bacterial species. An important challenge will be to ensure the transfer of large libraries to bacteria of the microbiome without introducing bottlenecks. Significant progress was made with the development of Mobile-CRISPRi [50 ], a toolbox of modular CRISPRi components that can be transferred by conjugation with a broad host range and integrated at conserved genomic sites in the chromosome of diverse bacteria. This system paves the way for the study of non-model gut bacteria and was recently used in vivo [51]. A strategy currently used to link genes to colonization/virulence phenotype in high-throughput relies on the gavage of transposon mutant libraries in mice, followed by the analysis of depleted insertions from bacteria recovered in feces. This systematic investigation of bacterial gene function in vivo has already yielded important insights (recently reviewed in Ref. [52]), showing that many genes dispensable for growth in vitro are necessary in the animal gut, and that the genes necessary for fitness in vivo strongly depend on the other bacteria present in the environment. However, the engraftment and maintenance of a dense transposon library with a sufficient coverage remains challenging. CRISPRi screens should provide an interesting alternative thanks to the decrease in library size. In this context, a recent study reports the first in vivo CRISPRi screen in S. pneumoniae to investigate bottlenecks occurring during infection in mice and identified genes whose essentiality differs between in vivo and in vitro conditions [53 ].

Conclusions and perspectives
CRISPR-based screening tools are currently gaining popularity in bacterial genomics. In particular, CRISPRi screens are becoming widely used for loss-of-function screening owing to several key advantages over previous techniques: CRISPRi is inducible, tunable to intermediate levels and can be multiplexed to repress multiple loci simultaneously [7,34,36,54]. In addition, libraries are custom-designed and smaller than with previous techniques, resulting in a substantial decrease in sequencing cost. A diversity of phenotypes are becoming available for screening, and recent progress in microfluidics-based phenotyping and in situ genotyping should address the current caveats of arrayed approaches.
In eukaryotes, the CRISPR-Cas system has also been used to perform gain-of-function screens through the activation of transcription by a dCas9 variant fused with an activator domain, in a method known as CRISPRa (CRISPR-activation) [6,18,55]. CRISPRa was also developed in bacteria [7] but this system has been limited by its average performance and by the very narrow sequence window in which dCas9 needs to bind to activate a downstream promoter. Recent improvements [56][57][58][59] could soon make CRISPRa available for genome-wide gain-of-function screening of bacterial operons.
While Cas9-mediated DNA breaks are toxic in bacteria, new CRISPR-derived tools introduce mutations without DNA cleavage and have great potential for high-throughput screening. In particular, base-editing can introduce CÁT or AÁG mutations through the fusion of dCas9 or a Cas9 nickase (nCas9) to a cytosine or adenine deaminase respectively [60,61] (Figure 1). The system has already been adapted to several bacterial species [62][63][64][65][66][67][68]. Another approach was recently proposed using a retron system [69,70] to generate single-stranded DNA templates that can recombine and modify the target. More recently, prime editing exploits the fusion of nCas9 to a reverse transcriptase to directly introduce a mutation programmed in the guide sequence [71]. Finally, two studies recently described the natural association of CRISPR systems with a transposition machinery. These systems catalyze the RNA-guided transposition of DNA in the target region [72,73]. We envision that this system will soon be repurposed to specifically insert DNA in high-throughput, not only enabling the generation of targeted gene knockouts but also the insertion of functional elements such as promoters or reporters.
The development of high-throughput genomics for nonmodel bacteria is just starting to explore the biodiversity of the microbial world and its incredible potential for biotechnological and medical applications. The ease-ofuse and cost-efficiency of CRISPR screens in bacteria should encourage microbiologists to take part in this exciting exploration.

32.
Wang T, Guan C, Guo J, Liu B, Wu Y, Xie Z, Zhang C, Xing X-H: Pooled CRISPR interference screening enables genome-scale functional genomics study in bacteria with superior performance. Nat Commun 2018, 9:2475 This work is one of the first examples of genome-wide CRISPRi screening to identify essential genes in rich and minimal media. The authors also exploit the biotechnological potential of CRISPRi to identify genes involved in the tolerance to isobutanol and furfural.

33.
Rousset F, Cui L, Siouve E, Becavin C, Depardieu F, Bikard D: Genome-wide CRISPR-dCas9 screens in E. coli identify essential genes and phage host factors. PLoS Genet 2018, 14: e1007749 Together with [32], this study shows how CRISPRi screens can be exploited to identify essential genes. The method is further employed to identify host factors for phage infection. In this preprint, the authors build a computational model to predict the effect of mismatches on the repression efficiency by dCas9. They further exploit it to show that the relationship between cellular fitness and the expression level of essential genes is conserved between E. coli and B. subtilis, two bacterial species sepatared by two billion years of evolution.