Cre-mediated, loxP independent sequential recombination of a tripartite transcriptional stop cassette allows for partial read-through transcription.

One of the widely used applications of the popular Cre-loxP method for targeted recombination is the permanent activation of marker genes, such as reporter genes or antibiotic resistance genes, by excision of a preceding transcriptional stop signal. The STOP cassette consists of three identical SV40-derived poly(A) signal repeats and is flanked by two loxP sites. We found that in addition to complete loxP-mediated recombination, limiting levels of the Cre recombinase also cause incomplete recombination of the STOP cassette. Partial recombination leads to the loss of only one or two of the three identical poly(A) repeats with recombination breakpoints always precisely matching the end/start of each poly(A) signal repeat without any relevant similarity to the canonical or known cryptic loxP sequences, suggesting that this type of Cre-mediated recombination is loxP-independent. Incomplete deletion of the STOP cassette results in partial read-through transcription, explaining at least some of the variability often observed in marker gene expression from an otherwise identical locus.


Introduction
The Cre-loxP recombination system derived from bacteriophage P1 is one of the most popular site-specific recombination tools. This system serves bacteriophage P1 to properly segregate P1 plasmids to daughter cells at cell division [1][2][3]. When a pair of directly repeated 34 bp loxP recombination sites is located on the same piece of DNA, P1 Cre recombinase excises the intervening DNA, releasing a circular DNA containing one loxP site and leaving the second loxP site behind [4]. Since its discovery, a vast number of applications of this recombination system beyond bacteria have been developed in various species, including in mouse cell lines [5] and transgenic mice [6,7].
Transgenically modified Cre-responder mouse strains which express fluorescent reporter proteins are now widely used to label specific cell populations, for instance to localize tagged cells within specific organs or to trace them during development or disease (fate mapping) [8]. A popular variant of this approach is Cre expression profiling using Cre recombinase-mediated excision of a transcriptional STOP cassette, flanked by directly repeated loxP sites, which is placed between a strong promoter and a marker gene. While this technique has originally been developed for targeted oncogene activation [7], it can also be used to permanently activate easily detectable reporter genes. Knock-in strategies allowed for the insertion of such conditional reporter genes into the permissive Rosa26 locus [9], providing a highly versatile tool to visualize the cell-type specificity of virtually any kind of promoter/ Cre combination, following crossing of a Cre mouse strain with the reporter mouse strain. Using this approach, Madisen et al. [10] developed a set of Cre reporter mice, including the Ai14 strain, that employed a previously reported STOP cassette containing three identical 254 bp poly(A) signal repeats derived from the SV40 virus [11].
We recently generated a novel BAC transgenic mouse model where over 200 kb of the mouse Epo locus have been used to drive the expression of the conditional (i.e. tamoxifen-inducible) Cre ERT2 protein [12]. These animals were crossed with Ai14 reporter mice, containing the tandem dimeric Tomato (tdT) fluorescence reporter gene preceded by the loxP-flanked STOP cassette. Following exposure to tamoxifen and oxygen-deprived (hypoxic) conditions, this mouse model allowed for the permanent fluorescent tagging of renal Epo-producing (REP) cells [12]. During the isolation and cloning of REP-derived cell lines, we noticed lower tdT fluorescence in some of the cells. This finding was unexpected since tdT is expressed from the same endogenous singlecopy gene locus in all cells. We hence sought for an explanation of these results and found a novel Cre activity, repeatedly resulting in partial recombination of the STOP cassette.

Generation of renal cell lines
The generation of fibroblastoid atypical interstitial kidney (FAIK) cell lines derived from kidneys of 2 to 3 months old female Epo-Cre ERT2 mice crossed with Ai14 reporter mice has been described in detail previously [12]. Using the same protocol, we generated new REP-derived (REPD) cell lines from kidneys excised from homozygous Epo-Cre ERT2 mice containing one Ai14 and one Terminator allele in the Rosa26 locus. AB-REPD cells were obtained from a 2 months old male mouse 3 days after tamoxifen/0.1% CO induction of Cre in vivo. TK-REPD cells were obtained from non-treated 2.5 months old female mice, kept for 1 week in culture medium containing 10 μM tamoxifen and B27 supplement (Invitrogen, Carlsbad, CA, USA) and exposed twice for 16 h to 0.2% O 2 at day 1 and 7 before treatment with 100 ng/ml diphtheria toxin until only a few tdT positive viable cells remained. Clonal FAIK1-10 and polyclonal TK-REPD4 cell lines were immortalized by lentiviral transduction with SV40 large T antigen as described [12]. Clonal AB-REPD2-22 were immortalized by gammaretroviral transduction with a temperature sensitive SV40 large T (pLPCX SV40 tsA58; kind gift from Parmjit Jat, London, UK) and cultured at 33°C for expansion and at 37°C for 3 to 7 days before harvest to prevent temperature-dependent artifacts.

Cell analysis
Cellular tdT expression was analyzed by fluorescence microscopy or FACS. For fluorescence microscopy, live cells were analyzed on an Eclipse Ts2R inverted microscope (Nikon, Tokyo, Japan) and images were acquired using NIS-Elements imaging software (Nikon). Exposure time and gain were kept constant for all acquired images within an experimental series. For confocal images, OCT-embedded tissue sections were analyzed on a Leica SP5 Mid UV-VIS microscope (Leica Microsystems, Wetzlar, Germany) and images were acquired using LAS X software (Leica Microsystems). Images were processed and analyzed using the Fiji distribution of ImageJ [15,16]. For FACS analysis, cells were detached using 2 mM EDTA (pH 8.0) in PBS, pelleted and resuspended in PBS. Cells were analyzed on a LSRII Fortessa FACScanner (Becton Dickinson, Franklin Lakes, NJ, USA). tdT fluorescence was excited using a 561 nm laser and detected using a 586/15 nm filter. Dead cells were excluded by 4′,6-diamidino-2-phenylindole (DAPI; Sigma-Aldrich) nuclear staining, using a 405 nm laser and a 450/50 nm filter. Data was analyzed using FlowJo software (FlowJo, Ashland, OR, USA).

RNA analysis
RNA was extracted and quantified by reverse-transcription (RT) real-time quantitative (q) PCR as described previously [12]. In brief, RT was performed with 2 μg total RNA and AffinityScript reverse transcriptase (Agilent), and the cDNA quantified using SYBR Green qPCR reagent kit (Kapa Biosystems, London, UK) in a MX3000P light cycler (Agilent). Transcript levels were calculated by comparison with calibrated standard curves and normalized to mouse ribosomal protein L28 mRNA. Primers used for RT-qPCR are listed in Supplementary Table 1 and were purchased from Microsynth (Balgach, Switzerland).

DNA analysis
DNA was isolated as described previously [12] and 100 ng per reaction were amplified by PCR (30-35 cycles of 95°C/10 s, 64°C/30 s, 72°C/30 s) using KAPA2G Fast DNA polymerase (Kapa Biosystems) and the primers listed in Supplementary Table 1. PCR products were analyzed by agarose gel electrophoresis, gel-purification and DNA sequencing (Microsynth), either directly or following sub-cloning into pBluescript vector (Agilent) using XhoI and EcoRI restriction enzymes (Thermo Fisher Scientific, Waltham, MA, USA).

Clonal variation in Cre-activated reporter fluorescence intensities
During isolation of primary tdT-positive REP cells from the kidneys of Epo-Cre ERT2 × Ai14 mice [12], we repeatedly observed single cells with lower tdT fluorescence intensity. This variation in tdT fluorescence was rather unexpected because tdT was expressed from the same constitutively active single-copy Rosa26 locus (Fig. 1A) in all cells analyzed. Following immortalization and expansion, pellets of TK-REPD4 cells appeared red whereas pellets of AB-REPD2-22 remained pale (Fig. 1B). Quantification of tdT mRNA demonstrated that tdT fluorescence intensity correlates with tdT transcript levels (Fig. 1C). In order to control for proper Cre ERT2 -mediated recombination of the loxP-flanked STOP cassette, genomic DNA was analyzed by PCR using the primers indicated in Fig. 1A (see Supplementary Table 1 for sequences). Surprisingly, the PCR product of 201 bp (i.e. the predicted length after loxP-mediated recombination) was only detected in the REPD cell line TK-REPD4 (tdT high) whereas AB-REPD2-22 (tdT low) displayed a longer PCR product (Fig. 1D). The previously reported FAIK1-10 cell line [12] contains a non-recombined Ai14 locus (tdT null) which resulted in the expected 1072 bp PCR product (Fig. 1D). Direct sequencing of these PCR products revealed the deletion of all three poly(A) signal repeats in TK-REPD4 cells, but one remaining poly(A) signal repeat as well as both loxP sites and linker regions (the sequences between the loxP sites and the trimeric poly(A) signal repeats) in AB-REPD2-22 cells ( Supplementary Fig. 1). In total, we obtained ten partially recombined AB-REPD cell clones, derived from two independent primary cell isolations, with one remaining poly(A) signal repeat (data not shown).

Cre dose-dependent sequential recombination of an endogenous STOP cassette
AB-REPD2-22 cells were obtained from primary REP cells isolated from mice treated with tamoxifen and a brief hypoxia exposure in vivo three days prior isolation. In contrast, TK-REPD4 cells were obtained from primary REP cells of untreated mice and were stimulated with tamoxifen and repeated hypoxia in vitro. Stronger transcriptional Cre induction and better tamoxifen accessibility in vitro than in vivo suggest that more Cre ERT2 was transcriptionally induced and translocated into the nucleus during TK-REPD4 than AB-REPD2-22 generation, leading to the hypothesis that a nuclear Cre ERT2 dose-dependent effect is responsible for the incomplete STOP cassette recombination. To analyze whether Cre dose-dependent sequential deletion of one, two or all three poly(A) signal repeats of the STOP cassette leads to a progressive increase in tdT expression ( Fig. 2A), tdT null FAIK1-10 cells were transiently transfected with increasing amounts of an improved Cre (iCre) expression vector, or constant amounts of a tamoxifen-inducible Cre (ERT2-Cre-ERT2) expression vector, followed by treatment with increasing concentrations of tamoxifen. In both cases, cells with variable tdT fluorescence intensities were obtained, demonstrating that this effect is independent of a particular Cre protein modification (Fig. 2B). While approx. 75% of all tdT-positive red cells were tdT low (blue in Fig. 2B, lower panel) after low amount iCre transfection, only 7% remained tdT low after high iCre transfection. Similarly, 26% were tdT low in the absence of tamoxifen (owing to the strong ERT2-Cre-ERT2 overexpression) but only 4% remained tdT low in the presence of 1 μM tamoxifen. It should be noted that for over 40 passages of in vitro cultivation we never observed any tdT-positive FAIK1-10 cells, neither under hypoxic conditions nor in the presence of tamoxifen or hydroxytamoxifen (data not shown), demonstrating that in the absence of exogenous Cre no STOP cassette recombination occurs.
Incomplete STOP cassette recombination was confirmed by FACS analysis, clearly showing that, besides non-recombined (corresponding to the naïve FAIK1-10 population) and fully recombined (corresponding to the TK-REPD4 population) cells, there was also a substantial number of cells with intermediate tdT fluorescence intensity, corresponding to the AB-REPD2-22 population (Fig. 2C). PCR analysis of the STOP cassette again resulted in multiple bands, consistent with incomplete recombination (Fig. 2D). These PCR products were isolated from the gel, sub-cloned and sequenced, confirming the iCre-dose-dependent sequential recombination of the three poly(A) signal repeats (Supplementary Fig. 2). To confirm that Cre-mediated stepwise STOP cassette deletion is not specific for non-recombined FAIK1-10 cells, partially recombined AB-REPD2-22 cells were transfected with an iCre expression vector. Exogenous iCre expression resulted in a clear increase in tdT fluorescence intensity (Fig. 2E). PCR amplification using primers flanking the STOP cassette indicated that a large portion of the transfected cells underwent complete recombination (Fig. 2F), resulting in an increase in tdT mRNA levels (Fig. 2G).

Cre dose-dependent sequential recombination of an exogenous STOP cassette
To exclude that this incomplete STOP cassette recombination might occur only with the endogenous Ai14 reporter in the Rosa26 locus of our genetically modified mouse strains, we repeated these experiments using CHO cells transiently co-transfected with the Ai9 [10] reporter plasmid (Fig. 3A) together with various Cre expression vectors as shown in Fig. 3B. Varying degrees of tdT fluorescence intensities again suggested partial recombination also of exogenous bacterial reporter plasmids (Fig. 3C). PCR amplification using the primers indicated in Fig. 3A, subcloning and sequencing confirmed sequential recombination of the tripartite STOP cassette (Supplementary Fig. 3). Of note, the recombination breakpoints inside the STOP cassette always precisely corresponded to the end/start of each poly(A) signal repeat without any similarity to the loxP sequence, suggesting that this type of Cre-mediated recombination is loxP-independent.  single cells with a coincidentially low Cre-mediated recombination event in vivo, we reasoned that a milder Epo inducing and hence Cre inducing stimulus would increase the proportion of tdT low REP cells in Epo-Cre ERT2 × Ai14 mice. Therefore, we treated these mice with roxadustat (FG-4592), a compound that has recently been approved in China for renal anemia therapy [17]. Roxadustat is a hypoxia-inducible factor prolyl-4-hydroxylase inhibitor that was designed for a relatively mild Epo induction to correct anemia in chronic kidney disease patients [18]. Combined tamoxifen/roxadustat treatment induced tdT positive REP cells in the mouse kidney (Fig. 4A). Automated analysis of full kidney slices demonstrated that many REP cells showed a weaker tdT fluorescence than others (Fig. 4B). To confirm that differences in fluorescence intensities between different cells are not due to varying zplanes, sequential images across the full 10 μm section were taken every

Discussion
The artificial SV40-derived tripartite STOP cassette [11] is widely used for many different applications in molecular biology. Our observations demonstrate that the Cre recombinase can cause, or is at least involved in, partial recombination of this cassette by targeting the end-start junctions of the three identical poly(A) signal repeats, leaving one or two poly(A) sequences intact. Without the presence of Cre we never observed any STOP cassette recombination, neither in vivo nor in vitro. Importantly, the poly(A) sequence shows no significant similarity to the loxP sequence and we did not find any abnormality in the two loxP sites outside the STOP cassette: while only one loxP site remained following complete (loxP-dependent) recombination, both loxP sites were still present following (loxP-independent) excision of one or two of the three poly(A) signal repeats. Overexpression of Cre in tdT null FAIK1-10 as well as in tdT low AB-REPD2-22 was sufficient to completely remove the STOP cassette and fully restore tdT expression, further confirming that the Cre-loxP system was not generally corrupted in our experiments and that tdT mRNA expression is directly related to the extent to which the STOP cassette is truncated.
What could be the mechanism(s) underlying the partial Cre-mediated recombination of the STOP cassette? One possibility may be alternative loxP-like sequences within the STOP cassette's SV40 poly(A) repeats, which only partially match the canonical loxP sequence, such as loxB, loxL and loxR sites [19] or other cryptic loxP sites [20][21][22][23]. However, when we aligned 36 different reported cryptic loxP sites to the STOP cassette, the best matching sequence was only 59% identical and covered only half of the 34 bp (cryptic) loxP site (#36 in Supplementary Fig. 4). Ignoring the spacer region and considering only the Cre recognition sequences thought to interact with Cre [23], one half of the palindromic loxP site almost perfectly matched the STOP cassette (#37 in Supplementary Fig. 5). However, there was one mismatch at a crucial position remaining and the other half of the palindrome did not match at all. In conclusion, while this highly similar cryptic recognition sequence might be involved in Cre recruitment to the pA repeats, the absence of any match with the second half of the palindromic sequence suggests that other mechanisms than cryptic loxP sites may be involved.
Alternatively, the Cre recombinase may be recruited to the STOP cassette via the flanking loxP sites and creates DNA breaks which are then repaired. However, we sequenced 23 partially recombined STOP cassettes ( Supplementary Figs. 2 and 3) and never observed any mutation, insertion or deletion at the end-start junctions of the three identical poly(A) signal repeats. Therefore, non-homologous end joining seems rather unlikely as a DNA repair mechanism. Regarding the tripartite perfect repeats, homologous recombination DNA repair proteins, possibly even recruited by the Cre recombinase, appear a more likely mechanism causing partial recombination of the STOP cassette.
Incompletely deleted tripartite poly(A) signal repeats led to partial transcription through the STOP cassette and suggests a Cre dose-dependent step-wise increase in marker/reporter expression. To our knowledge, such a Cre-mediated but loxP-independent recombination has not been reported before. A brief literature survey of this popular method immediately revealed several examples with apparently intermediary reporter or marker gene expression, in addition to cells with maximal expression [24][25][26][27]  might always be a minor cell population with incomplete inactivation of the STOP cassette. Especially for clonal experiments, it may be worthwhile to analyze the recombination of the STOP cassette, using the aforementioned method. Incomplete STOP cassette deletion should also be considered when analyzing in vivo expression patterns because weaker reporter signals may reflect lower Cre levels, despite being expressed from the same ubiquitous Rosa26 locus, which will be relevant for e.g. positive/negative threshold decisions.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement
The authors wish to thank Parmjit Jat, Inder Verma, Connie Cepko and Hongkui Zeng for the kind gifts of plasmids, and Patrick Spielmann for expert technical assistance.

Funding
This work was supported by the National Centre of Competence in Research "Kidney.CH" and the Swiss National Science Foundation (310030_184813).   Alignment between the STOP cassette (entire sequence between the two flanking loxP sites) and the canonical Cre recognition site (#1, bold) as well as 35 cryptic Cre recognition sites (#2 to #36) previously reported in the indicated publications. The canonical Cre recognition sequence with all bases irrelevant for Cre binding replaced by the ambiguous base N, as well as the Cre recognition sequence with all bases except for the TATA motif replaced by N, are included as #37 and #38, respectively. Bases required for Cre binding [23] are highlighted in blue. Small letters denote mismatches between the cryptic and the canonical loxP sites. LALIGN software was used with a minimal length of 4 nt, allowing for maximally 2 mismatches and 0 gaps. Nucleotides of the reported cryptic loxP sites matching the STOP cassette are underlined, and the longest possible alignments (Length) as well as the corresponding % sequence identity (Similarity) are indicated.