Engineering the AAVS1 locus for consistent and scalable transgene expression in human iPSCs and their differentiated derivatives

The potential use of induced pluripotent stem cells (iPSCs) in personalized regenerative medicine applications may be augmented by transgenics, including the expression of constitutive cell labels, differentiation reporters, or modulators of disease phenotypes. Thus, there is precedence for reproducible transgene expression amongst iPSC sub-clones with isogenic or diverse genetic backgrounds. Using virus or transposon vectors, transgene integration sites and copy numbers are difficult to control, and nearly impossible to reproduce across multiple cell lines. Moreover, randomly integrated transgenes are often subject to pleiotropic position effects as a consequence of epigenetic changes inherent in differentiation, undermining applications in iPSCs. To address this, we have adapted popular TALEN and CRISPR/Cas9 nuclease technologies in order to introduce transgenes into pre-defined loci and overcome random position effects. AAVS1 is an exemplary locus within the PPP1R12C gene that permits robust expression of CAG promoter-driven transgenes. Gene targeting controls transgene copy number such that reporter expression patterns are reproducible and scalable by 2-fold. Furthermore, gene expression is maintained during long-term human iPSC culture and in vitro differentiation along multiple lineages. Here, we outline our AAVS1 targeting protocol using standardized donor vectors and construction methods, as well as provide practical considerations for iPSC culture, drug selection, and genotyping. 2015 The Authors. Published by Elsevier Inc. This is an openaccess article under the CCBY license (http:// creativecommons.org/licenses/by/4.0/).


Introduction
Induced pluripotent stem cell (iPSC) technology [1] continues to drive regenerative medicine to new frontiers. The potential of disease-specific iPSCs to differentiate into affected tissues enables potent in vitro disease modeling and the potential for cell therapeutics. Such applications certainly benefit from reproducible transgenic systems, where appropriate expression of a therapeutic protein [2] or reporter for tracking of transplanted cells in an animal model [3] is key. Moreover, comparison of transgeneaugmented phenotypes across multiple iPSC clones would veritably suffer from technical variations. Viral or transposon systems provide ease of delivery [4], yet random integration into the genome may impact both endogenous and transgenic gene expression, such that screening for multiple comparable clones can become a time-consuming endeavor.
Selective integration of transgenic elements into defined loci would address undue screening efforts. Conventional gene targeting flanks transgene elements with genomic homology regions in a 'donor' plasmid DNA template [5], which exploits endogenous cellular homology directed repair (HDR) pathways for selective integration. As conventional gene targeting events are rare, recombinant proteins such as zinc finger nucleases (ZFNs) [6], TALE nucleases (TALENs) [7], and CRISPR/Cas9 [8] can be used to induce double-strand breaks (DSBs) at target sites in the genome, stimulating endogenous cellular DNA damage response and repair pathways [9]. Although non-homologous end-joining (NHEJ) of DSBs can lead to mutagenic insertions and deletions (collectively known as 'indels'), donor-mediated HDR faithfully reconstitutes genomic sequence, greatly increasing the frequency of gene targeting events. Thus, by stimulating DNA damage, nuclease systems can be used to enhance transgene integration in a sequence and locus-specific manner. Originally described as a major hotspot for adeno-associated virus (AAV) integration, intron 1 of the protein phosphatase 1, regulatory subunit 12C (PPP1R12C) gene on human chromosome 19 is referred to the AAVS1 locus [10]. This locus allows stable, longterm transgene expression in many cell types, including embryonic stem cells [11]. Insulating properties have been attributed directly to AAV, however potent insulators have also been shown to exist at the AAVS1 locus itself [12]. Gene targeting using homology to the AAVS1 integration site in combination with ZFN nucleases [13,14] and later TALENs [15] enhanced targeted transgene delivery. As disruption of PPP1R12C is not associated with any known disease, the AAVS1 locus is often considered a safe-harbor for transgene targeting.
Rivaling transgenic approaches based on simple viral transduction or transposition, targeted transgene integration with nucleases must be efficient and standardized. Here, we define a systematic AAVS1 targeting method based on updated AAVS1 targeting materials, amenable to use with either TALEN or CRIPSR/ Cas9 nucleases. Donor plasmids contain puromycin or optimized neomycin selection markers, and are suitable for transgene cloning strategies based on Gateway or In-Fusion technology. Maintaining the original gene-trap selection approach [13], AAVS1 targeting is achieved with high fidelity and low false-positive rates. We provide universal genotyping strategies including details on genomic PCR primers and Southern blot probe design, and identify common aberrant targeting events. We outline conditions to select correctly targeted clones and exclude heterozygotes with NHEJ events on the non-targeted allele. Constitutive reporter expression from the AAVS1 locus is found to be scalable and stable in both pluripotent and differentiated culture conditions. Finally, we provide a resource of targeted fluorescent or luminescent iPSCs in commonly used genetic backgrounds.

Overall strategy
This method outlines the specific delivery of transgenic elements to the AAVS1 safe-harbor locus in human iPSCs, stimulated by either TALEN or CRISPR/Cas9 nucleases systems. Donor plasmids maintain a common design with previously reported targeting vectors [15], with adaptations that permit easy assembly of complex transgene cargo. AAVS1 lies within the first intron of the constitutively expressed PPP1R12C gene, and as such, promoterless selection markers are employed for high-fidelity targeting. Universal genotyping aids in the selection of correctly targeted clones. Avoiding heterozygous clones with secondary NHEJ events opens up the possibility of targeting of the second AAVS1 allele in order to generate compound heterozygotes.
Here, we provide a walkthrough of the genotyping and reporter evaluation steps using TALEN-or CRISPR/Cas9-stimulated delivery of reporters (GFP, mCherry or a GFP-luc2 fusion) constitutively expressed from the CAG promoter (CMV early enhancer, chicken b-actin promoter, and rabbit b-globin intron [16]) into the AAVS1 locus of human iPSCs. Targeting with CAG-driven reporters exemplifies the reproducibility, stability, and scalability of transgene expression from AAVS1, in both iPSCs and their differentiated derivatives. Either TALEN [15] or CRISPR/Cas9 [17] nuclease systems may be used to generate DSBs and enhance targeting at the AAVS1 locus.
We used Golden Gate [18,19] or oligo cloning [17,20] to generate TALEN and CRISPR nuclease expression systems that target the AAVS1 locus at the same position cleaved by ZFNs (Fig. 1A, B and Table 1). Alternative AAVS1-targeting nucleases are available through resources such as Addgene. Note that nuclease target sequences may vary, and should be confirmed to be compatible with the standard donor vector design described in [13] and this protocol (see below).
When compared for NHEJ activity using a T7 endonuclease I assay [21], both native and improved dNC-TALENs [18] are outperformed by the CRISPR/Cas9 system (Fig. 1C). This difference is reflected in the average colony numbers obtained in gene targeting experiments, such that the CRISPR system has now superseded TALENs for our routine targeting applications.

Donor vector construction (estimated time: $1-2 weeks)
Nucleases increase HDR rates such that only 1-2 kb of genomic homology is required, obviating the need for long ($10 kbp) regions demanded by conventional gene targeting. Thus, simplified donor plasmid construction avoids technically challenging cloning strategies or even streamlined recombination-based strategies beginning with genomic libraries [22] or BAC constructs [23].
Our basic AAVS1 donor plasmid designs are derived from Addgene #22075 [13], and are outlined in Fig. 2A and Table 1. All donors use AAVS1 homology arms of $800 bp each (HA-L and HA-R), which target the selection cassette to intron 1 of PPP1R12C, replacing just the 'C' nucleotide ( Fig. 1A), or the entire downstream sgRNA-T2 recognition sequence and PAM 'CCACTAGGGACAG-GATTGG'. The promoter-less splice-acceptor (SA), T2A peptidelinked ''gene trap" selection cassette design ( Fig. 2A, C) effectively eliminates false-positive background arising from random integration. The majority of published AAVS1 donor vectors share this basic design, but the homology arms of alternate donor vectors should be scrutinized before application. Our own vectors accommodate various transgene cloning strategies, including: Multiple Cloning Site, or MCS, composed of the unique restriction sites: SpeI-PacI-XbaI-SalI (pAAVS1-P-MCS and -Nst-MCS). Gateway destination cassettes (Invitrogen) enabling the introduction of transgenes and cDNAs from common Gateway Entry vector libraries for constitutive expression (pAAVS1-P-CAG-DEST and -Nst-DEST). In-Fusion (Takara Bio) design simplifying the cloning of large transgenes or genomic regions, such as tissue-specific promoters (Fig. 2B). We recommend cloning large transgenes as restriction digestion fragments, since such elements can be difficult to amplify by PCR or may be prone to error due to repetitive regions or high GC content. Here, the donor plasmid itself is PCR amplified using defined primers bearing 12-15 nt of homology to the ends of the transgene fragment. PCR amplified regions, including homology arms and In-Fusion junctions, are sequenced with a universal primer set. Selection marker choices: puromycin-N-acetyltransferase (puro), or neomycin phosphotransferase (neo). All donor vectors described herein use neo E182 , which confers superior resistance to G418 (Fig. 2C) [19] in a custom Gateway ENTRY vector before transfer to a CAG expression vector. The AAVS1 sgRNA T2 [17] was cloned by Golden Gate into the pX330 vector [20], for expression from the U6 promoter along with Cas9. (C) T7E1 assay comparing the cleavage activity of TALEN and CRISPR/Cas9 systems in HEK293T cells. PCR products amplified from genomic DNA with dna182 and dna183 were treated with T7 Endonuclease I (NEB cat. No. M0302S) for 15 min followed by electrophoresis, and percent cleavage calculated [21]. dNC-TALEN architecture is based on truncated PthXo1 (Xanthomonas oryzae) [18]. M, marker (bp); n.c., negative control.

In-Fusion reaction
Mix 100 ng of PCR product with 50-200 ng of restriction fragment (adjusting for fragment size, 0.5-10 kb) in 8 lL of deionized water. Add 2 lL In-Fusion HD enzyme, mix, and incubate at 50°C for 15 min. Transform 2-5 lL into chemically competent bacteria, and proceed with standard plasmid selection and extraction.  . Sequence validation of the donor vector regions uses universal primers described in the text (red arrows). (C) iPSCs electroporated with the AAVS1 CRISPR/Cas9 nuclease system (pXAT2) and one of three MCS donor vectors, encoding resistance to puro (negative control), neo D182 , or neo E182 (Nst) were selected for 10 days in G418 at the indicated concentration. The resulting colonies were stained with crystal violet. Images are representative of colony yield from 1 Â 10 5 cells in a 6-well plate format.

Sequence verification
⁄ Note that primers dna1130 and dna157 bind to the bovine growth hormone poly-A (bGHpA) sequence of the selection marker. For transgenes elements containing bGHpA, alternate sequencing primers should be designed.

iPS cell culture
AAVS1 targeting with nucleases has been performed both on feeders and using feeder-free (Ff) culture. Ff is preferred for drug selection and cloning steps, while feeder-based culture is acceptable for human iPSC maintenance. In our experience using feeders, increased labor and lower overall cell yield is the trade-off for decreased material costs. We previously published a detailed protocol for the routine maintenance of human iPSC lines in Ff conditions, using dishes coated with Laminin-511 E8 fragment [4]. Other options for Ff culture have been described (e.g. BD Falcon Matrigel, Invitrogen Geltrex) and, generally, the protocols outlined here should be applicable to iPSCs maintained under such conditions. However, empirical optimization for new cell lines and culture conditions is strongly recommended. Modifications to our published protocol are highlighted in the relevant sections below.
Briefly Plasmids are prepared using a Plasmid MaxiPrep kit (Qiagen, cat. No. 12663), and stored at 0.5-1.0 mg/mL in sterile TE, pH 8.0. Specific endotoxin-free kits may also be used (Qiagen, cat. No. 12362), although standard Maxi preparations are normally low in endotoxin and are suitable for application in human iPSCs. Plasmid DNA should be verified by restriction digest and gel electrophoresis for identity and integrity, and quantity determined by NanoDrop or Qubit (Invitrogen).
Target iPSCs should be grown to sub-confluency on the day of transfection, and harvested as a single-cell suspension. One million (1 Â 10 6 ) cells are required for each electroporation, and should be prepared in bulk. Count viable iPSCs using dye exclusion (e.g. 0.4% trypan blue, Gibco, cat. no 15250-061) for accuracy and consistency.
To deliver plasmid DNA into iPSCs, we routinely use the NEPA21 (Nepa Gene Co. Ltd) achieving $30-50% transfection efficiency with >70% cell viability. Electroporation provides the most predictable and reproducible transfection rates and cell survival, with a low consumables cost. The optimal electrical pulse conditions provided below were determined using transient transfection of a fixed amount of CAG-GFP reporter plasmid (3 lg), and quantification by FACS. With similar optimization, other electroporationbased DNA delivery devices have also been used successfully (e.g. BioRad GenePulser, Invitrogen Neon, and Lonza Nucleofector).

DNA preparation
In a sterile 1.5 mL tube, mix the following plasmid vectors, and bring the total volume to 10 lL with sterile TE pH8.0: The control condition will reveal the frequency of random integration of the donor vector (by design, false positives from random integration of the promoter-less selection marker are negligible). iPSC preparation 1. Harvest iPSCs as a single-cell suspension using Accutase.
Wash and resuspend in culture media containing 10 lM ROCK inhibitor. 2. Count viable cells using trypan blue exclusion on a hemocytometer or automated counting device (e.g. BioRad TC-10 or Invitrogen Countess). One well of a 6-well dish typically yields 3-4 Â 10 6 cells. 3. Collect the appropriate cell number (1 Â 10 6 cells/electroporation) by centrifugation and resuspend them at a concentration of 1 Â 10 6 cells/100 lL Opti-MEM I reduced-serum medium (Invitrogen, cat. No. 31985-062). Electroporation and plating Perform each electroporation one-at-a-time to avoid inconsistent times for incubation of DNA and iPSCs between cuvettes, as extended incubation can lead to DNA degradation.
3. Resuspend cells in 900 lL of culture media containing 10 lM ROCK inhibitor.

Drug selection (estimated time: $10 days)
Observe the cells following plating, and remove ROCK inhibitor from the culture media once small colonies have formed (24-48 h after plating). 48 h after electroporation, begin selection with puromycin (0.5 lg/mL, Sigma-Aldrich, cat. No. P7255) or G418 (175 lg/mL, Calbiochem, cat. No. 345810). Note that drug concentrations may need to be determined empirically for each new iPSC line. Feed everyday with fresh drug-containing culture medium for up to 10-12 days, or until colonies form. Using CRISPR/Cas9, the expected yield is typically $30-60 colonies from 1-3 Â 10 5 plated cells (Fig. 2C, bottom right).

Targeted iPSC isolation and expansion (estimated time: $2 weeks)
Choose average-sized colonies that are well-isolated, with smooth edges and uniformly round shape ($500-1000 lm in diameter). When visible reporters such as GFP under control of the constitutively expressed CAG promoter are targeted to AAVS1, fluorescence uniformity and intensity can be used reliably to verify clonality, and possibly even genotype (see Section 3). Full well imaging using a BioStation CT (Nikon) or similar scanning instrument can help to identify desirable clones. Typically, 24 clones are sufficient to cover the range of resulting genotypes.
Under Ff conditions, iPSC clones are easily maintained in 96 well plates. Pick clones with a 10 lL pipetman, and transfer to wells pre-coated with laminin, in media containing 10 lM ROCK inhibitor. Dissociate to smaller clumps by gentle pipetting. One or two days later, change to media without ROCK inhibitor, supplemented with puromycin (0.5 lg/mL) or G418 (175 lg/mL). Maintain drug selection for the first passage. Using a multichannel pipette, passage the 96-well plate 1:2, allowing the cells to become confluent. Split one plate 1:4 for further expansion, freezing, screening and maintenance (following $4-6 days growth). The second plate will be used immediately for DNA extraction and genotyping (Section 2.6).

Identification of correctly targeted clones
A combination of PCR and Southern blot genotyping is required to identify all possible targeting outcomes: a. Homozygously targeted. b. Heterozygously targeted. c. Heterozygously targeted with an NHEJ event in the nontargeted allele. d. Category a, b, or c with additional donor vector integrations.
A schematic of normal and targeted AAVS1 alleles, along with the positions of PCR primers and Southern blot probes is depicted in Fig. 3A. Initial genotyping by PCR (Fig. 3B) combined with amplicon sequencing (Fig. 3C) helps to classify clones into categories (ac). In most cases, NHEJ indel mutations in category (c) clones are subtle, and only detected by direct sequencing of PCR products. In rare cases, these indels are large enough to be identified as up-or down-shifted PCR products by gel electrophoresis (eg. clone 317-8, Fig. 3B, C). If the intention is to allow sequential targeting of the second AAVS1 allele, it is imperative to choose clones without an NHEJ indel that disrupts the nuclease recognition sequence.
Category (a) and (b) clones are desirable, but not definitively identified by PCR alone. Southern blotting is recommended to verify homo and heterozygous targeting of the AAVS1 locus. More importantly, only Southern blotting with an internal transgenic or genomic probe (GFP or 5 0 probe, Fig. 3A, D, and E) can reveal category (d) clones, which arise from genomic integration of the donor plasmid [24], and are the major source of background we observe during AAVS1 targeting. Interestingly, random integration of the plasmid is rarely observed. Rather, category (d) clones present a predictable banding pattern based on the integration of the entire plasmid, suggesting that it occurs at the AAVS1 locus ( Fig. 3D-F, and Fig. 4C).

Genomic DNA preparation
For large numbers of clones, genomic DNA prepared directly from 96-well culture plates is amenable to PCR screening. Allow cells to reach confluency, then wash out media with PBS before adding lysis buffer (10 mM Tris-HCl, pH 7.5; 10 mM EDTA; 10 mM NaCl; 0.5% sarcosyl; 1 mg/ml Proteinase K). Incubate the plate at 55°C overnight. The next day, precipitate DNA directly in the plate by adding an ice-cold slurry of 75 mM NaCl in 100% ethanol. Invert to discard the liquid, wash twice with 70% ethanol, and air-dry before resuspending in 100 lL of TE pH8.0 buffer. For more consistent DNA yield, cells may be dissociated with Accutase, and collected by centrifugation in 96-well PCR plates. All subsequent DNA precipitation and wash steps may be carried out via centrifugation.
For smaller numbers of clones, column purification (DNeasy Kit, Qiagen, cat. No. 69506) provides high purity, homogenous DNA suspensions that can be accurately quantified prior to PCR genotyping. For each clone, perform DNA extraction from $1 Â 10 6 cells, or one confluent well of a 24-well plate. Follow the procedure described in the kit. Elute twice in 50 lL Buffer AE for a total volume of 100 lL, and measure the concentration by Nanodrop. The expected DNA yield is 100-200 ng/lL. 2.6.1. PCR genotyping and sequencing (estimated time: $1-2 days) Initial genotyping can be performed while cells are maintained in culture in a replica plate. In the event of a delay, temporary cryopreservation can be used to avoid accumulating additional passages. iPSCs grown in 96 well plates can be dissociated with Accutase and resuspended directly in 100 lL per well of Stem-CellBanker (Takara Bio, cat. No. CB043). Wrap plates in a layer paper towel and foil, for storage at À80°C for up to 2 months. To defrost clones, place the plate on a warm surface, and gently transfer the contents of target wells to 1 mL culture media containing 10 lM ROCK inhibitor. Centrifuge and remove the supernatant.
Resuspend the cell pellet and culture in a 24-well dish according to standard passage. Note that the entire plate must be defrosted to recover individual clones, and cryopreservation in vials is preferred.

PCR reaction
All PCR reactions ( (Figs. 3D, E and  4). An internal transgene-derived probe helps to verify unwanted donor plasmid integrations (Figs. 3F, 4C). Suitable transgenic probes include GFP (subfragment) and neo (full length). The puro gene fragment is unsuitable for Southern blotting, as its GC-rich sequence results in high signal-to-noise ratios. Routine screening with the 5 0 internal genomic probe verifies heterozygosity of the AAVS1 alleles (Figs. 3A, E, and 4).
Digestion with SphI is suggested for the donor series presented here, although other enzymes may be more effective when considering custom selection markers or transgene cargo. Probe labeling with DIG provides a safe and simple alternative to radioactive P 32 labeled probes. Technical details and material lists for Southern blotting using DIG-labeled probes are available through the manufacturer's online resource (https://lifescience.roche.com/dig) and   technical manual (https://lifescience.roche.com/wcsstore/RASCat-alogAssetStore/Articles/05353149001_08.08.pdf).

Genomic DNA preparation
Perform genomic DNA preparation in 96-well format as described in Section 2.6, but resuspend DNA pellets directly in a restriction master-mix containing: restriction enzyme (SphI, Compare the PCR products from DIG dNTP and standard dNTP reactions using gel electrophoresis. DIG-containing probes display retarded gel migration and reduced staining by intercalating agents such as ethidium bromide. Use 25-50 ng of probe per mL of hybridization buffer. Process and image the membranes as described in the Roche technical manual.

Cell stock preparation and additional quality control tests
Correctly targeted clones are expanded for permanent cryopreservation in liquid nitrogen. We recommend preparing a master stock (3-5 vials) and a working stock (>12 vials) following 1-2 passages of expansion. Quality control checks should be performed on the working stock. Routine screening of iPSC subclones should include verification of a stable karyotype (G-banding, FISH, or CNV analyses), maintenance of pluripotency (immunostaining or gene expression for markers such as OCT3/4, NANOG, TRA-1-60, TRA-1-81, or SSEA-4), and capacity for differentiation (embryoid bodies, directed differentiation, or teratomas, followed by gene expression analysis, immunostaining, or histology). Include the parental iPSC line as a control for all tests performed.

Results
Transgene expression from the AAVS1 locus satisfies various characteristics required for reliable application in human iPSCs. First, individual clones derived from AAVS1-targeting of a CAG-GFP cassette reproducibly display relative GFP intensities that correlate predictably with PCR and Southern blot genotyping data (Figs. 3B, D, E, and 5A). Interestingly, clones with aberrant transgene insertions (Fig. 3D-F) display an intermediate fluorescence level (Fig. 5A, gray bars). Fluorescence in correctly targeted 201B7 homozygotes was nearly twice as intense as heterozygotes. This phenomenon was recapitulated with CAG-GFP targeting in a related iPSC line, 409B2 [25] (Figs. 4A and 5B) and with a CAG-mCherry reporter in 201B7 (Figs. 4B and 5C). Fusion of GFP to luciferase (GFPluc2) and targeting in the Ff-derived male iPSC cell line 1383D6 (Fig. 4C), resulted in 10-fold lower GFP fluorescence, but showed a similar $2-fold increase in expression in homozygotes, as confirmed by luciferase enzyme assay (Fig. 5D). Second, gene expression is uniform across all undifferentiated iPSCs in clonal populations, such that FACS analysis shows a narrow range of expression, supporting observations by fluorescence microscopy (Fig. 5E, F). Finally, GFP expression is maintained at a similar level in >98% of iPSCs for at least 35 passages, or nearly 6 months, without selective pressure (Fig. 5G). This data highlights the reproducibility, scalability (2-fold) and stability of CAG-driven transgene expression, and suggests that FACS sorting approaches could be used to isolate populations of iPSCs with similar genotypes.
Relative CAG-GFP expression levels in homozygous and heterozygous iPSCs (Fig. 6A) are maintained upon differentiation. Intermediate stages of neural differentiation [26] were assayed for GFP expression (Fig. 6B). Embryoid bodies in suspension (d8), and outgrown neural precursors (d16) retained distinguishable GFP expression levels. PSA-NCAM + cells were measured by FACS to reveal uniform expression throughout the population (Fig. 6B,  bottom). Following maturation, TUJ1 + neurons also retained GFP expression (Fig. 6C). Cardiomyocytes induced from heterozygous iPSC clone 317-12 via ActivinA and BMP4 treatment [27] and identified by cell surface markers (SIRPA + , LIN À ), were found to be 98% positive for GFP (Fig. 6D, middle). Non-cardiomyocyte populations (not SIRPA + , LIN À ) also maintained GFP fluorescence, although the distribution was broad, possibly reflecting cell type heterogeneity (Fig. 6D, right). CD43 + blood cells derived from clone 317-12 also retained GFP fluorescence (data not shown). In vitro differentiation of iPSCs into neural crest and mesenchymal stromal cell (MSC) populations [28] showed a faithful retention of uniform GFP levels (Fig. 6E), although absolute fluorescence levels were cell-type specific (Fig. 6B, D, and E). Commonly used human iPSC lines with stable reporter gene expression are a valuable resource for in vitro differentiation and in vivo transplantation assays in animal models. The nuclease and donor plasmids ( Table 1) and human iPSC lines ( Table 2) described herein will be available through the RIKEN Bio Resource Center DNA Bank (dna.brc.riken.jp) or Cell Bank (http://www.brc. riken.jp/lab/cell/english/), respectively, and may be accessed using the ID numbers indicated. Plasmids will also be made available through Addgene (www.addgene.org). Gene targeting and validation data for the clones in Table 2 (such as genotyping, pluripotency marker expression, reporter expression, and karyotype) are available from the RIKEN BRC's Cell Bank datasheets.

Considerations and alternative approaches
The reporter gene expression patterns presented here are achieved using the CAG promoter, which is known for its robust expression in multiple cell types. Recent reports examined other common promoters, including SFFV, PGK, CMV7, and Ef1ɑ [29,30] for their transgene induction levels and pleiotropic effects on local chromatin and neighboring gene expression. In human iPSCs and differentiated tissues, CAG was found to be superior for transgene maintenance [29]. Inducible transgene expression using the doxycycline-inducible tetO promoter [31,32], as well as expression of reporters from tissue-specific promoters cloned by In-Fusion is also achievable [33] (Yoshida and Woltjen, unpublished results). Empirical testing is recommended for other transgene systems and differentiated cell types.
Although endogenous PPP1R12C gene expression is constitutive, it is modest. Since expression of the SA-T2A-selection marker is directly linked to the PPP1R12C transcript, drug selection can be subject to dosage effects. Selection for puro is mostly unaffected by expression level, however with mutant neo (E > D182 [34]), G418 selection can be severely impaired. When targeting with neo D182 , even homozygous targeted cells become sensitive to G418 concentrations >50 lg/mL, a dosage that insufficiently ablates nontargeted cells (Fig. 2C). All neo donor vectors described here (Nst) use wildtype neo E182 (Table 1); alternative donor vectors should be individually sequence-verified.
We noted that integration of the plasmid backbone [24] is the major source of false-positive background (Fig. 3F). Negative selection markers such as diphtheria toxin-A (DT-A) or herpes simplex virus thymidine kinase (HSV-TK) are typically included in conventional gene targeting vectors in order to select against random integration of linearized plasmid [5]. Presumably, higher frequencies of on-target recombination stimulated by nuclease treatment has led to the increasing trend in excluding negative markers from donor vector designs. Our own experience with negative selection markers indicates that their reliability is tightly linked to expression level, and thus we also have not included them as a donor design standard. Since PCR genotyping will not unambiguously detect backbone integration, and is sensitive to false-positive results from persisting extra-chromosomal vectors, we stress the importance of Southern blotting with an internal probe (5 0 or transgene-derived) to determine the structure of the targeted locus (Figs. 3D-F and 4C).
Subsequent targeting of the non-targeted AAVS1 allele is possible, so long as indels are not introduced during the first round of nuclease treatment. Compound heterozygotes can be produced in a single step using simultaneous targeting of both AAVS1 alleles with donor vectors expressing two different selection markers (e.g. neo and puro). Although no overt phenotype has been reported for loss of PPP1R12C gene function, it has been suggested that homozygous AAV integration is rarely observed [35]. In accordance with many previously published results, we noted no differences in pluripotency or differentiation capacity between heterozygous and homozygous targeted iPSC lines (Fig. 6A-C).
Off-target cleavage followed by mutagenic NHEJ is an underlying concern in the field of nuclease-mediated gene targeting. Based on the nucleases used ( Fig. 1 and Table 1), off-target sites may be predicted using target sequence similarity. Major off-target sites have been described for the T2 gRNA [36], although the frequencies of cleavage are far below that of on-target activity. In any case, there may be precedence to further improve the current CRISPR/ Cas9 nuclease system through the use of Cas9-D10A dual nickases [21] or dCas9-FokI fusion proteins [37], as a precautionary measure against potential off-target cleavage. In the production of new donor plasmids for use with CRISPR/Cas9 nucleases, care should be taken to avoid inclusion or re-construction of the sgRNA-T2 recognition site.
Mosaicism, or colony heterogeneity, is a concern for any clonal isolation protocol. Although the plating and selection conditions described here are designed to generate colonies from single-cell events, nuclease expression can persist after the first cell division, resulting in subtle genetic differences between daughters [38]. Despite their rarity, if ambiguous results are observed during PCR, sequencing, or Southern blot genotyping (usually manifesting as weak secondary bands or overlapping sequence traces), subcloning from the population is recommended.
Finally, other cultured human cell lines including embryonic stem cells, human dermal fibroblasts, HEK293T, or HeLa may be targeted at the AAVS1 locus using the materials described herein. Note that adaptation of growth conditions, transfection methods, DNA amounts, and selection strategies will require suitable customization.

Conclusions
Over the last two decades, the AAVS1 locus has been a reliable workhorse in human cell line transgenesis. Like the mouse ROSA26 locus [39], whose orthologous human locus is unfortunately ineffective [40], the AAVS1 locus provides a safe haven for reliable transgene expression with no overt phenotype. This is in contrast to the X-linked HPRT1 locus, which, although mostly permissive for constitutive expression [41], is associated with a loss-offunction disease -Lesch-Nyhan Syndrome -in human males. Pseudogene loci such as the human L-gulono-c-lactone oxidase (GULOP) locus [42], may avoid phenotypic effects, however transgene expression in pluripotent and differentiated lineages is less well described. Another recently reported locus, citrate lyase beta-like (CLYBL), lies in a gene-deficient region of human chromosome 13 and claims to confer 10-times greater transgene expression than AAVS1 in human iPSCs, with less severe effects on local gene expression [43]. Learning from loci with properties similar to AAVS1, or high-throughput transgene expression screens [44], it may be possible to ultimately predict an epigenetic signature associated with a permissive locus. TALEN and CRISPR/Cas9 nucleases have overcome many efficiency and specificity barriers, making them an attractive option to achieve robust targeted transgenesis. Related materials, protocols, and reference cell lines are now easily available through public bio-repositories. As a result, gene targeting in general has become an accessible technology to most labs with minimal molecular biology experience. Still, even nuclease-based methods work best when adhering to basic principles and practices established over the last three decades of conventional gene targeting. With these lessons in mind, transgenics, nucleases, and reprograming technologies will make for a powerful combination, driving a new era of regenerative medicine research.