Robust CRISPR/Cas9 Genome Editing of the HUDEP-2 Erythroid Precursor Line Using Plasmids and Single-Stranded Oligonucleotide Donors

The study of cellular processes and gene regulation in terminal erythroid development has been greatly facilitated by the generation of an immortalised erythroid cell line derived from Human Umbilical Derived Erythroid Precursors, termed HUDEP-2 cells. The ability to efficiently genome edit HUDEP-2 cells and make clonal lines hugely expands their utility as the insertion of clinically relevant mutations allows study of potentially every genetic disease affecting red blood cell development. Additionally, insertion of sequences encoding short protein tags such as Strep, FLAG and Myc permits study of protein behaviour in the normal and disease state. This approach is useful to augment the analysis of patient cells as large cell numbers are obtainable with the additional benefit that the need for specific antibodies may be circumvented. This approach is likely to lead to insights into disease mechanisms and provide reagents to allow drug discovery. HUDEP-2 cells provide a favourable alternative to the existing immortalised erythroleukemia lines as their karyotype is much less abnormal. These cells also provide sufficient material for a broad range of analyses as it is possible to generate in vitro-differentiated erythroblasts in numbers 4–7 fold higher than starting cell numbers within 9–12 days of culture. Here we describe an efficient, robust and reproducible plasmid-based methodology to introduce short (<20 bp) DNA sequences into the genome of HUDEP-2 cells using the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR associated protein 9 Cas9 system combined with single-stranded oligodeoxynucleotide (ssODN) donors. This protocol produces genetically modified lines in ~30 days and could also be used to generate knock-out and knock-in mutations.


Introduction
Studying disease pathogenesis affecting terminal erythroid differentiation requires model cellular systems capable of generating sufficient material to allow genomic, transcriptomic, proteomic and Methods and Protoc. 2018, 1, 28; doi:10.3390/mps1030028 www.mdpi.com/journal/mps cell biology approaches. Analysis of erythroblasts from circulating progenitors derived from the peripheral blood of healthy individuals and patients can provide invaluable insights into normal and diseased erythropoiesis and is often considered to be the gold standard. However, limited patient availability, a lack of specific antibodies and polygenic background effects that obscure core phenotypic abnormalities are all limitations of this approach and analysis of genetically modified cell lines may help to circumvent these issues. Additionally, the use of patient cells often provides insufficient material for many approaches.
To deal with these problems, a number of cell lines have been used to study terminal erythroid differentiation in the laboratory. These include the human erythroleukaemia cell lines termed K562 and HEL [1,2] and the mouse equivalent, termed MEL [3]. Study of these cell lines in conjunction with small interfering RNA (siRNA) mediated knockdown of specific genes has provided insight into normal and aberrant terminal erythroid differentiation, however, these lines are extremely aneuploid and harbour chromosome rearrangements that hinder genomic manipulation and may adversely affect terminal differentiation [4]. Recently it has become possible to generate and genetically manipulate induced pluripotent stem cells (iPSCs) from healthy individuals and patients, however, maintenance of these cells is time-consuming and costly and when transdifferentiated into erythroblasts iPSCs express mainly embryonic and fetal globins [5].
Study of erythroid differentiation using primary cells obtained from mice continues to be extremely informative, however, there is considerable effort and cost associated with generating and breeding mice lines harbouring specific mutations and expressing tagged proteins and it is imperative to replace animal models with alternative systems where possible [6]. Additionally, although animal models of human disease can be very informative it is not uncommon to find that human diseases are not accurately recreated in mouse models. One example of this is the Sec23b mutant mouse which does not recapitulate a type of anaemia termed congenital dyserythropoietic anaemia type II CDA-II, known to be caused by biallelic mutation of SEC23B in humans [7].
The recently published immortalised line derived from Human Umbilical Derived Erythroid Precursors and termed HUDEP-2 cells [8] offers an alternative system in which to study erythroid differentiation. HUDEP-2 cells express predominantly adult globins and are capable of robust in vitro erythroid differentiation and therefore are an excellent resource for modelling human terminal erythropoiesis. Analysis of wild-type and genetically modified HUDEP-2 cells offers insight into a range of erythroid disorders including the haemoglobinopathies and congenital forms of anaemia. In the case of αand β-thalassemia and sickle-cell anaemia, genome editing strategies and their effects may be tested in HUDEP-2 cells and valuable information about the genomic changes associated with these diseases, such as chromatin organisation and effects on gene expression may be gained (for a striking example see Wienert et al. [9]). In the case of the congenital anaemias, edited cells may be used to offer insight into the pathogenesis of these disorders and as a reagent to screen for potential novel therapies. In this work we use the rare anaemia termed congenital dyserythropoietic anaemia type I CDA-I, which is caused by loss of function mutations in either of the genes CDAN1 or C15ORF41, [10,11] as an example with which to demonstrate the efficacy and utility of genome editing. To date, study of the pathogenesis of this disease has been hindered by variable phenotypic abnormalities, difficulty in obtaining patient samples and a lack of specific antibodies.
There are some advantages to using ribonucleoprotein (RNP) delivery of Cas9 for inducing DNA breaks in terms of cutting efficiency and reduced off target effects as the Cas9 protein is cleared from the cell more quickly than when a plasmid-based expression system is employed [12]. If the user prefers to use RNP based editing, we would recommend using the protocol described in [13] combined with the sorting and clonal expansion protocol described here. It should be noted, however, there may be inter-batch and inter-supplier variability in Cas9 protein efficacy. Additionally, a current limitation of the RNP approach is a lack of flexibility to utilise the increasing array of Cas9 based approaches such as editing of the epigenome and base editing (for examples see Wang et al. [14]). Therefore, here we report a robust and efficient protocol delivering Cas9 via plasmids and the template for homology directed repair (HDR) of cleaved DNA using single-stranded oligodeoxynucleotide (ssODN) donors for generating cellular model systems, see Figure 1 for an overview. This protocol is flexible as it allows novel Cas9 variants of choice to be used, as long as the relevant Cas9 expressing plasmid can be generated or obtained. The protocol reported here allows generation of stable clones of genetically modified HUDEP-2 cells in~30 days, analysis of which will greatly accelerate research into cellular processes, gene regulation and drug discovery in normal and abnormal erythroid differentiation.

Experimental Design
The desired targeting site should be identified and small guide RNAs (sgRNAs) designed. Care should be taken to select a region in which the copy number is normal, otherwise this should be accounted for during the analysis and repeated rounds of targeting may be required. If homology directed repair with a donor template is required, the Cas9 cut site should be within 20 bp of the required integration site. One limitation of the approach detailed here is that the Streptococcus pyogenes Cas9 protein used in this protocol requires an NGG or an NAG motif at the 3 of its recognition sequence in order to cut. If there is no NGG or NAG within 20 bp of the required targeting site, then Cas9 proteins from alternative bacterial species (such as Neisseria meningitidis or Staphylococcus aureus) should be considered [14]. This protocol uses a single plasmid that contains sequence encoding the RNAIII polymerase U6 promoter driving expression of the sgRNA and a chicken β hybrid (Cbh) promoter driving expression of Cas9 and enhanced green fluorescent protein EGFP proteins separated by a T2A sequence.
The efficiency of sgRNA cutting should be tested in human embryonic kidney (HEK) 293T cells or any other easily transfectable cell line available to the user and the plasmid inducing the most efficient cutting of the target locus selected (see Section 4 Expected Results for further details). HUDEP-2 cells may also be used for this step if the user wishes to establish the cell-line specific cutting efficiency of each sgRNA. Once the exact cut site has been validated a ssODN should be designed with the homology arms centered on the cut site. One limitation of using ssODNs is that relatively short sequences may be introduced. This may be overcome by generating a donor plasmid, or an adeno-associated virus (AAV) vector, which may include a selectable marker to improve efficiency (see Bak et al. [13]). The validated Cas9-T2A-EGFP plasmid and ssODN should be nucleofected into the HUDEP-2 cells and after 48 h EGFP positive cells are single-cell sorted into microplates. Clones should be expanded and screened and positive clones taken forward for analyses. It should be noted that HUDEP-2 cells express variable levels of Kusabira Orange fluorescent protein and as such Cas9 plasmids using mCherry or other red-fluorescent protein should be avoided. If the gene targeted is essential for HUDEP-2 cell survival there may be no positive clones recovered and the experimental design should be reconsidered.

Targeting Design and Human Umbilical Derived Erythroid Precursor 2 Cytogenetics
An important step in considering the design of the experiment is to check the copy number of the intended target region. As there may be a degree of clonal variation in the karyotype of this cell line it is advisable to ascertain the chromosome number prior to starting experiments on a particular line.
Chromosome counts to characterise the karyotype of unmodified HUDEP-2 cells reveal a modal chromosome number of 51, XY (6/10 metaphases) with a range of 49-53 chromosomes. Supernumerary chromosomes comprised a larger marker chromosome and four smaller chromosomes. Array comparative genomic hybridization (aCGH) (using an Illumina InfiniumOmniExpress-24v1-2_A1 Beadchip (see Appendix A for details) reveal trisomies of chromosomes 6, 17 (partial ~30-80 Mb), 18 (partial ~18-78 Mb), 19 and 21 (partial ~14-48 Mb). There is also variable gain of material along the length of chromosome 8 suggesting that the large marker may consist of rearranged chromosome 8 material. There are also smaller regions of loss of heterozygosity (Table 1).
Once the locations of each desired modification have been identified suitable sequences for sgRNAs should be identified. In this protocol we recommend use of short guides of 18 nucleotides to minimise the homology available for off-target binding (see Discussion). To design each guide, select 18 base pairs immediately 5′ of the closest located protospacer adjacent motif (PAM) site, this is NGG for S. pyogenes Cas9 although NAG may also be used with reduced efficiency (see reference [14]). In our experience integration of a donor is most successful when Cas9 cut sites are within 20 bp of the desired integration site, note that S. pyogenes Cas9 cuts between the 3rd and 4th bases 5′ of the PAM. To allow redundancy and subsequent selection of the most efficient site select two PAMs closest to the desired cut site and test cutting efficiency at both sites by Surveyor assay. We recommend sgRNA sequences be selected to minimise the possibility of off-target hybridisation by checking for other regions of homology using the basic local alignment tool BLAT [15] and where alternatives are available the sequence with the least homology to off-target sites should be selected.

Targeting Design and Human Umbilical Derived Erythroid Precursor 2 Cytogenetics
An important step in considering the design of the experiment is to check the copy number of the intended target region. As there may be a degree of clonal variation in the karyotype of this cell line it is advisable to ascertain the chromosome number prior to starting experiments on a particular line.
Chromosome counts to characterise the karyotype of unmodified HUDEP-2 cells reveal a modal chromosome number of 51, XY (6/10 metaphases) with a range of 49-53 chromosomes. Supernumerary chromosomes comprised a larger marker chromosome and four smaller chromosomes. Array comparative genomic hybridization (aCGH) (using an Illumina InfiniumOmniExpress-24v1-2_A1 Beadchip (see Appendix A for details) reveal trisomies of chromosomes 6, 17 (partial~30-80 Mb), 18 (partial~18-78 Mb), . There is also variable gain of material along the length of chromosome 8 suggesting that the large marker may consist of rearranged chromosome 8 material. There are also smaller regions of loss of heterozygosity (Table 1).
Once the locations of each desired modification have been identified suitable sequences for sgRNAs should be identified. In this protocol we recommend use of short guides of 18 nucleotides to minimise the homology available for off-target binding (see Discussion). To design each guide, select 18 base pairs immediately 5 of the closest located protospacer adjacent motif (PAM) site, this is NGG for S. pyogenes Cas9 although NAG may also be used with reduced efficiency (see reference [14]). In our experience integration of a donor is most successful when Cas9 cut sites are within 20 bp of the desired integration site, note that S. pyogenes Cas9 cuts between the 3rd and 4th bases 5 of the PAM. To allow redundancy and subsequent selection of the most efficient site select two PAMs closest to the desired cut site and test cutting efficiency at both sites by Surveyor assay. We recommend sgRNA sequences be selected to minimise the possibility of off-target hybridisation by checking for other regions of homology using the basic local alignment tool BLAT [15] and where alternatives are available the sequence with the least homology to off-target sites should be selected. Oligonucleotides for use as sgRNAs should be obtained as 25 nmol scale desalted single-stranded oligonucleotides. Design sgRNA oligonucleotides to include 4-nucleotide 5' overhangs (forward oligo 5 -CACC-3 ; reverse oligo 5 -AAAC-3 ) compatible with the BbsI restriction enzyme site used to clone them into the pX458 plasmid [16] (Figure 2). Additionally, because the human U6 RNA polymerase III promoter in the pX458 plasmid preferentially transcribes sequences beginning with a guanine [17], sgRNA sequences that do not naturally include a 5 , "G" should include this nucleotide ( Figure 2

Cloning Oligonucleotides for Small Guide RNAs Time for Completion: 3 Days, Approx. 1 h per Day
Oligonucleotides for use as sgRNAs should be obtained as 25 nmol scale desalted singlestranded oligonucleotides. Design sgRNA oligonucleotides to include 4-nucleotide 5' overhangs (forward oligo 5′-CACC-3′; reverse oligo 5′-AAAC-3′) compatible with the BbsI restriction enzyme site used to clone them into the pX458 plasmid [16] (Figure 2). Additionally, because the human U6 RNA polymerase III promoter in the pX458 plasmid preferentially transcribes sequences beginning with a guanine [17], sgRNA sequences that do not naturally include a 5′, "G" should include this nucleotide ( Figure 2).

Figure 2. (A)
Heteroduplexed small guide RNA with overhangs for ligation with BbsI digested pX458; (B) BbsI digested pX458 creates overhangs that are homologous to those on heteroduplexed sgRNAs; (C) oligo duplex design for a guide targeting the sense strand, the G/C shown in red is optional and may be added where the 5′ end of the sequence does not terminate in a guanine.
Clone oligonucleotides encoding sgRNAs into pX458 by heteroduplexing followed by the onestep protocol described by Cost [18]. Briefly, heteroduplex oligonucleotides by combining the reagents listed below in a single well of a microtiter plate and heating at 95 °C for 5 min before cooling to 12 °C at a rate of 0.1 °C per second using a thermocycler. Perform enzymatic ligation assisted by nucleases (ELAN) reactions by combining the following reagents in a single well of a microtiter plate: 5′ CACC-G-18nt forward oligo 3′ 5' AAAC-18nt reverse oligo-C 3' (B) BbsI digested pX458 creates overhangs that are homologous to those on heteroduplexed sgRNAs; (C) oligo duplex design for a guide targeting the sense strand, the G/C shown in red is optional and may be added where the 5 end of the sequence does not terminate in a guanine.
Clone oligonucleotides encoding sgRNAs into pX458 by heteroduplexing followed by the one-step protocol described by Cost [18]. Briefly, heteroduplex oligonucleotides by combining the reagents listed below in a single well of a microtiter plate and heating at 95 • C for 5 min before cooling to 12 • C at a rate of 0.1 • C per second using a thermocycler. Perform enzymatic ligation assisted by nucleases (ELAN) reactions by combining the following reagents in a single well of a microtiter plate: Incubate ELAN reactions at 37 • C for 1 h before transforming into DH10B competent E. coli bacteria according to the manufacturer's instructions. Clonally select and screen using the LKO1.5 forward primer in combination each specific sgRNA reverse oligonucleotide. Amplification conditions: 0.5 µM each oligonucleotide, 1 µL of bacterial culture (template), 200 µM dNTPs, 1× polymerase specific amplification buffer, one unit thermostable DNA polymerase combined in a total of 20 µL. Amplification reactions should be incubated at 95 • C for 3 min and then cycled 35 times at 95 • C for 30 s, 58 • C for 30 s and 72 • C for 30 s before a final incubation at 72 • C for 10 min. Correctly recombined clones should produce a discrete amplification product of~100 bp, visible by agarose gel electrophoresis. Grow correctly recombined clones overnight in 50 mL LB broth supplemented with 0.1 mg/mL ampicillin and isolate plasmid DNA using the Plasmid Plus Midi kit (Qiagen) in accordance with the manufacturer's directions. Incubate ELAN reactions at 37 °C for 1 h before transforming into DH10B competent E. coli bacteria according to the manufacturer's instructions. Clonally select and screen using the LKO1.5 forward primer in combination each specific sgRNA reverse oligonucleotide. Amplification conditions: 0.5 μM each oligonucleotide, 1 μL of bacterial culture (template), 200 μM dNTPs, 1× polymerase specific amplification buffer, one unit thermostable DNA polymerase combined in a total of 20 μL. Amplification reactions should be incubated at 95 °C for 3 min and then cycled 35 times at 95 °C for 30 s, 58 °C for 30 s and 72 °C for 30 s before a final incubation at 72 °C for 10 min. Correctly recombined clones should produce a discrete amplification product of ~100 bp, visible by agarose gel electrophoresis. Grow correctly recombined clones overnight in 50 mL LB broth supplemented with 0.1 mg/mL ampicillin and isolate plasmid DNA using the Plasmid Plus Midi kit (Qiagen) in accordance with the manufacturer's directions.
CRITICAL STEP It is important to elute plasmids in a small volume (50 μL) to achieve concentrations of 4-6 μg/μL.

Surveyor Assay. Time for Completion: 1 Day
To assess the efficiency with which each Cas9 plasmid cuts its target site, transfect a 35 mm well of HEK 293T cells (or other available cell line) with 2 μg of plasmid using jetPRIME reagent (or other transfection reagent of choice) in accordance with manufacturer's instructions. Grow cells for 48 h post transfection and prepare genomic DNA (lyse cells for 2-4 h at 37 °C in a buffer containing 50 mM Tris pH 8.0, 1 mM ethylenediaminetetraacetic acid (EDTA), 0.5% Tween 20 and 60 μg/mL proteinase K. Heat inactivate proteinase K at 95 °C for 10 mins). Design oligonucleotides to generate amplification products of 500-800 bp around the target cut site and treat amplified DNA with Surveyor nuclease in accordance with the manufacturer's directions. Analyse digestion products by agarose gel electrophoresis and use Cas9 plasmids that generate samples with the highest ratio of digested to undigested product to target HUDEP-2 cells. Example Surveyor assays are shown in Section 4 Expected Results ( Figure 3).

Design of Oligonucleotide Donor. Time for Completion: 2 h
To introduce short DNA sequences into endogenous loci, ssODNs may be used. Donors may be obtained from IDT (Leuven, Belgium) as Ultramers ® 199 bp in length. The donors used in our example (see Section 4 Expected Outcomes) included a 24 bp sequence encoding a Strep-tag [19] or FLAG-tag [20] and a four amino acid flexible linker: Gly-Gly-Ser-Gly for integration into the CDAN1 or C15ORF41 locus respectively. To prevent repeated cutting at the same locus, ssODNs included a silent mutation, which alters either the first or second "G" of the "NGG" PAM site required for Cas9 binding. Changing the PAM to NAG should be avoided where possible as S. pyogenes Cas9 may still be able to cut albeit with reduced efficiency. We found that homology arms equidistant from the cut site efficiently facilitated homology driven repair of the locus following the double strand break generated by the Cas9 protein. Additionally, we recommend the small molecule agonist of RAD51, RS-1, be used to promote repair of the double stranded break with homology directed repair [21]. Expected outcomes may be seen in Figure 4.
If required, heterozygous genotypes can be reliably generated with two rounds of editing. The first targeting experiment uses an ssODN to introduce a silent mutation in the PAM site. Due to the incomplete efficiency of biallelic targeting, numerous clones will have successful integration of the ssODN on one allele and an indel resulting from non-homologous end joining on the other allele. In the second experiment, the allele containing the indel is targeted specifically with a newly designed CRITICAL STEP It is important to elute plasmids in a small volume (50 µL) to achieve concentrations of 4-6 µg/µL.

Surveyor Assay. Time for Completion: 1 Day
To assess the efficiency with which each Cas9 plasmid cuts its target site, transfect a 35 mm well of HEK 293T cells (or other available cell line) with 2 µg of plasmid using jetPRIME reagent (or other transfection reagent of choice) in accordance with manufacturer's instructions. Grow cells for 48 h post transfection and prepare genomic DNA (lyse cells for 2-4 h at 37 • C in a buffer containing 50 mM Tris pH 8.0, 1 mM ethylenediaminetetraacetic acid (EDTA), 0.5% Tween 20 and 60 µg/mL proteinase K. Heat inactivate proteinase K at 95 • C for 10 mins). Design oligonucleotides to generate amplification products of 500-800 bp around the target cut site and treat amplified DNA with Surveyor nuclease in accordance with the manufacturer's directions. Analyse digestion products by agarose gel electrophoresis and use Cas9 plasmids that generate samples with the highest ratio of digested to undigested product to target HUDEP-2 cells. Example Surveyor assays are shown in Section 4 Expected Results (Figure 3).

Design of Oligonucleotide Donor. Time for Completion: 2 h
To introduce short DNA sequences into endogenous loci, ssODNs may be used. Donors may be obtained from IDT (Leuven, Belgium) as Ultramers ® 199 bp in length. The donors used in our example (see Section 4 Expected Outcomes) included a 24 bp sequence encoding a Strep-tag [19] or FLAG-tag [20] and a four amino acid flexible linker: Gly-Gly-Ser-Gly for integration into the CDAN1 or C15ORF41 locus respectively. To prevent repeated cutting at the same locus, ssODNs included a silent mutation, which alters either the first or second "G" of the "NGG" PAM site required for Cas9 binding. Changing the PAM to NAG should be avoided where possible as S. pyogenes Cas9 may still be able to cut albeit with reduced efficiency. We found that homology arms equidistant from the cut site efficiently facilitated homology driven repair of the locus following the double strand break generated by the Cas9 protein. Additionally, we recommend the small molecule agonist of RAD51, RS-1, be used to promote repair of the double stranded break with homology directed repair [21]. Expected outcomes may be seen in Figure 4.
If required, heterozygous genotypes can be reliably generated with two rounds of editing. The first targeting experiment uses an ssODN to introduce a silent mutation in the PAM site. Due to the incomplete efficiency of biallelic targeting, numerous clones will have successful integration of the ssODN on one allele and an indel resulting from non-homologous end joining on the other allele. In the second experiment, the allele containing the indel is targeted specifically with a newly designed sgRNA. A ssODN which repairs the indel, disrupts the remaining PAM and introduces the desired mutation is transfected with the new sgRNA.

Transfection of Human Umbilical Derived Erythroid Precursor 2 Cells. Time for Completion: 3 Days
Cells should be expanded for 5-7 days and be in an exponential growth phase prior to transfection. Change media to adjust density to 0.5 × 10 6 /mL 24 h prior to transfection and maintain wild-type untargeted cells in expansion phase for use as a negative control during the selection of GFP-positive clones. Incubate ELAN reactions at 37 °C for 1 h before transforming into DH10B competent E. coli bacteria according to the manufacturer's instructions. Clonally select and screen using the LKO1.5 forward primer in combination each specific sgRNA reverse oligonucleotide. Amplification conditions: 0.5 μM each oligonucleotide, 1 μL of bacterial culture (template), 200 μM dNTPs, 1× polymerase specific amplification buffer, one unit thermostable DNA polymerase combined in a total of 20 μL. Amplification reactions should be incubated at 95 °C for 3 min and then cycled 35 times at 95 °C for 30 s, 58 °C for 30 s and 72 °C for 30 s before a final incubation at 72 °C for 10 min. Correctly recombined clones should produce a discrete amplification product of ~100 bp, visible by agarose gel electrophoresis. Grow correctly recombined clones overnight in 50 mL LB broth supplemented with 0.1 mg/mL ampicillin and isolate plasmid DNA using the Plasmid Plus Midi kit (Qiagen) in accordance with the manufacturer's directions.
CRITICAL STEP It is important to elute plasmids in a small volume (50 μL) to achieve concentrations of 4-6 μg/μL.

Surveyor Assay. Time for Completion: 1 Day
To assess the efficiency with which each Cas9 plasmid cuts its target site, transfect a 35 mm well of HEK 293T cells (or other available cell line) with 2 μg of plasmid using jetPRIME reagent (or other transfection reagent of choice) in accordance with manufacturer's instructions. Grow cells for 48 h post transfection and prepare genomic DNA (lyse cells for 2-4 h at 37 °C in a buffer containing 50 mM Tris pH 8.0, 1 mM ethylenediaminetetraacetic acid (EDTA), 0.5% Tween 20 and 60 μg/mL proteinase K. Heat inactivate proteinase K at 95 °C for 10 mins). Design oligonucleotides to generate amplification products of 500-800 bp around the target cut site and treat amplified DNA with Surveyor nuclease in accordance with the manufacturer's directions. Analyse digestion products by agarose gel electrophoresis and use Cas9 plasmids that generate samples with the highest ratio of digested to undigested product to target HUDEP-2 cells. Example Surveyor assays are shown in Section 4 Expected Results (Figure 3).

Design of Oligonucleotide Donor. Time for Completion: 2 h
To introduce short DNA sequences into endogenous loci, ssODNs may be used. Donors may be obtained from IDT (Leuven, Belgium) as Ultramers ® 199 bp in length. The donors used in our example (see Section 4 Expected Outcomes) included a 24 bp sequence encoding a Strep-tag [19] or FLAG-tag [20] and a four amino acid flexible linker: Gly-Gly-Ser-Gly for integration into the CDAN1 or C15ORF41 locus respectively. To prevent repeated cutting at the same locus, ssODNs included a silent mutation, which alters either the first or second "G" of the "NGG" PAM site required for Cas9 binding. Changing the PAM to NAG should be avoided where possible as S. pyogenes Cas9 may still be able to cut albeit with reduced efficiency. We found that homology arms equidistant from the cut site efficiently facilitated homology driven repair of the locus following the double strand break generated by the Cas9 protein. Additionally, we recommend the small molecule agonist of RAD51, RS-1, be used to promote repair of the double stranded break with homology directed repair [21]. Expected outcomes may be seen in Figure 4.
If required, heterozygous genotypes can be reliably generated with two rounds of editing. The first targeting experiment uses an ssODN to introduce a silent mutation in the PAM site. Due to the incomplete efficiency of biallelic targeting, numerous clones will have successful integration of the ssODN on one allele and an indel resulting from non-homologous end joining on the other allele. In the second experiment, the allele containing the indel is targeted specifically with a newly designed CRITICAL STEP The ratio of DNA to transfection reagents must be maintained as 1:10. Incubate ELAN reactions at 37 °C for 1 h before transforming into DH10B competent E. coli bacteria according to the manufacturer's instructions. Clonally select and screen using the LKO1.5 forward primer in combination each specific sgRNA reverse oligonucleotide. Amplification conditions: 0.5 μM each oligonucleotide, 1 μL of bacterial culture (template), 200 μM dNTPs, 1× polymerase specific amplification buffer, one unit thermostable DNA polymerase combined in a total of 20 μL. Amplification reactions should be incubated at 95 °C for 3 min and then cycled 35 times at 95 °C for 30 s, 58 °C for 30 s and 72 °C for 30 s before a final incubation at 72 °C for 10 min. Correctly recombined clones should produce a discrete amplification product of ~100 bp, visible by agarose gel electrophoresis. Grow correctly recombined clones overnight in 50 mL LB broth supplemented with 0.1 mg/mL ampicillin and isolate plasmid DNA using the Plasmid Plus Midi kit (Qiagen) in accordance with the manufacturer's directions.
CRITICAL STEP It is important to elute plasmids in a small volume (50 μL) to achieve concentrations of 4-6 μg/μL.

Surveyor Assay. Time for Completion: 1 Day
To assess the efficiency with which each Cas9 plasmid cuts its target site, transfect a 35 mm well of HEK 293T cells (or other available cell line) with 2 μg of plasmid using jetPRIME reagent (or other transfection reagent of choice) in accordance with manufacturer's instructions. Grow cells for 48 h post transfection and prepare genomic DNA (lyse cells for 2-4 h at 37 °C in a buffer containing 50 mM Tris pH 8.0, 1 mM ethylenediaminetetraacetic acid (EDTA), 0.5% Tween 20 and 60 μg/mL proteinase K. Heat inactivate proteinase K at 95 °C for 10 mins). Design oligonucleotides to generate amplification products of 500-800 bp around the target cut site and treat amplified DNA with Surveyor nuclease in accordance with the manufacturer's directions. Analyse digestion products by agarose gel electrophoresis and use Cas9 plasmids that generate samples with the highest ratio of digested to undigested product to target HUDEP-2 cells. Example Surveyor assays are shown in Section 4 Expected Results (Figure 3).

Design of Oligonucleotide Donor. Time for Completion: 2 h
To introduce short DNA sequences into endogenous loci, ssODNs may be used. Donors may be obtained from IDT (Leuven, Belgium) as Ultramers ® 199 bp in length. The donors used in our example (see Section 4 Expected Outcomes) included a 24 bp sequence encoding a Strep-tag [19] or FLAG-tag [20] and a four amino acid flexible linker: Gly-Gly-Ser-Gly for integration into the CDAN1 or C15ORF41 locus respectively. To prevent repeated cutting at the same locus, ssODNs included a silent mutation, which alters either the first or second "G" of the "NGG" PAM site required for Cas9 binding. Changing the PAM to NAG should be avoided where possible as S. pyogenes Cas9 may still be able to cut albeit with reduced efficiency. We found that homology arms equidistant from the cut site efficiently facilitated homology driven repair of the locus following the double strand break generated by the Cas9 protein. Additionally, we recommend the small molecule agonist of RAD51, RS-1, be used to promote repair of the double stranded break with homology directed repair [21]. Expected outcomes may be seen in Figure 4.
If required, heterozygous genotypes can be reliably generated with two rounds of editing. The first targeting experiment uses an ssODN to introduce a silent mutation in the PAM site. Due to the incomplete efficiency of biallelic targeting, numerous clones will have successful integration of the ssODN on one allele and an indel resulting from non-homologous end joining on the other allele. In the second experiment, the allele containing the indel is targeted specifically with a newly designed CRITICAL STEP Do not exceed 110 µL total volume in the cuvette.

3.
Prepare the appropriate volume of media to resuspend cells at a density of 1 × 10 6 per mL with 2 µg/mL DOX. 4.
Aliquot 3.25 × 10 6 cells into a tube and centrifuge at 270× g for 5 min.

5.
Aspirate supernatant, resuspend the cells by flicking the tube then add >10mL of phosphate buffered saline (PBS) and centrifuge at 270× g for 5 min. 6.
Aspirate PBS removing as much liquid as possible. This is important in ensuring an efficient transfection. 7.
Resuspend cells pellets in buffer/supplement/plasmid/donor solution and transfer (<110 µL) into a Nucleofector Amaxa 2B cuvette by gently dispensing the cells down the side of the vessel between the metallic plates and without introducing any bubbles. 8.
Nucleofect the cells by placing the cuvette in the Amaxa 2B Nucleofector using protocol U-08. 9.
Immediately following nucleofection return the cuvette to tissue culture hood and add growth media (see Section 2.2 for description) containing RS-1 (if required) to the cuvette. Aim to dilute the buffer a minimum of 5:1 media:buffer but ideally 10:1 within the first minute post nucleofection. Keep 1 mL of resuspension volume to wash cuvette with after transferring the cells into a culture vessel. 10. Using the Pasteur pipette provided in the nucleofection kit, remove cells from the cuvette and gently triturate the solution to evenly distribute them. 11. Rinse the cuvette with the remaining 1 mL of media and add this to the culture vessel. 12. Resuspend the cells to a density of 1 × 10 6 cells/mL in the prewarmed media with 2 µg/mL DOX OPTIONAL STEP If you are attempting to integrate a donor at the cut site it is possible to use the small molecule RS-1 to promote homology driven repair (0.75 µM final concentration in the cell resuspension media) [21,22]. RS-1 is a small molecule activator of RAD51, which is thought to be involved in finding a homologous repair template and facilitating strand exchange [23]. 15. Using fluorescence activated cell sorting (FACS) sort single cells into each well of a 60-well microtiter plate (Terasaki plate). Each well should be prefilled with 20 µL media containing 2 µg/mL DOX and maintained in an incubator until the cells have been prepared for sorting. 16. Prepare both the transfected cells and 1 × 10 5 untransfected cells by centrifugation at 270× g for five minutes and washing once with PBS. After recentrifugation and discarding PBS, resuspend the cell pellet in complete media with 2 µg/mL DOX and 1 µg/mL Hoechst 33258 to a density of 1 × 10 6 -5 × 10 6 per mL as appropriate depending on the FACS machine and preferred flow rate.
Hoechst is used to differentiate live/dead cells during sorting. 17. Set gate for GFP-positive cells based on the untransfected population and sort single GFP-positive cells into each well of 10 Terasaki plates. 18. Once cells have been sorted, transfer Terasaki plates to a humidity box (any container that allows gas exchange but retains moisture) and place in an incubator.

Clone Expansion. Time for Completion: 14 Days
3.6.1. Days 5-9 Incubate cells undisturbed for five days but ensure that the media is not evaporating from the plates. If wells appear to have reduced volume on Day 7, carefully add an additional 5-10 µL of media to each well without disturbing the cells. 28. DNA can be extracted from the cell pellets (as described in Section 3.2) in microcentrifuge tubes and screened by polymerase chain reaction (PCR) amplification and Sanger sequencing coupled with TIDER software [24] or restriction digest to determine correct integration of the required genomic modification. Indels may be analyzed in by combining Sanger sequencing with TIDE software [25], or amplification products may be cloned and sequenced. Next generation sequencing (NGS) is extremely useful in the analysis of HDR and indels and a novel method describing high-throughput multiplexed screening of modified clones is reported in Nussbaum et al. [26]. 29. Recover the correctly targeted clones in 2× 200 µL 96 wells and check them 24 h post-thaw.

Expected Results
To circumvent the lack of specific antibodies and allow detection of Codanin-1 and C15ORF41 proteins we inserted DNA sequences encoding a Strep-tag [19] and FLAG-tag [20] at the 3 end of CDAN1 and the 5 end of C15ORF41 and we show the outcome of this as expected results here. Surveyor assays for CDAN1 sgRNAs 1 and 2 ( Figure 3) show that both sgRNAs are capable of successfully inducing Cas9 activity at the correct targeting site, however, sgRNA 1 was selected because it produced a cut closest to the desired region of insertion, immediately preceding the termination codon. Similarly, for C15ORF41 both sgRNA 2 and sgRNA 4 allowed good cutting efficiency (Figure 3), but because sgRNA 4 generated the most efficient cutting it was used for targeting. genomic modification. Indels may be analyzed in by combining Sanger sequencing with TIDE software [25], or amplification products may be cloned and sequenced. Next generation sequencing (NGS) is extremely useful in the analysis of HDR and indels and a novel method describing high-throughput multiplexed screening of modified clones is reported in Nussbaum et al. [26] (submitted). 29. Recover the correctly targeted clones in 2× 200 μL 96 wells and check them 24 h post-thaw.

Expected Results
To circumvent the lack of specific antibodies and allow detection of Codanin-1 and C15ORF41 proteins we inserted DNA sequences encoding a Strep-tag [19] and FLAG-tag [20] at the 3′ end of CDAN1 and the 5′ end of C15ORF41 and we show the outcome of this as expected results here. Surveyor assays for CDAN1 sgRNAs 1 and 2 ( Figure 3) show that both sgRNAs are capable of successfully inducing Cas9 activity at the correct targeting site, however, sgRNA 1 was selected because it produced a cut closest to the desired region of insertion, immediately preceding the termination codon. Similarly, for C15ORF41 both sgRNA 2 and sgRNA 4 allowed good cutting efficiency (Figure 3), but because sgRNA 4 generated the most efficient cutting it was used for targeting. To date we have used the protocol described to generate four lines of HUDEP-2 cells harbouring inserted sequences (Table 2 and Figure 4). These figures are given as an example and it should be noted that efficiencies are likely to vary in a locus dependent fashion. From 600 GFP positive cells sorted for each experiment, ~30 survive the initial expansion phase and of those ~20-42% contain at least one successfully targeted allele. The majority of the remaining alleles are unedited and the rest contain indels mediated by nonhomologous end joining (NHEJ). Table 2 shows the outcomes of the edited alleles. It should be noted that by performing long-range amplification of target loci we have To date we have used the protocol described to generate four lines of HUDEP-2 cells harbouring inserted sequences (Table 2 and Figure 4). These figures are given as an example and it should be noted that efficiencies are likely to vary in a locus dependent fashion. From 600 GFP positive cells sorted for each experiment,~30 survive the initial expansion phase and of those~20-42% contain at least one successfully targeted allele. The majority of the remaining alleles are unedited and the rest contain indels mediated by nonhomologous end joining (NHEJ). Table 2 shows the outcomes of the edited alleles. It should be noted that by performing long-range amplification of target loci we have routinely found deletions of several hundred base pairs to have been associated with clustered regularly interspaced short palindromic repeats (CRISPR/Cas9)-associated cut sites. Therefore, care should be taken when designing assays to characterise target sites. As we observe some non-genetic clonal variability we would recommend making 2 or 3 clones for each line required. Although the karyotype of the wild-type HUDEP-2 cells reported here is conserved in our modified clones there may be some variation as the chromosome counts in the clones of HUDEP-2 cells in our hands differ slightly from those reported by Vinjamur and Bauer (2018) [27]. This protocol provides detailed step-by-step instructions for achieving cutting of genomic DNA and HDR mediated integration of short donor sequences in HUDEP-2 cells using CRISPR/Cas9. HUDEP-2 cells provide an alternative model system to erythroleukaemic lines and, unlike in vitro differentiated iPSCs, express predominantly adult globins, making them a useful tool for unpicking the pathogenesis of erythroid disorders. HUDEP-2 cells double every 24-36 h allowing the generation of large quantities of material for downstream analyses previously impeded by low cell numbers. HUDEP-2 cells also provide an isogenic background, which circumvents variable patient phenotypic abnormalities. This protocol can be applied to additional immortalized erythroid cell lines as they become available (e.g., Ref. [28]) and provides a means to efficiently generate modified cell lines.
Our approach takes advantage of recent studies that identify short ssODNs as a useful source of template for HDR [29,30]. If larger fragments are required for HDR then use of a delivery vector may facilitate this, see Reference [13]. HUDEP-2 cells also have scope for the use of longer linear double stranded donors, which have been efficiently integrated in HEK293T and human embryonic stem cells, particularly when paired with cell synchronization [31]. Treatment with Nocodazole and Cyclin-D1 causes cells to accumulate in the S/G2/M phase where HDR is the favoured pathway for DNA repair [31,32], although it remains to be shown whether this will work as efficiently in HUDEP-2 cells. Donors with an antibiotic resistance gene or that encode a fluorescent protein provide a means of selection at the level of integration and would also increase the rate of modification. Recently, modified S. pyogenes Cas9 proteins have been generated that show reduced off-target cutting events while preserving a high level of activity at on-target sites (e.g. [33,34]). These proteins offer a useful alternative where the user wishes to use RNP editing of erythroid cells as described in Reference [13].
A major consideration when utilizing the CRISPR/Cas9 system is that the enzyme tolerates mismatches in both the PAM and the protospacer element [35][36][37][38][39][40]. Although off-target activity of Cas9 is more concerning in a clinical context than when generating model systems, non-specific cleavage could lead to off-target modifications. In this study we used a short guide of 18 nucleotides to minimize the homology available for off-target binding [41]. It should be noted that shorter protospacer sequences can lead to a drop in editing efficiency, if low efficiency is an issue then longer protospacers of 20 or 21 nucleotides should be considered. Off-target sites can differ from the targeted locus by up to seven nucleotides [39,42] and these can be in the form of single nucleotide mismatches, small indels or DNA-RNA bulges [33]. However, off-target activity generally decreases with increasing numbers of mismatches and three mismatches have been reported to ablate almost all off-target activity [35,38,42]. Off-target activity varies according to cell type and transfection conditions [38,[43][44][45] and there are no simple rules that can be applied to its prediction [38,42]. However, unbiased detection of off-target effects may be determined by Digenome-seq [40] or CIRCLE-seq [45] alternatively predicted off-target sites may be amplified using specific primers followed by mismatch detection or sequencing. Where necessary, perhaps the most effective way to navigate this limitation of the Cas9 system is to generate the same mutation using two different guides and compare the resultant cell lines. This would help to distinguish phenotypic abnormalities arising from the intended modification from the possible effects of off-target cleavage. This would increase the likelihood that phenotypic abnormalities arise due to modification alone and not the effect of off-target cleavage and clonal variability.
Methods Protoc. 2018, 1, x FOR PEER REVIEW 12 of 14 of the Cas9 system is to generate the same mutation using two different guides and compare the resultant cell lines. This would help to distinguish phenotypic abnormalities arising from the intended modification from the possible effects of off-target cleavage. This would increase the likelihood that phenotypic abnormalities arise due to modification alone and not the effect of offtarget cleavage and clonal variability.  Immortalized cell lines are a useful resource for unpicking the relationship between patient genotype and disease phenotype as long as their assays are interpreted within the limitations of an immortalized system. As such, we karyotyped the HUDEP-2 cells using the Infinium Omni5-4 v1.2 CGH array and found that they have a modal chromosome count of 51. These cells have relatively fewer abnormalities than HEL or K562 cells [4], which also have whole-chromosome duplication events with modal chromosome numbers of 66 [2] and 67 [46] respectively. In vitro differentiation of the cells confirmed that despite the potential off-target effects of the genome editing process, each clone still followed a normal differentiation process and produced mature erythroblasts ( Figure 5). Immortalized cell lines are a useful resource for unpicking the relationship between patient genotype and disease phenotype as long as their assays are interpreted within the limitations of an immortalized system. As such, we karyotyped the HUDEP-2 cells using the Infinium Omni5-4 v1.2 CGH array and found that they have a modal chromosome count of 51. These cells have relatively fewer abnormalities than HEL or K562 cells [4], which also have whole-chromosome duplication events with modal chromosome numbers of 66 [2] and 67 [46] respectively. In vitro differentiation of the cells confirmed that despite the potential off-target effects of the genome editing process, each clone still followed a normal differentiation process and produced mature erythroblasts ( Figure 5).  Since their inception in 2013, HUDEP-2 cells have been widely utilised in conjunction with the CRISPR/Cas9 system. However, few details are available for the modification of these cells, especially where methods alternative to lentiviral transduction are preferable [9,[47][48][49][50][51]. Here we describe a detailed protocol, similar to the strategy employed in Reference [9], for the efficient modification of immortalised erythroid progenitor cells in an easy to follow and scalable method. In our experience, HUDEP-2 cell differentiation can provide adult-globin expressing erythroblasts in numbers 4-7 fold higher than starting cell numbers within 9-12 days. These cells provide an alternative to existing immortalised erythroleukemia lines and support the need to replace animal models [6].