Integrated single-cell sequencing of 5-hydroxymethylcytosine and genomic DNA using scH&G-seq

Summary The asymmetric distribution of 5-hydroxymethylcytosine (5hmC) between two DNA strands of a chromosome enables endogenous reconstruction of cellular lineages at an individual-cell-division resolution. Further, when integrated with data on genomic variants to infer clonal lineages, this combinatorial information accurately reconstructs larger lineage trees. Here, we provide a detailed protocol for single-cell 5-hydroxymethylcytosine and genomic DNA sequencing (scH&G-seq) to simultaneously quantify 5hmC and genomic DNA from the same cell to reconstruct lineage trees at a single-cell-division resolution. For complete details on the use and execution of this protocol, please refer to Wangsanuwat et al., 2021.

Highlights scH&G-seq enables quantification of 5hmC and genomic DNA (gDNA) from the same cell Strand-specific 5hmC enables reconstructing trees at a single-celldivision resolution Genomic variants allow identification of clonal subtrees within a lineage tree Integrated 5hmC and gDNA detection improves prediction accuracy of larger lineage trees SUMMARY The asymmetric distribution of 5-hydroxymethylcytosine (5hmC) between two DNA strands of a chromosome enables endogenous reconstruction of cellular lineages at an individual-cell-division resolution. Further, when integrated with data on genomic variants to infer clonal lineages, this combinatorial information accurately reconstructs larger lineage trees. Here, we provide a detailed protocol for single-cell 5-hydroxymethylcytosine and genomic DNA sequencing (scH&G-seq) to simultaneously quantify 5hmC and genomic DNA from the same cell to reconstruct lineage trees at a single-cell-division resolution. For complete details on the use and execution of this protocol, please refer to Wangsanuwat et al., 2021.

BEFORE YOU BEGIN
This protocol for scH&G-seq (Single-cell 5-hydroxymethylcytosine and genomic DNA sequencing) was recently developed in a publication to simultaneously quantify 5hmC and genomic DNA from a single cell (Wangsanuwat et al., 2021). scH&G-seq builds upon scAba-Seq (Mooijman et al., 2016), where the addition of restriction enzymes BseRI and/or AluI enables sequencing of genomic DNA together with 5hmC in single cells. BseRI was selected because it generates the same overhang after digestion of genomic DNA as AbaSI, which is used to detect 5hmC. In addition, the recognition sequence of BseRI is devoid of CpG, eliminating ambiguity in assigning reads to genomic DNA or 5hmC. Thus, both genomic DNA-derived reads and 5hmC-derived reads can be separated computationally post-sequencing while being ligated and amplified from the same double-stranded adapter. AluI was selected to increase the coverage of the genome, as described previously (Rooijers et al., 2019). As digestion with AluI leaves a blunt end, an additional blunt double-stranded adapter is required to capture genomic DNA reads derived from AluI. The AluI double-stranded adapters and AbaSI/BseRI double-stranded adapters use different sets of unique barcodes. This protocol provides step-by-step instructions for performing scH&G-seq using both BseRI and AluI simultaneously. It also includes modifications to the protocol if only one of the two enzymes are used. Finally, scH&G-seq can be performed on any cell type that can be sorted by fluorescence-activated cell sorting (FACS) and has 5hmC in its genome.
The scH&G-seq protocol is completed over several days. Day 1 in the table below is the day a sorted 384-well plate containing single cells is thawed from À80 C for use in the scH&G-seq protocol. Use 1. Phosphorylate the bottom strand of each barcoded adapter (for both AbaSI/BseRI and AluI) a. Perform reaction in a 96-well plate. Keep each barcoded bottom strand primer in a separate well.
b. Seal the plate with a Greiner multiwell plate seal, and briefly spin the plate down at 2500 rpm. Then incubate as follows: 2. Anneal top and bottom strand primers to generate double-stranded adapters.

Protocol steps Performed on day
Restriction digestion of genomic DNA 1 Stripping chromatin from genomic DNA 1 Glucosylation of 5hmC sites in the genome 2 Protease treatment to degrade T4-BGT 3 AbaSI digestion 4 Ligating double-stranded adapters to fragmented genomic DNA molecules 4 Pooling single cells and DNA cleanup 5 In vitro transcription and amplified RNA clean-up 5 & 6 Illumina library preparation from amplified RNA 6 Computational pipeline to analyze scH&G-seq sequencing 7 and beyond Heat inactivation 65 C 20 min 1

Final hold 4 C Forever
Maintain thermocycler lid at 75 C. The total volume per well will be 20 mL.

OPEN ACCESS
a. Perform this step in a 96-well plate. Add a top strand primer to each well containing the corresponding bottom strand primer that was phosphorylated in step 1. b. Before removing the seal from the plate, allow the plate to cool to 4 C in the thermocycler, then briefly spin the plate down at 2500 rpm. c. Add to step 1: d. As described in the table below, the thermocycler is preheated to 95 C, and once it has reached this temperature, insert the plate and advance to the next step. Incubate the plate for 5 min at 95 C, then decrease temperature by 1 C per minute till the plate reaches 20 C (Maintain thermocycler lid at 105 C). The total volume per well will be 50 mL.
e. The barcoded double-stranded adapters are now double-stranded and are at a stock concentration of 20 mM.
3. Dilute the double-stranded adapters to a concentration of 1 mM. a. In a new 96-well plate, for each barcoded double-stranded adapter, combine 5 mL of the 20 mM stock (step 2) with 95 mL of nuclease-free water. The total volume per well will be 100 mL. This 96-well plate now contains 1 mM phosphorylated double-stranded adapters. 4. Further dilute the double-stranded adapters. Take the 1 mM phosphorylated double-stranded adapters from step 3 and follow the steps below to dilute them to their corresponding working concentrations in a total volume of 100 mL per well. a. For all scH&G-seq strategies, the working concentration of the random 2-nucleotide 3 0 overhang phosphorylated double-stranded adapters (that is, for ligating to AbaSI/BseRI generated fragments) is 75 nM. Make a new 96-well plate for this working concentration. To do this, for each barcoded double-stranded adapter, combine 7.5 mL of 1 mM stock (step 3) with 92.5 mL of nuclease-free water. The total volume per well will be 100 mL. The adapter sequences have been described in detail previously (Mooijman et al., 2016). b. The working concentration of the blunt phosphorylated double-stranded adapters (that is, for ligating to AluI generated fragments) is 64 nM. Make a new 96-well plate for this working concentration. To do this, for each barcoded double-stranded adapter, combine 6.4 mL of 1 mM stock (step 3) with 93.6 mL of nuclease-free water. The total volume per well will be 100 mL. The adapter sequences have been described in detail previously (Rooijers et al., 2019;Markodimitraki et al., 2020). 5. Seal all the plates with a Greiner multiwell plate seal, and briefly spin the plates down at 2500 rpm. Note: To expedite the above process, use a 12-channel pipette when handling the barcoded double-stranded adapters and for making dilutions. Mix the solution well by pipetting up and down at least 10 times at each stage. All primers and double-stranded adapters can be stored long term at À20 C.
Note: The bottom strand primer refers to the primer that will ligate to the free 3 0 end of genomic DNA after enzymatic digestion. The top strand primer refers to the primer that will ligate to the free 5 0 end of genomic DNA after enzymatic digestion.

Add Vapor-Lock to 384-well plates
Timing: 10 min per plate To avoid evaporation of small volumes of aqueous solutions added in later steps, Vapor-Lock is added to each well of a 384-well plate. All pipetting at this step is performed manually.
6. Add 4 mL of Vapor-Lock to each well of a 384-well plate. a. A 12-channel pipette can be used for dispensing. To do this, use a serological pipette to dispense Vapor-Lock into a disposable pipetting reservoir. Typically, 5.5 mL of Vapor-Lock is needed to make 3 plates. 7. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm.
CRITICAL: Avoid touching the tips containing Vapor-Lock onto the top of the plate as this will result in a small amount of oil on the surface of the plate, considerably reducing the ability of the Greiner multiwell plate seal to effectively seal the plate.
Note: The same pipette tips can be used to fill 2 plates with Vapor-Lock. Use new tips if any of the tips touch any other surface that could be contaminated with RNases.
Note: PCR safe mineral oil can be used as a cheaper alternative to Vapor-Lock. The Vapor-Lock or mineral oil is added manually to avoid introducing oil into the liquid-handling robot. In this protocol, we prepared double-stranded adapters in 96-well plates and Vapor-Lock in 384-well plates, but scH&G-seq can be performed in just 96-well or 384-well plates.

Culture H9 cell line
Timing: 2+ weeks The cells of interest are grown under the desired experimental conditions, and when ready they are isolated for scH&G-seq. As an example here, H9 human embryonic stem cells are grown.
8. On ice, dilute the Matrigel in DMEM/F-12 according to the lot specific dilution factor. Then precoat a 6-well dish with the diluted Matrigel (1 mL/well). Incubate the plate at 20 C-25 C for 1 h for same day use or wrap in parafilm and store at 4 C for up to 1 week. Aspirate off superfluous liquid before plating cells. 9. Thaw a vial containing 1 million H9 cells. Put 70% of the cells into one Matrigel coated well and the other 30% into another Matrigel coated well of a 6-well dish.

OPEN ACCESS
a. Over the next few days continue culturing the well that looks healthier and grows at the appropriate rate. Healthy colonies have smooth edges and are round, containing closely packed cells with a high nucleus to cytoplasm ratio. 10. Fill each well with 2 mL of pre-warmed mTeSR1. 11. Refresh the media daily and wash with 13 DPBS if cell debris is visible. 12. When cells reach approximately 70% confluency (this should take 5-7 days), and before colonies begin to combine, passage the cells using the cell dissociation reagent Versene. a. Aspirate the media and wash with 13 DPBS b. Add 1 mL of Versene to each well that requires passage. c. Incubate the plate at 37 C for 3-5 min. d. Aspirate Versene gently. The colonies should still be attached to the plate and should be visible to the unaided eye. e. Use a P1000 pipette to slowly add cell culture media onto the colonies, removing them from the plate. Do not use the pipette tip to scratch the colonies off the plate. f. Pipette up and down to create clumps of cells. Be gentle to prevent a single-cell suspension.
During pipetting, avoid the formation of air bubbles. g. Passage the cells in small clumps. Typically, one well at approximately 70% confluency can be passed into 6-12 new Matrigel coated wells. h. Add up to 2 mL of media per well and disperse the clumps of cells evenly throughout the well. 13. Continue to grow the cells for at least one more passage, or until the cells are ready for single cell collection by FACS according to your experimental procedure. For H9 cells grown here, cells are ready for FACS sorting when they reach approximately 70% confluency and before colonies begin to combine.
CRITICAL: Pipetting with the appropriate amount of force to create small clumps of H9 cells is critical. Creating a single-cell suspension of H9 cells during passage will result in substantial cell death. Passaging in large clumps will result in the new plate quickly becoming overgrown. When initially attempting the protocol, it is recommended to err on the side of passaging cells in larger clumps and checking cell growth often.
Note: Replace the cell culture steps as appropriate. It is critical that the resulting cells can be sorted by FACS and that the cells contain detectable levels of 5hmC.

MATERIALS AND EQUIPMENT
For high-throughput processing, the Nanodrop II low-volume liquid-handling robot was used unless otherwise noted. For dispensing reagents into the 384-well plate, the ''dispense plate discrete stop'' program built into the Nanodrop II robot was used. When dispensing reagents, 100 mL (12.5 mL per dispensing tip) of additional master mix is needed to account for the dead volume in the dispensing reservoir. For dispensing double-stranded adapters, the ''dispense plate in-the-well'' program built into the Nanodrop II robot was used, and between each dispense step the tips were washed three times with the pre-programed ''Eject Wash Rinse'' program to prevent inadvertently mixing barcoded double-stranded adapters.
Note: Any liquid-handling robot with the ability to dispense into 384-well plates down to 200 nL volumes can be used. For manual pipetting, aerosol barrier, low retention pipet tips should be used to reduce losses and the possibility of cross contamination. Additionally, handling all materials in a PCR workstation can also reduce the chances of cross contamination. Finally, clean all surfaces and items (including gloves) using 70% ethanol followed by RNaseZap to reduce the chances of cross contamination.
Note: To ensure that the aRNA fragmentation buffer remains RNase-free, split the Tris-acetate into two parts. Use a pH probe to measure the exact amount of hydrochloric acid or sodium hydroxide required to balance the pH of one of the Tris-acetate aliquots to 8.1. Discard this aliquot, and to the other aliquot, add the measured amount of acid or base. Do not use the pH probe on this aliquot as it could result in RNase contamination.

STEP-BY-STEP METHOD DETAILS
In the protocol below, we provide details of performing scH&G-seq and the corresponding data processing steps to analyze the Illumina sequencing data ( Figure 1). Note: In each reaction mix table, the final concentration for each reagent added is calculated based on its concentration in the final solution volume in the 384-well plate rather than its concentration in the mix being added.

Prepare plates for FACS by adding lysis buffer
Timing: 1-2 h Add a lysis buffer to each well of a 384-well plate. This step is performed prior to FACS to ensure that once single cells are sorted into this solution, they are lysed and the plate can be stored at À80 C until processed.
1. To each well of a Vapor-Lock containing 384-well plate, add the lysis buffer as described below using a liquid handling robot: Figure 1. Schematic of scH&G-seq In scH&G-seq, cells of interest are first sorted into 384-well plates containing the lysis buffer. Next, AluI and/or BseRI are added to digest genomic and mitochondrial DNA. Following this step, protease treatment is used to strip off chromatin. Thereafter, 5hmC marks in the genome are glucosylated by T4-BGT and then protease treatment is used to degrade T4-BGT. Genomic DNA marked by modified 5hmC sites are then cut by AbaSI, which is followed by ligation with AbaSI-/BseRI-specific and AluI-specific double-stranded adapters. Next, single cells are pooled and amplified by in vitro transcription. Finally, after reverse transcription (RT) and PCR amplification, the Illumina libraries are ready for sequencing.

OPEN ACCESS
2. Seal the 384-well plates with a Greiner multiwell plate seal, and briefly spin the plate down at 2500 rpm. Always keep plates on ice except when directly handling them.

Isolate and FACS sort single cells
Timing: 2-6 h (depending on the number of plates sorted) To make a single cell suspension of H9 cells (or cells of interest) and to sort one cell into each well of a 384-well plate, proceed as follows: 3. Wash cells twice with 2 mL of 13 DPBS. 4. Add 0.5 mL of TrypLE select (13) solution to one well of a 6-well plate and incubate the plate at 37 C until cells begin to lift (3-5 min). 5. Add serum-containing media to neutralize the TrypLE solution, collect the cells in a new 15 mL conical centrifuge tube and spin the tube at 300 g for 5 min to pellet the cells. 6. Aspirate the medium and resuspend the pellet in 1 mL of 13 DPBS. Make a single-cell suspension by gently triturating with a P1000 pipette. Always keep the tube on ice except when handling it. a. Add propidium iodide (PI) to distinguish between live and dead cells, thereby increasing the capture efficiency. The final concentration of PI should be between 1 and 50 mg/mL. b. Filter cells using a FACS test tube with a cell strainer snap cap to remove large cellular clumps.
After filtration, sort cells from this test tube. 7. Set up the FACS machine to sort single cells into a 384-well plate.
a. If you are unsure whether the microfluidics of the FACS machine was cleaned prior to use, clean the instrument with 10% bleach. For the Sony SH800 cell sorter, follow the Bleach Cleaning wizard in the sorter software. b. Set the sample area to 4 C and, if possible, set the collection area on the FACS machine to 4 C as well. c. Use an empty 384-well plate sealed with a Greiner multiwell plate seal to perform a mock sort.
Do this by dispensing 50 droplets of sheath fluid onto the seal covering the plate. Check that the droplets are centered on the wells of the plate. If they are not, adjust the alignment and reperform the mock sort. Perform this mock sort on one fourth of all wells spread across the plate to ensure that the plate is properly aligned.
CRITICAL: Accurate alignment of droplets on the 384-well plate is needed for successfully sorting single cells into the plate.
8. Once the 384-well plate is well-aligned, sort a single cell into each well of the 384-well plate. Identify single cells using forward scatter and backscatter (or side scatter depending on the sorter). 9. After sorting a plate, seal it with a Greiner multiwell plate seal, and briefly spin the plate down at 2500 rpm. If sorting multiple plates, place the sorted plate on ice (or transfer to À80 C freezer) and sort the next plate following the same protocol.
Optional: Leave a few wells unsorted as a negative control. Sequencing results from these negative wells will behave similar to poorly sorted wells. If you get a significant number of sequencing reads that map to these negative wells, it indicates genomic DNA contamination. Note: mTeSR1 does not contain serum. If serum containing media is unavailable, neutralize the TrypLE solution with a 5% fetal bovine serum solution in 13 DPBS.
Note: Gentle pipetting is important to make a single cell suspension but over pipetting can reduce cell viability and increase cellular debris.
Note: Single cells have a smaller forward scatter area and width when compared to doublets, but a similar forward scatter height. Additionally, single cells display higher forward scatter and lower back scatter than cellular debris. A high proportion of your cells should be viable single cells. Viable cells should contain no emission from propidium iodide.
Pause point: After sorting, plates can be stored at À80 C until ready to continue. Stored plates have similar success in generating high-quality Illumina libraries, even for plates that have been stored for up to 1 year.

Timing: 2 h
Genomic DNA is digested with the enzyme(s) of choice to sequence the genome in scH&G-seq. Simultaneous digestion with both BseRI and AluI is described here, but the options for digesting with only one of the two enzymes are also discussed.
10. Incubate the plate for 5 min at 65 C and then return it to 4 C or ice.
11. Digest genomic DNA simultaneously with both BseRI and AluI by adding the following mix to each well of the 384-well plate using a liquid handling robot. a. BseRI and AluI digestion master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. c. Incubate in a thermocycler as follows: The total volume per well will be 700 nL (excluding Vapor-Lock).
Optional: If only BseRI is used, replace the digestion mix in step 11a with the following mix. Perform steps 11b and 11c as described above. Optional: If only AluI is used, replace the digestion mix in step 11a with the following mix.
Perform steps 11b and 11c as described above.
Optional: If the goal is to detect only 5hmC and no genomic DNA, replace the digestion mix in step 11a with the following mix. Perform step 11b as described above but skip step 11c.
Note: Digestion of genomic DNA here is performed prior to stripping off chromatin to maximize reads derived from mitochondrial DNA. Mitochondrial reads can be useful for clonal lineage reconstruction as regions of the mitochondrial genome have higher rates of mutation compared to genomic DNA (Wallace, 1994;Ludwig et al., 2019;Xu et al., 2019).

Timing: 1 day
Protease is added to remove chromatin, enabling the genome-wide detection of 5hmC.
12. If no Qiagen protease solution is available, starting from the lyophilized powder make a fresh Qiagen protease solution at a concentration of 100 mg/mL in nuclease-free water.
CRITICAL: Make the Qiagen protease solution under RNase-free conditions. A sterile pipette tip can be used instead of a traditional spatula to scoop out the lyophilized powder, while maintaining RNase-free conditions.
Note: Qiagen protease solution can be stored at À20 C for up to one month.
13. Add the following mix to each well of the 384-well plate using a liquid handling robot. a. Protease master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. c. Perform the following: d. The total volume per well will be 2.5 mL.
Pause point: Plates can be stored for 12 h at 4 C.
14. Convert 5hmC to 5ghmC by adding the following mix to each well of the 384-well plate using a liquid handling robot. a. BGT master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin down the plate at 2500 rpm. c. Perform the following: Pause point: Plates can be stored for 12 h at 4 C.
Protease treatment to degrade T4-BGT

Timing: 5 h
Protease is added to the reaction mixture to degrade T4-BGT.
15. If no Qiagen protease solution is available, make a fresh Qiagen protease solution at 100 mg/mL concentration in nuclease-free water as described in step 12. 16. Degrade T4-BGT by adding the following mix to each well of the 384-well plate using a liquid handling robot. a. Protease master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. c. Perform the following: d. The total volume per well will be 3.5 mL.
Pause point: Plates can be stored for 12 h at 4 C.

AbaSI digestion
Timing: 3 h AbaSI is added to selectively cut genomic DNA downstream of 5hmC and 5ghmC sites, leaving a 2-nucleotide 3 0 overhang.
17. Cut genomic DNA downstream of 5hmC and 5ghmC sites by adding the following mix to each well of the 384-well plate using a liquid handling robot. a. AbaSI digestion master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. c. Perform the following: d. The total volume per well will be 4 mL.
Ligating double-stranded adapters to fragmented genomic DNA molecules Timing: 1 day The double-stranded adapters are ligated to genomic DNA fragments cut by AluI, BseRI, and AbaSI. These adapters are specific for each enzyme and uniquely barcoded so that single cells can be pooled downstream in the protocol. The adapters also contain a part of the Illumina read 1 adapter sequence and a T7 promoter sequence.
18. Add 200 nL of the 64 nM blunt phosphorylated double-stranded adapter to each well of the 384-well plate using a liquid handling robot. Use one unique adapter per well ( Figure 2). a. After adding the double-stranded adapter, seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. 19. Add 200 nL of the 75 nM random 2-nucleotide 3 0 overhang phosphorylated double-stranded adapter to each well of the 384-well plate using a liquid handling robot. Use one unique adapter per well (Figure 2). a. After adding the double-stranded adapter, seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. 20. Ligate the adapters to the fragmented genomic DNA molecules by adding the following mix to each well of the 384-well plate using a liquid handling robot. a. T4 DNA ligase master mix: b. Seal the plate with a Greiner multiwell plate seal and briefly spin the plate down at 2500 rpm. d. The total volume per well will be 5 mL. The combined concentration of NEBuffer 4 and T4 DNA ligase buffer is 13.
CRITICAL: The cells that you want to pool together must have uniquely barcoded doublestranded adapters added at this step. Do not add more than one barcoded doublestranded adapter of each adapter type to a single reaction well.
CRITICAL: Wash the tips of the robot three times with the pre-programed ''Eject Wash Rinse'' program after each adapter dispenses to prevent inadvertently mixing the double-stranded adapters.

Thermocycler conditions
Steps Temperature Time Cycles CRITICAL: Document the double-stranded adapters used for each cell. 5hmC derived (from AbaSI) and BseRI derived genomic DNA reads will have the same barcode for a given cell, while AluI derived genomic DNA reads will have another set of barcodes. AbaSI/BseRI barcodes and AluI barcodes for the same cell will later be matched during data processing. Without this documentation, it will not be possible to pair the 5hmC derived reads, and the BseRI and AluI derived reads to the same cell.
Note: ATP and the T4 DNA ligase buffer, which contains ATP, degrade with multiple freeze thaw cycles. Aliquot these components as necessary to limit the number of freeze thaw cycles.
Optional: If AluI is not used to digest genomic DNA, do not add the blunt phosphorylated double-stranded adapters (skip step 18). In addition, replace the ligation mix (step 20a) with the following mix shown below. Thereafter, perform steps 20b and 20c as described above.
Pause point: Plates can be stored at À20 C until ready to continue.

Pooling single cells and DNA cleanup
Timing: 3 h Each well containing distinctly barcoded DNA molecules from an individual cell in the 384-well plate is pooled together manually. A standard 13 DNA bead cleanup is performed to exchange the buffer.
21. Use a manual pipette (a 12-channel P20 is recommended) to pool cells that are uniquely barcoded (both AbaSI/BseRI barcodes and AluI barcodes) into a 1.5 mL microcentrifuge tube. Each well of the 384-well plate should contain a total volume of 9 mL (4 mL Vapor-Lock and 5 mL of aqueous solution) ( Figure 2). a. Depending on the numbers of AbaSI/BseRI and AluI barcodes used, more than one 1.5 mL microcentrifuge tube might be required to pool cells from the entire plate. For example, if 96 AbaSI/BseRI barcodes and 96 AluI barcodes were distributed over a 384-well plate, four 1.5 mL microcentrifuge tubes will be required to pool cells. b. To pool a plate formatted as described in Figure 2, for the first barcoded set of 96 cells, pool all odd rows and odd columns. Use a 12-channel P20 pipette to aspirate all liquid from wells A1, A3, A5.A23, then eject the liquid into a 12-tube PCR strip. c. Return to the plate and aspirate wells C1, C3, C5.C23, then eject the liquid into the same 12-tube PCR strip. d. Continue this pattern until the odd columns of row O have been transferred into the same 12tube PCR strip. The same tips can be used throughout this process but should be discarded after transferring the odd columns of row O. e. Transfer the material from the 12-tube PCR strip into a 1.5 mL microcentrifuge tube. The pooling of one set of 96 uniquely barcoded cells are now complete. f. Repeat this process using new tips, a new 12-tube PCR strip and a new 1.5 mL microcentrifuge tube to pool all even rows and odd columns. This will result in the second set of 96 uniquely barcoded cells. g. Repeat this process using new tips, a new 12-tube PCR strip and a new 1.5 mL microcentrifuge tube to pool all odd rows and even columns. This will result in the third set of 96 uniquely barcoded cells. h. Repeat this process again using new tips, a new 12-tube PCR strip and a new 1.5 mL microcentrifuge tube to pool all even rows and even columns. This will result in the fourth set of 96 uniquely barcoded cells. At this point the 384-well plate has been distributed into 4 libraries and each library has its own 1.5 mL microcentrifuge tube. 22. Spin the tubes at 5,000 g for 1 min. The mixture should separate into two distinct phases. 23. Use a P200 pipette to remove as much of the upper oil phase as possible. 24. Next, use a P1000 pipette to quickly plunge through any of the remaining oil phase and transfer the aqueous phase to a new tube. a. The pipette should go through the oil phase only once to collect all the solution in the aqueous phase. If the oil and aqueous phases begin to mix prior to complete separation of the aqueous phase, repeat steps 22-24. b. The samples are not amplified at this point. To minimize sample loss, collect as much of the aqueous layer as possible, and ideally leave behind less than 10 mL of the aqueous phase. It is best practice to not aspirate any of the oil phase as it can complicate step 25. Note: When using a vacufuge, if the sample volume reduces to less than 6.4 mL, add nucleasefree water to bring the volume back up to 6.4 mL.
Pause point: Samples can be stored at À20 C until ready to continue.

In vitro transcription and amplified RNA clean-up
Timing: 1 day In vitro transcription (IVT) is used to amplify the barcoded DNA molecules, generating amplified RNA (aRNA). Next, ExoSAP-IT is used to remove any excess primers. The aRNA is then fragmented to a suitable length (200-1000 nucleotides) for sequencing. Finally, a standard 0.8253 RNA bead cleanup is performed to exchange the buffer.
27. The MEGAscript T7 transcription kit is used for this step. To each pooled sample containing 6.4 mL of barcoded DNA molecules add: a. IVT master mix: b. The total volume per sample will be 16 mL containing 13 T7 reaction buffer.
c. Perform the following: d. The samples are now amplified and contain aRNA.
CRITICAL: As IVT takes 13 h to complete, it is likely that this step will be performed at the end of the day and the samples will be held at 4 C. While aRNA is stable at 4 C for several hours, it is recommended to either store aRNA at À20 C for short-term storage (a few days) or proceed to the next step within a few hours after the IVT step is complete. the tube from the magnetic stand and elute the fragmented aRNA in 22 mL of nuclease-free water. e. Allow the fragmented aRNA to elute from the beads for 10 min before placing the tube on the magnetic stand. Once the beads form a pellet, transfer the supernatant to a new tube.
CRITICAL: All surfaces and items must be RNase-free. Use RNaseZap to wipe down all equipment, surfaces, and gloved hands. RNase contamination can lead to library preparation failure.
Optional: At the completion of this section, an Agilent RNA 6000 Pico chip can be run to check the size distribution of the fragmented aRNA ( Figures 3A and 3B).
Pause point: Samples can be stored at À80 C until ready to continue. In the following steps, not all aRNA will be used. The remaining aRNA can be stored at À80 C to generate additional Illumina libraries in the future if necessary.

Timing: 1 day
In this section, aRNA is reverse transcribed using random hexamer primers with overhangs that contain part of the Illumina read 2 adapter sequence. The reverse transcribed single-stranded DNA now contains parts of the Illumina read 1 and read 2 adapters at the two ends of the molecules. Next, these molecules are amplified by PCR, and the full Illumina adapter sequences are introduced using overhangs in the PCR primers. Two 0.8253 DNA bead cleanups are performed to remove primers and dimers, and to exchange the buffer. The size distribution and concentration of the final Illumina libraries are then quantified using an Agilent Bioanalyzer and Qubit fluorometer, respectively. b. The total volume will be 6.5 mL. c. Perform the following:

Thermocycler conditions
Steps Temperature Time Cycles Heat denature the aRNA 65 C 5 min 1 d. Immediately place the samples directly on ice. e. Add the remaining reverse transcription mix: f. The total volume will be 10.5 mL. g. Perform the following: 33. PCR amplification a. Transfer 5.25 mL of the reverse transcribed aRNA from step 32 to a new PCR tube and store the other half at À20 C for later use in case the number of PCR cycles performed here turns out to be non-optimal. b. Each sample should receive its own indexed RNA PCR Primer (RPI1-RPI48) so that it can be multiplexed with other samples in one Illumina sequencing lane if desired. c. To 5.25 mL of the reverse transcribed aRNA, add the following PCR mix: d. The total volume is now 25.25 mL. e. Perform the following (Note that the thermocycler is preheated to 98 C, and once it has reached this temperature, insert the sample and advance to the next step):

Thermocycler conditions
Steps Temperature Time Cycles Initial extension 25 C 10 min 1 Reverse 34. Briefly spin down the tubes to collect all the liquid to the bottom of the tube. Add 24.75 mL of nuclease-free water. Transfer the solution in each PCR tube to a new 1.5 mL microcentrifuge tube. 35. Perform a 0.8253 AMPure XP magnetic bead cleanup, eluting in 50 mL of nuclease-free water.
a. This cleanup is similar to the one described in step 25. b. Add 41.25 mL of AMPure XP magnetic beads to 50 mL of the PCR product. Incubate the sample for 15 min. Transfer the tube to a magnetic stand. c. Once the beads form a pellet, discard the supernatant. Wash the pellet twice with freshly made 80% ethanol. d. Dry the beads on the magnetic stand by leaving the cap of the tube open for 3-10 min.
Remove the tube from the magnetic stand and elute in 50 mL of nuclease-free water. e. Allow the DNA to elute from the beads for 10 min before placing the tube on the magnetic stand. Once the beads form a pellet, transfer the supernatant to a new tube. 36. Perform a second 0.8253 AMPure XP magnetic bead cleanup, eluting in 15 mL of nuclease-free water. a. This cleanup is the same as step 35, except that the final elution volume is 15 mL. 37. Determine the fragment size distribution of the Illumina library using an Agilent high-sensitivity DNA kit following the manufacturer's recommendations. 1 mL of each sample is loaded on a high-sensitivity DNA chip. a. If the Illumina library does not show a size distribution similar to Figure 3C, see troubleshooting 2, 3, 4, and 5 CRITICAL: The bioanalyzer plot should show that greater than 90% of the product library has a size distribution between 200 and 1000 bp. There should be little to no primer (<100 bases) or dimer peaks ($120 bases) present. If there is a substantial primer and/or dimer peaks present, or if a significant percentage of the library is greater than 1000 bp, see troubleshooting 2 and 3.
38. If the bioanalyzer results warrant continuation, quantify the concentration of the Illumina library using the Qubit dsDNA high-sensitivity assay by following the manufacturer's recommendations. 1 mL of the Illumina library is used to estimate the concentration. a. If the sample concentration is above the lower limit of detection for the Qubit dsDNA highsensitivity assay, the concentration is typically sufficient for Illumina sequencing. b. The concentration will vary depending on the number of PCR cycles, but concentrations are typically in the range of 0.5-4 ng/mL. Low Qubit concentrations might reflect that many of the cells in the library have failed to amplify and may not produce high-quality data. 39. Combine Illumina libraries with distinct multiplexed barcodes added during the PCR step (step 33). Sequence on an Illumina platform to a depth of approximately 0.5-1 million reads per cell. For the analysis pipeline presented below, sequencing should contain at least 76 cycles on read 1. No custom Illumina sequencing primers are required.
Note: During PCR (step 33), one of the primers is universal (RP1) and is added to all samples. To multiplex multiple samples on the same Illumina sequencing lane, the other primer (RPI1, RPI2, . RPI48) is unique to each sample.
Note: 9 PCR cycles were found to be sufficient for making successful Illumina libraries containing 48 H9 cells.
Note: These Illumina libraries can be sequenced without any custom primers, and they can be multiplexed with any TruSeq-based Illumina sequencing library.
Note: While the steps 25, 35, and 36 are for cleanup of DNA molecules, the cleanup in step 31 is for RNA molecules. Make sure to use the correct magnetic beads (AMPure XP vs. RNAClean XP) accordingly.

OPEN ACCESS
Pause point: Samples can be stored at À20 C after the reverse transcription or PCR step (prior to or after AMPure XP magnetic bead cleanup).
Computational pipeline to analyze scH&G-seq sequencing results

Timing: 1-2 days
Sequencing results are processed into output files that provide 5hmC and genomic/mitochondrial DNA locations in each cell. Each read type -AbaSI, BseRI, and AluI derived reads -is partitioned into different files. Each file contains a list of unique locations detected (with duplicate reads removed) with the following format: cell barcode number, chromosome, mapped genomic location, and DNA strand information. The DNA strand information column is replaced with a unique molecule identifier (UMI) for the AluI derived file. All scripts and barcodes used can be found at GitHub https://github.com/alexchialastri/scH-G-seq. The associated scH&G-seq sequencing data for H9 cells can be found at GEO: GSE131678.
40. Process reads derived from AbaSI (5hmC) and BseRI (genomic/mitochondrial DNA). These reads have the same barcode (which is a different set from AluI derived reads) and are deconvoluted using the following steps: a. Sequenced reads (read 1) are trimmed to 76 bases. i. perl MakeFastqShorter.pl -Input InputFastqFileName.fastq -LengthMake 76 b. Trimmed reads containing AbaSI/BseRI barcodes are extracted.
Note: Prior to mapping, index the genome using BWA version 0.7.15. Follow the steps outlined in the BWA manual to create an indexed genome.
Note: BWA parameters in steps 40.c and 41.c can be adjusted as appropriate. However, ''-B'' parameter indicating the barcode length should be kept as 6 and 13, respectively.
Note: The names of output files can be adjusted as appropriate, as long as the names of files generated in steps 40 and 41 remain distinct.
Note: For the pipeline processing BseRI and AbaSI derived reads, the resulting text files will indicate chromosome X as 23, chromosome Y as 24, and mitochondrial-based reads as chromosome 25. The following scripts will also work on the mouse genome (or any organism with less than 22 autosomes) and will still result in the same naming convention for the sex chromosomes and mitochondrial-based reads. Organisms with more than 22 autosomes will require slight modification to line 65 of process_scaba_with_BseRI_AnyInput.pl to account for more chromosomes. For the pipeline processing AluI derived reads, the resulting text files will identify the sex chromosomes as chromosome X and Y, and the mitochondrial-based reads as chromosome M.
Optional: After mapping the fastq files, samtools can be used to quantify the mapping efficiency using the flagstat tool on both sam files. Samtools can also be used to convert the sam file into a compressed bam file for long term data storage.
Optional: After completing this protocol, follow the steps described for single-cell Probabilistic Endogenous Cellular Lineage Reconstruction (scPECLR) in Wangsanuwat et al., 2021 to reconstruct cellular lineages.

EXPECTED OUTCOMES
Prior to sequencing, the bioanalyzer plot should show a size distribution that contains greater than 90% of the Illumina library between 200 and 1000 bp. There should be little to no primer (<100 bases) or dimer peaks ($120 bases). The bioanalyzer plot for the Illumina library should be similar to that shown in Figure 3C. The concentration of the Illumina library can vary but it should be high enough to be detected on the Qubit dsDNA high-sensitivity assay with 1 mL of sample. Successfully sequenced cells typically result in several orders of magnitude more sequencing reads compared to barcodes that were not used (negative controls), or cells that were poorly sorted ( Figure 4A). Nearly all cells that have high read counts derived from one enzymatic digestion also have high read counts from the other enzymatic digestion(s) ( Figure 4B).
In the example shown in Figure 4, scH&G-seq was performed using BseRI (B), AluI (A), both enzymes (AB), or neither enzyme (N), along with AbaSI. As expected, genomic and mitochondrial BseRIderived reads were detected in conditions B and AB ( Figures 4C and 4D). Similarly, genomic and mitochondrial AluI-derived reads were detected in conditions A and AB ( Figures 4E and 4F). In all conditions, high levels of AbaSI-derived reads were identified, demonstrating that in all these conditions 5hmC can be detected ( Figure 4G).

QUANTIFICATION AND STATISTICAL ANALYSIS
Strand-specific 5hmC can be used to infer cellular lineages at an individual cell division resolution. Further, sequencing of genomic and mitochondrial DNA can be used to identify copy number variations (CNV) and single-nucleotide polymorphisms (SNP) that together with strand-specific 5hmC can be used to reconstruct larger lineage trees. The detailed methods can be found in Wangsanuwat et al., 2021. LIMITATIONS Certain cell types contain low levels of 5hmC and in these cases successful detection of this mark may be challenging in single cells. However, for applications related to reconstructing lineage trees, it is only necessary to quantify the relative levels of 5hmC between the two strands of a chromosome; therefore, in these cases, the absolute levels of 5hmC in a cell does not impact the accuracy of lineage reconstruction (Mooijman et al., 2016;Wangsanuwat et al., 2021).

TROUBLESHOOTING Problem 1
Prior to IVT, the sample does not reduce to a volume of 6.4 mL, even after several minutes in the vacufuge (step 26).

Potential solution
This problem most likely occurs because the sample contains a layer of oil, reducing the efficiency of evaporation. Return the sample to the vacufuge, and spin for an additional 10 min at 20 C-25 C. If no evaporation occurs, use the 30 C temperature mode on the vacufuge and continue spinning for an additional 10 min. If this fails, transfer the liquid to a new 1.5 mL microcentrifuge tube and continue spinning at 30 C. If necessary, a short vacufuge spin of less than 10 min at a temperature of 45 C can also be attempted. If the issue persists, continue with the protocol by proportionally scaling up all reagents of the IVT reaction, aRNA fragmentation and the aRNA bead cleanup. IVT has been successfully performed with an initial volume as high as 10 mL. Continued (E and F) Using either AluI or a combination of AluI and BseRI enables the detection of AluI-derived reads from genomic DNA (E) and mitochondrial DNA (F). As cells in the control sample received no AluI adapter, it has no associated AluI-derived reads, and is therefore not shown in these two panels.

Problem 2
After running the Agilent high-sensitivity DNA kit on the bioanalyzer, a significant amount of material and/or peaks are observed at sizes less than 200 bp (step 37).

Potential solution
Perform an additional 0.8253 AMPure XP magnetic bead cleanup and elute in 15 mL of nuclease-free water. Then rerun the Illumina library on the bioanalyzer using the Agilent high-sensitivity DNA kit.

Problem 3
After running the Agilent high-sensitivity DNA kit on the bioanalyzer, a significant amount of material is observed at sizes greater than 1000 bp (step 37).

Potential solution
Starting from the 5.25 mL of saved reverse transcribed aRNA (step 33.a), perform PCR with 2-3 less cycles. Continue to follow the protocol and re-evaluate the Illumina library on the bioanalyzer using the Agilent high-sensitivity DNA kit.

Problem 4
After running the Agilent high-sensitivity DNA kit on the bioanalyzer, a very limited amount of material or no material is observed (step 37).

Potential solution
Starting from the 5.25 mL of saved reverse transcribed aRNA (step 33.a), perform PCR with 3 more cycles. Continue to follow the protocol and re-evaluate the Illumina library on the bioanalyzer using the Agilent high-sensitivity DNA kit. If there is still a very limited amount of material, run an Agilent RNA 6000 Pico chip on the aRNA following the manufacturer's recommendations to determine if the ligated genome DNA molecules were successfully amplified by IVT. If no material is observed on the Agilent RNA 6000 Pico chip, restart the protocol by processing a new plate (starting from step 10).

Problem 5
Continued issues with generating high-quality Illumina libraries based on the bioanalyzer plots and the previous troubleshooting steps have not worked after multiple attempts (step 37).

Potential solution
If none of the plates from a given FACS sort have been verified to work, restart by sorting single cells again into 384-well plates (starting from step 1). Replace reagents with new nuclease-free aliquots. If issues persist, lower the double-stranded adapter concentration by a factor of two. Lower doublestranded adapter concentrations typically produce high-quality Illumina libraries on the bioanalyzer at the expense of sequencing complexity. If issues persist, try sorting 100 cells into each well of a 384-well plate and perform scH&G-seq as described above to determine if single-cell sorting and/or handling of limited starting material is the problem.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Siddharth S. Dey (sdey@ucsb.edu).

Materials availability
This study did not generate new unique materials or reagents.

Data and code availability
The raw and processed single-cell sequencing data have been deposited at GEO and are publicly available as of the date of publication. The accession number is listed in the key resources The codes required to analyze sequencing data produced in this protocol have been deposited at GitHub and are publicly available as of the date of publication. The GitHub link is listed in the key resources table.