scGR-seq: Integrated analysis of glycan and RNA in single cells

Summary Glycans are structurally diverse molecules found on the surface of living cells. The protocol details a system developed for combined analysis of glycan and RNA in single cells (scGR-seq) using human induced pluripotent stem cells (hiPSCs) and hiPSC-derived neural progenitor cells (NPCs). scGR-seq consists of DNA-barcoded lectin-based glycan profiling by sequencing (scGlycan-seq) and single-cell transcriptome profiling (scRNA-seq). scGR-seq will be an essential technique to delineate the cellular heterogeneity of glycans across multicellular systems. For complete details on the use and execution of this profile, please refer to Minoshima et al. (2021).


SUMMARY
Glycans are structurally diverse molecules found on the surface of living cells. The protocol details a system developed for combined analysis of glycan and RNA in single cells (scGR-seq) using human induced pluripotent stem cells (hiPSCs) and hiPSC-derived neural progenitor cells (NPCs). scGR-seq consists of DNA-barcoded lectin-based glycan profiling by sequencing (scGlycan-seq) and single-cell transcriptome profiling (scRNA-seq). scGR-seq will be an essential technique to delineate the cellular heterogeneity of glycans across multicellular systems. For complete details on the use and execution of this profile, please refer to Minoshima et al. (2021).

BEFORE YOU BEGIN
The protocol below describes the specific steps for using human induced pluripotent stem cells (hiPSCs) and hiPSC-derived neural progenitor cells (NPCs). However, we have also used this protocol in other cells, such as human dermal fibroblasts, hiPSC-derived neurons, and several cell lines.
This protocol consists of three major steps ( Figure 1)-(i) Single cell Glycan-seq (scGlycan-seq) (ii) Single cell RNA-seq (scRNA-seq) (iii) Integrated data analysis (scGR-seq) Standard cell culture procedures and humidified incubators are required for the maintenance of cell culture.
We use the next-generation sequencer, MiSeq (Illumina) and the barcode DNA counting system (Minoshima et al., 2021) to count DNA-barcodes derived from each lectin.
Note: 1 g of Sepharose gel contains 0.5 mmol of epoxy groups. The amount of gel to prepare can be reduced depending on the purification scale.
CRITICAL: Epichlorohydrin is a toxic substance.

Purification of DNA-barcoded lectins
Timing: 2 days 13. The sugar-immobilized Sepharose CL-4B column (1 mL in miniature column) is washed with 1 mL PBSE at 4 C. 14. Add 100 mL of PBSE into the DNA-barcoded lectin solutions and apply onto the sugar-immobilized Sepharose CL-4B column. Recover the flow-through fraction (100 mL). 15. Wash the sugar-immobilized Sepharose CL-4B column with 400 mL of PBSE three times. Recover each of the wash fractions (400 mL each). 16. Add 400 mL of the elution solution comprising PBSE containing an appropriate sugar for each lectin (see Table 1). Repeat this step three times. Recover each of the elution fractions (400 mL each). 17. Analyze the DNA-barcoded lectins by SDS-PAGE ( Figure 2). Mix 4 mL of each fraction of the purification steps (lectin only, flow-through, wash, elution) with 4 mL of SDS sample buffer. 18. Load 8 mL of the samples as well as 5 mL of Prestained Protein Size Marker onto 17% SDSPAGE gel. Run the SDSPAGE using SDS running buffer at 100 V for 20 min. 19. Stain the SDSPAGE gel with GelRed followed by the manufacturer's protocol. This step can stain free as well as lectin-conjugated DNA-barcodes 20. Stain the SDSPAGE gel with silver staining reagents followed by the manufacturer's protocol.
The gel used for the GelRed staining can be used for silver staining.  21. Recover the elution fractions and dialyze the purified DNA-barcoded lectins against 0.13 PBS (for dialysis) using Tube-O-Dialyzer, Medi 8kD. 22. Concentrate the DNA-barcoded lectins using a centrifugal filter (Amicon ultra 0.5 mL 10K) having a 10 kDa molecular weight cut off. 23. Quantify the protein and DNA concentration using the Bradford and Quant-iT OliGreen ssDNA Reagent Kit, respectively, and determine the DNA-to-lectin ratio. 24. Mix 41 DNA-barcoded probes (5 mg/mL, final concentration, for each lectin) (Table 1) into a 1.5 mL tube and fill up to 100 mL with PBS/BSA. 25. Store it at À30 C.
Note: Any lectins can be used for scGR-seq, but we recommend to check whether the lectins show no reaction each other using assays such as lectin blotting. We labeled 41 probes with DNA barcodes (Table 1), which cover a wide range of glycans such as sialylated, galactosylated, mannosylated, GlcNAcylated, and fucosylated glycans.
Caution: Some lectins are eluted in washing fractions. In this case, recover the wash fractions and use for the experiments. Note: Dissolve Tris-HCl and NaCl with 500 mL MilliQ, adjust the pH to 4.0 by acetic acid, and fill up to 1 L with Milli Q.

KEY RESOURCES
Note: Fill up to 10 mL with MilliQ.
Note: Fill up to 10 mL with MilliQ.
Note: Fill up to 1 L with MilliQ.
Note: Fill up to 1 L with MilliQ.
Note: Wash Sephadex G-25 fine with TBS, and dispense 0.8 mL of Sephadex G-25 fine into the column and store it at 4 C. Note: Filter the reagent using 0.22 mm PVDF membrane and store it at 4 C.
Note: Aliquot and store at À20 C; however, it can be stored at 2 C-8 C for up to 2 weeks if not used immediately.

STEP-BY-STEP METHOD DETAILS
Cell culture of hiPSCs Timing: 3-4 days CRITICAL: Perform all cell culture experiments inside a biosafety cabinet, and wear personal protective equipment, including gloves and goggles.
1. Coat each well of 6-well plate with 1 mL of Matrigel and let it sit at room temperature (RT) for 1 h. 2. Thaw the mTeSR Plus media at 37 C for 5-15 min. Plate the appropriate number of 201B7 hiPSCs in a 6-well plate containing 2 mL of the mTeSR Plus media. 3. Culture the cells for 2-3 days in a CO 2 incubator (with CO 2 level set to 5%). 4. Recover the cells with gentle cell dissociation reagent and resuspend them in the mTeSR Plus media supplemented with 10 mM Y-27632.
Pause point: hiPSC are suspended in mFreSR Cryopreservation Medium and stored in liquid nitrogen.

Generation of hiPSC-derived neural progenitor cells (NPCs)
Timing: 11 days 5. Thaw STEMdiffä Neural Induction Medium and STEMdiffä SMADi Neural Induction Supplement at room temperature (15 C-25 C) or overnight (2 C-8 C). Swirl both media thoroughly. 6. Add 0.5 mL of STEMdiffä SMADi Neural Induction Supplement to 250 mL of STEMdiffä Neural Induction Medium (NIM). Mix them thoroughly and warm them to room temperature before use. 7. Coat each well of 6-well plate with 1 mL of Matrigel and let it sit at room temperature for 1 h. 8. Wash the hiPSC cultured well with 2 mL of phosphate-buffered saline (PBS). 9. Add 1 mL of gentle cell dissociation reagent and incubate at 37 C for 8-10 min. 10. Pipet up and down 3-5 times to dissociate the cell aggregates. Collect the cells in a 15 mL conical tube. 11. Wash the plate with 2 mL PBS and collect the remaining cells into a 15 mL conical tube. 12. Count the viable cells with a hemocytometer using Trypan blue dye method. 13. Centrifuge the 15 mL conical tube at 3003g for 4 min. Aspirate and discard the supernatant without disturbing the cell pellet. 14. Resuspend the cell pellets with NIM supplemented with 10 mM Y-27632 to achieve a final concentration of 1 3 10 6 cells/mL. 15. Aspirate the Matrigel from a 6-well plate and add 2 mL cell suspension (2 3 10 6 cells/well) into a single well of Matrigel-coated plate. 16. Incubate the cells at 37 C, 5% CO 2 for 11 days. Fresh NIM without Y-27632 is used for medium changes every other day. 17. Aspirate the media from the plates, add 1 mL of Accutase per well, and incubate at 37 C, 5% CO 2 for 5 min. 18. Collect the cell suspension in a 15 mL conical tube. Wash the plates with 2 mL pre-warmed DMEM/F12 media, and collect the residual in the same tube. Centrifuge the cell suspension at 3003g for 4 min, and the resulting cell pellet is resuspended in NIM and used for the evaluation of the differentiation state by qRT-PCR and fluorescence staining of hiPSC markers (POU5F1) and NPC markers (SOX1, NESTIN, PAX6, FOXG1).

PBS/BSA
Pause point: hiPSC-derived NPCs are suspended in mFreSR Cryopreservation Medium and stored in liquid nitrogen.

Single cell glycan-seq
Timing: 2 days CRITICAL: Contamination with DNA and DNase will significantly affect the experiments. Perform all experiments inside a biosafety cabinet in the dark, and wear personal protective equipment, including gloves and goggles. Take extreme care while handling all the reagents to prevent contamination with DNA and DNase. Prepare and dispense all reagents on ice unless otherwise stated. CRITICAL: Be careful not to touch the cells with the pipette tip. You can leave 0.5 mL of supernatant in the tube.

Incubate the cells with
Pause point: The supernatant can be stored at À80 C.
23. Add 2.5 mL of cell lysis buffer into the tube and cover the tube with a cap. 24. Spin down and store at À80 C for single-cell RNA-seq (see the next section ''single cell RNAseq.'' 25. Perform PCR to amplify the DNA-barcodes for sequencing. a. Prepare the PCR mix as follows: each sample contains 25 mL of total reaction volume-9.75 mL of supernatant (template), 12.5 mL of NEBNext UltraII Q5 Master Mix, 0.25 mL of i5 index primer (100 mM), and 2.5 mL of i7 index primer (10 mM). b. Perform the PCR reactions as follows.
a. Combine 16 samples into a 1.5 mL microtube. b. Add 320 mL of AMPure into it and gently pipette the contents 10 times. c. After incubation at RT for 5 min, expose the tubes to the magnetic stand for 2 min. d. Discard the supernatants without disturbing the magnetic beads on the magnetic stand. e. Add 1 mL of 80% ethanol into the 1.5 mL microtube followed by incubation at RT for 30 s on the magnetic stand. Discard the supernatant and repeat the washing step three times. f. Air-dry the magnetic beads at RT for 10 min.
CRITICAL: When beads are dried, the color of the beads becomes lighter. If it dried too much, it would be difficult to elute.
g. Remove 1.5 mL microtube from the magnetic stand and add 160 mL of 10 mM Tris (pH 8.5). h. Gently pipette the contents 10 times and incubate at RT for 2 min. i. Expose the tube to the magnetic stand and carefully collect the supernatant to a 15 mL tube. Transfer the column to a 1.5 mL microcentrifuge tube and centrifuge at 20,4003g for 30 s at 4 C to elute the DNA. 28. Use 6.5 mL of the elution fraction to analyze the size and quantity of the PCR products, using the microchip electrophoresis system-MultiNA with DNA-500 kit-according to the manufacturer's

PCR cycling conditions
Steps Pause point: DNA library can be stored at À20 C.
Alternatives: For the analysis of the size and quantity of the PCR products, Agilent Bioanalyzer or Agilent TapeStation could be considered.
29. Denaturing library DNA (for > 4 nM of library DNA) a. Dilute the concentration of library DNA to 4 nM with nuclease-free water and mix it in equal amounts. b. Mix 4 mL of the library DNA mixture of all the samples with 4 mL of 0.1 N NaOH, briefly by vortexing and spin down. c. Incubate at RT for 5 min and keep them on ice. d. Add 4 mL of 2 nM library DNA to 796 mL of pre-chilled HT1 for a total of 800 mL (10 pM) library DNA. Mix briefly by vortexing and spin down.
(for < 4 nM of library DNA) e. Mix each library DNA in equal amounts f. Mix 2 mL of the library DNA mixture of all the samples with 2 mL of 0.1 N NaOH, briefly by vortexing and spin down. g. Incubate at RT for 5 min and keep them on ice. h. Add 2 mL of 200 mM Tris-HCl (pH 7.0) and mix briefly by vortexing and spin down. i. Add 6 mL of library DNA to 534 mL of pre-chilled HT1 for a total of 540 mL, library DNA. Mix briefly by vortexing and spin down.
30. Denaturing PhiX a. Mix 1 mL of 10 nM PhiX, 4 mL of nuclease-free water, and 5 mL of 0.1 N NaOH for a total of 10 mL, 1 nM of Phix. Mix briefly by vortexing and spin down. b. Incubate at RT for 5 min and chill on ice. c. Mix 2 mL of 1 nM denatured PhiX and 248 mL of pre-chilled HT1 for a total of 250 mL (8 pM PhiX). Mix briefly by vortexing and spin down. 31. Mix 540 mL of library DNA (step 29) with 130 mL of 8 pM PhiX (step 30). 32. Heat at 96 C for 2 min and chill on ice immediately followed by incubation for 5 min. 33. Load 600 mL of the library mix into the reagent cartridge of MiSeq Reagent kit and run the setup according to the manufacturer's instructions.

Single cell RNA-seq
Timing: 1-2 days CRITICAL: Contamination with RNase and DNA will significantly affect the experiments. Perform all experiments inside a biosafety cabinet in the dark, and wear personal protective equipment, including gloves, masks, and goggles. Wipe all instruments used for the experiment, i.e., pipette, centrifuge, mixer, thermal cycler, laboratory bench, with RNase remover -RnaseAway. Take extreme care while handling all the reagents to prevent contamination with RNase. In order to inactivate RNase and maintain enzyme activity, prepare and dispense all reagents on ice, unless otherwise stated.
34. Prepare cDNA library using a full-length total RNA-sequencing method -Random displacement amplification sequencing (RamDA-seq)-from single cells, according to manufacturer's instructions. 35. Quantify the library DNA from individual samples derived from single cells, using the microchip electrophoresis system-MultiNA with DNA-12000 kit-according to the manufacturer's instructions. A band of 150-600 bp will appear if the DNA library is constructed successfully. 36. Pool and mix each library DNA and transfer 50-100 fmol into a 1.5 mL tube. 37. Sequence the mixed library DNA using a next-generation sequencer such as Novaseq6000 according to the sequencer guidelines.

EXPECTED OUTCOMES
A successful scGR-seq output amounts to approximately 5,000 of the total DNA barcode counts.
In the case of scRNA-seq, approximately 10,000 genes should be detected. In UMAP, hiPSCs and NPCs are separated into two clusters based on Glycan-seq and RNA-seq data (Figure 3). hiPSC-specific lectin, rBC2LCN, shows higher binding to hiPSCs than NPCs ( Figure 4A). In contrast, rBanana shows higher binding to NPCs than hiPSCs ( Figure 4A). hiPSCs show higher expression of hiPSC-specific genes such as NANOG and POU5F1 ( Figure 4B); in comparison, NPCs show higher expression of NPC marker genes such as NES (NESTIN), PAX6, and SOX1 ( Figure 4B).

QUANTIFICATION AND STATISTICAL ANALYSIS
Preprocessing of data Our in-house developed software, Barcode DNA counting system (Mizuho Information & Research Institute, Inc., Tokyo, Japan), processed the Glycan-seq readout in the FASTQ format, which is accessible from the github (https://github.com/bioinfo-tsukuba/barcode-dna-counting-system).
Each read sequence is aligned with the DNA-barcode reference that corresponds to each lectin in this system. Two mismatches in the flanking region and one mismatch in the middle region were accommodated to the maximum extent. As a result, the DNA barcode count data in each cell is a readout. Each lectin count is normalized with the total count of DNA barcode and expressed as % of total count.

Quality control
Extremely low count data may induce a data bias in scGlycan-seq; it is necessary to check whether the total number of barcode count influence component 1 or component 2 obtained from principal component analysis ( Figure 5A). If the total count-dependent bias is detected, then a cut-off value of the total barcode count is determined by Otsu's method using R with tidyverse packages as follows (Otsu, 1979) ( Figure 5B).

OPEN ACCESS
3. Perform binarization in log10-transformed total count data with Otsu's method 4. Visualization of binarized total count data In scRNA-seq data, cells with low-quality data are determined by several parameters, such as read count, gene count, or mitochondrial read ratio with Seurat R package (version 4.02). Dead/damaged cells exhibited low read count and increased mitochondrial read ratio whilst aggregated cells showed abnormally high read count. Since an optimum cut-off value to exclude low-quality cells depends on cell type, read depth and read quality, it needs to be determined by individual data set.

Integrated data analysis
The Seurat R package performs dimensionality reduction, cellular clustering, and identification of differential gene expression and thus can be used to analyze scGlycan-seq and scRNA-seq data. The Seurat platform also supports the integrated analysis of scGlycan-seq and scRNA-seq based on the weighted-nearest neighbor (wnn) workflow. A detailed protocol for the Seurat R package is described in (Hao et al., 2021).

LIMITATIONS
Like flow cytometry and lectin microarray, absolute amounts of glycans and accurate glycan structures cannot be determined directly from the signal intensities described above. Another limitation of the current system is the throughput. Since scGR-seq is a plate-based platform, processing of cell numbers is limited to hundreds of cells, while it can perform full-length total RNA sequencing (

Problem 1
The molar ratio of DNA barcode relative to lectin is too low (''purification of DNA-barcoded lectins'' step 23).

Potential solution
Increase the amount of PC-DBCO-NHS and incubation time.

Potential solution
Cells might be agglutinated during incubation with DNA-barcoded lectins. The cell aggregates can be removed using 100 mm filter. In manual picking, single cells can be selected by visual inspection. In FACS, gating with FSC-H and FSC-W can exclude aggregated cells from the analysis.

Problem 3
No band is detected when DNA barcodes obtained from each single-cell were run on the microchip electrophoresis system, MultiNA (''single cell glycan-seq'' step 28).

Potential solution
It is recommended to include bulk samples (1 3 10 4 cells) as a positive control to validate PCR reactions. If the band is detected only in bulk samples, the amount of DNA barcodes obtained from each single-cell is likely too low. Even in that case, it may be detectable in a next-generation sequencer such as Miseq if you follow the protocol in ''step 29: Denaturing library DNA if the concentration of library DNA is < 4 nM.''

Potential solution
Keep an experimental space clean to prevent degradation of RNA by RNase contamination.
Confirm that ethanol is dried out after washing steps of AMPure beads, since the residential ethanol may inhibit the subsequent reactions.
Increase the number of PCR cycle.
Remove low yield samples from sequencing analysis with Next-generation sequencer.
Include whole volume of low yield samples into the mixed cDNA library.

Problem 5
High amount of primer dimers is detected around 110-130 bp when cDNA library obtained from each single-cell were run on the microchip electrophoresis system, MultiNA (''single cell RNAseq'' step 35).

Potential solution
It is recommended to remove primer dimers by size-selection of DNA fragments with AMPure XP beads because primer dimers will compete with cDNA to bind flow cell in Next-generation sequencer. The addition of 1.0-1.2 times the volume of AMPure XP to the PCR reaction solution is sufficient to remove 110-130 bp fragments. Note that this selection step may slightly reduce short cDNA fragments around 150 bp.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Hiroaki Tateno (h-tateno@aist.go.jp).

Materials availability
Recombinant lectins are available from FUJIFILM Wako Pure Chemical Corporation or the lead contact upon request.

Data and code availability
The code of the barcode DNA counting system is available from github (https://github.com/ bioinfo-tsukuba/barcode-dna-counting-system).