Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls

Summary Cell-free methylated DNA immunoprecipitation sequencing (cfMeDIP-seq) identifies genomic regions with DNA methylation, using a protocol adapted to work with low-input DNA samples and with cell-free DNA (cfDNA). We developed a set of synthetic spike-in DNA controls for cfMeDIP-seq to provide a simple and inexpensive reference for quantitative normalization. We designed 54 DNA fragments with combinations of methylation status (methylated and unmethylated), fragment length (80 bp, 160 bp, 320 bp), G + C content (35%, 50%, 65%), and fraction of CpG dinucleotides within the fragment (1/80 bp, 1/40 bp, 1/20 bp). Using 0.01 ng of spike-in controls enables training a generalized linear model that absolutely quantifies methylated cfDNA in MeDIP-seq experiments. It mitigates batch effects and corrects for biases in enrichment due to known biophysical properties of DNA fragments and other technical biases.


In brief
Wilson et al. developed synthetic DNA controls for normalizing cell-free methylation DNA immunoprecipitation sequencing (cfMeDIP-seq) assays used in liquid biopsies. These spike-in controls allow absolute quantification of methylated cell-free DNA in picomoles rather than arbitrary read counts. They also adjust for batch effects and reduce systematic noise related to technical biases.

INTRODUCTION
Cell-free methylated DNA immunoprecipitation sequencing (cfMeDIP-seq) identifies DNA methylation using low-input samples of cell-free DNA (cfDNA). This method detects DNA methylation patterns reflective of distinct cancer types from circulating tumor DNA, which arises from tumor cells shedding DNA into an individual's blood (Shen et al. 2018. cfMeDIP-seq proves ideal when assessing peripheral blood plasma from cancer patients, where one may obtain only a small amount of circulating tumor DNA and no indication of from where the circulating tumor DNA originates. Sequencing assay methods, such as cfMeDIP-seq or RNA sequencing (RNA-seq), require a control for more accurate quan-titative comparison across samples and batches. Reference controls for sequencing assays have consisted of spike-in reference DNA or RNA fragments of known sequence (Jiang et al., 2011;Chen et al., 2016;Orlando et al., 2014;Deveson et al., 2016;Blackburn et al., 2019). Without spike-in controls, one must assume that a given amount of assayed material produces equal DNA or RNA yields in different experimental conditions, and that this also holds true across all genomic regions (Chen et al., 2016). By normalizing quantification to a known amount of spike-in DNA added into a sample, we can overcome this assumption, leading to more accurate results (Chen et al., 2016). When carefully designed, spike-in controls can adjust for specific technical biases.
In addition to adjusting for technical biases, spike-in controls act as experimental standards for quality control. The addition MOTIVATION For robust quantitative comparisons between samples, immunoprecipitation enrichment methods such as cfMeDIP-seq require normalization against common reference controls. Common reference controls can correct for technical variation in the processing of different samples. They can also correct for biases in enrichment due to known biophysical properties of DNA fragments such as fragment length, G + C content, and fraction of CpG dinucleotides. Furthermore, common reference controls can provide an experimental standard for quality control. We sought to provide synthetic spike-in DNA fragments that would fulfill these reference control purposes in cfMeDIP-seq experiments.
of spike-in controls dramatically changes the interpretation of RNA-seq, chromatin immunoprecipitation sequencing, and other genomic assay results (Jiang et al., 2011;Chen et al., 2016;Orlando et al., 2014;Deveson et al., 2016;Blackburn et al., 2019). As such, all quantitative genome-wide assays would benefit from the addition of spike-in controls (Chen et al., 2016).
The most common approach to normalizing sequencing assay data consists in dividing the number of reads at each genomic region by the total number of reads genome-wide. This approach addresses technical variance owing to sequencing depth, but it can mask differences in biological variables of interest. Normalizing data to a known amount of spike-in DNA for each sample allows for more accurate detection of differences and adjustment of biophysical properties of DNA fragments that can influence results (Orlando et al., 2014;Chen et al., 2016).
While other genomic assays have long utilized spike-in controls, methods measuring genome-wide DNA methylation have rarely used them. The previous cfMeDIP-seq protocol uses methylated and unmethylated Arabidopsis thaliana DNA as a spike-in control to assess immunoprecipitation and binding efficiency to methylated DNA . This approach cannot correct for properties of specific methylated DNA fragments likely to influence results, such as G + C content, fragment length, and CpG fraction. Other controls used in DNA methylation enrichment methods include setting aside and sequencing a portion of input DNA without the enrichment procedure (Plongthongkum et al., 2014). This provides a reference point to assess enrichment of DNA methylation overall. Sequencing input DNA, however, cannot adjust for properties of individual fragments that can affect enrichment. Spike-in controls for bisulfite conversion methods for assaying DNA methylation (Liu et al., 2019;Olova et al., 2018) have no use in bisulfite-free enrichment methods, such as cfMeDIP.
Here, we introduce new synthetic DNA spike-in controls for cfMeDIP-seq. Our synthetic spike-ins can measure how DNA fragment properties, such as length, G + C content, and number of CpGs, can affect the number of reads produced by some known amount of fragment. Our spike-in controls assess nonspecific binding, an integral part of cfMeDIP-seq analysis. The spike-in controls also mitigate technical effects such as experiments performed by different labs. We also use unique molecular indices (UMIs) to adjust for PCR bias. To calculate methylation specificity, we compare methylated fragments with unmethylated fragments after cfMeDIP-seq. We add a known molar amount of spike-in controls to each sample. With this information, we apply a generalized linear model to calculate molar amount, accounting for fragment length, G + C content, and CpG fraction. These spike-in controls generate an absolute quantitative measure of methylated DNA, allowing for more robust comparisons between samples and experiments.

Spike-in controls confirm cfMeDIP's high efficiency and specificity
We performed cfMeDIP-seq directly on the synthetic spike-in control fragments ( Figure 1A, left; Table S1). For each sample, we saved 10% of the mass before performing cfMeDIP-seq.
This acts as an input control. In input samples, we observed 51% of the input fragments methylated and 49% unmethylated. After cfMeDIP, 97% of the output fragments were methylated (Figures 2A and 2B; Table S2). The enrichment for methylated sequences further supports the validity and high efficiency of cfMe-DIP-seq.
After cfMeDIP, signal from methylated fragments for both the synthetic spike-in control alone (97% of spike-in control fragments) and 10 ng of HCT116 colon cancer cell line DNA with 0.01 ng of spike-in (97% of spike-in control fragments) showed an enrichment of 160 bp fragments. We expected this enrichment, owing to our size selection step for insert size fragments between 80 bp and 380 bp. We designed fragments outside these insert sizes (1) to assess the efficiency of size selection and (2) for use in experiments without size selection. We also observed enrichment of fragments with higher G + C content and higher CpG fraction (Figures 2B and 2C; Table S2), and observed depletion in fragments with fewer CpGs. Immunoprecipitation less often enriches these fragments, methylated or unmethylated. Our spike-in controls allow us to account for this bias, facilitating the accurate interpretation of cfMeDIP-seq data.
Signal from unmethylated fragments for both the synthetic spike-in control alone (3% of spike-in control fragments) and 10 ng of HCT116 DNA with 0.01 ng of spike-in (3% of spike-in control fragments) had no association with fragment lengths, G + C content, or CpG fraction (Figures 2B and 2C; Table S2). This suggests that the low amount of unmethylated fragment signal arose from random non-specific binding and that it was not confounded systematically with fragment length, G + C content, or CpG fraction.
We also observed similar results in the acute myeloid leukemia (AML) samples. These samples had more unmethylated reads and lower methylation specificity in Lab C (Table S3). The lower methylation specificity arose from not using the methylated filler specified in the original protocol (Shen et al. 2018. We assessed the range of number of sequenced reads where we had alternative sets of 320 bp fragments with different G + C contents and CpG fractions. Because no unmethylated fragment yielded more than 20 reads in any experiment, we focused our assessment on methylated fragments. The similar numbers of reads sequenced for alternative fragments with the same fragment properties show the robustness of the spike-in controls (Table 1).
Even in the fragments with the largest range between alternatives (50% G + C content, 1/20 bp CpG fraction), this range represents only %4% of the total number of sequenced reads per sample. This difference may represent unknown fragment properties that influence cfMeDIP-seq results. The other sets of alternative fragments had even smaller ranges between alternatives.

Low-input spike-in control accounts for technical variance
To determine the optimal amount of spike-in control DNA to add to each sample, we assessed the proportion of reads used toward our spike-in controls. We compared spike-in control reads with the total number of reads used for our biological sample, HCT116 genomic DNA ( Figure 1A, right). We optimized the amount of spike-in controls to be used in subsequent experiments. This allowed us to maximize reads to our biological sample of interest while obtaining sufficient reads from the spike-in controls to correct for technical bias. Adding in 0.01 ng of our synthetic spike-in control DNA before cfMeDIPseq used <1% of the reads for the spike-in controls. We retained >650,000 reads of spike-in control sequence for analysis while leaving the rest of the reads for our biological sample. Therefore, we decided to use 0.01 ng of spike-in control fragments in subsequent experiments.
The 0.01 ng of spike-in control DNA added to 10 ng of HCT116 genomic DNA revealed R97% specificity of methylated DNA with %0.01% total non-specific binding to non-methylated fragments ( Figure 2C and Table S2). We calculated methylation specificity by dividing the total number of methylated fragments by the total number of spike-in control fragments. The cfMeDIP process also enriched for the 160 bp fragments we physically size selected for and for higher G + C content. Fragments that have CpG present at only 1/80 bp in the experiment with 10 ng of HCT116 with 0.01 ng of spike-in control DNA were represented by %1% of reads ( Figure 2C and Table S2). The same patterns persisted when spiking in 0.05 ng and 0.1 ng of spike-in control DNA into 10 ng of HCT116 cell line.
Removing problematic regions eliminates some technical artifacts From our normalized data, we removed regions containing simple repeats, the Encyclopedia of DNA Elements (ENCODE) Proj-ect (ENCODE Project Consortium 2012) blacklist (Amemiya et al., 2019) regions, and regions with mappability scores of %0.5. We noticed that many regions with high molar amount also had high standard deviation of molar amount between replicate samples. Thus, we removed regions where standard deviation of molar amount between replicates was R0.05.
After removing the high standard deviation regions, we observed no relationship between molar amount and standard deviation of molar amount, and no relationship between molar amount and mappability ( Figure 3). We further examined the 11 Figure 2. Assessing biases in fragment length, G + C content, and CpG fraction (A-C) In (A) input spike-in control DNA without cfMeDIP-seq, (B) output spike-in control DNA, after cfMeDIP-seq, and (C) 0.01 ng spike-in control DNA added to HCT116 replicates. Blue, methylated fragments; gray, unmethylated fragments. Circle, sample 1; triangle, sample 2. Solid line, mean of the two samples. Columns marked with numerals 1 and 2 represent alternative sets of fragments with identical properties but different sequences. See also Table S2. genomic windows with the highest estimated molar amounts (R0.25 pmol; Figures 3A and 3B). Of these 11 windows, seven overlapped repetitive elements (Table 2). Of those seven, six overlapped with Alu elements. This suggests that removing regions that have high standard deviation of molar amount between replicates can reduce technical artifacts. Regions with high standard deviation of molar amount between replicates perform inconsistently, leading to inaccurate quantification of DNA methylation.

Absolute quantification correlates with M-values
We compared molar amount with M-values from the EPIC array ( Figure 1A, right). We used our generalized linear model to calculate the molar amount of cfMeDIP-seq methylated DNA fragments for each 300 bp genomic window. Molar amount significantly correlated with array M-values over 300 bp in our HCT116 genomic DNA samples (r R 0.62; Figures 4A, 4C, 4E, and 4G). Correlation increased when we restricted analyses to 300 bp windows with R5 CpG probes on the array (r = 0.82; Figure 4C). This reflected cfMeDIP-seq's preference for methylated, CpG-dense reads.
We compared the current standard of read counts with M-values ( Figures 4B, 4D, 4F, and 4H). Molar amount correlated with M-values similarly to read counts, but provided the advantage of absolute quantification.

Spike-in controls accurately predict molar amount
To assess the accuracy of models built on spike-in controls, we trained models using only some spike-ins as training data, testing on the held-out spike-ins. The mean absolute error between known molar amount and predicted molar amount was %0.002 pmol in every case. Adding more spike-in controls to the training data further increased model accuracy ( Figure 5 and Table S4). While the largest training sets had the lowest median mean absolute error, they had larger variance in mean absolute error over 100 iterations. The increased variance likely derived from the increased noise from using smaller test sets, down to a single test spike-in used for the case with 25 training spike-ins.

Spike-in controls significantly mitigate batch effects
To determine whether our spike-in controls mitigate batch effects for a clinical test, we provided aliquots of cfDNA samples from five AML patients to three different labs ( Figure 1B). Each lab performed cfMeDIP-seq on each of the five samples with slight variations. We preprocessed each batch as described above. This yielded three models (Table S5).
We performed principal component analysis (PCA) to assess whether any batch variables drive any of the top principal components. For raw read count data without spike-in controls, principal component 1 explains 80% of the variance and showed DNA methylation changes non-significantly associated with processing batch, filler (methylated or unmethylated), and adapter. For the raw reads, principal component 3, explaining 5% of the variance, was significantly associated with batch ( Figure 6A).
Quantitative sequencing enrichment analysis (QSEA) normalization increased the variance explained by principal component 1 to 81%. After QSEA normalization, principal component 1 became significantly associated with filler. Principal component 1 was also associated non-significantly with batch and adapter ( Figure 6B). The correlation of principal component 1 with filler DNA only appeared after QSEA normalization, suggesting that QSEA may have introduced spurious variance into the data.
Using spike-in controls reduced the variance explained by principal component 1 to 78%, which shows a non-significant effect size change in DNA methylation associated with batch variables. Similar to the raw data, we still see association with the batch variable in principal component 3, explaining 6% of the variance ( Figure 6C).
Using our suggested filtering, which includes removing simple repeats, ENCODE blacklist regions, and regions with low mappability, improves variance associated with batch variables in the data. Upon removing these regions, batch and filler was associated with principal component 2, explaining 6% of the variance. These variables were not confounded with the known non-technical, biological variables-sample and sex-that one would want to associate with the top principal components. Principal component 1, explaining 83% of the variance, was associated with sample and sex more than other variables, though not significantly ( Figure 6D). These changes in DNA methylation associated with biological variable were absent in principal component 1 of either the raw data ( Figure 6A) or the QSEA data ( Figure 6B). Removing the confounding of these changes with batch variables aids their use for assessing biological variance of interest.
We further investigated principal component 2 using spike-in controls and filtering. Examining the top 10% of windows (N = 258,254) driving the explained variance, 71% matched to RepeatMasker (Smit et al., 2010) repetitive elements. Of the To examine whether spike-in controls reduce technical variance, we performed sample-sample correlations in the raw read count data and predicted molar amount using spike-in controls. For each of the five AML samples, we compared the Lab B and Lab C data against Lab A. This generated 10 correlations for raw read count data and a paired 10 correlations for molar amount. We expect technical replicates to have correlations closer to 1. In all 10 of the technical replicate pairings examined, the molar amounts performed better than the raw read count data (Table S6).

DISCUSSION
The data presented here establish the validity of using our synthetic spike-in control DNA to absolutely quantify cfDNA in cfMe-DIP-seq experiments. We showed that technical bias exists in the cfMeDIP-seq data and that the use of our spike-in controls helps mitigate these biases. For example, Lab C used unmethylated lambda filler DNA, but the spike-in controls successfully mitigated the effect of this change. While using spike-in controls significantly mitigates batch effects compared with QSEA, the variance in the data explained by batch variables remains similar between spike-in controls and raw data. Major advantages of using spike-in controls over raw data include the ability to adjust for fragment length, G + C content, and CpG fraction, and to absolutely quantify cfDNA. Additionally, one can use the signal from unmethylated spike-in control reads to calculate methylation specificity for each sample. To reduce technical artifacts, one must remove problematic genomic regions prior to analysis.
Despite removing problematic genomic regions, some remaining regions had much higher predicted molar amount than all other genomic regions of a given standard deviation or mappability score. The high molar amount genomic regions consisted Histograms only include windows that do not overlap with UCSC simple repeats and the ENCODE blacklist, and regions with Umap k100 multi-read mappability scores % 0:5. Asterisks indicate 11 outlier genomic windows chosen for further examination. mostly of repetitive elements, predominantly short interspersed nuclear elements (SINEs) ( Table 2). While these regions had high CpG density, our model already adjusted for CpG fraction. This made CpG density an unlikely driver of high molar amount. The two highest molar amount genomic regions from our HCT116 colon cancer cells overlap MIER2, part of the MIER gene family, associated with suppression of colorectal cancer (Peng et al., 2017). This shows the utility of our method to reveal important pathways for the gene regulation of the sample examined by a liquid biopsy.
Out of the top five genomic regions we followed up, four had high expression in the testis (two regions overlapping MIER2 and one region overlapping each of RNF126 and EXOC2) and one had high expression in the ovary (PRPF4B). Several cancers exhibit high expression of genes associated with cell lines and low methylation in these genes' promoters (Kutilin 2020;Pfeifer 2018;Janson et al., 2017).
The chr19:343,501-343,800 window resided within the first intron of MIER2, overlapping the EH38E1929888 candidate cis-regulatory element from ENCODE SCREEN (Moore et al., 2020). This element has a proximal enhancer-like signature in other cell types, but not in HCT116. HCN2 and RNF126 windows also overlapped high-density CpG islands. The other MIER2 window at chr19:308,701-309,000 contained a short coding exon and flanking intronic sequence on both sides. Both MIER2 windows also overlapped high-density CpG islands.
We may see over-representation of high molar amount in some repetitive regions owing to the characteristic hypermethylation of these regions (Deininger 2011). Had we not removed many repetitive regions when removing regions listed in the ENCODE blacklist and regions with low mappability, we may have observed more genomic regions with predicted high molar amount. Regions with high molar amount that map to repetitive elements contain many extra copies not present in the reference genome. These regions would appear uniquely mappable, and thus were not removed by our previous filtering steps.
HCT116 likely has problematic genomic regions not found in the ENCODE blacklist. This arises from the relative dearth of ENCODE data available for HCT116 in the blacklist generation process, compared with cell types in the ENCODE tier 1 and tier 2 categories.
Depending on the experimental question, some may choose to go beyond our filtering recommendations and remove all repetitive elements, such as all long interspersed nuclear elements (LINEs) and all SINEs. Given that, after filtering, only 11 genomic windows had molar amount R0.25 pmol, removing all repetitive elements would not affect results drastically.
Both molar amount in picomoles and read counts correlated with M-values. The strength of this correlation varied with the number of CpGs represented on the EPIC array for a genomic window. We expect the increase in correlation with more CpGs as cfMeDIP-seq preferentially enriches for hypermethylated regions and picks up fewer fragments in CpG-sparse regions. Additionally, if the array has fewer probes representing a 300 bp genomic region than that region has CpGs, it may poorly represent DNA methylation for the entire 300 bp.
To facilitate the use of our spike-in controls, we have created an R package, spiky, to help process data generated from cfMeDIPseq experiments that include the spike-in controls. This package trains the Gaussian generalized linear model and predicts molar amount in picomoles on user data. The spiky package is available on Bioconductor (Huber et al., 2015; https://bioconductor.org/ packages/spiky) and GitHub (https://github.com/trichelab/spiky).
Incorporating these spike-in controls in future cfMeDIP-seq experiments will adjust for technical biases and mitigate batch effects, improving cfMeDIP-seq data overall.

Limitations of the study
The spike-in controls presented here allow absolute quantification of cfDNA, but the limited number of controls focus on a specific set of DNA fragment properties. Spike-in controls encompassing a larger number of fragment lengths, CpG fractions, and G + C content will further improve absolute quantification.
Using both unmethylated and methylated spike-in controls can estimate methylated DNA enrichment efficiency, but this applies to an entire experiment or sample. We still cannot tell whether any given fragment remaining after cfMeDIP is methylated or unmethylated, although we know that almost all of them are.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:    For each number of spike-in fragments between 6 and 25 inclusive, we 100 times randomly selected that number of spike-ins as training data. We used the remaining spike-ins as test data. Each point shows the mean absolute error over all the test spike-ins for that iteration. The vertical limits of the plot include at least 98/100 iterations in every case. We denote outliers for 6 or 7 training spike-ins with a cross at the top of the plot, labeled with the mean absolute error for that case. Red line denotes median mean absolute error. See also to reduce the left skew of CpG fraction, we used a cube root transformation.
To calculate the proportion of a given fragment that overlapped with the defined 300 bp windows, we used bedtools version 2.29.2 intersect (Quinlan and Hall 2010). We calculated an adjusted molar amount h 0 to only consider the portion of the window each fragment overlapped. For this calculation, we multiplied the molar amount h by the length of the overlap between the fragment and the genomic window [˛[1 bp, 300 bp], divided by the window size [ Ã = 300 bp:

OPEN ACCESS
Identifying regions to remove during filtering To assess multimapping reads that might influence the molar amount estimate, we used Umap (Karimzadeh et al., 2018) multi-read mappability scores. We used k100 mappability scores, representing the largest read lengths available. We annotated each 300 bp window with its minimum mappability score. We assessed the relationship between molar amount and mappability scores. We calculated standard deviation of molar amount between the two replicate samples for which we spiked 0.01 ng of synthetic DNA into 10 ng of HCT116 genomic DNA. We assessed the relationship between standard deviation of molar amount between replicates, excluding 1,0 70,387 simple repeat regions (Karolchik et al., 2004), 239,461 regions listed in the ENCODE Project (ENCODE Project Consortium 2012) blacklist (Amemiya et al., 2019), 549,876 regions with mappability score % 0:5, and 906 regions with standard deviation R 0:05. This left 4,446,375 genomic windows in the analyses.

Correlation between picomoles and M-values
We removed simple repeat regions (Karolchik et al., 2004), regions listed in the ENCODE blacklist (Amemiya et al., 2019), regions with Umap k100-multi-read mappability % 0:5, and regions with standard deviation of molar amount R 0:05. As described above, we estimated molar amount, using a generalized linear model ðr 2 = 0:93Þ. We prioritized models that performed better on 160 bp fragments, as we physically size selected for these fragments.
To compare molar amount to another complimentary measure of DNA methylation, we had genomic DNA from the cell line HCT116 profiled (Princess Margaret Genomics Centre, Toronto, ON, Canada) using the Infinium MethylationEPIC BeadChip array (Illumina, San Diego, CA, USA). We prepared these samples as technical replicates of the HCT116 genomic DNA that we later spiked with 0.01 ng spike-in control. We normalized and preprocessed array data using sesame (Zhou et al., 2018) version 1.8.2. We annotated CpGs on the array to our 300 bp genomic windows. When > 1 CpG probe annotated to a window, we calculated the mean probe M-values across the window.
We assessed correlation between array M-values and picomoles, and between array M-values and read counts. We compared to M-values rather than b values because b values have high heteroscedasticity (Du et al., 2010). As cfMeDIP-seq preferentially enriches for highly methylated regions, we hypothesized that regions for which the array has more CpG probes would correlate to both molar amount and read counts better than regions with less CpG coverage. As such, we assessed the correlation independently at windows containing R 3, R 5, R 7, and R 10 CpG probes.

Examining accuracy of predicted molar amount
To determine how well our spike-in controls predict molar amount, we trained Gaussian generalized linear models using only some spike-ins as training data, and held-out spike-ins as test data. We performed this analysis on AML data from Lab A. We chose Lab A as this lab adhered most closely to the previously published protocol (Shen et al., 2018).
We parameterized generalized linear models using training sets containing at least 6 and at most 25 spike-in controls. Since the model had 5 terms, using a minimum of 6 spike-ins meant we always had more training data points than the number of terms in the model. This avoided likely inaccurate rank-deficient fits of the model. Then, we predicted the remaining spike-in controls, which had known molar amount, as a test set. For each number of spike-in controls for training, we randomly selected 100 different training sets. For each of these 100 iterations, we calculated the mean absolute error between the known molar amount and predicted molar amount.

Examining consistency across experimental batches
To experimentally introduce batch effects, we provided a sample of 10 ng of cfDNA obtained from peripheral blood plasma of 5 AML patients containing 0.01 ng of our spike-in controls to 3 independent labs ( Figure 1B). Blood was collected at the time of diagnosis in EDTA tubes. Samples were spun and plasma frozen in Eppendorf tubes at À80 C until use. As a blood cancer, leukemia generates a high amount of plasma cfDNA, allowing us to have sufficient cfDNA to divide into three technical replicates of 10 ng each.
Each lab performed the cfMeDIP-seq method described by Shen et al., (2019), with a variety of procedural modifications, detailed here. The changes emulated batch effects commonly found in publicly available data from different labs. Labs A and C used the same IDT xGen Duplex Seq adapters with 3 bp-4 bp UMI, as described above. Lab B used custom IDT adapters with 2 bp degenerate UMI, as previously described (Wang et al., 2019). For ligation of adapters, Labs A and B incubated at 4 C for 16 h, while Lab B incubated at 20 C for 2 h. Labs A and C used Antibody 1 (Diagenode, Denville, NJ, USA, Cat #C15200081-100, Lot #RD004, RRID: AB_2572207), while Lab B used Antibody 2 (Diagenode, Denville, NJ, USA, Cat #C15200081-100, Lot #RD001, RRID: AB_2572207). Lab A and B used 50% methylated lambda filler DNA and 50% unmethylated lambda filler DNA. Lab C used 100% unmethylated lambda filler DNA. The lambda filler tops up the sample to 100 ng so that each sample has the same amount of input DNA.
For amplifying the final library, the batches had different numbers of PCR cycles. Lab A ran 13-15 cycles, optimized per sample, Lab B ran 13 cycles, and Lab C ran 11 cycles. Lab A and B sequenced DNA with a NovaSeq 6000 (Illumina, San Diego, CA, USA), paired-end 2 3 100 bp. Lab C sequenced DNA on a NextSeq 550 (Illumina, San Diego, CA, USA), paired-end 2 3 75 bp. Lab A obtained 60 million reads per sample, Lab B obtained 100 million reads per sample, and Lab C obtained 85 million reads per sample.
We assessed whether spike-in controls mitigated batch effects on samples for which we calculated molar amount. To do this, we performed PCA on different quantification methods, which included raw read counts, our molar amount estimates, and the output of QSEA (Lienhard et al., 2017), the current standard processing pipeline for MeDIP-seq data. QSEA normalizes MeDIP-seq data