Assessment of genome packaging in AAVs using Orbitrap-based charge-detection mass spectrometry

Adeno-associated viruses (AAVs) represent important gene therapy vectors with several approved clinical applications and numerous more in clinical trials. Genome packaging is an essential step in the bioprocessing of AAVs and needs to be tightly monitored to ensure the proper delivery of transgenes and the production of effective drugs. Current methods to monitor genome packaging have limited sensitivity, a high demand on labor, and struggle to distinguish between packaging of the intended genome or unwanted side-products. Here we show that Orbitrap-based charge-detection mass spectrometry allows the very sensitive quantification of all these different AAV bioprocessing products. A protocol is presented that allows the quantification of genome-packed AAV preparations in under half an hour, requiring only micro-liter quantities of typical AAV preparations with ∼1013 viral capsids per milliliter. The method quickly assesses the integrity and amount of genome packed AAV particles to support AAV bioprocessing and characterization of this rapidly emerging class of advanced drug therapies.


INTRODUCTION
Adeno-associated viruses (AAVs) are small viruses with a singlestranded DNA (ssDNA) genome that is encapsidated by a stochastic mixture of 60 VP1, VP2, or VP3. 1,2 The virus belongs to the genus Dependoparvovirus, and consequently is dependent on coinfection with helper viruses like adeno-or herpesviruses. AAVs' inability to replicate on their own, combined with a low immunogenic profile, make them ideal gene therapy vectors, with several AAV-based gene therapy treatments already approved by the European Medicines Agency (EMA) and US Food and Drug Administration (FDA). 3 For gene-delivery purposes, the natural AAV genome is replaced with a transgene, flanked by the natural AAV inverted terminal repeats (ITRs). These recombinant AAVs (rAAVs) are typically produced in dedicated host expression systems where the transgene, the structural AAV genes, and the required helper genes are co-transfected or stably integrated. [4][5][6][7] Human HEK293 or Sf9 insect cells are currently the most often used expression systems used for expression of AAV, although other and/or adapted cell lines have been explored as well. 8 No matter what host cell is used, the produced rAAV particles must be purified from the cultured cell stock. Besides removing crude biological materials, this step also requires the proper separation of the empty from genome-filled AAV particles. The empty particles are still immunogenic but are unproductive for transgene delivery, while accounting for up to 80% of the total number of particles produced in the cell culture system. Hence, just a fraction of the total yield of AAV particles from the production system contains the desired genome ( Figure 1). This diversity of products, combined with the poor scalability of the used expression systems, makes AAV production and purification challenging and expensive.
Ideally, the filled AAV particles will only contain the ITR-flanked genome. However, several processes may lead to genome heterogeneity resulting in capsids partially filled with truncated genomes or even particles that package multiple genome copies (see Figure 1). First, if the packaged transgene exceeds the natural genome capacity of $4.8 kb, the genome may become truncated during the encapsidation. 9 Such AAV preparations, as in dual-vector strategies, can still transduce the corresponding gene of interest (GOI) via homologous recombination in the target cell, albeit at a greatly reduced efficiency. 10,11 Second, genome heterogeneity can also already be induced during replication. Like all dependoparvoviruses, AAV genome replication follows the "rolling hairpin" model where the Rep endonuclease nicks the terminal resolution site (trs) to allow the synthesis of the second downstream ITR. 12 If this process fails, the final product can be a self-complementary single-strand genome at approximately twice the size of the monomeric DNA. 13 This process can be exploited by modifying the Reb binding element (RBE) to promote the formation of self-complementary AAV (scAAV), but this process occurs also at a low rate in wild-type ITRs (wt-ITRs). [14][15][16] Administration of scAAV has the benefit that transcription can start directly from the internally hybridized dsDNA transgene and does not require the rate-limiting synthesis of the complementary DNA strand and intermolecular hybridization. 17,18 Third, secondary structure elements in the DNA can cause genome heterogeneity, as has been documented in studies whereby rAAV had been designed to deliver short hairpin RNAs and CRISPR elements, which were shown to interfere with the genome replication. 19,20 This latter feature is especially important in targeted gene delivery, as many mammalian transgenes contain highly structured elements to assist in promotor binding.
Sensitive and specific methods to investigate the processes leading to unwanted genome heterogeneity may lead the way to improved bioprocessing of homogeneous AAV particles for clinical applications. Typical laboratory-scale preparations of rAAVs yield no more than a few milliliters of purified samples (equivalent to 0.1-10 mL at $10 13 viral capsids per milliliter [vc/mL]), which makes subsequent analysis challenging, especially when screening several different experimental growth conditions. From the biophysical methods available to investigate potential heterogeneity in rAAV preparations, analytical ultracentrifugation (AUC), electron microscopy (EM), and polymerase chain reaction (PCR)-based methods are currently the industry standards (see Figure 1), but all these approaches have also their downsides (as reviewed in Gimpel et al. 21 ). While AUC is also used as a preparative purification method to fractionate empty from filled rAAV particles, it requires relatively large amounts of samples (0.5 mL of $2 Â 10 12 -5 Â 10 12 vc/mL). Negative stain EM typically requires much less sample, but staining and drying of the samples may disturb particle integrity, leading to genome release and a bias toward empty capsids. Moreover, these staining artifacts further complicate distinguishing fully packed capsids from the partially filled side-products by EM. More recently, mass photometry has also been explored to characterize empty and filled AAV particles. 22 PCR-based methods require just minute amounts of sample but mostly can only report on genome titer, without distinction of genome integrity. Thus, currently, the field is missing a quantitative method to distinguish between full, empty, and partially packaged rAAVs with sufficient resolution, sensitivity, and dynamic range to detect even the least abundant populations. Moreover, methods that can also be used to characterize the nature of these potential genome intermediates, for instance by measuring their mass or sequencing their DNA, would be useful for optimizing the rAAV production processes.
A relatively new alternative approach used for rAAV characterization is charge-detection mass spectrometry (CD-MS), as first demonstrated by using home-built instruments. 23 The feasibility to perform CD-MS on commercial Orbitrap ultra high mass range (UHMR) platforms was recently demonstrated and also used for analyzing Figure 1. AAV bioprocessing and analysis Schematic overview of rAAV production in either HEK293 or SF9 host cells for a designed transgene of around 2.6 kb. After transfection of (or infection with) the corresponding transgene (1) (if replicated properly) they will yield an encapsulated 800-kDa genome (2). Other encapsulated off-target genomes could originate from selfcomplementary ssDNA dimeric variants (3) as well as truncated genomes (4). These gene products are all encapsulated (5) and will yield a mixture of empty and (partly) filled particles. The initial number of other contaminants (DNA, side product, host cell proteins) can be several orders of magnitude higher than that of rAAV capsids and require extensive purification. rAAV capsids show additionally a wide distribution of empty, partially filled, and filled capsids, requiring additional purification and monitoring. Detailed characterization is important as only a small number of the infectious particles will transduce their genome. This efficiency is sensitive to the presence of empty capsids and capsids filled with off-target genomes. For monitoring this, AUC and EM are the industry standards, whereby the former can also serve as preparative method. Here, Orbitrapbased CD-MS is explored for quality control, as it is sensitive, facile, and can yield information on capsid/genome integrity. www.moleculartherapy.org rAAVs. 24,25 In CD-MS, particles are measured individually as opposed to conventional native MS. Inherently, single-particle measurements offer extreme sensitivity and thus low sample consumption, which is highly beneficial for successful application to clinical preparations of rAAV.
Here we present an optimized workflow enabling the mass analysis of heterogeneous rAAV particles, whereby the mass resolving power and accuracy attainable allow us to determine and quantify genome integrity. We determine the mass and abundance of various co-occurring rAAV particles in less than 30 min, using only 1 mL of sample at typical AAV working concentrations (2 Â 10 13 vc/mL). We introduce an improved acquisition method with better signal utilization that can accurately and reproducible detect rAAV particle populations as low as 2%, also identifying some less-abundant, likely scAAV, variants as off-target products in rAAV preparations. We show that the quantitative accuracy of the CD-MS method to distinguish filled from empty particles is in the range of 1%-3% (see Table S1). This new approach can contribute to the optimization of the bioprocessing of rAAVs, producing more particles containing the desired intact transgene.

Optimizing the analysis of co-occurring AAV particles in CD-MS
Here we aimed to quantitatively measure co-occurring rAAV particles ranging in mass from about 3.6 (empty capsids) to 5.3 MDa (genome packed capsids). Toward this goal we build further on our previous study of AAV by Orbitrap-based CD-MS. 25 In this approach, rAAV particles are directly diluted, from their storage so- lution, into aqueous ammonium acetate and introduced into a mass spectrometer where they are ionized by electrospray ionization under native conditions. 26 Subsequently each particle is individually detected within the Orbitrap mass analyzer and we can estimate the number of charges for each ion directly from its intensity. This enables mass determination on a single-particle basis with the particle counts providing quantitative information on population distributions. The unbiased detection of particles of different composition and mass is not trivial in MS, as ion transmission and ion decay processes are charge and mass dependent. 27,28 Therefore, we initially investigated the effect of the pressure settings and transient recording times on the detection of all different co-occurring rAAV particles and observed that transient times of 512 ms, and pressure settings between 1.5 and 3 (i.e., 2.1 Â 10 À10 to 5.3 Â 10 À10 mbar UHV readout), provided the optimal conditions for the least biased detection of all co-occurring particles, as described in detail in the supplemental information and Figure S1.

Assessing AAV particle diversity
The importance of using this optimized acquisition approach becomes most evident when aiming to analyze and quantify low-abundant encapsulated genome variants, as demonstrated in Figure 2. The analyzed particles were expressed with a common expression cassette with an approximately 2.5-kb long genome (with a molecular weight [MW] of $800 kDa). Besides packing this genome, other possible genome variants that may be packed by the AAV are a self-complementary ssDNA (MW $2 Â 800 kDa) and truncated forms of the genome as further depicted in Figure 1. By CD-MS, the obtained two-dimensional (2D) histogram and the corresponding mass histogram for an AAV6a serotype, expressed with the above-mentioned transgene, are shown in Figures 2A and 2B. Cumulatively, the 2D CD-MS spectrum shown in Figure 2A shows the data for about $76,000 individual rAAV6a particles. From 2D CD-MS, two prominent co-occurring distributions can be observed, but also two weaker particle populations, all with distinct masses. The two most prominent particle populations represent the empty ($23,000 particles, MW = 3.85 MDa) and genome-filled rAAV6a particles ($40,400 particles, MW = 4.63 MDa), with a relative mass difference of $800 kDa, due to the packaging of the expected genome. All measured masses are summarized in Table 1. The additional, low-abundant densities in the 2D mass histogram appearing between the empty and full, as well as at higher m/z beyond the full particles, exhibit masses of approximately 4.3 MDa ($3,700 particles) and 5.3 MDa ($1,800 particles), respectively.
To further validate that these less abundant populations of AAV particles are also observed in other rAAV preparations and other rAAV serotypes, we next analyzed by CD-MS a rAAV9 sample, expressed with the same transgene. The CD-MS-obtained 2D histogram and the corresponding mass histogram for an AAV9 serotype are shown in Figures 2C and 2D. Also in this 2D CD-MS histogram, beside the two co-occurring distributions, originating from the empty ($1,000 particles, MW = 3.78 MDa) and genome-filled rAAV9 particles ($5,600 particles, MW = 4.59 MDa), again a lower abundance particle population is observed.
Assessing AAV particle abundances With the above-described improved data acquisition method, we developed a dedicated workflow for the sensitive and accurate mass analysis of rAAV preparations, and next wanted to assess its performance, especially in the quantification of co-occurring particle distributions. The aim here was to minimize sample consumption as well as demands on time and labor ( Figure 3) to meet industry standards. Based on its improved sensitivity and direct charge assessment, CD-MS is less negatively affected by high salt concentrations present in the storage buffer and consequently we observed that the rAAV samples could be directly diluted into aqueous ammonium acetate instead of requiring a time-and sample-consuming buffer exchange by dialysis or spin filters. We found for our AAV preparations (2 Â 10 13 vc/mL) a dilution factor of 20 reduced the salt concentration sufficiently, causing less than 1% increase in mass due to salt adducts compared with buffer exchanged preparations (see Figure S2), but leaving a high enough AAV concentration for fast data acquisition. We were able to measure and record around 30,000 particles in 20 min, which is sufficient for good statistics in the downstream data analysis. The whole sample preparation as well as data acquisition could be performed in less than 30 min. The data processing pipeline is largely automated and can be, after initial setup, executed with a runtime of less than a minute for the here recorded individual experiments.
Establishing this sensitive and efficient workflow, we focused for our quantitative analysis on a research-stage AAV6 sample with a proprietary transgene, which we term here AAV6b, to distinguish it from the commercial AAV6a sample. The produced AAV6b particles were first separated into full and empty particles by using a CsCl gradient centrifugation. Following this fractionation, the empty and full rAAV particles were mixed in defined ratios of 0%, 25%, 50%, 75%, and 100%. These five distinct samples were subsequently analyzed by CD-MS. The setup as well as an illustrative example of the recorded data are visualized in Figures 3B and 3C for a rAAV6b sample, in which the empty and full fractions had been pre-mixed in a ratio of 25%:75%. However, for this rAAV6b sample, we clearly observed in the 2D CD-MS histogram three distinct particle distributions, which we assigned, based on their mass, to empty, intermediate, and full genome packed rAAV capsids. We constructed for each observed distribution in the 2D CD-MS spectrum a polygon capturing the majority of the corresponding ions, used to filter and assign all ions laying within the polygons margins. From the filtered The measured experimental masses correspond to the average masses determined from the mass histograms extracted from the CD-MS data.
The expected masses are based on the amino acid sequences of the AAV VP proteins and an assumed stoichiometry of 60 capsid proteins (assumed 5:5:50 for VP1:VP2:VP3 and with ±5 VP as composition can vary 1 ), while the mass of the DNA is estimated from its nucleic acid sequence. For the AAVs of serotype 6 the AAV6a particles were expressed with a GFP genome, whereas the AAV6b were expressed with a proprietary genome.
ion distributions, we can calculate for each particle population the average mass as well as count the particle population, as presented in Figure 3C. For this particular rAAV6b sample, we assigned a capsid mass of 3.7 MDa (5,651 particles, orange), a mass of 5.2 MDa (10,529 particles, red) for the full genome packed AAV particles, and 4.4 MDa (4,490 particles, green) for AAV particles packaging a truncated genome, respectively.
To further evaluate the method, we next measured samples from the wider range of mixing ratios, also under two distinct experimental conditions, varying the pressure settings (see Figures 4A and S3) to further confirm that they do not, in the applied pressure range, influence the qualitative and quantitative outcome of the analysis. The resulting 2D CD-MS data and mass histograms are shown in Figure 4A, with, from left to right, the empty to completely full rAAV particles. For the quantitative analysis, the same three polygons, and color-coding, were used to count the particle distributions. The relative contributions of each of the three distinct particle populations in the five measured pre-mixed samples are depicted in Figure 4B, where the two bars shown for each sample represent the measurements at the two different experimental pressure settings. Although the ratio assessed by CD-MS is already pleasingly in agreement with the expected ratio, we noticed that, in the measurements of the pre-mixed samples, there were slightly more empty particles detected by CD-MS than expected based on the mixing ratio. To investigate this slight discrepancy, we also carefully examined the CD-MS data of the non-mixed empty and full samples. While the empty sample seems to be pure, we still observed empty particles in the presumable pure full fraction (Figures 4A and 4B).

DISCUSSION
The workflow presented here enables the assessment of genome integrity with an accurate quantification of each respective species, present in a heterogeneous mixture of AAV particles. By using Orbitrap-based CD-MS, it is possible to perform these experiments with only microliter quantities of typical AAV preparations and fast turnaround times, of less than 30 min per experiment, including sample preparation, data acquisition, and mass analysis, as depicted in Figure 3A.
Assessing AAV particle diversity The improved acquisition method allows the detection of low-abundant species and provides confident mass assignments in the 2D CD-MS spectra enabling the estimation of capsid cargo, as demonstrated in Figure 2A. The particle population with the highest mass likely represents AAV capsids packaging a self-complementary transgene version. The intermediate-mass ions are likely AAV capsids encapsulating truncated transgenes and show a less clearly defined mass distribution (see also Figure S4). The mass difference (relative to the empty capsid) of these lower-mass particles correlate quite well with the location of GC-rich structures in the EGFP sequence, previously identified as a source for likely genome truncations. 19 Notably, all these co-occurring particles can be resolved in the 2D CD-MS spectra, whereby the detectable difference in abundance in between the most abundant and least abundant particle distribution is close to 40, with the rAAV6a particles seemingly packaging dimeric DNA representing just 2.4% of all particles, and the rAAV6a particles packaging truncated DNA representing just 4.8% of all particles. Also, in the 2D CD-MS experiments for rAAV9 presented in Figure 2C, beside the two co-occurring major distributions, originating from the empty and genome-filled rAAV9 particles, a lower-abundance particle population is observed originating from AAV particles containing likely dimeric DNA.

Assessing AAV particle abundances
To better assess the soundness of the quantification performed here, we used previously sized AAV particles, which were mixed again at defined ratios and then subjected to CD-MS analysis, as shown in Figure 3. The quantification based on these particle counts yields a ratio of 27%:22%:51% (empty/partially full/full) for this sample, which is in good agreement with the fact that this sample had been pre-mixed with 25% empty particles. However, our data reveal that a substantial part of the seemingly filled rAAV6b particles, fractionated and purified by using analytical ultracentrifugation, are not filled with the desired transgene but just a truncated part of that. To further investigate this, these pre-sized samples were subjected to further analysis over a wider range of mixing ratios as well as two distinct pressure settings (see Figure 4).
First, this analysis reveals that, in the pressure regime used, the change in pressure does not affect the particle distribution. Second, when summing up all the particles packing the intact and truncated transgenes, and considering them as full, we observe that the ratio between empty and full rAAV, as measured by CD-MS, matches the expected ratio very well, based on the known mixing ratio. The expected ratio is depicted by the dashed black line in Figure 4B.
This contamination of empty rAAV particles is likely caused by the insufficient resolving power in the ultracentrifugation. Measuring the pure full sample, we could assess the percentage of partially and fully filled particles, and observed that, in this particular sample, there is a relatively high amount of particles that seemingly have packed a truncated DNA molecule (about 26% of all filled particles), as shown in Figure 4C. A similar 30%:70% ratio between half-full and full rAAV was consistently measured also after mixing the full fraction with an empty fraction, as shown in Figure 4C. Evidently, this is to be expected but also shows that the quantification of particle populations by CD-MS can be quite robust and accurate, using only minute amounts of sample and time. The achieved overall quantification accuracy is 1.3%, after correcting for the $4% empty capsids present in the full fraction (3% before correction; see Table S1).
In summary, we here show that Orbitrap-based CD-MS allows the accurate quantification of various co-occurring AAV bioprocessing products in under half an hour, requiring only micro-liter quantities of typical AAV preparations at $10 13 vc/mL. The method can be used to rapidly monitor the integrity and amount of genome packed AAV particles and thus support the detailed characterization of this rapidly emerging class of advanced drug therapies.

AAV samples
Full and empty mixtures of AAV6a and AAV9 were purchased from Vigene Biosciences containing a GFP genome and were stored in the provided buffer (PBS, 0.001% Pluronic F-68) from the manufacturer. The alternate AAV6b samples were produced by Pfizer using a baculovirus expression system containing a proprietary genome and stored in a PBS buffer containing 0.001% poloxamer 188 (P188).
For the AAV6b samples, the enriched empty and full particles were fractionated via cesium chloride ultracentrifugation using a Beckman Coulter Optima L80XP ultracentrifuge with an SW 41Ti swinging bucket rotor run at 40,000 rpm for 44 h at 15 C. Following centrifugation, two distinct viral bands were visible: a low-density and highdensity band, corresponding to empty and full capsids, respectively. Each purified band was collected via side puncture and buffer exchanged into PBS using a 10-kDa MW cutoff dialysis device. The capsid concentration of each bulk enriched fraction was determined by size exclusion chromatography, followed by subsequent dilution in a proprietary formulation buffer (containing 0.001% poloxamer 188) at equivalent sample concentrations. The theoretical percentage of empty capsids assumes that the low-density band is 100% empty capsids and that the high-density band is 100% full capsids.

AAV CD-MS
For CD-MS analysis, 1 mL of 2 Â 10 13 vp/mL were diluted directly in 19 mL of aqueous ammonium acetate (75 mM, pH 7.5) or buffer exchanged into aqueous ammonium acetate (75 mM, pH 7.5) by several concentration and dilution rounds using Vivaspin centrifugal concentrators (50 kDa MWCO, 9,000 g, 4 C). Samples were analyzed either immediately or stored at 4 C for up to 1 week before analysis. For the subsequent single-particle CD-MS analysis, an aliquot of 1 mL was loaded into gold-coated borosilicate capillaries (Kwik-Fil, World Precision Instruments, Sarasota, FL) made in house for nano ESI on a P97 puller (Sutter Instruments, Novato, CA) and coated by using an Edwards Scancoat six pirani 501 sputter coater (Edwards Laboratories, Milpitas, CA). Samples were analyzed on a standard Q Exactive UHMR instrument (Thermo Fisher Scientific, Bremen, Germany). 28,29 The instrument parameters were optimized for the transmission of AAV particles. In short, spray voltage was set 1.4 kV, S-lens RF was set to 200 V, and ion transfer target m/ z and detector optimization were set to high m/z. In-source trapping was enabled with desolvation voltage of À150 V. The ion transfer optics (injection flatapole, inter-flatapole lens, bent flatapole, and transfer multipole) were set to 10, 10, 4, and 4 V. We used xenon as neutral gas in the collision cell and complexes were desolvated via activation in the HCD cell (120-140 V). Gas pressure was varied during experiments and the mentioned gas pressures correspond to UHV cold cathode gauge readouts. Data were recorded with noise threshold set to 0, at either 512 or 1,024 ms transient time as specified in the manuscript.

AAV quantification
For AAV quantification acquired.raw files were converted to mzXML with vendor peak picking enabled using msConvert. If not mentioned otherwise, data files were used for quantification without filtering. If filtering was applied, first, all centroids above the noise level were removed, then split peak occurrences were removed by removing each peak with a neighboring peak within three times the FWHM of the theoretical achievable resolution. Conversion from intensity to charge was performed with the formula charge = intensity/ 12.521 where 12.521 is the given conversion factor for the instrument used here, as described in more detail previously. 25 Ions were quantified based on their position in the 2D histograms while applying appropriate filtering margins as demonstrated in the supplied python code example (see Data S1).