Evaluation of diversity indices to estimate clonal dominance in gene therapy studies

In cell and gene therapy, achieving the stable engraftment of an abundant and highly polyclonal population of gene-corrected cells is one of the key factors to ensure the successful and safe treatment of patients. Because integrative vectors have been associated with possible risks of insertional mutagenesis leading to clonal dominance, monitoring the relative abundance of individual vector insertion sites in patients’ blood cells has become an important safety assessment, particularly in hematopoietic stem cell-based therapies. Clinical studies often express clonal diversity using various metrics. One of the most commonly used is the Shannon index of entropy. However, this index aggregates two distinct aspects of diversity, the number of unique species and their relative abundance. This property hampers the comparison of samples with different richness. This prompted us to reanalyze published datasets and to model the properties of various indices as applied to the evaluation of clonal diversity in gene therapy. A normalized version of the Shannon index, such as Pielou’s index, or Simpson’s probability index is robust and useful to compare sample evenness between patients and trials. Clinically meaningful standard values for clonal diversity are herein proposed to facilitate the use of vector insertion site analyses in genomic medicine practice.


INTRODUCTION
Gene transfer vectors derived from lentiviruses or retroviruses have been successfully used for the cell and gene therapy of rare genetic disorders including primary immune deficiencies, hemoglobin disorders, cytopenias, or neurodegenerative diseases 1 and for cancer immunotherapy with T cell receptor or chimeric antigen receptor (CAR)-T cells. 2,3 As vectors stably integrate into the genome of target cells, their genomic insertion sites (ISs) provide important biological and safety information on the effects of gene therapy. ISs inform on the genomic and epigenetic positional preferences of vectors 4 and enable cellular lineage ontology studies, 5 estimations of number of engrafted transduced cells, 6 and in vivo hematopoiesis reconstitution in patients 7 or in animal models. 8 The diversity of genomic ISs describes the polyclonal nature of the gene-modified cell population, which is an important safety assessment considering the clinical risks of insertional mutagenesis caused by retroviral vectors. 1,9 In hematopoietic stem cells (HSCs), the insertions of retroviral vectors using active long terminal repeat (LTR) promoters have caused the transactivation of proto-oncogenes resulting in clonal dominance and leukemic events. 10 Self-inactivated lentiviral vectors (LVs) have reduced this risk, but LVs can still disrupt gene expression in HSCs or in T cells, resulting in proliferative advantages and clonal dominance without transformation, as observed in some patients treated with LVs for thalassemia 11 or with CAR-T cells for cancer. 12,13 Consequently, vector genomic IS analyses have become integrated into patient safety monitoring protocols when using integrative gene therapy vectors.
Yet, the measure of gene therapy vector ISs is complex and not a standard assay in medical biology or genomic medicine practice. Methods for IS sequence retrieval and for DNA sequencing have largely evolved over the last decades to increase coverage, output, and speed. 14 Bioinformatic analyses yield multiple results that include genomic mapping data, information relative to ISs near specific sets of genes like oncogenes, estimation of cellular lineage relationships, measures of gene-modified cell counts, and measures of population diversity. Statistical methods for measuring clonal diversity mostly originate from the ecology field 15 and consider that cells sharing the same unique IS (also called clones) are species of which we want to measure the number and the relative abundance. Richness (the number of different clones) can estimate gene therapy efficacy, whereas the absence of any dominant species of clone over time (high evenness) can define gene therapy safety. Many indices are used to estimate clonal diversity either by the relative abundance of gene-modified cells with Gini or Shannon indices or by estimating population size with the Chao1 index or the number of clones in the top 50% of cell abundance with the UC 50 value, often reported together in gene therapy studies. [16][17][18][19][20][21][22][23] The Shannon index of diversity (or entropy) is a popular diversity index for gene therapy studies. It considers both the richness and the evenness of the sample but does not allow one to compare samples with different richness. It is sensitive to the amount of DNA, the number of vector copies in the sample, and the sequencing depth. 24 It also assumes that all species are represented in a sample and are randomly sampled. Moreover, the formula contains logarithms, but different bases are used in the literature (i.e., base 2 23 ), so inter-study comparisons of the Shannon index diversity in gene therapy must be made with caution, especially in samples of different sizes. The use of a normalized index would be preferable, enabling comparisons in studies with different size datasets. A normalized version of the Shannon index, also known as Pielou's index, 25 is used to estimate evenness. Simpson's index is another measure of diversity taking into account the number of species present, as well as the relative abundance of each species.
We herein compared the use and performance of these different indices that estimate relative abundance for the analysis of IS diversity in gene therapy studies in which some adverse events occurred or not. Gene therapy trials for the treatment of Wiskott Aldrich syndrome (WAS), metachromatic leukodystrophy (MLD), or X-linked severe combined immunodeficiency (SCID-X1) have used the Shannon index to evaluate the polyclonality of the gene-corrected cell populations in patients over time. [17][18][19][20]22,23,26 We modeled the performance of these indices upon the occurrence of clonal dominance and found differential impacts of sample size on the results obtained. Using an already published experimental IS dataset with spiked clones, 27 we show that Pielou's index provides a sensitive measure of clonal dominance. Using published gene therapy studies, we also show that the Pielou index enables inter-study comparisons and can be used to define a safety threshold in trials where leukemias occurred. Finally, we used Pielou's and Simpson's indices to demonstrate the sustained clonal diversity in a gene therapy clinical trial for WAS. 6,20,22,26

RESULTS
Published datasets demonstrate the interest of using a normalized diversity index The relative abundance of gene-modified cells was analyzed in several published gene therapy clinical trial datasets ( Table 1). Two of these trials used gamma-retrovirus-derived vectors, which have induced proliferative disorders in some patients. 19,23 In other trials, an SINlentiviral vector was used to treat WAS and MLD patients, 17,18,20,22,26 and no clonal dominance was observed. Figure 1 presents the longitudinal evolution of several indicators in these published studies. For clarity, we only presented the diversity analysis of peripheral blood mononuclear cells (PBMCs) or myeloid cell samples. The number of unique ISs varies over a wide range of values and among studies, patients, and time points. It is therefore obvious that the unique IS value is insufficient as a single indicator to estimate the polyclonality of gene-modified cells ( Figure 1A). The Shannon index also varies between studies ( Figure 1B). The Shannon index value drops in patients when a clonal dominance occurs (between 20% and 98% of cells are leukemic blasts at the indicated time points circled in red in Figure 1B). 19,23 In the Wang et al. study, diversity was restored upon chemotherapy treatment in patients 7 and 10, and the Shannon index value rose. However, Shannon index values fluctuated in other patients and other studies without evidence for clonal dominance. The values of this indicator were also found to be very variable not only between patients but also during the follow-up of each patient, ranging from 3.4 to 8.8 in the Magnani et al. study in which no clonal dominance occurred. 22 Patient-to-patient fluctuations in this index are mostly related to the richness of samples, as shown in Figure 1A, which may be the consequence of different sampling size, differences in the methodology used to identify ISs, or the efficacy of cell transduction. In the Wang et al. and Braun et al. studies,19,28 fewer than a thousand clones were identified per sample because DNA pyrosequencing technology was used. Instead, Illumina next-generation DNA sequencing used in the other studies (Table 1) generates up to 10,000 unique ISs per sample and therefore larger values of the Shannon index. Figure 1C shows the advantage of using Pielou's normalized Shannon index, which was calculated here from values of Shannon and ISs to control for differences in sample richness. Pielou's index shows that evenness is stable in healthy patients over time, confirming that changes in the Shannon index were related to different richness. For patients with adverse events, the evenness drops rapidly as expected and discriminates more obviously the healthy and leukemic patients. The control of the sample evenness using this normalized index makes it possible to define a threshold value below which it would be a health concern for patient. According to the values observed in patients with leukemia and healthy patients across the different trials, a threshold value of Pielou's index at 0.5 seems to adequately define low-and high-diversity samples (dashed horizontal line in Figure 1C). Below this threshold, there is an increasingly high probably of clinically relevant clonal dominance as shown in a logistic regression analysis ( Figure S1). This does not take into account asymptomatic patients. One of the patients in Braun et al. 19 and one patient in Wang et al. 23 show that evenness is decreasing over time reaching the threshold. These two patients have high frequency of clones with an IS in MDS1 and CCND2 genes, respectively, but no adverse event diagnosed at the last reported time point.

Computer simulation of clonal dominance and evaluation of the performance of diversity indices
In order to evaluate the effectiveness of several richness and evenness indices, we simulated a polyclonal population with an increasing clonal dominance over time ( Figure 2). A polyclonal population of 1,000 clones (richness indicated by a dashed line in Figure 2) was simulated using a random proliferation rate for each clone (see simulation details in materials and methods). Supplemental Figures S2 and S3 show the same simulations for 100 or 10,000 clones, respectively.
Independently of the true population richness and sample size, we observed that the estimated proportion of the "dominant clone" in the different samples is a good estimation of its abundance in the population (Figures 2A, S2A, and S3A) even if small samples tend to be  Table 1. In some graphs, red circles indicate the reported diagnosis of adverse events such as lymphoproliferative disorders. Patients' identification numbers were attributed incrementally and iteratively for each study so that all panels share the same legend for patient identification.
more variable when abundance is around 50%, as expected. Similarly, when sample size is large enough compared with population richness, all the clones are detected at least once in each sample for early time points ( Figures 2B, S2B, and S3B). Increasing sample size increases the number of unique detected clones up to saturation (e.g., all possible clones detected). The simulation shows that above 20% of clonal dominance in the population, the estimated richness is negatively biased even for large samples ( Figures 2B, S2B, and S3B).
Different indices have been proposed to estimate population richness even from small samples. One of them is the Chao1 index that takes advantage of the presence of rare species to estimate unseen species. 29 Figures 2C, S2C, and S3C show that the Chao1 index can correctly estimate the population richness even for small samples but only when the evenness is relatively high (early pseudotime points) and up to a moderate 15% clonal dominance if sample size is large.
The frequently used Shannon index decreases with the appearance of clonal dominance (Figures 2D, S2D, and S3D). However, because the Shannon index considers both the richness (which increases with sample size) and the evenness in the calculation, we also observed that its value increases with sample size (compare Figures 2D, S2D, and S3D for results in different sample size). Comparing different values of diversity with the Shannon index is then risky if sample size/richness is not equivalent between samples. Two other indices have been proposed in gene therapy monitoring of cell polyclonality.
The Gini index was developed in the field of economy to evaluate the amount of inequality in the contribution of different entities. Here, cell clones that have the same abundance contribute equally, and the Gini index is close to 0. If one clone becomes dominant, it will skew the frequency distribution, and the Gini index becomes close to 1. Figures 2E, S2E, and S3E show that the Gini index is influenced by sample size (relative to population richness) and is biased even at early pseudotime points.
Another index proposed 12,16,30 is the UC 50 value expressing how many clones contribute to the most expanded 50% of the total cell abundance. Figures 2F, S2F, and S3F show that UC 50 rapidly drops to 1 when the dominant clone reached 50% of relative abundance and is sensitive to sample size (and richness estimation). Thus, comparing two samples is then difficult without considering the richness.
In order to remove sample size/richness effect, one strategy is to correct the Shannon index value using an estimation of the true sample richness. A proposed solution is to use a correction based on sample coverage and the Horvitz-Thompson method (Figures 2G, S2G, and S3G). 31 Sample size effect is removed except for very small samples, and the corrected value is still influenced by the population richness.
Another approach to remove sample size/richness effect is to normalize the Shannon index by the maximum value expected if the evenness was perfect between identified clones, which corresponds to Pielou's index in the literature. 25 Figures 2H, S2H, and S3H show that Pielou's index ranges from 1 for even clonal abundance to 0 when the clonal dominance has reached its maximum value. Pielou's index is very close to the population value (dark dashed line) even though the value is slightly overestimated at late pseudotime points and when sample size is very small. Pielou's index values are independent of the richness and may be used to compare different samples with different sampling efforts or different true richness. Depending on the species abundance distribution shape, Pielou's index Simpson's dominance index is another estimator of sample evenness that gives more weight to abundant species. It expresses the probability that two independent sampled cells taken at random from a dataset belong to the same clone. 15 The less diverse the population is, the higher is this probability. Figures 2I, S2I, and S3I show that Simpson's index performs similarly to Pielou's but is less biased independently of sample size and richness.
The three most robust indices for our purpose, Shannon's, Pielou's, and Simpson's, were computed in simulations with increasing richness and clonal abundance. In the simulations, one clone is dominant, and other clones' abundance is distributed using a random uniform distribution. Surface plots in Figure 3 clearly show the dependence of the Shannon index on sample richness, whereas Pielou's and Simpson's indices are almost independent from the richness except when really small. Thus, both the Pielou and Simpson indices are almost linearly dependent on the amount of clonal dominance, with all others clones being distributed following a random uniform distribution.

Comparison of Pielou and Simpson indices in real conditions
In order to evaluate the behavior of the Simpson and Pielou indices in experimental conditions, we used a dataset of ISs with different amounts of clonal dominance from a previously reported study. 27 To artificially increase clonal dominance, a DNA from a polyclonal population of LV-transduced cells was spiked with known and increasing quantities of DNA extracted from stable cell clones containing either one, two, or three copies of LV, generating increasing amounts of multiple detectable ISs. Under these conditions, an www.moleculartherapy.org average of 900 ISs (683-920) for 2,500 estimated cells per condition (2,388-3,502) were analyzed. The contribution of major ISs is represented in Figure 4. Both Simpson's and Pielou's indices detected the drop in diversity due to the increased clonal dominance, but Pielou's index seemed to be the most sensitive to the change in evenness over the gradually increasing dominance (Figure 4). Figure S4 confirmed that the Simpson index is less sensitive to the increase in dominance than Pielou, especially when multiple ISs contribute to the change in evenness.
A comparison of Pielou's and Simpson's indices was also performed on IS analysis done in WAS patients' PBMCs in a clinical trial longterm follow-up of WAS gene therapy 22 ( Figure 5). Even though Pielou's index seems to be more variable than Simpson's, the longitudinal tracking of both indices is consistent with the absence of any notable clonal dominance in this study.

DISCUSSION
The analysis of retroviral or lentiviral vector ISs in gene therapy provides a large amount of complex sequencing and genomic and statistical data that present a challenge for the medical interpretation of gene therapy results. Monitoring the gene-modified cellular diversity in patients is particularly important to follow considering the possible occurrence of clonal dominance with clinical impact. Considering that many different indicators of population diversity have been used in published studies, [16][17][18][19][20][21][22][23] the purpose of the present work was to seek an index, insensitive to the effects of richness and sample size, thus allowing intra-and inter-study comparisons, with meaningful clinical significance. We herein propose to use Pielou's or Simpson's indices to obtain normalized or relative measures of evenness, which is a biologically meaningful aspect of diversity. Here, the analysis of published gene therapy and experimental datasets and in silico modeling studies have shown that the popularly used Shannon index is not sufficiently robust as a single indicator, as it varies according to sample richness and size as shown by us and also by others. 24 This Shannon entropy is not useful alone to predict the risk associated with increased clonal dominance in patients with proliferative disorders. While Del Core et al. 24 propose to apply an additional model to reduce the confounding effects on Shannon, we propose to use existing indices. Pielou's and Simpson's indices are robust and measure evenness of clonal abundance in samples. Being scaled between 0 and 1, these indices also provide calibrated values that facilitate the interpretation of results. Based on the reanalysis of gamma-retroviral and lentiviral gene therapy trials in which leukemias occurred or not, we propose that a Pielou's index value of 0.5 may define an alert threshold of low evenness in samples with clinical importance. Simpson's index value could not be modeled from the publication datasets as all information was not available to calculate it, but this index should also provide similar threshold values. The comparison between Pielou's and Simpson's indices shows that Pielou can be more biased by very low sample sizes than Simpson and that Simpson's index is less dynamic and less sensitive to clonal abundance changes than Pielou. Otherwise, the behavior of these two indices is rather comparable.
Other values are important to consider to better interpret results of clonal abundance in gene therapy. The estimation of the number of gene-modified cells (i.e., population richness or population size) is a well-recognized challenge in gene therapy. 32 Counting the numbers of unique ISs is not enough, technical bias exists, and sampling is almost always incomplete. When retrieving ISs after random DNA shearing, it is possible to estimate the size of cellular clones by counting the different linker positions (i.e., different DNA fragments) obtained for each unique IS. 32 Sample richness depends on the sampling effort as the number of expected ISs is proportional to sample size, amount of material, and information obtained (number of cells, DNA quantity, DNA sequencing method, and throughput). As we showed, the observed sample richness may not be a sufficient metric to estimate the polyclonality over time, across patient or trials if sampling effort varies. Thus, the Chao1 estimator is useful to estimate the size of populations of gene-modified cells. 32 Chao1 provides a higher estimate of population size by using the information on the number of times that ISs are detected or not in each sampling. If all clones are detected several times in a sample, sampling effort may be saturating, indicating that no other clones are expected if sample size increase. Otherwise, other clones are expected from additional or larger samples indicating a large polyclonality or an undersampling issue. Reporting both the observed and the predicted richness using the Chao1 is then of particular interest to interpret clonal abundance in the light of sample completeness.
Diversity indicators are useful in many fields of population studies as reviewed recently with microbiome analyses. 33 The medical importance of normalizing the analyses of diversity in gene-modified cell populations in gene therapy is obvious. It is useful for safety evaluations of integrative vectors or to follow treatment effects as suggested by CAR-T cell studies 12 or to track population dynamics as shown in the case of antiviral immune responses. 34 Normalizing or calibrating the evenness index is essential in gene therapy to express the level of clonal abundance of gene-modified cells in a reproducible and clinically meaningful manner. The use of Pielou's or Simpson's indices and the identification of threshold alert values such as a Pielou <0.5 may facilitate the medical interpretation of gene therapy protocols biosafety. Below this value, diversity should be closely monitored through additional parameters and longitudinal verifications.

Data and scripts
Data processing and graphical representation were done using R software (4.0.2). Diversity indices were computed using specific libraries (Vegan 2.5.7, ineq 0.2.13). Scripts are available at https://github.com/ gcorre/GeneTherapyClonalDiversityIndex_study.

Formulas
Pielou's index named "J" in the formula also referred to as the Shannon's equitability index or the Shannon evenness is defined as the ratio of the observed diversity (H) by the maximum diversity (Hmax) expected considering the number of observed clones (S) (i.e., if they were all equi-abundant) with the following formula: Where, H is the Shannon index, S the number of species, n is the log base, and p i is the relative abundance of clone i.
The Simpson index was calculated using the following: Gini was determined using Brown's formula: Where n is the number of clones and y i the abundance of the ith clone ranked in increasing value.
Where, S is the total number of observed clones, and f 1 and f 2 are the number of clones observed once and twice respectively.

Horvitz-Thompson correction
Here, c = 1 À f1 S and S the total number of observed clones, with f 1 observed once.
In silico simulation of clonal dominance Clonal dominance was simulated using the R software in order to evaluate different richness and evenness metrics. At T0, we generated N clones, which all have the same relative abundance. Each of them is attributed a similar proliferation rate, except one that is attributed a higher value to mimic a selective advantage and to generate a progressive clonal dominance. At each step of the simulation, the relative abundance of each clone is updated according to the proliferation rate, and the proliferation rate for the next time point is updated with a random value from the normal distribution N ð0; 1). The proliferation rate of the "malignant" clone was chosen so that clonal dominance raised progressively and reached its maximum after 20 pseudotime steps. Random sampling with replacement was performed 20 times for different sample sizes (one-tenth to 10 times the total number of clones). Diversity indices were estimated for each sampling and a summary calculated for graphical representation (mean and standard deviation).

Datasets analyzed
The Shannon diversity index value of different participants to ex vivo gene therapy clinical trials listed in Table 1 were extracted from these publications using either the information in the text, tables, or figures. A logarithm base change was applied to Shannon index values when necessary to compare the different datasets.

DATA AVAILABILITY
Data are available upon request. Scripts are available at https://github. com/gcorre/GeneTherapyClonalDiversityIndex_study.