Leveraging homologous hypotheses for increased efficiency in tumor growth curve testing

Hutson, Alan D.; Yu, Han; Attwood, Kristopher

doi:10.1038/s41598-023-47202-9

Download PDF

Article
Open access
Published: 14 November 2023

Leveraging homologous hypotheses for increased efficiency in tumor growth curve testing

Alan D. Hutson¹,
Han Yu¹ &
Kristopher Attwood¹

Scientific Reports volume 13, Article number: 19890 (2023) Cite this article

386 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

In this note, we present an innovative approach called “homologous hypothesis tests” that focuses on cross-sectional comparisons of average tumor volumes at different time-points. By leveraging the correlation structure between time-points, our method enables highly efficient per time-point comparisons, providing inferences that are highly efficient as compared to those obtained from a standard two-sample t test. The key advantage of this approach lies in its user-friendliness and accessibility, as it can be easily employed by the broader scientific community through standard statistical software packages.

Statistical analysis of comparative tumor growth repeated measures experiments in the ovarian cancer patient derived xenograft (PDX) setting

Article Open access 13 April 2021

Standardised survival probabilities: a useful and informative tool for reporting regression models for survival data

Article Open access 01 September 2022

Universal scaling laws rule explosive growth in human cancers

Article 10 August 2020

Introduction

Tumor growth modeling in pre-clinical cancer research is a pivotal analysis that has been extensively explored in countless research papers. However, the utilization of animal models in this context can impose significant financial burdens due to their high costs. Consequently, optimizing testing procedures for growth curve modeling becomes crucial to ensure cost-efficiency without compromising accuracy and reliability.

In terms of background, and without loss of generality, let us focus on a two group comparison of an experimental treatment A versus treatment B in terms of comparing changes in tumor volume over time. Figure 1 depicts a commonly encountered plot in the literature, showcasing the average tumor volume (${\text{mm}}^3$) for mice treated with IL-1Ra versus scrIL-1a. plotted against time¹. The scrIL-1a values are offset for easier readability.

One strategy for modeling rates of change in tumor volume generally assume that the log tumor volume has linear relationship with time², with time measured on a continuous scale. Zavrakidis et al.² recommend a linear regression model with an autoregressive (AR-1) covariance structure for analyzing log-transformed tumor volumes. This model effectively accounts for the correlation among repeated measurements per mouse and provides unbiased results in comparing tumor growth rates between treatment groups. However, the accuracy of the model’s performance depends on the correct specification of the variance-covariance structure, as misspecification can affect the type I error and coverage rates. A similar study was carried forth in patient derived xenograft models³.

A series of nonlinear mixed-effects models that mathematically describe tumor size dynamics in cancer patients undergoing anticancer drug treatment has been developed as the Drug Disease Model Resources (DDMoRe) repository for oncology models⁴. More recently, Forrest et al.⁵ propose a nonparametric approach to overcome the linearity assumptions using regression splines in a generalized additive mixed model to estimate group-level response trends in logarithmically scaled tumor volume. This approach improves the fidelity of describing nonlinear growth scenarios and enhances statistical power for detecting differences between treatment regimens. Vaghi et al.⁶ analyzed tumor growth kinetics using a nonlinear mixed-effects approach and found that the Gompertz model provided the best fit to the experimental data. They confirmed a correlation between the Gompertz model parameters and proposed a reduced Gompertz function that improved predictive accuracy and precision, offering potential clinical applications in personalized tumor age prediction based on limited diagnostic data.

Alternatively, when monitoring tumor volume at specific time intervals, a useful approach is the application of a standard mixed model analysis of variance. This method treats each time-point as a distinct category and incorporates factors such as treatment, time, and their interaction, providing a nonparametric perspective on the relationship between time and tumor volume. If a significant overall difference in growth curves is observed, the subsequent step involves examining cross-sectional comparisons at each time point as specific contrasts within the mixed model. In general, these contrasts are simplified to two-sample t tests assuming normality, disregarding the correlation structure between time points. The main objective of these cross-sectional analyses is to statistically determine the time point at which the growth curve diverges and ascertain whether the growth curves remain separated in subsequent measurements. This information proves valuable in understanding the temporal dynamics of tumor growth and treatment effects. This analytical approach can be argued to be the predominant analysis presented in the field of tumor volume growth.

In “Definition of a homologous hypothesis” Section, we provide a precise definition of the homologous hypothesis and draw a clear contrast between this approach and the current mean-based tests. Section “Regression framework for testing a homologous hypothesis” delves into formulating the homologous hypothesis within the regression framework, leading to a more concise presentation of the results. Additionally, in “Simulation study” section, we offer power comparisons between the homologous hypothesis test and the traditional two-sample t test, demonstrating the robustness and effectiveness of our method. To illustrate the practicality of our approach, we provide a real-life example in “Example” section, followed by concluding remarks in the final section. Our intention is to make this methodology accessible and applicable, fostering advancements in tumor volume analysis and facilitating broader adoption within the scientific community.

Definition of a homologous hypothesis

Let $Y_{x_{i},ij}$ denote the tumor volume for the ith animal, $i=1,2,\cdots , n$, at the jth time-point, $j=1,2, \cdots , m$, and let $x_i$ indicate the treatment assignment for the ith animal ($x_i=0$ for treatment A, $x_i=1$ for treatment B), with the total sample size denoted as $n=n_0+n_1$. In cross-sectional analyses the null hypothesis of interest is to compare the mean tumor volume between treatment A and treatment B at specific time, given as

$$\begin{aligned} H_0: E(Y_{0,j})=E(Y_{1,j}), \end{aligned}$$

(1)

where $E(Y_{0,j})$ and $E(Y_{1,j})$ are the expected values for tumor volumes for treatment A and treatment B at time j. The alternative hypothesis may be two-sided or one-sided depending upon the needs of the analyst. This test is generally carried out on the raw values or log-transformed tumor volume values using a two-sample t test.

Now, let us assume a linear relationship between between the mean tumor volume at time point j and time-point $j-1$ for treatment groups A and B, respectively, and given as follows:

$$\begin{aligned} E(Y_{0,j}|Y_{0,j-1}=y_{0,j-1})=E(Y_{0,j})+ \rho _{0,j} \frac{\sigma _{Y_{0,j}}}{\sigma _{Y_{0,j-1}}}(y_{0,j-1}-E(Y_{0,j-1})), \end{aligned}$$

(2)

$$\begin{aligned} E(Y_{1,j}|Y_{1,j-1}=y_{1,j-1})=E(Y_{1,j})+ \rho _{1,j} \frac{\sigma _{Y_{1,j}}}{\sigma _{Y_{1,j-1}}}(y_{1,j-1}-E(Y_{1,j-1})), \end{aligned}$$

(3)

where $\rho _{0,j}$ is the correlation between $Y_{0,j}$ and $Y_{0,j-1}$, $\rho _{1,j}$ is the correlation between $Y_{1,j}$ and $Y_{1,j-1}$, $\sigma _{Y_{0,j}}$ and $\sigma _{Y_{0,j-1}}$ are the standard deviations for $Y_{0,j}$ and $Y_{0,j-1}$, respectively, and $\sigma _{Y_{1,j}}$ and $\sigma _{Y_{1,j-1}}$ are the standard deviations for $Y_{1,j}$ and $Y_{1,j-1}$, respectively.

An immediate examination of Eqs. (2) and (3) reveals that $E[Y_{0,j}|Y_{0,j-1}=E(Y_{0,j-1})]=E(Y_{0,j})$ and $E[Y_{1,j}|Y_{1,j-1}=E(Y_{1,j-1})]=E(Y_{1,j})$. This interesting relationship suggests a potentially more efficient approach for testing (1), leveraging the correlation between $Y_{0,j}$ and $Y_{0,j-1}$, as well as $Y_{1,j}$ and $Y_{1,j-1}$. Furthermore, it is worth noting that $E(Y_{0,j}|Y_{0,j-1}=y_{0,j-1})=E(Y_{0,j})$ holds true when $\rho _{0,j}=0$, and similarly, $E(Y_{1,j}|Y_{1,j-1}=y_{1,j-1})=E(Y_{1,j})$ when $\rho _{1,j}=0$. In other words, no additional information is gained in cases where there is no correlation between time points. However, in general, tumor growth curve models exhibit a high degree of correlation between adjacent time points. The sets of dependence relationships between tumor volumes over time form the basis for our concept of a homologous hypothesis as an alternative to the standard cross-sectional hypothesis at (1) for comparing two means.

Definition of a homologous hypothesis

We define the homologous null hypothesis for time point j, $j>1$, as follows:

$$\begin{aligned} H_0: E(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1})=E(Y_{1,j}|Y_{1,j-1}=\bar{y}_{1,j-1}), \end{aligned}$$

(4)

where $\bar{y}_{0,j-1}=\sum _{i=1}^n Y_{x_{i},ij-1} (1-x_i)/n_0$ and $\bar{y}_{1,j-1}=\sum _{i=1}^n Y_{x_{i},ij-1} x_i/n_1$ are the moment estimators for the expected tumor volumes $E(Y_{0,j-1})$ and $E(Y_{1,j-1})$, respectively, at time-point $j-1$.

The similarity between the standard cross-sectional null hypothesis at (1) and the homologous null hypothesis at (4) may be seen by noting that

$$\begin{aligned} E(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1})=E(Y_{0,j})+ \rho _{0,j} \frac{\sigma _{Y_{0,j}}}{\sigma _{Y_{0,j-1}}}(\bar{y}_{0,j-1}-E(Y_{0,j-1})), \end{aligned}$$

(5)

$$\begin{aligned} E(Y_{1,j}|Y_{1,j-1}=\bar{y}_{1,j-1})=E(Y_{1,j})+ \rho _{1,j} \frac{\sigma _{Y_{1,j}}}{\sigma _{Y_{1,j-1}}}(\bar{y}_{1,j-1}-E(Y_{1,j-1})), . \end{aligned}$$

(6)

i.e., $E(Y_{0,j}| Y_{0,j-1}\bar{y}_{0,j-1})$ is within a neighborhood of $E(Y_{0,j})$ and $E(Y_{1,j}| Y_{0,j-1}\bar{y}_{1,j-1})$ is within a neighborhood of $E(Y_{1,j})$.

In particular, through standard central limit arguments with bounded variances assumed, we have $\bar{y}_{0,j-1}\overset{p}{\rightarrow } E(Y_{0,j-1})$ and $\bar{y}_{1,j-1} \overset{p}{\rightarrow }\ E(Y_{1,j-1})$ as $n \rightarrow \infty $. Therefore, in an asymptotic sense, the homologous hypothesis stated in (4) can be considered equivalent to the standard cross-sectional hypothesis in (1). In other words we can rewrite the homologous null hypothesis at (4) as

$$\begin{aligned} H_0: E(Y_{0,j})-E(Y_{1,j})=o_p(1). \end{aligned}$$

(7)

The standard cross-sectional null hypothesis at (1) and the homologous null hypothesis at (4) exhibit subtle differences, except when $\rho _{0,j}=0$ and $\rho _{1,j}=0$. However, the primary reason for rejecting the homologous null hypothesis lies in the discrepancies between the population mean growth tumor volumes $E(Y_{1,j})-E(Y_{0,j})$ at time j. Emphasizing this point, if the investigator is willing to accept these subtle distinctions between the standard cross-sectional null hypothesis and the homologous null hypothesis substantial gains in statistical efficiency may be achieved. This can be accomplished by capitalizing on the correlation structure between successive tumor growth values over time. This in turn can reduce sample size requirements dramatically, where certain animal models may cost several thousand dollars per unit.

Furthermore, if the parameters for Eqs. (2) and (3) are estimated via standard least-squares regression of $Y_{0,j}$ on $Y_{0,j-1}$ and $Y_{1,j}$ on $Y_{1,j-1}$ we arrive at the following estimators:

$$\begin{aligned} \hat{E}(Y_{0,j}|Y_{0,j-1}=y_{0,j-1})=\bar{y}_{0,j}+ \hat{\rho }_{0,j} \frac{\hat{\sigma }_{Y_{0,j}}}{\hat{\sigma }_{Y_{0,j-1}}}(y_{0,j-1}-\bar{y}_{0,j-1}), \end{aligned}$$

(8)

$$\begin{aligned} \hat{E}(Y_{1,j}|Y_{1,j-1}=y_{1,j-1})=\bar{y}_{1,j}+ \hat{\rho }_{1,j} \frac{\hat{\sigma }_{Y_{1,j}}}{\hat{\sigma }_{Y_{1,j-1}}}(y_{1,j-1}-\bar{y}_{1,j-1}), \end{aligned}$$

(9)

where $\bar{y}_{0,j-1}$ and $\bar{y}_{1,j-1}$ are defined above,

$$\begin{aligned} \bar{y}_{0,j}= & {} \hat{E}(Y_{0,j}) =\frac{\sum _{i=1}^n Y_{x_{i},ij} (1-x_i)}{n_0} \;\; \text{ and } \;\; \bar{y}_{1,j} = \hat{E}(Y_{1,j})=\frac{\sum _{i=1}^n Y_{x_{i},ij} x_i}{n_1} \\ \hat{\sigma }^2_{Y_{0,j}}= & {} \frac{\sum _{i=1}^n (1- x_i)( Y_{x_{i},ij}-\bar{y}_{0,j})^2}{n_0-1} \;\; \text{ and } \;\; \hat{\sigma }^2_{Y_{0,j-1}} = \frac{\sum _{i=1}^n (1-x_i)( Y_{x_{i},ij-1}-\bar{y}_{0,j-1})^2}{n_0-1}, \\ \hat{\sigma }^2_{Y_{1,j}}= & {} \frac{\sum _{i=1}^n x_i( Y_{x_{i},ij}-\bar{y}_{1,j})^2}{n_1-1} \;\; \text{ and } \;\; \hat{\sigma }^2_{Y_{1,j-1}} = \frac{\sum _{i=1}^n x_i( Y_{x_{i},ij-1}-\bar{y}_{1,j-1})^2}{n_1-1}, \\ \hat{\rho }_{0,j}= & {} \frac{\sum _{i=1}^n (1- x_i)( Y_{x_{i},ij}-\bar{y}_{0,j})( Y_{x_{i},ij-1}-\bar{y}_{0,j-1})}{\sqrt{\sum _{i=1}^n (1- x_i)( Y_{x_{i},ij}-\bar{y}_{0,j})^2}\sqrt{\sum _{i=1}^n (1- x_i)( Y_{x_{i},ij-1}-\bar{y}_{0,j-1})^2}} \\ \hat{\rho }_{1,j}= & {} \frac{\sum _{i=1}^n x_i( Y_{x_{i},ij}-\bar{y}_{1,j})( Y_{x_{i},ij-1}-\bar{y}_{1,j-1})}{\sqrt{\sum _{i=1}^n x_i( Y_{x_{i},ij}-\bar{y}_{1,j})^2}\sqrt{\sum _{i=1}^n x_i( Y_{x_{i},ij-1}-\bar{y}_{1,j-1})^2}}. \end{aligned}$$

Now, it should be clear from (8) and (9) that the sample estimators for the conditional and unconditional are identically the sample mean at time j, i.e.,

$$\begin{aligned} \hat{E}(Y_{0,j}|Y_{0,j-1}=\bar{y}_{0,j-1})&= {} \hat{E}(Y_{0,j})=\bar{y}_{0,j}, \end{aligned}$$

(10)

$$\begin{aligned} \hat{E}(Y_{1,j}|Y_{1,j-1}=\bar{y}_{1,j-1})&= {} \hat{E}(Y_{1,j})=\bar{y}_{1,j}. \end{aligned}$$

(11)

However,

$$\begin{aligned} \text{ Var }[Y_{0,j}|Y_{0,j-1}=y_{0,j-1})]&= {} \text{ Var }(Y_{0,j})(1-\rho _{0,j}^2), \end{aligned}$$

(12)

$$\begin{aligned} \text{ Var }[Y_{1,j}|Y_{1,j-1}=y_{1,j-1})]&= {} \text{ Var }(Y_{1,j})(1-\rho _{1,j}^2). \end{aligned}$$

(13)

Thus if the correlation between tumor volumes at time point j and $j-1$ is strong a high degree of efficiency can be gained in terms of testing the homologous hypothesis stated at (4) as compared to the standard cross-sectional hypothesis in (1).

The variance estimates for $\hat{E}(Y_{0,j}|Y_{0,j-1}=\bar{y}_{0,j-1})$ and $ \hat{E}(Y_{1,j}|Y_{1,j-1}=\bar{y}_{1,j-1})$ at (10) and (11), respectively, follow from standard least- squares theory and are as follows:

$$\begin{aligned} \widehat{\text{ Var }}(\hat{E}(Y_{0,j}|Y_{0,j-1}= & {} \bar{y}_{0,j-1}))=\frac{MSE_0}{n_0}, \end{aligned}$$

(14)

$$\begin{aligned} \widehat{\text{ Var }}(\hat{E}(Y_{1,j}|Y_{1,j-1}= & {} \bar{y}_{1,j-1}))=\frac{MSE_1}{n_0}, \end{aligned}$$

(15)

where

$$\begin{aligned} MSE_0= & {} \frac{\sum _i^{n_{0}}[Y_{0,ij}- \hat{E}(Y_{0,j}|Y_{0,j-1}=y_{0,ij-1})]^2}{n_0-2}, \end{aligned}$$

(16)

$$\begin{aligned} MSE_1= & {} \frac{\sum _i^{n_{1}}[Y_{1,ij}- \hat{E}(Y_{1,j}|Y_{1,j-1}=y_{1,ij-1})]^2}{n_1-2}, \end{aligned}$$

(17)

and $\hat{E}(Y_{0,j}|Y_{0,j-1}=y_{0,j-1})$ and $\hat{E}(Y_{1,j}|Y_{1,j-1}=y_{1,j-1})$ are given at (8) and (9), respectively.

Regression framework for testing a homologous hypothesis

We can create a more streamlined approach for testing the homologous hypothesis

$$\begin{aligned} H_0: E(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1})=E(Y_{1,j}|Y_{1,j-1}=\bar{y}_{1,j-1}), \end{aligned}$$

(18)

at time point j, $j>1$, using a regression framework. Combining Eqs. (5) and (6) into a single regression framework at time j we arrive at the model

$$\begin{aligned} Y_{x_i, ij}=\beta _{0,j} + \beta _{1,j} x_i+ \beta _{2,j} y_{x_i,j-1}+ \beta _{3,j} x_i y_{x_i,j-1}+\epsilon _{ij}, i=1,2, \cdots , n, \end{aligned}$$

(19)

where as before $Y_{x_{i},ij}$ denotes the tumor volume for the ith animal, $i=1,2,\cdots , n$, at the jth time-point, $j=1,2, \cdots , m$, $x_i$ indicates the treatment assignment for the ith animal ($x_i=0$ for treatment A, $x_i=1$ for treatment B) and the $\epsilon _i$’s are assumed independent and identically distributed (i.i.d.), $\epsilon _i \sim N(0,\sigma _j^2)$. As will be evident from the discussion below, the regression framework provides a streamlined approach for estimating the standard errors of our quantities of interest and utilizing the well-known classical inferential framework.

Defining the regression model at (19) allows us to reformulate our conditional estimators $ \hat{E}(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1})$ at (8) and $\hat{E}(Y_{1,j}| Y_{1,j-1}=\bar{y}_{1,j-1})$ at (9) as follows:

$$\begin{aligned} \hat{E}(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1})= & {} \hat{\beta }_{0,j} + \hat{\beta }_{2,j} \bar{y}_{0,j-1}=\bar{y}_{0,j}, \end{aligned}$$

(20)

$$\begin{aligned} \hat{E}(Y_{1,j}| Y_{1,j-1}=\bar{y}_{1,j-1})= & {} \hat{\beta }_{0,j} + \hat{\beta }_{1,j} + (\hat{\beta }_{2,j} +\hat{\beta }_{3,j}) \bar{y}_{1,j-1} =\bar{y}_{1,j}, \end{aligned}$$

(21)

where $\hat{\beta }_{0,j}, \hat{\beta }_{1,j}, \hat{\beta }_{2,j}$ and $\hat{\beta }_{3,j}$ are standard least-squares regression slope estimators. This approach leads to the mean difference estimator

$$\begin{aligned} \hat{D}_j= & {} \hat{E}(Y_{1,j}| Y_{1,j-1}=\bar{y}_{1,j-1})=\bar{y}_{1,j-1})- \hat{E}(Y_{0,j}| Y_{0,j-1}=\bar{y}_{0,j-1}) \nonumber \\= & {} \hat{\beta }_{1,j} + \hat{\beta }_{2,j} (\bar{y}_{1,j-1}- \bar{y}_{0,j-1}) + \hat{\beta }_{3,j} \bar{y}_{1,j-1}. \nonumber \\= & {} \bar{y}_{1,j}-\bar{y}_{0,j} \end{aligned}$$

(22)

In terms of matrix formulation let

$$ {\mathbf{Y}}_{j} = \left[ {\begin{array}{*{20}l} {Y_{{0,1j}} } \\ {Y_{{0,2j}} } \\ \vdots \\ {Y_{{0,n_{0} j}} } \\ {Y_{{1,n_{0} + 1j}} } \\ {Y_{{1,n_{0} + 2j}} } \\ \vdots \\ {Y_{{1,n_{0} + n_{1} j}} } \\ \end{array} } \right]\;{\mathbf{X}}_{j} = \left[ {\begin{array}{*{20}l} 1 & 0 & {y_{{0,1j - 1}} } & 0 \\ 1 & 0 & {y_{{0,2j - 1}} } & 0 \\ \vdots & \vdots & {} & {} \\ 1 & 0 & {y_{{0,n_{0} j - 1}} } & 0 \\ 1 & 1 & {y_{{1,n_{0} + 1j - 1}} } & {y_{{1,n_{0} + 1j - 1}} } \\ 1 & 1 & {y_{{1,n_{0} + 2j - 1}} } & {y_{{1,n_{0} + 2j - 1}} } \\ \vdots & \vdots & {} & {} \\ 1 & 1 & {y_{{1,n_{0} + n_{1} j - 1}} } & {y_{{1,n_{0} + n_{1} j - 1}} } \\ \end{array} } \right]. $$

(23)

The regression model at (19) may be re-written compactly in matrix form as

$$\begin{aligned} \textbf{Y}_j= \textbf{X}_j\varvec{\beta }_j+ \varvec{\epsilon }_j, \end{aligned}$$

(24)

where the vector of $\epsilon _{ij}$’s are i.i.d. $N(0,\sigma _j^2)$ and $\textbf{Y}_j$ and $ \textbf{X}_j$ are given at (23).

It follows from standard linear model least-squares theory that the $4 \times 1$ vector of regression coefficient estimators has the following form:

$$\begin{aligned} \hat{\varvec{\beta }}_j=(\textbf{X}_j'{} \textbf{X}_j)^{-1} \textbf{X}_j'\textbf{Y}_j. \end{aligned}$$

(25)

To obtain the regression form of the estimator $\hat{D}_j $ at (22) first define the $1 \times 4$ vector:

$$\begin{aligned} \textbf{z}_{D_{j}}= & {} (0, 1, \bar{y}_{1,j-1}-\bar{y}_{0,j-1}, \bar{y}_{1,j-1}). \end{aligned}$$

(26)

Then

$$\begin{aligned} \hat{D}_j= & {} \textbf{z}_D \hat{\varvec{\beta }}_j. \end{aligned}$$

(27)

The sample variance of $\hat{D}_j$ (27) based on standard linear models formulations is given as:

$$\begin{aligned} s^2_{\hat{D}_j}= & {} \textbf{z}_{D_{j}} s^2(\hat{\varvec{\beta }}_j) \textbf{z}_{D_{j}}' , \end{aligned}$$

(28)

where $SSE_j=\textbf{Y}_j'{} \textbf{Y}_j-\hat{\varvec{\beta }}_j' \textbf{X}_j'{} \textbf{Y}_j,$ $MSE_j = SSE_j/(n-4),$ and $s^2(\hat{\varvec{\beta }}_j) = MSE_j (\textbf{X}_j'\textbf{X}_j)^{-1}$.

Under model assumptions stated at ( 24 ) we have that

$$\begin{aligned} \frac{\hat{D}_j-D}{s_{\hat{D}_j}} \sim t_{n-4}, \end{aligned}$$

(29)

where $\hat{D}_j$ is defined at (28) and $s^2_{\hat{D}_j}$ is defined at (28). The distributional result at (29) follows from standard least-squares theory. The homologous hypothesis test is available within the R homologous package available at GitHub (https://github.com/hyu-ub/homologous).

Simulation study

Study 1

We conducted a simulation study using the regression model specified in Eq. (19). To simplify our analysis, we assumed that $\beta _{0,j}$ is set to zero, without any loss of generality. In Tables 3, 4 and 5, you’ll find the results of simulated statistical power for testing the homologous hypothesis, as defined in equation (4), compared to the standard cross-sectional hypothesis outlined in Eq. (1). We varied the values of parameters such as $\sigma $, $\beta _{1,j}$, $\beta _{2,j}$, and $\beta _{3,j}$, while maintaining a significance level of $\alpha =0.05$ with two-sided alternative hypotheses.

Each simulation run encompassed 10,000 Monte Carlo replications. In each replication, we generated samples for the variable $y_{x_i,j-1}$ from a standard normal distribution with a sample size of $n=10$, equally divided between two experimental groups. In practical terms, this simulation is akin to analyzing data based on log-transformed tumor volumes.

It’s worth noting that when $\beta _{2,j}=0$ and $\beta _{3,j}=0$, the homologous hypothesis and the standard cross-sectional hypothesis are essentially equivalent, with only minor differences in the test statistics due to variations in the degrees of freedom in the null t-distributions.

As previously mentioned, the homologous hypothesis (4) and the standard cross-sectional hypothesis (1) share similarities but are not entirely equivalent. To calibrate the simulation results, we establish the equality: $ \rho _{0,j} \frac{\sigma _{Y_{0,j}}}{\sigma _{Y_{0,j-1}}}(\bar{y}_{0,j-1}-E(Y_{0,j-1})) = \rho _{1,j} \frac{\sigma _{Y_{1,j}}}{\sigma _{Y_{1,j-1}}}(\bar{y}_{1,j-1}-E(Y_{1,j-1})) $ within the homologous testing framework. Consequently, the primary factor influencing the power values is the disparity between Treatment A and Treatment B. By allowing the above equality to vary across replications, the power values for the homologous test would exhibit an increase.

The correlation between time point $j-1$ and time point j in log-transformed tumor volumes varies as $\sigma $ changes from 0.4 to 1, with Table 3 showing the highest correlation and Table 5 the lowest. As expected, when $\beta _{2,j}=0$ and $\beta _{3,j}=0$, both tests yield nearly equivalent results. However, the power of the homologous test is significantly enhanced in cases of high correlation between time points, as demonstrated in Table 3 where $\sigma =0.4$, $\beta _{1,j}=1$, $\beta _{2,j}=1$, and $\beta _{3,j}=0$, resulting in a power of 0.968, compared to 0.386 for the standard two-sample t test. Moreover, even with moderate correlation between time points, there are still considerable power gains. For example, in Table 3, when $\sigma =1$, $\beta _{1,j}=1$, $\beta _{2,j}=1$, and $\beta _{3,j}=0$, the power of the homologous test is 0.424, compared to 0.282 for the standard two-sample t test. These findings highlight the significance of considering correlation between time points when conducting tests, as it can lead to substantial improvements in statistical power.

Study 2

In our ongoing investigation, we conducted a second simulation study to further scrutinize the homologous test in comparison to the linear mixed model (LMM) when treating time as a continuous variable. This time, we generated data involving six time points and two distinct groups following the model specified as:

$$\begin{aligned} Y_{ij} = \beta _{g_i, j} + \epsilon _{ij}, \end{aligned}$$

(30)

In this equation, the variables $Y_{ij}$ and $\epsilon _{ij}$ represent the outcomes and observation errors for observation i at time j. The error term $\epsilon _i$ for each individual i exhibits a compound symmetry covariance structure, with diagonal elements set to 1 and off-diagonal elements set to 0.5. The variable $g_i$ indicates the group assignment for the ith observation, taking on values of 1 and 2. The parameter $\beta _{g_i, j}$ signifies the expected outcome for group $g_i$ at time j.

For this simulation study, we considered six distinct scenarios for $\beta _{g_i, j}$ (as illustrated in Fig. 2). We maintained a fixed sample size of $n=16$ per group, ensuring that the two-sided independent sample t test yields 78% power when the difference between the means of the two groups is 1, at a significance level of 0.05. This is given the assumption of a standard deviation of the error term equal to 1.

We compared the performance of the homologous test, the standard two-sample t test, and the LMM that incorporates time by group interaction effects and random intercepts. Our primary focus is on assessing the difference between the two groups at the final time point. Each simulation iteration encompassed 10,000 Monte Carlo replications, providing robust results for our analysis.

Table 1 provides insights into the rejection probabilities across six distinct scenarios. In scenarios 1-3, where no differences exist in group means at the final time point, the rejection probabilities correspond to type I error rates. The results clearly demonstrate that both the homologous test and the standard two-sample t test effectively control type I errors at the desired level. However, in scenarios 2 and 3, characterized by non-linear patterns, the Linear Mixed Model (LMM) exhibited a notable inflation in type I error.

Moving to scenarios 4–6, the rejection probabilities represent statistical power. In all three cases, the homologous test exhibited superior power when compared to the standard two-sample t test. This is attributed to the homologous test’s efficient utilization of information from previous time points. When the linear assumption holds, as in scenario 4, the homologous test and LMM demonstrated similar power. However, when this assumption doesn’t hold, as seen in scenarios 5 and 6, the homologous test exhibited higher power under the studied conditions.

In summary, when compared to the standard t test and LMM, the homologous test outperformed in terms of type I error control and efficiency, proving its effectiveness across a range of scenarios.

Table 1 The rejection probabilities from the simulation Study 2.

Full size table

Example

To demonstrate our method, we analyzed the tumor growth curves from the study by Sass et al.¹, which showed that the IL-1$\alpha $ expression facilitate tumor cell proliferation. The data showed that IL-1$\alpha $ knockdown by shIL-1$\alpha $ can delay the tumor growth when compared with the control group (scrIL-1$\alpha $). In addition, the blockade of IL-1$\alpha $ paracrine effect by a natural antagonist IL-1Ra also resulted in a significant delay in tumor growth. Here we evaluated the homologous hypothesis test and traditional two-sample t test in comparing the tumor volumes between the scrIL-1$\alpha $ and IL-1Ra groups across the time points from day 3 to day 24 ($n=4$ in each group). Table 2 and Figure 3 show that the standard errors of the estimated mean tumor volumes are remarkably smaller than those from the standard method. Correspondingly, the homologous test achieves higher power than the two-sample t test. This makes it possible to detect the difference between two groups at day 21 (Tables 3, 4, 5). On the other hand, the t test did not find any significant differences at $\alpha =0.05$. It is notable that this significant gain in power is attributed to the high correlation between tumor volumes at neighboring time points of measurements (Table 2).

Table 2 The estimated mean tumor volumes and standard errors (SEs) using the proposed and conventional methods. The Pearson’s correlation coefficients with tumor volume at previous time point $\rho $ and p-values from the test of homologous hypothesis and two-sample t tests are also shown.

Full size table

Table 3 Simulated power results given $\sigma $, $\beta _{1,j}$ , $\beta _{2,j}$ and $\beta _{3,j}$ for the homologous and standard cross-sectional hypotheses.

Full size table

Table 4 Simulated power results given $\sigma $, $\beta _{1,j}$ , $\beta _{2,j}$ and $\beta _{3,j}$ for the homologous and standard cross-sectional hypotheses.

Full size table

Table 5 Simulated power results given $\sigma $, $\beta _{1,j}$ , $\beta _{2,j}$ and $\beta _{3,j}$ for the homologous and standard cross-sectional hypotheses.

Full size table

Conclusions

In this manuscript, we have presented a straightforward approach to harnessing the correlation structure between time-points in a cross-sectional analysis of mean tumor volumes. Our novel method, the homologous hypothesis approach, offers significant advantages in terms of statistical power, especially when faced with a fixed sample size or the need to reduce sample sizes and costs while maintaining a fixed power, as compared to the traditional t test.

One of the key strengths of our method is its simplicity, as it allows for a clear and efficient implementation of the analysis using time moving forward. Nevertheless, we recognize that there are opportunities for further advancements and extensions to our approach.

For instance, future investigations could explore the use of multiple time-points in either direction along the time scale. Incorporating additional time-points could potentially enhance the precision of our results and provide a more comprehensive understanding of the treatment effects over time.

Furthermore, an exciting avenue for future research lies in developing methods to combine p-values across multiple tests for a global assessment of treatment effects over time. This would offer a more holistic perspective on the efficacy of the treatments under investigation and could lead to more robust and insightful conclusions.

In conclusion, our work represents an important step towards a more powerful and flexible approach for analyzing mean tumor volumes in cross-sectional studies. While we have presented the most straightforward version of our method using time moving forward, there is considerable potential for further enhancement and expansion, which could open up new possibilities for the analysis of time-dependent data in medical research. We hope that our findings will inspire further investigations and foster the development of innovative statistical methods in this domain.

Data availability

All data generated or analysed during this study are included in a supplemental file.

References

Sass, S. N. et al. Tumor-associated myeloid cells promote tumorigenesis of non-tumorigenic human and murine prostatic epithelial cell lines. Cancer Immunol. Immunother. 67(6), 873–883. https://doi.org/10.1007/s00262-018-2143-y (2018) (Epub 2018 Mar 3. PMID: 29502208; PMCID: PMC5951898.).
Article CAS PubMed PubMed Central Google Scholar
Zavrakidis, I., Jóźwiak, K. & Hauptmann, M. Statistical analysis of longitudinal data on tumour growth in mice experiments. Sci. Rep. 10(1), 9143. https://doi.org/10.1038/s41598-020-65767-7 (2020) (PMID: 32499558; PMCID: PMC7272435.).
Article ADS CAS PubMed PubMed Central Google Scholar
Oberg, A. L. et al. Statistical analysis of comparative tumor growth repeated measures experiments in the ovarian cancer patient derived xenograft (PDX) setting. Sci. Rep. 11(1), 8076. https://doi.org/10.1038/s41598-021-87470-x (2021) (PMID: 33850213; PMCID: PMC8044116.).
Article ADS CAS PubMed PubMed Central Google Scholar
Ribba, B. et al. A review of mixed-effects models of tumor growth and effects of anticancer drug treatment used in population analysis. CPT Pharmacomet. Syst. Pharmacol. 3(5), e113. https://doi.org/10.1038/psp.2014.12 (2014) (PMID: 24806032; PMCID: PMC4050233.).
Article CAS Google Scholar
Forrest, W. F. et al. Generalized additive mixed modeling of longitudinal tumor growth reduces bias and improves decision making in translational oncology. Cancer Res. 80(22), 5089–5097. https://doi.org/10.1158/0008-5472.CAN-20-0342 (2020) (Epub 2020 Sep 25. PMID: 32978171.).
Article CAS PubMed Google Scholar
Vaghi, C. et al. Population modeling of tumor growth curves and the reduced Gompertz model improve prediction of the age of experimental tumors. PLoS Comput. Biol. 16(2), e1007178. https://doi.org/10.1371/journal.pcbi.1007178 (2020) (PMID: 32097421; PMCID: PMC7059968.).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by a National Cancer Institute (NCI) Cancer Center Support Grant (CCSG) to Roswell Park Comprehensive Cancer Center (grant no. P30CA016056), and the following three NCI grants to Dr. Hutson: NRG Oncology Statistical and Data Management Center grant (grant no. U10CA180822); Immuno-Oncology Translational Network (IOTN) Moonshot grant (grant no. U24CA232979-01); Acquired Resistance to Therapy Network (ARTN) grant (grant no.U24CA274159-01). No potential competing interests were reported by the authors. We wish to thank the reviewers for their thoughtful comments, which led to an improved version of our work.

Author information

Authors and Affiliations

Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Elm and Carlton Streets, Buffalo, NY, 14623, USA
Alan D. Hutson, Han Yu & Kristopher Attwood

Authors

Alan D. Hutson
View author publications
You can also search for this author in PubMed Google Scholar
Han Yu
View author publications
You can also search for this author in PubMed Google Scholar
Kristopher Attwood
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.H. wrote the main manuscript H.Y. prepared the example section and developed the corresponding statistcal software K.A., A.H., H.Y. had weekly meetings to develop the underlying theory. K.A. edited the manuscript.

Corresponding author

Correspondence to Alan D. Hutson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hutson, A.D., Yu, H. & Attwood, K. Leveraging homologous hypotheses for increased efficiency in tumor growth curve testing. Sci Rep 13, 19890 (2023). https://doi.org/10.1038/s41598-023-47202-9

Download citation

Received: 07 August 2023
Accepted: 10 November 2023
Published: 14 November 2023
DOI: https://doi.org/10.1038/s41598-023-47202-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.