Simultaneous confidence band for the difference of segmented linear models

https://doi.org/10.1016/j.jspi.2010.09.011Get rights and content

Abstract

Consider comparing between two treatments a response variable, whose expectation depends on the value of a continuous covariate in some nonlinear fashion. We fit separate segmented linear models to each treatment to approximate the nonlinear relationship. For this setting, we provide a simultaneous confidence band for the difference between treatments of the expected value functions. The treatments are said to differ significantly on intervals of the covariate where the simultaneous confidence band does not contain zero. We consider segmented linear models where the locations of the changepoints are both known and unknown. The band is obtained from asymptotic results.

Section snippets

Introduction and background

In many settings, one wants to compare the effects of two treatments when the expected treatment responses depend on the values of a covariate. If the expected response is linear in the covariate and the treatment effect is identical regardless of the covariate value, i.e., parallel response functions and analysis of covariance is the standard approach. However, when the treatment effect varies depending on the value of the covariate, the parallelism assumption fails to hold. In the linear

Segmented linear model with unknown changepoints for two treatments

Let f define an m-segment continuous piecewise linear function with parameter vectorθ=(θ11,θ12,,θm1,θm2)Tbyf(x,θ)=θ11+θ12x+k=2mθk2(xθk1)Δ[x>θk1],where Δ[A] is the indicator function of the set A. The parameters have the following interpretations: θ11 is the y intercept of the first segment; θ12 is the slope of the first segment, for k=2,…,m; θk1 is the x coordinate of the changepoint of the kth segment; θk2 is the change in slope of the kth segment relative to the (k–1)st, where we assume,

Asymptotic background

Cox and Ma (1995) give a method for constructing a simultaneous confidence band about the mean function for generalized nonlinear regression models. Following their notation, we let Y1,…,Yn be n independent random variables with associated covariates x1,…,xn, and CDFs H1(y|x1,θ),…,Hn(y|xn,θ), θ=(θ1,…,θp)TΘ. Cox and Ma (1995) provide a simultaneous confidence band for the mean Eθ(Y)=g(x,θ) viewed as a function of x, where g(x,θ) is a known (suitable) function depending on an unknown parameter θ.

Simultaneous confidence band

The approach we follow in obtaining simultaneous confidence bands is to first apply the ideas of Cox and Ma with the awareness that certain continuity requirements are at issue for the general case of unknown changepoints. As a follow-up to this, we show that in the simpler case of known changepoints, Cox and Ma’s approach can be directly applied to obtain simultaneous confidence bands. For the model with unknown changepoints, the essence of the concern is that the resulting band for covariate

Practical issues

To handle the numerical difficulties of fitting segmented linear models with unknown changepoints to data we follow Larson’s (1992) method which finds global MLEs. Larson details the two-segment linear model and describes an iterative method for an m-segment linear model. We note that his method becomes computationally prohibitive if more than a few changepoints are required (SAS/IML® code which implements Larson’s (1992) estimation procedure for two-segment linear models is available from the

Application

Our method is illustrated with data from R.A. Cook’s unpublished Ph.D. dissertation as published in Bacon and Watts (1971, Tables 2 and 4, Examples 1 and 2). The experimental data were obtained by Cook during his investigation of the behavior of stagnant surface layer height (also referred to as “band height”) in a controlled flow of water down an inclined channel using different surfactants. As noted by Bacon and Watts, a two segment linear model is quite plausible for this data.

Fig. 1 shows

Simulation of coverage probabilities

We simulated the coverage probabilities for a particular pair of 2-segment linear models with varying common error variances, sample sizes, and sets of observed covariates. For the first treatment group, f(x, θ(1))=(x–6) Δ[x>6] and for the second treatment group f(x, θ(2))=0.25 x+(x–4)Δ[x>4]. The simulations used fixed covariates equally spaced on (0, 10), here xij=10j/(ni+1), j=1,…,ni. The choices for the common sample size were n1=n2=10, 20, 50, 80, 150, 250, and 500 and for the common error

Comments

Another approach to the problem of constructing a simultaneous confidence band about the difference of segmented linear models might be to naïvely apply the Working-Hotelling band (for example, Kutner et al. (2005, p. 230) for a single regression line or Potthoff (1964) for the difference of two regression lines) to each segment of the difference with a Bonferroni correction for the confidence level. A comparison of this approach to our simultaneous confidence band is described in detail in

Acknowledgements

The work of the first author was supported in part by a grant from the NIH/NCI (U10CA69651). The work of the second author was supported in part by grants from the NSF (DMS00 72207) and NIH (1 P50 MH084053). The authors want to thank the referee for extremely thoughtful comments that have improved the presentation of our paper.

References (14)

  • G. Chiu et al.

    Asymptotic theory for bent-cable regression—the basic case

    Journal of Statistical Planning and Inference

    (2005)
  • H.J. Larson

    Least squares estimation of linear splines with unknown knot locations

    Computational Statistics and Data Analysis

    (1992)
  • D.W. Bacon et al.

    Estimating the transition between two intersecting straight lines

    Biometrika

    (1971)
  • P. Billingsley

    Probability and Measure

    (1995)
  • C. Cox et al.

    Asymptotic confidence bands for generalized nonlinear regression models

    Biometrics

    (1995)
  • P.I. Feder

    On asymptotic distribution theory in segmented regression problems—identified case

    Annals of Statistics

    (1975)
  • D.V. Hinkley

    Bootstrap methods

    Journal of the Royal Statistical Society, Series B (Methodological)

    (1988)
There are more references available in the full text version of this article.
View full text