Skip to main content
Log in

COBS: qualitatively constrained smoothing via linear programming

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

Popular smoothing techniques generally have a difficult time accommodating qualitative constraints like monotonicity, convexity or boundary conditions on the fitted function. In this paper, we attempt to bring the problem of constrained spline smoothing to the foreground and describe the details of a constrained B-spline smoothing (COBS) algorithm that is being made available to S-plus users. Recent work of He & Shi (1998) considered a special case and showed that the L1 projection of a smooth function into the space of B-splines provides a monotone smoother that is flexible, efficient and achieves the optimal rate of convergence. Several options and generalizations are included in COBS:it can handle small or large data sets either with user interaction or full automation. Three examples are provided to show how COBS works in a variety of real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1a
Figure 1b
Figure 2a
Figure 2b
Figure 3a
Figure 3b

Similar content being viewed by others

References

  • Bartels, R. & Conn, A. (1980), ‘Linearly constrained discrete l1 problems’, ACM Transaction on Mathematical Software 6, 594–608.

    Article  Google Scholar 

  • Cleveland, W. (1979), ‘Robust locally weighted regression and smoothing scatterplots’, Journal of the American Statistical Association 74, 829–836.

    Article  MathSciNet  Google Scholar 

  • De Boor, C. (1978), A Practical Guide to Splines, Vol. 27 of Applied Mathematical Sciences, Springer-Verlag, New York.

    Google Scholar 

  • Delecroix, X., Simioni, M. & Thomas-Agnan, C. (1995), ‘A shape constrained smoother: Simulation study’, Computational Statistics 10, 155–175.

    MATH  Google Scholar 

  • Dierckx, P. (1993), Curve and Surface Fitting with Splines, Clarendon, Oxford.

    MATH  Google Scholar 

  • Eilers, P. H. C. & Marx, B. D. (1996), ‘Flexible smoothing with b-splines and penalties’, Statistical Science 11, 89–102.

    Article  MathSciNet  Google Scholar 

  • Friedman, J. (1984), A variable span smoother, Technical Report 5, Laboratory for Computational Statistics, Department of Statistics, Stanford, University, California.

    Book  Google Scholar 

  • Green, P. J. & Silverman, B. W. (1994), Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, Chapman Hall, London.

    Book  Google Scholar 

  • Hansen, J. & Lebedeff, S. (1987), ‘Global trends of measured surface air temperature’, 92, 13345–13372.

    Google Scholar 

  • Hansen, J. & Lebedeff, S. (1988), ‘Global surface air temperature: Update through 1987’, Geophys. Lett. 15, 323–326.

    Article  Google Scholar 

  • Härdle, W. (1990), Applied Nonparametric Regression, Cambridge University Press, New York.

    Book  Google Scholar 

  • Haslett, J. & Raftery, A. (1989), ‘Space-time modelling with long-memory dependence: Assessing ireland’s wind power resource’, 38, 1–21.

    Google Scholar 

  • Hastie, T. & Tibshirani, R. (1990), Generalized additive models, Chapman & Hall.

  • Hawkins, D. M. (1994), ‘Fitting monotonic polynomials to data’, 9, 233–237.

    Google Scholar 

  • He, X. (1997), ‘Quantiles without crossing’, American Statisticians 51, 186–192.

    Google Scholar 

  • He, X. & Shi, P. (1994), ‘Convergence rate of b-spline estimators of nonparametric conditional quantile functions’, Journal of Nonparametric Statistics 3, 299–308.

    Article  MathSciNet  Google Scholar 

  • He, X. & Shi, P. (1998), ‘Monotone b-spline smoothing’, Journal of the American Statistical Association. Forthcoming.

  • He, X., Ng, P. & Portnoy, S. (1998), ‘Bivariate quantile smoothing splines’, Journal of the Royal Statistical Society, Series B. Forthcoming.

  • Koenker, R. & Bassett Gilbert, J. (1978), ‘Regression quantiles’, Econometrica 46, 33–50.

    Article  MathSciNet  Google Scholar 

  • Koenker, R. & Ng, P. (1996), ‘A remark on bartels and conn’s linearly constrained 11 algorithm’, ACM Transaction on Mathematical Software 22, 493–495.

    Article  Google Scholar 

  • Koenker, R. & Schorfheide, F. (1994), ‘Quantile spline models for global temperature change’, Climate Change 28, 395–404.

    Article  Google Scholar 

  • Koenker, R., Ng, P. & Portnoy, S. (1994), ‘Quantile smoothing splines’, Biometrika 81, 673–680.

    Article  MathSciNet  Google Scholar 

  • Ng, P. (1996), ‘An algorithm for quantile smoothing splines’, Computational Statistics and Data Analysis 22, 99–118.

    Article  MathSciNet  Google Scholar 

  • Portnoy, S. (1997), ‘Local asymptotics for quantile smoothing splines’, Annals of Statistics 25, 414–434.

    Article  MathSciNet  Google Scholar 

  • Portnoy, S. & Koenker, R. (1997), ‘The gaussian hare and the laplacian tortoise: Computability of squared-error vs. absolute-error estimators (with discussion)’, Statistical Science 12, 279–296.

    Article  MathSciNet  Google Scholar 

  • Ramsay, J. O. (1988), ‘Monotone regression splines in action’, Statistical Science 3, 425–441.

    Article  Google Scholar 

  • Schumaker, L. (1981), Spline Functions: Basic Theory, Wiley & Son, New York.

    MATH  Google Scholar 

  • Schwarz, G. (1978), ‘Estimating the dimension of a model’, 6, 461–464.

    Google Scholar 

  • Wahba, G. (1990), Spline Models for Observational Data, SIAM: PA.

    Book  Google Scholar 

  • Wang, F. T. & Scott, D. W. (1994), ‘The l1 method for robust nonparametric regression’, JASA 89, 65–76.

    Google Scholar 

  • Watson, G. S. (1966), ‘Smooth regression analysis’, Sankhya 26, 359–378.

    MathSciNet  MATH  Google Scholar 

  • Wright, I. W. & Wegman, E. J. (1980), ‘Isotonic, convex and related splines’, The Annals of Statistics 8, 1023–1035.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The authors would like to thank two anonymous referees, David Scott and Roger Koenker for their insightful comments. Xuming He’s research is supported in part by National Security Agency Grant MDA904-96-1-0011 and National Science Foundation Grant SBR 9617278. All the computation in this paper was performed on the Sparc 20 in the Econometric Lab at the University of Illinois supported by NSF Grant SBR-95-12440.

Appendix

Appendix

Following is the calling sequence for COBS.

cobs(x, y, constraint, z, minz = knots[1], maxz = knots[nknots], nz = 100, knots, nknots, method = ‘quantile’, degree = 2, tau = 0.5, lambda = 0, ic = ‘aic’, knots.add = F, pointwise, print.warn = T, print.mesg = T, coef = rep(0, nvar), w = rep(1, n), maxiter = 20*n, lstart = log(.Machine$single.xmax) ** 2, factor = 1)

ARGUMENTS

  • x vector of covariate.

  • y vector of response variable. It must have the same length (n) as x. constraint ‘increase’, ‘decrease’, ‘convex’, ‘concave’, ‘periodic’ or ‘none’.

OPTIONAL ARGUMENTS

  • z vector of grid points at which the fitted values are evaluated; default to an equally spaced grid with nz grid points between minz and maxz. If the fitted values at x are desired, use z = unique(x).

  • minz needed if z is not given; default to min(x) or the first knot if knots are given.

  • maxz needed if z is not given; default to max(x) or the last knot if knots are given.

  • nz number of grid points in z if z is not given; default to 100.

  • knots vector of locations of the knot mesh; if missing, nknots number of knots will be created using the specified method and automatic knot selection will be carried out for regression B-spline (lambda = 0); if not missing and length(knots) == nknots, the provided knot mesh will be used in the fit and no automatic knot selection will be performed; otherwise, automatic knots selection will be performed on the provided knots.

  • nknots maximum number of knots; default to 6 for regression B-spline, 20 for smoothing B-spline.

  • method method used to generate nknots number of knots when knots is not provided; ‘quantile’ (equally spaced in percentile levels) or ‘uniform’ (equally spaced in covariate); default to ‘quantile’.

  • degree degree of the splines; 1 for linear spline and 2 for quadratic spline; default to 2.

  • tau desired quantile level; default to 0.5 (median).

  • lambda penalty parameter; lambda == 0:no penalty (regression B-spline); lambda>0:smoothing B-spline with the given lambda; lambda<0:smoothing B-spline with lambda chosen by a Schwarz-type information criterion.

  • ic information criterion used in knot deletion and addition for regression B-spline method when lambda == 0; ‘aic’ (Akaike-type) or ‘sic’ (Schwarz-type); default to ‘aic’.

  • knots.add logical; an additional step of stepwise knot addition will be performed for regression B-spline if T; the default is F.

  • pointwise an optional three-column matrix with each row specifying one of the following constraints:(1,xi,yi) − fitted value at xi will be >= yi; (−1,xi,yi) − fitted value at xi will be <= yi; (0,xi,yi) − fitted value at xi will be = yi; (2,xi,yi) − derivative of the fitted function at xi will be yi.

  • print.warn logical flag for printing of warning messages; default to T; probably needs to be set to F if performing monte carlo simulation.

  • print.mesg logical flag for printing of intermediate messages; default to T; probably needs to be set to F if performing monte carlo simulation.

  • coef initial guess of the B-spline coefficients; default to a vector of zeros.

  • w vector of weights the same length as x (y) assigned to both x and y; default to uniform weights adding up to one; using normalized weights that add up to one will speed up computation.

  • maxiter upper bound of the number of iteration; default to 20*n.

  • lstart starting value for lambda when performing parametric programming in lambda if lambda<0; default to log(.Machine$single.xmax)**2.

  • factor determines how big a step to the next smaller lambda should be while performing parametric linear programming in lambda; default to one will give all unique lambda’s; use of bigger factor (> 1& < 4) will save time for big problems.

VALUE

  • coef B-spline coefficients.

  • fit fitted value at z.

  • resid vector of residuals from the fit.

  • z as in input.

  • knots the final set of knots used in the computation.

  • ifl exit code:1 — ok; 2 — problem is infeasible, check specification of the pointwise argument; 3 — maxiter is reached before finding a solution, either increase maxiter and restart the program with coef and knots set to the value upon previous exit or use a smaller lstart value when lambda<0 or use a smaller lambda value when lambda>0; 4 — program aborted, numerical difficulties due to ill-conditioning.

  • icyc number of cycles taken to achieve convergence.

  • k the effective dimensionality of the final fit.

  • lambda the penalty parameter used in the final fit.

  • pp.lambda vector of all unique lambda’s obtained from parametric programming when lambda < 0 on input.

  • sic vector of Schwarz information criteria evaluated at pp.lambda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, X., Ng, P. COBS: qualitatively constrained smoothing via linear programming. Computational Statistics 14, 315–337 (1999). https://doi.org/10.1007/s001800050019

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001800050019

Keywords

Navigation