Skip to main content
Log in

Investigating health-related time use with partially observed data

  • Published:
Review of Economics of the Household Aims and scope Submit manuscript

Abstract

This paper suggests analytical strategies for obtaining informative parameter bounds when multivariate health-related time use data are partially observed in a particular yet common manner. One familiar context is where M>1 outcomes’ respective totals across N>1 time periods are observed but where questions of interest involve features—probabilities, moments, etc.—of their unobserved joint distribution at each of the N time periods. For instance, one might wish to understand the distribution of any type of unhealthy day experienced over a month but have access only to the separate monthly totals of physically unhealthy and mentally unhealthy days that are experienced. After demonstrating methods to partially identify such distributions and related parameters under several sampling assumptions, the paper proceeds to derive bounds on partial effects involving exogenous covariates. These results are applied in three empirical exercises. Whether the proposed bounds prove to be sufficiently tight to usefully inform decisionmakers can only be determined in context, although in this paper’s empirical analysis some of the estimated bounds turn out to be perhaps surprisingly tight. Moreover, it is suggested in the paper’s conclusion that the issues considered in this paper may become increasingly salient for analysts as data privacy policies increasingly constrain analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

To be posted on author’s website upon publication.

Code availability

To be posted on author’s website upon publication.

Notes

  1. In the specific context where n indexes time units (days), this translates into identifying the distribution of the number of time units (days) on which no or, reciprocally, any event occurs.

  2. The sizes of the sets L(n) and U(n) depend on n and M:

    $${\# L({{n}}) = \left\{ {\begin{array}{*{20}{l}}{1,}&{{{n}} = 0}\\{{{M}},}&{{{n}} \in \left\{ {1,\, \ldots ,\,{{N}} - 1} \right\}}\\{{{\left( {{{n}} + 1} \right)}^{{M}}} - {{{n}}^{{M}}},}&{{{n}} = {{M}}}\end{array}} \right.} \quad {\# {\it{U}}\left( n \right) = \left\{ {\begin{array}{*{20}{l}} {1,} \hfill & {n = 0} \hfill \\ {\left( {n + 1} \right)^M - \left( {\begin{array}{*{20}{c}} {M + n - 1} \\ M \end{array}} \right),} \hfill & {n \in {\cal{N}}} \hfill \end{array}} \right.}$$
  3. A concrete example is where the outcome of interest is a dichotomization of S, 1(S ≥ t) for some threshold t, with the parameter of interest being Pr(1(S ≥ t)) = Pr(S ∈ {t,…,N})).

  4. The sizes of these sets are:

    $${\# {L_E}(n) = {(n + 1)^M} - {n^M},\quad n \in N} \quad {\# {\it{U}}_E\left( n \right) = \left\{ {\begin{array}{*{20}{l}} {\left( {\begin{array}{*{20}{c}} {M + n - 1} \\ {M - 1} \end{array}} \right),} \hfill & {n \;<\; N} \hfill \\ {\left( {n + 1} \right)^M - \left( {\begin{array}{*{20}{c}} {M + n - 1} \\ M \end{array}} \right),} \hfill & {n = N} \hfill \end{array}} \right.}$$
  5. For E[S] note that (18) and (19) are equivalent to what might be some readers' intuitive approach (see U.S. CDC, 2000) of defining lower and upper bounds on E[S] as, respectively, \(\mathop {\sum}\nolimits_{{\boldsymbol{q}} \in {\cal{A}}} {\Pr \left( {{\boldsymbol{y}} = {\boldsymbol{q}}} \right)} \times \mathop{\max}\limits_{m \in {\cal{M}}}\left\{ {q_m} \right\}\) and \(\mathop {\sum}\nolimits_{{\boldsymbol{q}} \in {\cal{A}}} {\Pr \left( {{\boldsymbol{y}} = {\boldsymbol{q}}} \right)} \times \min \left\{ {\mathop {\sum}\nolimits_{m \in {\cal{M}}} {q_m,\,N} } \right\}\).

  6. https://www.cdc.gov/brfss/index.html

  7. https://digital.nhs.uk/data-and-information/publications/statistical/health-survey-for-england

  8. Time patterns are considered relevant features of individuals' physical activities. For example, the USDHHS Physical Activity Guidelines for Americans (USDHHS, 2018) recommends: For substantial health benefits, adults should do at least 150 minutes (2 hours and 30 minutes) to 300 minutes (5 hours) a week of moderate-intensity, or 75 minutes (1 hour and 15 minutes) to 150 minutes (2 hours and 30 minutes) a week of vigorous-intensity aerobic physical activity, or an equivalent combination of moderate- and vigorous-intensity aerobic activity. Preferably, aerobic activity should be spread throughout the week.

  9. https://www.cdc.gov/healthyyouth/data/yrbs/index.htm

References

  • Abowd, J. M., & Schmutte, I. M. (2019). An economic analysis of privacy protection and statistical accuracy as social choices. American Economic Review, 109, 171–202.

    Article  Google Scholar 

  • Burke, L. G. et al. (2020). Healthy Days at Home: A Novel Population-based Outcome Measure. Healthcare 8. https://doi.org/10.1016/j.hjdsi.2019.100378.

  • Burns, M. & Mullahy, J. (2016). Healthy-Time Measures of Health Outcomes and Healthcare Quality. NBER working paper 22562.

  • Bynum, J. P. W., et al. (2016). Our parents, ourselves: health care for an aging population. Lebanon, NH: The Dartmouth Institute of Health Policy and Clinical Practice.

    Google Scholar 

  • Grossman, M. (1972). On the concept of health capital and the demand for health. Journal of Political Economy, 80, 223–255.

    Article  Google Scholar 

  • Hamermesh, D. S. (2016). What’s to know about time use? Journal of Economic Surveys, 30, 198–203.

    Article  Google Scholar 

  • Manski, C. F. (1988). Analog estimation methods in econometrics. New York, NY: Chapman and Hall.

    Google Scholar 

  • Manski, C. F. (1989). Anatomy of the selection problem. Journal of Human Resources, 24, 343–360.

    Article  Google Scholar 

  • Manski, C. F. (1990). Nonparametric bounds on treatment effects. American Economic Review Papers and Proceedings 80: 319–323.

  • Manski, C. F. (2020). The lure of incredible certitude. Economics and Philosophy, 36, 216–245.

    Article  Google Scholar 

  • Manski, C. F., & Tamer, E. (2002). Inference on regressions with interval data on a regressor or outcome. Econometrica, 70, 519–546.

    Article  Google Scholar 

  • Pesko, M. F., et al. (2016). The effect of potential electronic nicotine delivery system regulations on nicotine product selection. Addiction, 111, 734–744.

    Article  Google Scholar 

  • U.S. Centers for Disease Control and Prevention. (2000). Measuring healthy days. Atlanta: CDC.

    Google Scholar 

  • U.S. Department of Health and Human Services. (2018). Physical activity guidelines for Americans, 2nd edition. Washington, DC: U.S. Department of Health and Human Services.

    Google Scholar 

Download references

Acknowledgements

Thanks are owed to Chris Adams, Ciaran O’Neill, and two referees for helpful comments and suggestions.

Funding

Partial support was provided by RWJF Evidence for Action Grant 73336, NICHD Grant P2CHD047873 to the Center for Demography and Ecology, and NIA Grant P30AG017266 to the Center for Demography of Health and Aging, all at UW-Madison.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Mullahy.

Ethics declarations

Conflict of interest

The author declares no competing interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Appendix A: General results on bounds

Appendix A: General results on bounds

It proves useful for several of the derivations presented in the main text to describe some general results. For a set of generic parameters Θ = {θj} suppose lower and upper bounds, LB{θj} and UB{θj}, have been determined. Suppose h{θj} is a smooth, monotone function. Then:

$$\begin{array}{*{20}{c}} {LB\left( {h\left( {\theta _j} \right)} \right) = \left\{ {\begin{array}{*{20}{l}} {h\left( {LB\left( {\theta _j} \right)} \right),} \hfill & {h^\prime\, > \,0} \hfill \\ {h\left( {UB\left( {\theta _j} \right)} \right)} \hfill & {h^\prime\, < \,0} \hfill \end{array}} \right.} & {UB\left( {h\left( {\theta _j} \right)} \right) = \left\{ {\begin{array}{*{20}{l}} {h\left( {UB\left( {\theta _j} \right)} \right),} \hfill & {h^\prime\, > \,0} \hfill \\ {h\left( {LB\left( {\theta _j} \right)} \right)} \hfill & {h^\prime\, < \,0} \hfill \end{array}} \right.} \end{array}$$
(a)
$$\begin{array}{*{20}{c}} {LB\left( {\mathop {\sum}\limits_j {h_j\left( {\theta _j} \right)} } \right) = \mathop {\sum}\limits_j {LB\left( {h_j\left( {\theta _j} \right)} \right)} } & {UB\left( {\mathop {\sum}\limits_j {h_j\left( {\theta _j} \right)} } \right) = \mathop {\sum}\limits_j {UB\left( {h_j\left( {\theta _j} \right)} \right)} ,} \end{array}$$
(b)

where the summations in (b) run over some or all of the elements of Θ.

For example (and supposing θj > 0): \(LB\left( {1/\theta _j} \right) = 1/UB\left( {\theta _j} \right)\); \(UB\left( { - \theta _j} \right) = - LB\left( {\theta _j} \right)\); \(UB\left( {\theta _j^2} \right) = \left( {UB\left( {\theta _j} \right)} \right)^2\); LB(θjθk) = LB(θj)−UB(θk). Note that even when LB(hj(θj)) and LB(hj(θj)) are the tightest bounds on each hj(θj) from (a) this does not imply that the bounds on their sums defined in (b) are the tightest possible bounds on Σjhj(θj) even though in general the bounds in (b) will be valid.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mullahy, J. Investigating health-related time use with partially observed data. Rev Econ Household 20, 103–121 (2022). https://doi.org/10.1007/s11150-021-09570-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11150-021-09570-x

Keywords

JEL codes

Navigation