Elsevier

Labour Economics

Volume 14, Issue 6, December 2007, Pages 870-893
Labour Economics

Identifying and Estimating the Distributions of Ex Post and Ex Ante Returns to Schooling

https://doi.org/10.1016/j.labeco.2007.06.002Get rights and content

Abstract

This paper surveys a recent body of research by Carneiro, Hansen, and Heckman [Carneiro, P., K. Hansen, and J.J. Heckman, 2001, Fall. Removing the veil of ignorance in assessing the distributional impacts of social policies. Swedish Economic Policy Review 8 (2), 273–301., Carneiro, P., K. Hansen, and J.J. Heckman, 2003, May. Estimating distributions of treatment effects with an application to the returns to schooling and measurement of the effects of uncertainty on college choice. International Economic Review 44 (2), 361–422. 2001 Lawrence R. Klein Lecture], Cunha and Heckman [Cunha, F. and J.J. Heckman, 2006. The evolution of earnings risk in the US economy. Presented at the 9th World Congress of the Econometric Society, London], Cunha, Heckman, and Navarro [Cunha, F., J.J. Heckman, and S. Navarro, 2004, March. Separating heterogeneity from uncertainty in an aiyagari–laitner economy. Presented at the Goldwater Conference on Labor Markets, Arizona., Cunha, F., J.J. Heckman, and S. Navarro, 2005, April. Separating uncertainty from heterogeneity in life cycle earnings, The 2004 Hicks Lecture. Oxford Economic Papers 57 (2), 191–261., Cunha, F., J.J. Heckman, and S. Navarro, 2006. Counterfactual analysis of inequality and social mobility. In S.L. Morgan, D.B. Grusky, and G.S. Fields (Eds.), Mobility and Inequality: Frontiers of Research in Sociology and Economics, Chapter 4, pp. 290–348. Stanford, CA: Stanford University Press], Heckman and Navarro [Heckman, J.J. and S. Navarro, 2007, February. Dynamic discrete choice and dynamic treatment effects. Journal of Econometrics 136 (2), 341–396] and Navarro [Navarro, S., 2005. Understanding Schooling: Using Observed Choices to Infer Agent's Information in a Dynamic Model of Schooling Choice When Consumption Allocation is Subject to Borrowing Constraints. Ph.D. Dissertation, University of Chicago, Chicago, IL] that identifies and estimates the ex post distribution of returns to schooling and determines ex ante distributions of returns on which agents base their schooling choices. We discuss methods and evidence, and state a fundamental identification problem concerning the separation of preferences, market structures and agent information sets. For a variety of market structures and preference specifications, we estimate that over 50% of the ex post variance in returns to college are forecastable at the time agents make their schooling choices.

Introduction

The literature on the returns to schooling attempts to estimate the ex post rate of return. Ex post returns are interesting historical facts that describe how economies reward schooling. Ex ante returns are, however, what agents act on. To explain choices and evaluate their optimality, it is necessary to know what is in the agent's information set in order to determine the ex ante rate of return.

This paper describes new methods developed to estimate ex ante returns to schooling. We describe methods that characterize what is in the agent's information set at the time schooling decisions are made. The literature surveyed in this paper exploits the key idea that if agents know something and use that information in making their schooling decisions, it will affect their schooling choices. With panel data on earnings we can measure realized outcomes and assess what components of those outcomes are known at the time schooling choices are made.

The literature on panel data earnings dynamics (e.g. Lillard and Willis, 1978, MaCurdy, 1982) is not designed to estimate what is in agent information sets. It estimates earnings equations of the following type:Yi,t=Xi,tβ+Siτ+Ui,t,where Yi,t, Xi,t, Si, and Ui,t denote (for person i at time t) the realized earnings, observable characteristics, educational attainment, and unobservable characteristics, respectively, from the point of view of the observing economist. The variables generating outcomes realized at time t may or may not have been known to the agents at the time they made their schooling decisions. Many economists mistakenly equate their ignorance about the Ui,t with what the agents they study know about it.

The error term Ui,t is often decomposed into two or more components. For example, it is common to specify thatUi,t=ϕi+δi,t.The term ϕi is a person-specific effect. The error term δi,t is often assumed to follow an ARMA (p, q) process (see Hause, 1980, MaCurdy, 1982) such as δi,t = ρδi,t−1 + mi,t, where mi,t is a mean zero innovation independent of Xi,t and the other error components. The components Xi,t, ϕi, and δi,t all contribute to measured ex post variability across persons. However, the literature is silent about the difference between heterogeneity or variability among persons from the point of view of the observer economist and uncertainty, the unforecastable part of earnings as of a given age. The literature on income mobility and on inequality measures all variability ex post as in Chiswick (1974), Mincer (1974) and Chiswick and Mincer (1972).

An alternative specification of the error process postulates a factor structure for earnings,Ui,t=θiαt+εi,t,where θi is a vector of skills (e.g., ability, initial human capital, motivation, and the like), αt is a vector of skill prices, and the εi,t are mutually independent mean zero shocks independent of θi. Hause (1980) and Heckman and Scheinkman (1987) analyze such earnings models. Any process in the form of Eq. (2) can be written in terms of (3). The latter specification is more directly interpretable as a pricing equation than is (2).

The predictable components of Ui,t will have different effects on choices and economic welfare than the unpredictable components, if people are risk averse and cannot fully insure against uncertainty. Statistical decompositions based on (1), (2), and (3) or versions of them describe ex post variability but tell us nothing about which components of (1), (2), or (3) are forecastable by agents ex ante. Is ϕi unknown to the agent? δi,t? Or ϕi + δi,t? Or mi,t? In representation (3), the entire vector θi, components of the θi, the εi,t, or all of these may or may not be known to the agent at the time schooling choices are made.

The methodology developed in Carneiro, Hansen, and Heckman (2003), Cunha et al., 2004, Cunha et al., 2005 and Cunha and Heckman (2006) provides a framework within which it is possible to identify components of life cycle outcomes that are forecastable and acted on at the time decisions are taken from ones that are not. In order to choose between high school and college, agents forecast future earnings (and other returns and costs) for each schooling level. Using information about an educational choice at the time the choice is made, together with the ex post realization of earnings and costs that are observed at later ages, it is possible to estimate and test which components of future earnings and costs are forecast by the agent. This can be done provided we know, or can estimate, the earnings of agents under both schooling choices and provided we specify the market environment under which they operate as well as their preferences over outcomes.

For market environments where separation theorems are valid, so that consumption decisions are made independently of wealth maximizing decisions, it is not necessary to know agent preferences to decompose realized earnings outcomes in this fashion. Carneiro, Hansen, and Heckman (2003), Cunha et al., 2004, Cunha et al., 2005 and Cunha and Heckman (2006) use choice information to extract ex ante, or forecast, components of earnings and to distinguish them from realized earnings for different market environments. The difference between forecast and realized earnings allows them to identify the distributions of the components of uncertainty facing agents at the time they make their schooling decisions.

Section snippets

A Generalized Roy Model

To state the problem addressed in the recent literature more precisely, consider a version of the generalized Roy (1951) economy with two sectors.1 Let Si denote different schooling levels. Si = 0 denotes choice of the high school sector for person i, and Si = 1 denotes choice of the

Identifying Information Sets in Card's Model of Schooling

Consider decomposing the “returns” coefficient on schooling in an earnings equation into components that are known at the time schooling choices are made and components that are not known. Write the log of annualized discounted lifetime earnings of person i aslnyi=α+ρiSi+Ui,where ρi is the person-specific ex post return, Si is years of schooling, and Ui is a mean zero unobservable. We seek to decompose ρi into two components ρi = ηi + νi, where ηi is a component known to the agent when he/she makes

The Method of Cunha, Heckman and, Navarro

Cunha et al., 2004, Cunha et al., 2005, henceforth CHN) and Cunha and Heckman (2006), exploit covariances between schooling and realized earnings that arise under different agent information structures to test which information structure characterizes the data. They build on the analysis of Carneiro, Hansen, and Heckman (2003). To see how the method works, simplify the model back to two schooling levels: Si = 1 (college); Si = 0 (high school). Heckman and Navarro (2007) extend this analysis to

More general preferences and market settings

To focus on the main ideas in the literature, we have used the simple market structures of complete contingent claims markets. What can be identified in more general environments? In the absence of perfect certainty or perfect risk sharing, preferences and market environments also determine schooling choices. The separation theorem allowing consumption and schooling decisions to be analyzed in isolation that has been used thus far breaks down.

If we postulate information processes a priori, and

Evidence on Uncertainty and Heterogeneity of Returns

Few data sets contain the full life cycle of earnings along with the test scores and schooling choices needed to directly estimate the CHN model and extract components of uncertainty. It is necessary to pool data sets. See CHN who discuss how to combine NLSY and PSID data sets. We summarize the analysis of Cunha and Heckman (2006) in this subsection. See their paper for their exclusions and identification conditions.

Following the preceding theoretical analysis, they consider only two schooling

Extensions and Alternative Specifications

Carneiro, Hansen, and Heckman (2003) estimate a version of the model just discussed for an environment of complete autarky. Individuals have to live within their means each period. Cunha, Heckman, and Navarro (2004) estimate a version of this model with restrictions on intertemporal trade as in the Aiyagari–Laitner economy. Different assumptions about credit markets and preferences produce a range of estimates of the proportion of the total variability of returns to schooling that are

Summary and Conclusions

This paper surveys the main models and methods developed in Carneiro, Hansen, and Heckman (2003), Cunha et al., 2004, Cunha et al., 2005, Cunha et al., 2006 and Cunha and Heckman (2006) for estimating models of heterogeneity and uncertainty in the returns to schooling. The goal of this work is to separate variability from uncertainty and to estimate the distributions of ex ante and ex post returns to schooling. The key idea in the recent literature is to exploit the relationship between

References (50)

  • P. Carneiro et al.

    Removing the veil of ignorance in assessing the distributional impacts of social policies

    Swedish Economic Policy Review

    (2001)
  • P. Carneiro et al.

    Estimating distributions of treatment effects with an application to the returns to schooling and measurement of the effects of uncertainty on college choice

    International Economic Review

    (2003)
  • B.R. Chiswick

    Income Inequality: Regional Analyses Within a Human Capital Framework

    (1974)
  • B.R. Chiswick et al.

    Time-series changes in personal income inequality in the United States from 1939, with projections to 1985

    Journal of Political Economy

    (1972)
  • F. Cunha et al.

    The evolution of earnings risk in the US economy

  • F. Cunha et al.

    The technology of skill formation

    American Economic Review

    (2007)
  • Cunha, F., Heckman, J.J., in press. A framework for the analysis of inequality. Macroeconomic...
  • F. Cunha et al.

    Separating heterogeneity from uncertainty in an Aiyagari–Laitner economy

  • F. Cunha et al.

    Separating uncertainty from heterogeneity in life cycle earnings, The 2004 Hicks Lecture

    Oxford Economic Papers

    (2005)
  • F. Cunha et al.

    Counterfactual analysis of inequality and social mobility

  • J. Durbin

    Errors in variables

    Review of the International Statistical Institute

    (1954)
  • Ellwood, D.T., 2001. The sputtering U.S. Labor Market. Kennedy School of Government, Harvard (Unpublished...
  • M.A. Flavin

    The adjustment of consumption to changing expectations about future income

    Journal of Political Economy

    (1981)
  • W.M. Gorman

    A possible procedure for analysing quality differentials in the egg market

    Review of Economic Studies

    (1980)
  • R. Gronau

    Wage comparisons — a selectivity bias

    Journal of Political Economy

    (1974)
  • Cited by (70)

    View all citing articles on Scopus

    This research was supported by NIH R01-HD043411, NSF SES-024158 and the Geary Institute, University College Dublin, Ireland. The views expressed in this paper are those of the authors and not necessarily those of the funders listed here. We wish to thank the editor and two anonymous referees, as well as Lars Hansen, Lance Lochner, Salvador Navarro, Robert Townsend, Sergio Urzua, and Petra Todd for helpful comments.

    View full text