Skip to main content
Log in

An analysis of US domestic migration via subset-stable measures of administrative data

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

How does the likelihood of moving across US regions vary with changes in household characteristics, and how does the risk of a change in status vary given a move? Statistics aimed at these questions are calculated for households who earned formal market income in the US, 2001–2015, totaling about 1.7 billion observations with 82.7 million long-distance moves, and covering statuses such as income, school enrollment, age, number of children, local cost of living, and retirement or marital status. The key theoretical result of this article shows that the Cochran–Mantel–Haenszel statistic is the unique aggregate risk ratio within a broad class that has the “subset stability” property: If a statistic has value \(s_1\) for one subset and \(s_2\) for another, then the statistic for the union of the two sets is between \(s_1\) and \(s_2\). A sequence of pseudo-experiments generate a wealth of tests regarding the relationship between moving and a broad range of household characteristics, for the full population and salient subsets, with some focus on the characteristics of the 44.2% of movers who see negative income returns relative to the counterfactual of staying.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and material

Due to IRS restrictions, the data can not be made public, but will be made available upon request to the IRS Statistics of Income division, after the appropriate clearance under 26 USC §6103.

Code availability

Portions not containing IRS-restricted information are available upon request. See also the author’s Cochrane-Mantel-Haenszel statistic calculator at https://github.com/b-k/cmh.py/.

Notes

  1. It is not always clear that a truly random experiment is desirable. Military moves due to redeployment perhaps approximate a truly random allocation [7], but individuals who chose to be in military families may have unobservable characteristics systematically different from those who do not. Whether these results would apply to families where one member is randomly drafted into the military is unknown. Similarly, a randomized trial hoping to describe outcomes for future movers would first find households who chose to move of their own volition, then make a randomized interference into some subset of that subpopulation. This may be impossible using general population surveys or administrative records.

  2. It also features exposure to climate change; McLeman, et al [44] discuss the resulting out-migration.

  3. As discussed in the appendix, 80km is also the definition of a move used by the US Internal Revenue Service. These are not small moves: the IRS Statistics of Income division estimates $3.5 billion in moving expenses claimed by those moving over 80km in 2016.

  4. Alternatives to the strict adherence to a controlled pseudo-experiment, instead relying on household history, create more difficulties than they resolve. Classifying movers by their full pattern of moves is error-prone (what if a household moves twice in the same year?), and requires arbitrary decisions about how to treat different series. Is a mover who moves in years 1, 2, and 4 comparable to one who moves in years 1, 3, and 4? Throwing out moving households after they move again creates a sample that answers the question “what is the outcome from moving once and never moving again relative to the counterfactual of never moving?”, but this is is a biased measure of any activity among the full population. Specific questions about chain migrants versus once-in-a-lifetime movers is reserved for future research.

  5. The more common version of the CMH statistic is an odds ratio, not a risk ratio. Odds is calculated by the ratio of count of occurrence of an event over count of non-occurrence; risk is the ratio of the same occurrence count over the full count of the population [54]. An odds ratio or risk ratio is the ratio of two so-defined odds or risks.

    This article relies on the risk ratio. Colloquial references to the chance, likelihood, and typically even odds of an event refer to the risk, not the odds as defined here. The odds ratio is symmetric, giving equal odds to the chance of moving among retirees versus non-retirees, and the odds of retiring among movers versus stayers, for example. The risk ratio gives distinct values for the two, which can better advise causal inquiries.

    For relatively unlikely events, such as a health condition in a typical medical study, the odds ratio approximates the risk ratio, but as the likelihood of the event grows, the odds overestimates the risk to the point of being almost unusable for discussing the relative chance that an event will occur [62].

  6. In medical studies, when subjects are selected ex ante and split into ceteris paribus cells based on observed covariates, there is bias in the measure of odds or risk ratios, and so the CMH statistic, to the extent that those controlled covariates correlate to the outcome [12, 14, 58]. But that is not the situation in typical administrative record or commercial data sets, with a defined universe of observations with no subject selection. Multiple testing issues in mereological methods [17] are not a consideration for descriptive studies, or can be adjusted via methods such as Bonferroni corrections.

  7. Via https://apps.bea.gov/iTable/index_regional.cfm, accessed April 2021.

References

  1. Akgündüz, Y. E. , Bağır, Y. K. , Cılasun, S. M. , & Kırdar, M. G. (2021). Consequences of a massive refugee influx on firm performance and market structure. Technical Report 21/01 .

  2. Arah, O. A. (2008). The role of causal reasoning in understanding Simpson’s paradox, Lord’s paradox, and the suppression effect: Covariate selection in the analysis of observational studies. Emerging Themes in Epidemiology, 5(1), 1–5.

    Article  Google Scholar 

  3. Barros, A. J., & Hirakata, V. N. (2003). Alternatives for logistic regression in cross-sectional studies: an empirical comparison of models that directly estimate the prevalence ratio. BMC Medical Research Methodology, 3(1), 1–13.

    Article  Google Scholar 

  4. Benson, M., & O’Reilly, K. (2015). From lifestyle migration to lifestyle inmigration: Categories, concepts and ways of thinking. Migration Studies, 4(1), 20–37.

    Article  Google Scholar 

  5. Borjas, G. (1998). Immigration and welfare magnets. Technical report.

  6. Brown, D. (2008). Rural Retirement Migration. Dordrecht: Springer.

    Book  Google Scholar 

  7. Burke, J., & Miller, A. R. (2017). The effects of job relocation on spousal careers: Evidence from military change of station moves. Economic Inquiry, 56(2), 1261–1277.

    Article  Google Scholar 

  8. Card, D. (2001). Estimating the return to schooling: Progress on some persistent econometric problems. Econometrica, 69(5), 1127–1160.

    Article  Google Scholar 

  9. Cebula, R. J. (1974). Interstate migration and the Tiebout hypothesis: An analysis according to race, sex and age. Journal of the American Statistical Association, 69(348), 876–879.

    Article  Google Scholar 

  10. Chau, N. H. (1997). The pattern of migration with variable migration cost. Journal of Regional Science, 37(1), 35–54.

    Article  Google Scholar 

  11. Clark, D. E., & Hunter, W. J. (1992). The impact of economic opportunity, amenities and fiscal factors on age-specific migration rates. Journal of Regional Science, 32(3), 349–365.

    Article  Google Scholar 

  12. Costanza, M. (1995). Matching. Preventive Medicine, 24(5), 425–433.

    Article  Google Scholar 

  13. Dao, M., Furceri, D., & Loungani, P. (2017). Regional labor market adjustment in the united states: Trend and cycle. The Review of Economics and Statistics, 99(2), 243–257.

    Article  Google Scholar 

  14. Deeks, J. (1998). When can odds ratios mislead? Odds ratios should be used only in case-control studies and logistic regression analyses. BMJ, 316(7136), 989–91.

    Google Scholar 

  15. Detang-Dessendre, C., Drapier, C., & Jayet, H. (2004). The impact of migration on wages: Empirical evidence from French youth. Journal of Regional Science, 44(4), 661–691.

    Article  Google Scholar 

  16. DeWaard, J., Johnson, J. E., & Whitaker, S. D. (2018). Internal migration in the United States: A comparative assessment of the utility of the Consumer Credit Panel.

  17. Dixon, D. O., & Simon, R. (1992). Bayesian subset analysis in a colorectal cancer clinical trial. Statistics in Medicine, 11(1), 13–22.

    Article  Google Scholar 

  18. Duncan, D. T., Aldstadt, J., Whalen, J., White, K., Castro, M. C., & Williams, D. R. (2012). Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities? Demographic Research, 26, 409–448.

    Article  Google Scholar 

  19. Dyck, D. V., Cardon, G., Deforche, B., & Bourdeaudhuij, I. D. (2011). Do adults like living in high-walkable neighborhoods? Associations of walkability parameters with neighborhood satisfaction and possible mediators. Health & Place, 17(4), 971–977.

    Article  Google Scholar 

  20. Edwards, C. (2018). Tax reform and interstate migration. Technical Report 84, Cato Institute.

  21. Faggian, A., & McCann, P. (2009). Universities, agglomerations and graduate human capital mobility. Tijdschrift voor Economische en Sociale Geografie, 100(2), 210–223.

    Article  Google Scholar 

  22. Fee, K., Wardrip, K., & Nelson, L. (2019). Opportunity occupations revisited: Exploring employment for sub-baccalaureate workers across metro areas and over time. Philadelphia Federal Reserve: Technical report.

  23. Fournier, G. M., Rasmussen, D. W., & Serow, W. J. (1988). Elderly migration: For sun and money. Population Research and Policy Review, 7(2), 189–199.

    Article  Google Scholar 

  24. Fox, W. F., Herzog, H. W., & Schlottman, A. M. (1989). Metropolitan fiscal structure and migration. Journal of Regional Science, 29(4), 523–536.

    Article  Google Scholar 

  25. Garip, F. (2008). Social capital and migration: How do similar resources lead to divergent outcomes? Demography, 45(3), 591–617.

    Article  Google Scholar 

  26. Greenwood, M. J. (1981). Migration and EconomicGrowth in the United States: National, Regional, and Metropolitan Perspectives. London: Academic Press.

    Google Scholar 

  27. Greenwood, M. J., & Sweetland, D. (1972). The determinants of migration between standard metropolitan statistical areas. Demography, 9(4), 665.

    Article  Google Scholar 

  28. Gurak, D. . T., & Kritz, M. . M. (2000). The interstate migration of US immigrants: Individual and contextual determinants. Social Forces, 78(3), 1017–1039.

    Article  Google Scholar 

  29. Hernán, M. A., Clayton, D., & Keiding, N. (2011). The Simpson’s paradox unraveled. International Journal of Epidemiology, 40(3), 780–785.

    Article  Google Scholar 

  30. Hernández-Murillo, R., Ott, L. S., Owyang, M. T., & Whalen, D. (2011). Patterns of interstate migration in the United States from the Survey of Income and Program Participation. Federal Reserve Bank of St. Louis Review, 93(3), 169–85.

    Google Scholar 

  31. Herzog, H. W., & Schlottmann, A. M. (1986). State and local tax deductibility and metropolitan migration. National Tax Journal, 39(2), 189–200.

    Article  Google Scholar 

  32. Hidalgo, M. D., & López-Pina, J. A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel–Haenszel procedures. Educational and Psychological Measurement, 64(6), 903–915.

    Article  Google Scholar 

  33. Hyatt, H., McEntarfer, E., Ueda, K., & Zhang, A. (2018). Interstate migration and employer-to-employer transitions in the United States: New evidence from administrative records data. Demography, 55(6), 2161–2180.

    Article  Google Scholar 

  34. Ihrke, D. K., & Faber, C. S. (2012). Geographical mobility: 2005–2010. Technical report: US Census Bureau.

  35. Jackman, R., & Savouri, S. (1992). Regional migration in Britain: An analysis of gross flows using NHS central register data. The Economic Journal, 102(415), 1433.

    Article  Google Scholar 

  36. Katz, L. F., & Blanchard, O. (1992). Regional evolutions. Technical Report 1.

  37. Kennan, J., & Walker, J. R. (2011). The effect of expected income on individual migration decisions. Econometrica, 79(1), 211–251.

    Article  Google Scholar 

  38. Kleven, H., Landais, C., Muñoz, M., & Stantcheva, S. (2020). Taxation and migration: Evidence and policy implications. Journal of Economic Perspectives, 34(2), 119–142.

    Article  Google Scholar 

  39. Krieg, R. G. (1997). Occupational change, employer change, internal migration, and earnings. Regional Science and Urban Economics, 27(1), 1–15.

    Article  Google Scholar 

  40. Lerman, K. (2017). Computational social scientist beware: Simpson’s paradox in behavioral data. Journal of Computational Social Science, 1(1), 49–58.

    Article  Google Scholar 

  41. Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.

    Google Scholar 

  42. McKinnish, T. (2005). Importing the poor. Journal of Human Resources, XL(1), 57–76.

    Article  Google Scholar 

  43. McKinnish, T. (2007). Welfare-induced migration at state borders: New evidence from micro-data. Journal of Public Economics, 91(3–4), 437–450.

    Article  Google Scholar 

  44. McLeman, R. A., & Hunter, L. M. (2010). Migration in the context of vulnerability and adaptation to climate change: Insights from analogues. Wiley Interdisciplinary Reviews: Climate Change, 1(3), 450–461.

    Google Scholar 

  45. Molloy, R., Smith, C. L., & Wozniak, A. (2011). Internal migration in the United States. Journal of Economic Perspectives, 25(3), 173–196.

    Article  Google Scholar 

  46. Nelson, M. A., & Wyzan, M. L. (1989). Public policy, local labor demand, and migration in Sweden, 1979–1984. Journal of Regional Science, 29(2), 247–264.

    Article  Google Scholar 

  47. Nunn, R., Kawano, L., & Klemens, B. (2018). Unemployment insurance and worker mobility. Urban-Brookings Tax Policy Center: Technical report.

  48. O’Reilly, K. (2016). Lifestyle Migration. London: Routledge.

    Google Scholar 

  49. Pack, J. R. (1973). Determinants of migration to central cities. Journal of Regional Science, 13(2), 249–260.

    Article  Google Scholar 

  50. Palloni, A., Massey, D. S., Ceballos, M., Espinosa, K., & Spittel, M. (2001). Social capital and international migration: A test using information on family networks. American Journal of Sociology, 106(5), 1262–1298.

    Article  Google Scholar 

  51. Pearce, N. (2004). Effect measures in prevalence studies. Environmental Health Perspectives, 112(10), 1047–1050.

    Article  Google Scholar 

  52. Preuhs, R. R. (1999). State policy components of interstate migration in the United States. Political Research Quarterly, 52(3), 527–547.

    Article  Google Scholar 

  53. Quinn, M. A., & Rubb, S. (2005). The importance of education-occupation matching in migration decisions. Demography, 42(1), 153–167.

    Article  Google Scholar 

  54. Ranganathan, P., Aggarwal, R., & Pramesh, C. (2015). Common pitfalls in statistical analysis: Odds versus risk. Perspectives in Clinical Research, 6(4), 222.

    Article  Google Scholar 

  55. Rapoport, J. (2018). The faster growth of larger, less crowded locations. Economic Review, 103(4), 5–38 (Fourth Quarter).

    Google Scholar 

  56. Roback, J. (1982). Wages, rents, and the quality of life. Journal of Political Economy, 90(6), 1257–1278.

    Article  Google Scholar 

  57. Rogers, H. J., & Swaminathan, H. (1993). A comparison of logistic regression and Mantel–Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105–116.

    Article  Google Scholar 

  58. Rose, S., & van der Laan, M. J. (2009). Why match? Investigating matched case-control study designs with causal effect estimation. The International Journal of Biostatistics. https://doi.org/10.2202/1557-4679.1127.

    Article  Google Scholar 

  59. Rothman, K. J. (2012). Modern Epidemiology. Philadelphia: LWW.

    Google Scholar 

  60. Sala, H., & Trivín, P. (2014). Labour market dynamics in Spanish regions: Evaluating asymmetries in troublesome times. SERIEs, 5(2–3), 197–221.

    Article  Google Scholar 

  61. Sander, N., & Bell, M. (2013). Migration and retirement in the life course: An event history approach. Journal of Population Research, 31(1), 1–27.

    Article  Google Scholar 

  62. Schmidt, C. O., & Kohlmann, T. (2008). When to use the odds ratio or the relative risk? International Journal of Public Health, 53(3), 165–167.

    Article  Google Scholar 

  63. Simpson, E. H. (1951). The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological), 13(2), 238–241.

    Google Scholar 

  64. Sjaastad, L. A. (1962). The costs and returns of human migration. Journal of Political Economy, 70(5 Part 2), 80–93.

    Article  Google Scholar 

  65. Stark, O., & Bloom, D. E. (1985). The new economics of labor migration. The American Economic Review, 75(2), 173–178.

    Google Scholar 

  66. Tcha, M. (1995). Altruism, household size and migration. Economics Letters, 49(4), 441–445.

    Article  Google Scholar 

  67. Tiebout, C. M. (1956). A pure theory of local expenditures. Journal of Political Economy, 64(5), 416–424.

    Article  Google Scholar 

  68. Tunaru, R. (2001). Models of association versus causal models for contingency tables. Journal of the Royal Statistical Society Series D (The Statistician), 50(3), 257–269.

    Google Scholar 

  69. Vedder, R. (1990). Tiebout, taxes, and economic growth. Cato Journal, 10(1), 91–108.

    Google Scholar 

  70. Wacholder, S. (1986). Binomial regression in GLIM: Estimating risk ratios and risk differences. American Journal of Epidemiology, 123(1), 174–184.

    Article  Google Scholar 

  71. Walker, K. E. (2017). The shifting destinations of metropolitan migrants in the US, 2005–2011. Growth and Change, 48(4), 532–551.

    Article  Google Scholar 

  72. Yarnold, P. R. (1996). Characterizing and circumventing Simpson’s paradox for ordered bivariate data. Educational and Psychological Measurement, 56(3), 430–442.

    Article  Google Scholar 

  73. Young, C., Varner, C., Lurie, I. Z., & Prisinzano, R. (2016). Millionaire migration and taxation of the elite. American Sociological Review, 81(3), 421–446.

    Article  Google Scholar 

  74. Yule, G. U. (1903). Notes on the theory of association of attributes in statistics. Biometrika, 2(2), 121–134.

    Article  Google Scholar 

Download references

Funding

This article was written during course of business by a US Treasury employee, as part of a project to improve tax modeling via improvements in demographic modeling.

Author information

Authors and Affiliations

Authors

Contributions

Sole author. Much of the data preparation work was done before and independently of this study, as acknowledged in Sect. 3.

Corresponding author

Correspondence to Ben Klemens.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is a component of a larger study on the population characteristics underlying models of the inputs to tax revenue calculations, and how they evolve over time. Thanks to David Bridgeland, Randy Capps, Adam Cole, Aaron Schumacher, Bethany DeSalvo, Robin Fisher, Chung Kim, Gray Kimbrough, Elizabeth Landau, Ithai Lurie, Nick Turner, Elizabeth Maggie Penn, Joshua Tauberer, and the compilers of the data bank, Raj Chetty, John Friedman, Emmanuel Saez, Danny Yagan, and their counterparts at IRS.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 284 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Klemens, B. An analysis of US domestic migration via subset-stable measures of administrative data. J Comput Soc Sc 5, 351–382 (2022). https://doi.org/10.1007/s42001-021-00124-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-021-00124-w

Keywords

JEL classifications

Navigation