Population scenarios for U.S. states consistent with shared socioeconomic pathways

There is a growing demand for subnational population projections for informing potential demographic influences on many aspects of society and the environment at the scale at which interactions occur and actions are taken. Existing US subnational population projections have not fully accounted for regional variations of demographic rates and therefore under-estimate the uncertainties in and heterogeneity of population trends. We present a first set of population projections for US states that span a wide but plausible range of population outcomes driven by changing state-level demographic rates consistent with the widely used SSP scenario framework. The projections are carried out for all 50 states integrated through bilateral gross migration flows. They update the original national-level SSP population projections based on recently available data and introduce more plausible assumptions on long-term international migration. We project a national population ranging from about 250–650 million by 2100, somewhat lower than the SSP projections due mainly to updated base year data. Utah and other states in the Rocky Mountain region see the largest increases in population in proportional terms, while the Northeast and Great Lakes regions see the slowest growth or most decline, along with individual states like Alaska, California, Louisiana, and Mississippi. Aging occurs in all states and scenarios, but is most prominent in the Northeast, Florida, and in some cases states in the West and the Great Lakes region. The relative contributions of fertility, mortality, and migration to population change varies substantially across states.


Introduction
Population projections play an important role in understanding how potential changes in demographics may influence many aspects of society and environment. They inform our understanding of future needs, risks, and opportunities related to health, employment, housing, education, financial transfer programs and taxation, demands for transportation, energy facilities, and other various goods and services (Fauth and Gomez-Ibanez 1979, Bjornstad 1979, Hogan and Roberts 2015, Mccue and Herbert 2016, Shaw 2018, CBO 2018 Not least of these issues is the implications such changes may have for human interactions with the environment. Population changes play a key role in demands for natural resources, with consequences for pollutants, biodiversity, and climate, and are also a determinant of the vulnerability or resilience of society to environmental changes (Muttarack and Jiang 2016).
Population projections have typically been produced and used at the national level, including those with global coverage (UNDP 2019). They are also produced for smaller regions, such as local areas, cities, or larger administrative units (Smith et al 2001). Gridded projections of local outcomes that are national in coverage have been produced as well. For example, the Shared Socioeconomic Pathways (SSPs) are a widely used scenario framework developed at the level of large world regions (O'Neill et al 2014), with quantitative elements, including population, projected at the national level (Kc and Lutz 2017). These national projections have been downscaled to grid cells (Jones andO'Neill 2016, Murakami andYamagata 2019) in order to provide spatial information.
However, in large countries such as the US, this approach insufficiently captures regional heterogeneity in demographic processes and outcomes (Jones and O'Neill 2016), which requires population scenarios at a scale that lies between the national and the local. Projections at this regional scale would improve local scale projections by providing subnational boundary conditions that capture regional heterogeneity. They would also be useful by better informing research and decision-making that occurs at the level of sectors and systems that also have substantial regional heterogeneity. For example, research and decision-making on subnational dimensions of climate change (Hsiang et al 2017), energy supply systems (Feijoo et al, 2018;Hostick etc 2014), and watersheds (Voisin et al 2019) occur at sub-national scales, between the national and local. Subnational scenario development at a scale more relevant to the systems being affected or taking actions is also occurring, for example integrating across sectors to assess future climate-related risks and response options (Absar andPreston 2015, Shi et al 2016). Planning decisions also benefit from regionally differentiated population projections, including for transportation infrastructure (Chi et al 2019), housing (Li et al 2019), and schooling (Maryland Department of Planning 2017).
In the US, there are currently no regular projections carried out by demographic institutions. The most recent state population projections by the US Census Bureau (2005) were conducted in 2005 and the Bureau has no plan to produce new projections. Although many states produce their own population projections, they use various methods with a range of mathematical sophistication and, considered as a group, do not constitute meaningful national scenarios. Moreover, these projections typically extend only to 2030 or 2040, so cannot be used in longerterm applications.
Existing long-term subnational population projections for the US, such as the county-level projections by (Bierwagen et al 2010, EPA 2017, Oak Ridge National Laboratory (Mckee et al 2014), and Hauer (2019), are valuable contributions but also have some methodological and data limitations. The EPA's projection is based on a hybrid cohort-component model and gravity-regression model. In the cohortcomponent model, the same fertility and mortality rates are assumed for all counties, leaving out the large variations in these demographic conditions across subnational areas. The gravity-regression model produces the number of net-migrants for two large age groups (under or above 50), adding the numbers to the populations for each county in turn rather than simultaneously, which can lead to mis-estimation of subnational population changes. Mckee et al (2014) focus primarily on the spatial allocation of population within counties, but also produce a county-level population projection for the US through 2050 using the cohort-component method. Due to data limitations, their county-tocounty migration rates under-represent some subgroups of the population, and the demographic rates that produce the ultimate county-and state-level outcomes remain unclear.
The projections by Hauer derive population size at the county level based on past trends in population size rather than modeling outcomes as a result of projected changes in fertility, mortality, and migration. While this approach has the advantage of greatly reduced data requirements (Smith 1987), this type of trend extrapolation method is not suitable for longterm population projections Woodward 1991, Smith et al 2001). Moreover, this method does not explicitly account for the effects of demographic events and the results do not reflect regional variations in population age structure.
We present a first set of US demographic projections at the state level, providing outcomes that are methodologically transparent and constitute meaningful scenarios at both the state and national levels. The projection model is initialized with recent data on age-and sex-specific population, fertility, mortality, internal and international migration rates at the state level. This step captures substantial current demographic heterogeneity across states. Future scenarios of state-specific fertility, mortality and migration are constructed with two purposes: (1) to span a wide but plausible range of population outcomes for the western United States, the geographic focus of an integrated project on energy, water, and land systems for which the projections were developed (https://im3.pnnl.gov/), and (2) to be consistent with the Shared Socioeconomic Pathways (SSPs; O'Neill et al 2014, Jiang 2014), a widely used scenario framework for integrated human-environment analysis. The latter purpose is intended to make the projections more broadly relevant beyond their initial application.
To achieve these purposes, we developed projections based on three SSPs (SSP 2, 3, and 5) whose qualitative assumptions would produce the widest range of western region population size. Definition of scenario assumptions was consistent with the SSP narratives (O'Neill et al 2017) and the existing national-level SSP population projections from KC and Lutz (2017). We develop a novel set of scenarios for state-to-state migration rates within the US with a rationale linked to the broader SSP storylines.
At the national level, our scenarios also span the full range of population outcomes across SSPs, although we do not require our national outcomes to exactly match the numeric outcomes of the original SSPs projections. In addition, our projections use the recent demographic data through 2015, compared to the 2005 initial year of the SSP national projections. They therefore represent an update of the SSP national population projections for the US.
Subsequent sections of the paper briefly describe the multiregional model used for the state population projections, followed by the introduction of the data used as input for running the projections, and a summary of the development of scenario assumptions for demographic components of state population dynamics. We then present the main results and finalize with discussion and conclusions.

Method and data
The method used in this research is a cohortcomponent population projection model extended for multiregional demography (Rogers 1975(Rogers , 1995, in which national population is divided into subnational units according to their socioeconomic, political, or demographic significance (here we use US states). It simultaneously projects population changes for all the states, with the national total produced as the sum across the states. This approach provides a transparent accounting for the factors driving demographic change in each state and nationally, allowing for clearer interpretation of outcomes. The model is described in detail in the supplementary information. Here we provide a summary of its key features: • Population is disaggregated by gender and single year of age in each state, beginning in the base year 2010. • Bilateral gross in-and out-migration flows by age and gender across all states are explicitly used in the modeling process. • Women of childbearing ages give birth according to age-specific fertility rates for the states in which they reside or move to. • Mortality occurs according to age-, gender-, and state-specific mortality rates. • Population changes as a result of the joint effects of migration, fertility, and mortality according to a large matrix of demographic rates (a transition matrix of (101 age × 2 gender × 50 state) 2 = 1.02 × 10 8 cells).
• The total number of net international migrants are added at the beginning of each year and allocated to each state according to the age, gender and state distribution of international migrants in the base year. • Assumptions on future changes in the total fertility rate (TFR), life expectancy at birth, state-to-state migration rates, and number of net international immigrants are derived and applied to the age-and gender-specific demographic rates to derive future population by age, gender, and state.
The model requires data for the base year (2010) population, mortality, and migration (domestic and international) by age, gender, and state, as well as fertility by age and state. We derive model inputs based on data mainly from the 2010 US Census, the American Community Survey (ACS), and the US Centers for Disease Control (CDC) Wonder database. Derivation of input data is described in supplementary information; here we only give a brief description and highlight the heterogeneity in outcomes for bilateral migration across states, which is key to multiregional projection but usually difficult to obtain.
Using the ACS data we derive domestic migration for each state and aggregate results for divisions and regions which reveals broader regional variations in the migration flows (supplementary information figures S6, S7, and S8 (https://stacks.iop.org/ERL/ 15/094097/mmedia)). Figure 1 shows differences in migrant profiles by age and gender across states, using the migration rates between New York and Florida and between New York and California as examples. It shows that a large proportion of older New Yorkers migrate to Florida, while migrants from Florida to New York are mostly young adults. Migration rates from New York to California are much higher than from California to New York. Migrants from New York to California are predominantly young adults, and more females move from California to New York than males.

Scenario development
Our scenario development aims to span a wide range of outcomes for the western US while also being consistent with the SSP framework. In the SSPs, the US experiences a medium level of population growth in both SSP1 ('Sustainability') and SSP2 ('Middle of the Road'), driven by medium levels of fertility, mortality, and international migration. Population growth is low in SSP3 ('Regional Rivalry') due to low fertility and international migration along with high mortality, and growth is high in SSP5 ('Fossil-fueled Development') due to high fertility and international migration along with low mortality. We use this set of SSPs for our state-level projections, choosing SSP2 over SSP1 as our preferred representative of medium population changes due to its middle of the road assumptions across most aspects of societal change.
To make assumptions for each state on future changes in fertility, mortality, and international migration, we assume that the changes in these rates at the national level apply equally (in proportional terms) to all states. While it is certainly plausible and even likely that experience will vary across states, we view this first attempt at state-level projections based on SSPs to be the most logical benchmark projection, from which future work could deviate.
We introduce three important changes relative to the national level SSPs that we believe improve on those projections. First, we update them with more recent data. The original national SSP projections (KC and Lutz 2017) are based on data for the year 2005. We use a base year of 2010 for our projection, and assume that national level fertility, mortality, and international migration are consistent with recent data in all scenarios through 2015, and then gradually merge with the original scenario assumptions by 2040 (figure 2), to preserve the judgments of the SSP projection authors on the range of future uncertainty. Experience over the period 2005-2015 for fertility and international migration has followed SSP3 more closely than other SSPs, toward lower fertility and migration levels. Thus the modification has the effect of accounting for this recent trend and lower starting point for these rates.
The second change we introduce is to modify the original assumption about the long-term level of international migration. The original SSP projections assume that net immigration after 2050 will linearly decline to zero by the end of the century, based on an argument that it is better to assume zero net migration when there is deep uncertainty about future trends. However, this assumption has been shown to have significant consequences in countries, like the US, with substantial international migration (Abel 2018). We therefore assume that the numbers of net immigrants will remain constant beyond 2050 at their 2050 levels for SSP2 and SSP5 and that they will decline to zero by the end of century, as in the original SSPs, only in SSP3, the low migration scenario ( figure 2(b)). These assumptions are broadly consistent with historical experience in the US, where the numbers of net international migrants fluctuated over time, ranging from almost zero during the economic depression in the 1930s to about 1.8 million per year in the late 1990s (Martin 2014).
The third difference is that we make assumptions on changes in internal migration, a process that is not included in the national SSP population projections. We based our assumptions on the SSP narratives and on historical experience, aiming to span a range of migration that broadly captures US trends and levels over the past 70 years (Molly et al 2011). We assumed first that trends in internal migration would be qualitatively similar to those in international migration; that is, in a scenario like SSP5 with high international migration, domestic migration would also be high. This extends the concept in SSP5 of a globalizing world with low barriers to people, capital, and ideas crossing international borders to low barriers also domestically. The same match in domestic and international assumptions is applied to the low migration SSP3 scenario. Quantitatively, we assume no changes to the bilateral migration rates under SSP2, a doubling of these rates by year 2040 under SSP5, and a reduction in the rates by half by the end of the century under SSP3. These assumptions produce levels of migration, measured as the proportion of the total population of the US Another consideration in scenario development is to account for the impacts of recent COVID-19 pandemic on life expectancy. We do not explicitly include it on our scenario because the anticipated effects on mortality, while important to public health, do not substantially change the outlook for long-term population outcomes (Goldstein and Lee 2020).

Projection results
We carried out projections for the period 2010-2100 for all US states, for three SSPs. The aggregate projection results for the country differ from the original SSP population projections (figure 4(a)) because they are based on more recent data, have different assumptions on international migration, and are carried out at the state level where shifts in the population compositions across states can change national population growth. In general, our projections are somewhat lower than the original SSPs, due largely to the fact that recent data have closely followed the relatively low trend in demographic rates in SSP3.
Consequently, the original SSP projections are similar to ours for SSP3, about 250 million by 2100, which is about 70 million less than the current (2015) population. For SSP5 we project a population of about 650 million by 2100, which is about 70 million less than the original SSP, although still almost twice the current population.
At the subnational level, projected population trends vary substantially across states. Figure 4( b) shows the examples of 4 states. In Colorado, projected changes are to some degree similar to the national projections, ranging from a modest decline to more than a doubling of population size over the century. New York in the Northeast and Illinois in the Midwest are projected to experience population decline even in SSP2, and by nearly half in SSP3 by the end of the century, while increasing by less than 50% in SSP5. The population decline in New York and Illinois in SSP2 is largely due to net out-migrants in state-tostate migration as well as to low fertility rates, even though they both gain population from international migration (see SI figure S4). In contrast, Utah will experience rapid population growth, increasing 2.5 times and 3.5 times respectively by 2100 in SSP2 and SSP5, and growing less substantially in SSP3. The rapid population growth in Utah is largely driven by its well above replacement level fertility and young population age structure in the base year. We also compare our results to the state population projections by the US Census Bureau (Jiang et al 2018) which produces outcomes up to 2030 ( figure 4(b)). They are similar for most states but differ in some cases given the different base year data and assumptions on demographic rates in the scenarios. The projection results show considerable changes in spatial population distribution across states in different SSPs. Figure 5 shows the percentage changes in state population size in 2050 and 2100 compared to the population size in 2010. While most states increase in population under SSP2 by 20%-50% in 2050 (figures 5-1(a)) and by 50%-100% in 2100 (figures 5-1(b)), population in the Northeast and East North Central (Great Lakes) regions do not grow much or even decline (e.g. New Jersey) because of population loss from internal migration and because of low fertility. Alaska experiences the most population decline, mainly due to its large net migration out of the state during the last decade and the assumption of unchanged migration rates in the future. California, Louisiana, and Mississippi are the other states with slow population growth due to outmigration. In contrast, Utah is projected to increase in population by more than 50% in 2050 and by more than 100% in 2100, mainly because of large natural growth. Other states in the Rocky Mountain region also see substantial growth. North Dakota experiences growth on par with Utah, owing to high net in-migration in recent years associated with employment opportunities in the energy industry, which may not continue over the long-term. Compared to the changes in SSP2, population growth in all states is much more significant in SSP5 in 2050 (figures 5-2(a)). By the end of the century, population in most states more than doubles (figures 5-2(b)). In contrast, population in most states in SSP3 are projected to grow by only 5%-20% in 2050 (figures 5-3(a)). By the end of the century, the majority of states experience population decline (figures 5-3(b)).
Spatial variation is also observed in population age composition. Figures 6-1 shows that in year 2010 the elderly aged 70 and above accounted for 10%-15% of the total population in most states. Only Utah and Alaska had less than 10%, while the figure in Florida, Maine, West Virginia, and Pennsylvania was between 15% and 20%. The projection results show that population aging will occur for all states in all SSPs but to different extents across states and over time. In SSP2, the proportion of elderly population reaches 25%-30% by 2100 in most states, except some states in the Central and South region remain at 20%-25%. Florida, Maine, and Vermont are the states with the highest proportion (above 30%, figures 6-2). Population aging differs substantially across states and SSPs, with much higher proportions of the elderly in SSP3 and much lower proportions in SSP5, because of the different assumed changes in fertility, migration, and mortality across SSPs.
Using the multiregional population model, we examine the net effects of different demographic components (fertility, mortality, internal and international migration) on the changes of population sizes and age compositions. For example, a decomposition analysis of the projection results in SSP2 shows very different sources of population growth across states (see figures S4 and S5). While natural growth (births minus deaths), internal migration, and net international migration contribute almost equally to population growth in Texas in 2010-2050, the population of New York during the same period grows only due to international migration and is substantially reduced due to internal migration, while the population of Florida increases through both internal and international migration but is reduced due to negative natural growth.

Discussion
This paper presents a first set of SSP-based population projections for US states using a multiregional demographic method and accounting for varying population dynamics across states. Results show that the spatial distribution of population and its age structure can vary widely across SSPs. In addition, our aggregate results for the whole US revise the original SSP population projections, based on updated demographic data that shows that the country has followed demographic trends closest to the low growth SSP3 scenario over the past decade.
Compared to existing subnational population projections, our projections apply a multiregional model that projects all states simultaneously, linked by migration, and do not require iterative adjustments afterward to match a separately produced national projection. Using age-and gender-specific gross bilateral migration rates between states, we explicitly account for the interactions between states over time and account for the effect of changes in state population sizes and age and gender structures on migration flows. Results show distinctive patterns of regional variations in both population sizes and age composition and the demographic driving forces of these variations between the states and across SSPs. These projection results provide relevant information for researchers and practitioners to consider demographically differentiated challenges of climate change impacts, as well as other socioeconomic and environmental stresses over the coming decades.
Moreover, the multiregional method provides a useful tool for examining the relative contributions of each demographic component (fertility, mortality, internal, and international migration) on population trends. This capacity is of particular use for developing extended SSP scenarios for subnational regions with or without considering climate changes and mitigation and adaptation policies.
In this first set of subnational population scenarios, we have not considered variations in overall patterns of differences in demographic behavior across states. Our next step is to explore scenarios that deviate from these overall patterns in order to fully capture future uncertainty in regional heterogeneity in social, economic, institutional, and geographic conditions. Another limitation of our current projections is that we do not account for the impacts of climate changes on demographic rates. For instance, a changing climate may cause more people to leave states that experience higher temperatures, altering current patterns of south-and west-ward migration flows, and could also affect the age profiles of migrants. Moreover, similar subnational population projections would be beneficial to carry out in other large countries.