Disease and fertility: Evidence from the 1918–19 influenza pandemic in Sweden

What are the consequences of a severe health shock like an influenza pandemic on fertility? Using rich administrative data and a difference-in-differences approach, we evaluate fertility responses to the 1918–19 influenza pandemic in Sweden. We find evidence of a small baby boom following the end of the pandemic, but we show that this effect is second-order compared to a strong long-term negative fertility effect. Within this net fertility decline there are compositional effects: we observe a relative increase in births to married women and to better-off families. Several factors – including disruptions to the marriage market and income effects – contribute to the long-term fertility reduction. The results are consistent with studies that find a positive fertility response following natural disasters, but we show that this effect is short-lived. © 2021 Elsevier B.V. All rights reserved.


Introduction
A central line of inquiry in economic and demographic research concerns how fertility responds to changes in mortality. Yet, we have limited knowledge on the causal relationship between pandemics and fertility, and particularly few insights about the time dynamics of fertility responses to major health shocks. The event of a pandemic can cause major losses and in a globalized world where viruses can spread quickly, insights on whether, when, and why fertility changes together with mortality seem highly relevant.
A handful of studies examine fertility responses to pandemics. The results suggest there are immediate negative effects with fewer births six to nine months after the mortality peak, pointing to increased miscarriages, still births and pre-term deliveries (Bloom-Feshbach et al., 2012;Chandra and Yu, 2015b,a;Chandra et al., 2018;Guimbeau et al., 2020), followed by increased fertility in the short run (Mamelund, 2004;Donaldson and Keniston, 2015). 1 This short run positive response aligns with findings in the literature on the fertility effects of mortality following wars and natural disasters (see, e.g. Nobles et al. (2015) on the tsunami in South-East Asia; Pörtner (2008) on hurricanes in Guatemala; Finlay (2009) on severe earthquakes; Lindstrom and Berhanu (1999) and Agadjanian and Prata (2002) on war) that shows that birth rates tend to increase in the short term. Short-run positive fertility effects can be explained by postponement or replacement fertility, but in theory such effects should no longer be present in the longer run.
This paper uses detailed information on the 1918-19 influenza pandemic in Sweden to study its effects on subsequent fertility rates using a difference-in-differences approach. The influenza pandemic was unforeseen and provides a unique opportunity to study fertility dynamics following a severe morbidity and mortality shock. Assembling administrative information from parish records, censuses, chief medical officer reports and midwife journals, we create a purpose-built historical database of highquality data for a country that was neutral during World War I (WWI). As discussed by Beach et al. (2020) WWI is a potential confounder when studying fertility effects as the war likely affected the marriage market, but also incomes and female participation in the labor market in countries involved in the war. Our data and design allow us to study immediate, short-term and long-term fertility responses to the pandemic, and to assess the plausibility of various mediating factors.
A major contribution of the paper is that we contrast fertility effects observed over different time horizons, and we are able to show that different observation windows may lead to very different conclusions about the impact of the pandemic. We also seek to analyze possible mechanisms behind the observed fertility changes beyond biological effects. Despite its relevance for improving our understanding of the relationship between mortality and fertility, the role of mechanisms has often been overlooked in the empirical literature. A pandemic may have psychological effects but also alter economic conditions, introduce uncertainty and influence fertility decisions by disrupting family structures and marriage markets. 2 We examine mechanisms of economic character and marriage market effects, but also provide insights to whether certain groups changed their fertility behavior more than others.
The paper is the result of a vast data collection effort, combining various individual-level and aggregate data covering the entire Swedish population over a 13-year period. Our analytical sample includes the number of deaths from all causes, births, stillbirths, influenza and pneumonia cases and various mother and birth characteristics for about 400 urban and rural health districts located within 25 counties. The comprehensiveness of the data allows us to make a number of additional contributions to the literature. First, we can carefully assess the plausibility of the identifying assumptions, which leads to less concern about confounding factors biasing estimates. Second, covering the entire population implies high external validity compared to studies providing specific sub-population effects. This also implies that we have data from both rural and urban areas and can explicitly investigate different dynamics in different types of districts. 3 Third, while previous empirical studies generally focus on overall mortality, we consider both adult and child mortality as well as morbidity which allow for different mechanisms operating in the immediate, short and long run. Finally, we can explicitly deal with internal migration, which otherwise confounds any analysis on the effects of a mortality shock.
After a short dip in conception rates during the pandemic, we find evidence of a small baby boom in rural areas after the peak of the pandemic. These results corroborate the fertility response noted after natural disasters. We further show that the positive short-term effect is driven by high social status parents: married couples, higher socioeconomic groups and mothers who already have at least one child contribute more than proportionally to the short-term increase in conceptions. This finding is interesting per se, but also of relevance for the large and widely cited literature on the fetal origins hypothesis following Barker (1990). Numerous studies show that in utero exposure to a health shock has consequences for health and socioeconomic status later in life (see, e.g. Almond et al., 2018;Helgertz and Bengtsson, 2019). These results rest on the assumption that people conceived during a health shock do not differ from those conceived shortly after other than through exposure. Some recent research revisits the literature which evaluates in utero exposure to the 1918-19 influenza and assesses the role of parental selection for the exposed cohorts (Brown and Thomas, 2018;Beach et al., 2018). 4 We find a shift towards higher social status parents after the pandemic. If children conceived shortly after the shock have better predisposition than those conceived during the pandemic, adverse health and income effects of an in utero shock will be overestimated. The same caveat applies to results on intergenerational effects (see, e.g. Veenendaal et al., 2013) and parental responses (see, e.g. Almond and Mazumder, 2013) to prenatal exposure. Parman (2015) demonstrates by example of the Spanish flu that the negative effects of in utero exposure can be further reinforced by parents reallocating resources towards older siblings, not affected in utero by the shock. This emphasizes how a large mortality shock can disrupt family structures and the allocation of resources among children. We demonstrate, that a large mortality shock will also affect decisions regarding family size.
Notably, the short-term fertility increase is swamped by a strong negative effect in the longer term. Areas greatly affected by the pandemic experience decreased fertility rates for years after the pandemic. Moving from the quartile of districts least affected by the flu in terms of adult mortality to the quartile of districts most affected associates with a decline in the monthly conception rate by about 10.5 percent in the long run. We show that this negative fertility effect goes beyond the 'mechanical' effect of those adults lost to the flu not having children, and rather represents behavioral and economic effects, including disruptions in the marriage market (a persistent reduction in the proportion of married individuals in the population). The noted composition effect is exacerbated by a disproportionate reduction in fertility among unmarried people: with the number of children to married couples decreasing less than general fertility in the disease aftermath, whereby the noted marriage market disruptions explain a substantial part of the fertility drop.
All in all, the results suggest that a deadly pandemic will be felt decades later and that the long-run effects may be very different from the short-term effects. The historical context corresponds to a country during the fertility transition which makes our findings pertinent to many contemporary epidemic settings. 5 Our findings contribute to the understanding of the mortality-fertility link and show that the effects go well beyond those of direct exposure.

The 1918-19 influenza pandemic
The first recorded case of the influenza in Sweden was in June 1918. Initially, the seemingly mild flu caused little concern, but this situation soon changed. Influenza-related mortality and morbidity rates were particularly high from August 1918 to February 1919, peaking in October and November. A milder wave appeared in March 1919 and a final wave in early 1920. 6 Knowledge about the virus was limited. Flu vaccines were yet to be invented and the only effective measures were rest, hot blankets, cold compresses for headaches and drinking plenty of water (Mamelund, 2011).
According to official sources around 10 percent of the Swedish population was infected (Richter and Robling, 2013) and nearly one percent died (Karlsson et al., 2014), but death rates varied considerably across age groups and across the country. 7 The most heavily affected counties experienced death rates almost three times higher than the least affected counties (Åman, 1990). Despite 2 Since marriage is traditionally seen as a proxy of fertility (Bongaarts, 1978) and was the main setting for childbearing in the early 20th century this may be an important pathway in the context of study.
3 A rural-urban divide seems highly relevant in a historical context, but seems to matter also in contemporary settings: Aassve et al. (2020) note possibly different post-pandemic fertility trajectories by income level and rural or urban area for the  Both studies focus on the U.S. which participated in WWI. While Brown and Thomas (2018) find no significant flu effects on later life outcomes after including proxies for parental characteristics of the 1919 cohort, the results of Beach et al. (2018) are more in line with Almond (2006) and largely unaffected by controlling for parental SES. 5 In Sweden, fertility began to decline around 1880 when the number of children to married women was above four. The fertility transition to below two children per woman was completed by the mid-1930s (cf. Strulik and Vollmer, 2015). According to Bengtsson and Dribe (2014)  a clear north/south county gradient, with higher mortality in the north, there was considerable heterogeneity across districts within each county: Fig. 1 shows district-level influenza and pneumonia morbidity, all-cause adult and child mortality rates for the period August 1918 to March 1919 (per 1,000 inhabitants). 8 The pandemic had several characteristics. First, in its most virulent form the influenza struck swiftly and unexpectedly. Most people died within 6-11 days after contracting the illness (Taubenberger and Morens, 2006). Second, the influenza affected the bronchus and the lungs which induced more pneumonia deaths (Morens and Fauci, 2007). Third, the pandemic was unique in that it primarily killed adults aged 20-40. Fig. 2 illustrates the age distribution of mortality during the pandemic in different ways. Fig. 2a compares influenza and pneumonia mortality rates by age and separated by gender in 1918 and1917. Fig. 2b shows the elevation of overall mortality during the influenza period at the health district level. Fig. 2c and d show that the share of adults aged 20-40 in total mortality in relation to child mortality was many times higher during the pandemic than before and after. Research suggests that the reason for this mortality pattern was cytokine shock, an overreaction of the immune system (Kobasa et al., 2007) such that a strong immune system was a liability rather than an asset, and possibly a lack of prior exposure to similar viruses (Mamelund, 2011).
Given that the most deadly wave of the pandemic was unanticipated and short, it is unlikely that people adjusted fertility behavior in anticipation. 9 It was also impossible to foresee who had a higher infection risk. Men exhibited slightly higher mortality rates than women (see Table B1 in the appendix), but some evidence suggests that pregnant women in the last trimester were especially susceptible to the influenza, leading to early termination of pregnancy (Bloom-Feshbach et al., 2012;Barry, 2004;Bland, 1919). 10 Several European countries experienced a baby boom in the 1920s, commonly ascribed to WWI ending. The U.K. birth rate jumped from 18.3 births per 1,000 population in 1918 to more than 23 in 1919, but neutral countries like Sweden and Norway also exhibited elevated birth rates in the 1920s. Despite being neutral, these countries may of course still have been affected by the war ending, but it is notable that they did not experience any wartime fertility dip (Chesnais, 1992). Swedish fertility rates declined linearly from 1911 to 1919, and WWI neither accelerated nor decelerated this decrease (Statistics Sweden, 1999). The 1920 baby boom has therefore also been linked to the influenza pandemic (Mamelund, 2004). Fig. 3 shows the crude birth rate (CBR) for Sweden from 1915 to 1927, along with the CBR distribution across all health districts in each year. Fertility rates generally declined Fig. 1. Influenza and pneumonia morbidity and overall mortality rates in Sweden during August 1918 to March 1919 (per 1,000 inhabitants). Note: Data correspond to the health district level. Legend categories represent quintiles. 8 Geographic heterogeneity in 1918-19 influenza related mortality rates is also noted in other countries, and some studies try to identify possible determinants. For example Clay et al. (2019) show significant cross-city variation in excess mortality in the US and find high poverty and poor health levels contributed to pandemic severity. Similarly the on-going COVID-19 pandemic show a very unequal impact of the virus across countries and regions (see e.g. Fenoll and Grossbard (2020)  throughout the period, but a clear deviation from the trend appears in 1920-21.

World War I and economic conditions
Sweden was neutral during WWI. Mortality rates were normal in the years prior to the pandemic, and morbidity and mortality record keeping was uninterrupted. Still the war affected the Swedish economy. The U.K. naval blockade and German naval belligerence hurt the country's import trade (Jörberg and Krantz, 1978) and price controls and food rationing were introduced. A poor harvest in the fall of 1916 led to food shortages and social unrest for a short period, but in general, the wartime period was characterized by adequate food supply (Nyström, 1994), and historical sources suggest that the economic impact of WWI was generally even across the country (Östlind, 1945).
Some sectors of the economy benefited from the war. Raw material exports increased and Swedish agriculture did well because of the lack of competitive imports (Schön, 2010), leading to a large trade surplus (Magnusson, 1996). Conversely, these sectors experienced a downturn after the war. After a period of growth the economy experienced a brief decline in 1920-21, where GDP dropped by five percent in one year and unemployment increased, but the country recovered quickly. Real wages were also positively affected, due in part to the introduction of the eight-hour working day (Jörberg and Krantz, 1978).
As pointed out by Beach et al. (2020) studying the fertility effect of the 1918-19 influenza in countries participating in WWI is difficult as the war may have affected fertility through, e.g., changing marriage markets since more men than women died in the war. Another issue related to WWI, pointed out by Brown and Thomas (2018) and Beach et al. (2018), is that there seems to have been a shift in fertility in the U.S. during the war, whereby cohorts exposed to the pandemic in utero were born to fathers of lower socioeconomic status compared to earlier cohorts. The shift followed from a social gradient in WWI mobilization and enlistment and from men being stationed outside the U.S. and likely also because of active family planning. With Sweden being neutral in WWI, there was no change in the male-to-female sex ratio following war-related deaths and war mobilization is less of a concern. Swedish men were not stationed outside the country, but families may have delayed births following the uncertainty surrounding the war. 11 Such behavior would depress birth rates before the influenza pandemic outbreak, which implies that estimated fertility effects will present a lower bound. Since we use the variation in birth rates over time and across districts, such behavior is only a problem if districts reacting to the war were also those being more affected by the pandemic.

Conceptual framework
This section provides a short outline of the theory guiding our analysis on fertility effects with a special focus on mechanisms and time dynamics. An important starting point is that a pandemic is only a temporary shock and should therefore not have an impact on desired fertility in the long run, unless it also affects determinants of desired fertility.
In terms of pathways we may first consider biological effects following a pandemic. Fertility may change if a pandemic reduces sexual activity or the ability to conceive. Infections may lead to pregnancy termination, and a spouse's death clearly reduces fertility prospects for the surviving spouse. Regarding the dynamics of this relationship, we expect an immediate negative fertility effect. While the effect stemming from infections is expected to fade out over time, a negative effect may also remain in the long run following spousal deaths, and with partner matching and remarriage taking time. 12 Such long-run negative fertility effect may also follow as individuals not directly affected through the death of a spouse may face consequences of a mortality shock on the marriage market as the sex ratio determines chances of finding a spouse (Becker, 1960(Becker, , 1973(Becker, , 1974 and as the population composition in terms of other traits, important for marriage market outcomes through assortative mating, may be affected. Second, there may be important behavioral effects following a pandemic affecting fertility. These effects can be classified as psychological or economic. Psychological effects may be distinguished as either postponement or replacement fertility. Postponement fertility refers to delaying fertility decisions due to uncertainty about survival prospects or fear of contagion (Lee, 1981;Menken et al., 1981;Castro et al., 2015). With such postponement, fertility will decrease in the immediate and then increase right after the peak of a pandemic, as couples who would have conceived anyway and couples who postponed fertility will conceive in the short run. Replacement fertility stems from couples losing children who then conceive again to replace a lost offspring (see Preston, 1978). It has also been shown that high mortality events may even trigger a society-wide action of population rebuilding, leading to new conceptions at the intensive and extensive margins (see Geertz, 1963;Grimard, 1993;Townsend, 1994;Conning and Udry, 2007). This kind of replacement effect is possibly stronger in more rural settings where communities are closer. In terms of time dynamics, replacement fertility increases conceptions in the short term, 13 but the short-run positive fertility 11 See Richter and Robling (2013) for a discussion. 12 A fading out of an immediate negative effect also aligns with a story where women experiencing a termination of their pregnancy soon after can get pregnant again. 13 An alternative view attributes increased fertility after a mortality shock to a hoarding effect: parents have more children than initially intended because a recent mortality shock instills doubt about their children's survival prospects (National Research Council, 1998;Palloni and Rafalimanana, 1999;Preston, 1978;Rosenzweig and Schultz, 1983;LeGrand et al., 2003). This is more pertinent for long-duration events and less pertinent for a short-term mortality shock following a pandemic. This mechanism would only be relevant if the 1918-19 influenza pandemic shifted the expectations of children's future survival over the longer term.
effect following replacement and postponement should not be present in the long run. A major epidemic may also have economic effects impacting the fertility decision, triggered by changes in relative prices and opportunity costs, but also by introducing uncertainty. Mortality within a family likely reduces incomes which may delay fertility as children are costly (Alam and Pörtner, 2018). Economic theory also suggests that the death of young adults will increase wages and wealth in the economy as labor supply sharply decreases, and fixed factors such as land and capital are shared by fewer people (see Boucekkine et al., 2009;Herlihy, 1997;Young, 2005). The substitution effect associated with such wage increases will reduce fertility as female labor supply likely increases and having children becomes relatively more costly. At the same time the income effect may increase fertility, as agents can afford to have more children (cf. Del Bono et al., 2015), although Galor and Weil (1996) show that the substitution effect may dominate if women's relative wages increase. A major pandemic may also fuel general perceptions of uncertainty about future economic conditions. If individuals avoid making long-term commitments following such uncertainty, family formation may be negatively affected. In terms of dynamics, the economic effects are expected to have an impact on fertility in the short and the long run.

Data and empirical strategy
We build a unique dataset combining data from several official administrative sources collected from archives and public libraries. 14 To create the dataset, we combine individual-level data with aggregate information corresponding to three administrative partitions. The smallest geographical unit is a parish (around 2,500 at the time). The next unit is a health district, grouping together several parishes served by the same medical personnel. There were about 400 health districts at the time of varying sizes and populations. The largest administrative unit is the county, of which there were 25 at the time.
The main unit of our analysis is the health district level, but some of the data refer to the parish level. We therefore map parishes to health districts and track changes in the allocation and borders of health districts. 15 We aggregate health districts to obtain units with stable borders over the entire study period. This leaves us with a total of 396 districts, including 65 aggregated districts. The empirical analysis examines rural and urban health districts separately as fertility dynamics are likely to be different in these contexts. The division used is the contemporary classification of districts into extra provincial, provincial, municipal district, and city in the source material. We group the first two categories into rural and the latter two into urban health districts. 16

Sources
A central source is the parish church books recording all deaths in Sweden. Already in 1686 local priests were obliged to record all births, deaths and marriages in a parish into church books that today are publicly available in local archives (Wicksell, 1922). The Federation of Swedish Genealogical Societies has digitized church records as the Swedish Death Index, which includes parish location and birth and death dates for all individuals who died in Sweden between 1901 and 2013. For a majority of individuals the civil status at time of death is also recorded. We use this source to calculate the monthly death numbers across age groups. We also use the Swedish Death Index to derive the monthly birth numbers for each health district. Some people in the cohorts of interest were still alive in 2013 and we thus do not observe them in the Swedish Death Index. We identify those individuals by use of the 1950 Census which includes people born between 1915 and 1927 who were still alive in 2013 (and therefore also alive in 1950). The census reports their data and parish of birth, which we use to supplement the birth numbers from the Swedish Death Index.
A second source is historical records from the National Medical Board who collected monthly data from physicians on district morbidity which we digitize. This variable correlates strongly with influenza mortality at the local level in Sweden (Karlsson et al., 2014), but there is an ongoing debate regarding its accuracy, especially in periods of high influenza mortality (Bloom-Feshbach et al., 2012;Mamelund et al., 2016). Doctors were obliged to report verified cases of the flu (Influensakommittén,1924) and historical records suggest that people did visit health care centers when they had the flu and that the pandemic clearly increased the demand for GPs (see, e.g. Influensabyrån, 1919). But morbidity is likely under-reported, and more so in rural compared to urban areas as a sick patient had to visit a physician to get recorded and the distance to health care was longer. 17 Reporting consistency across districts mayalso be a potential issue when it comes to morbidity data: despite that the symptoms of the influenza were well known, there was no microbiological testing. 18 As influenza was often complicated by pneumonia, we combine information on influenza and pneumonia incidents in our morbidity measure. 19 The historical records from the National Medical Board also include demographic information and the number of inhabitants at the beginning of each year in each health district. We digitize this information and combine it with the monthly birth and death numbers to calculate the monthly district population. Appendix Fig. B3 shows time trends in influenza and pneumonia morbidity and all-cause mortality for different age groups from 1915 to 1927. All series exhibit very pronounced spikes in the fall of 1918. The timing and severity of the increase in deaths in late 1918 suggest that it is reasonable to assume that the majority of the excess deaths in this period were caused by the pandemic.
Our third main source is midwife journals. Swedish midwifery was professionalized early on. Trained midwives attended around 80 percent of births by the turn of the 20th century, while less than 10 percent of women gave birth in hospitals (Högberg et al., 1986). 20 Midwives had to keep diaries on all attended births and 14 Most of the information was scanned from hard copies and digitized by the authors and research assistants. 15 The initial allocation is based on an official list of health districts and which parishes they include from 1930. Changes are identified using information from royal decrees, http://sara.moricz.se/Kommungränskonverterare/ and individual web searches. 16 The distinction between extra provincial and provincial was usually one of timing, where a newly formed district would start as an extra provincial district which was later turned into a provincial district if the separation proved viable. The urban category mainly corresponds to smaller towns. Our results are robust to defining only city districts as urban. 17 As discussed by Mamelund et al. (2016) under-reporting could also follow from a shortage of doctors. A general under-reporting of morbidity is corroborated by sickness reports for workplaces across Sweden suggesting higher morbidity rates, see e.g. Helgertz and Bengtsson (2019). 18 It is worth noting the long tradition and the well-defined responsibilities of the main district physician likely improved consistency in reporting. Diseases control was one of the main responsibilities of the main district physicians already in the 19th century (Edvinsson, 2011) and the district physician had an obligation to make reports regarding monthly cases of epidemic disease to the National Medical Board using standardized forms separating between disease types (Jonsson, 2009), which is one of the main reasons why Sweden is one of the few countries that have historical monthly morbidity data by type. 19 Using measures combining influenza and pneumonia incidents should also better capture any eventual early spring or summer wave of the influenza (Andreasen et al., 2008). Morbidity data corresponds to all influenza and pneumonia incidents in a district and is not available for separate age groups. 20 By 1819, every parish was required to employ a licensed and trained midwife. In 1870 the ratio of midwives to doctors was 3.1 in Sweden, compared to 1.4 in the rest of Scandinavia (Romlid, 1997) and 1.2 in France ( Thomson, 1997).
reported them annually to the main district physician (Bhalotra et al., 2017). We digitize the information from the midwife journals from 1915 to 1927, including data on the number of midwives in each district, birth type (live births, stillbirths, and miscarriages, and the number of pre-term and full-term births), and mother characteristics (the number of births to married, unmarried or widowed mothers and whether a woman was a first-time mother). 21 Finally, we use annual information on local poverty rates, income and capital income. Poverty rates, taken from the annual publication on poor relief (Statistics Sweden, 1917), are defined as the proportion of the population living in public poorhouses. Income includes all taxable earnings reported to the tax authorities, and capital income includes asset yields, rents and dividends, and comes from the yearbook of municipalities (Statistics Sweden, 1920). In heterogeneity analyses and in balancing tests we also use information from municipality yearbooks on private property assessed value, public revenue, public assets, public debts and population density. We also use data on the number of railway stations in a district in 1918 from Olofsson (1921). Appendix A provides definitions of all variables used in the following.
Appendix Table B1 provides summary statistics on all variables for the periods before, during and after the pandemic. Notably, there is considerable variation in the pandemic across districts, with an overall mortality rate ranging between 3.85 and 46 deaths per 1,000 inhabitants during the influenza period, and a corresponding morbidity rate ranging between 0 and 635 infections per 1,000 inhabitants.

Main variable definitions
Since it is the conditions at the time of conception that matter for the fertility decision, we specify the model in terms of conceptions rates. Conception rates are estimated based on the universe of live births, which are observed at the individual level and aggregated up to the health district-month level. With the exact number of conceptions unobserved, due to stillbirths and miscarriages that are not observed with the same frequency, we impute the following measure: where i represents a health district, m a month, and t the corresponding year. We thus lag the number of live births by nine months and adjust this number for stillbirths and miscarriages. We only have information on stillbirths and miscarriages as reported by midwives on an annual level, and therefore assume an equal distribution of stillbirths and miscarriages throughout the year. 22 We also assume that we observe the correct share of stillbirths and miscarriages as a share of total births in the data reported by midwifes and then calculate the 'true' number of stillbirths and miscarriages by assuming the same share of stillbirths and miscarriages for the births observed in our main Death Index source and the 1950 Census. Stillbirths include pregnancy losses in months seven to nine. Hence, we lag one third of the calculated number of stillbirths occurring in month m by seven months, one third by eight months and one third by nine months. A miscarriage is a pregnancy loss occurring less than seven months into the pregnancy, but likely only miscarriages after three months of pregnancy are noted in the data. We thus lag one third of the calculated number of miscarriages in month m by four, five and six months, respectively. As early miscarriages are likely to have increased during and shortly after the flu (see Bloom-Feshbach et al., 2012;Chandra and Yu, 2015b,a), our results will represent a lower bound, especially in the short run. 23 ; 24 The conception rate is calculated by dividing the number of conceptions in district i in month m by the corresponding monthly population. 25 Ideally, we would define the conception rate with respect to the population at risk (women in ages 15-49), but this information is not available on the district level. On the other hand, we measure influenza exposure with reference to the same population number, which means that our estimates may be interpreted as elasticities. This will prove useful when we consider the cumulative net impact of the pandemic. Nevertheless, we carefully assess the extent to which compositional changes induced by the pandemic might be driving some of the results. We apply an extended difference-in-differences framework and use variation in pandemic severity across districts and variation in conception rates over time within districts. For flu intensity, the influenza period is defined as August 1918 to March 1919. 26 We allow for persistent effects of the pandemic but rule out anticipation effects. Therefore, our treatment variable FluIntensity is a cumulative influenza intensity measure capturing all-cause deaths or influenza and pneumonia morbidity up to conception month m in district i. This implies that only mortality/morbidity incidents in August 1918 are assumed to matter for conceptions in this month, whereas the sum of incidents in August and September 1918 matters for conceptions in September 1918, etc.:  Barnett and Dobson, 2010;Bruckner et al., 2014;Eriksson and Fellman, 2000;Strand et al., 2012) with higher stillbirth rates during summer and/or winter when temperatures are at extremes. We find no evidence of seasonality in stillbirths in our data (see Appendix Fig. B2). 23 Around one in four pregnant women experience a miscarriage, with the vast majority occurring well before week 12 of gestation. One could argue that miscarriages are part of the natural process of pregnancy and should not be included in the conception numbers. In our data, miscarriages constitute on average around 4.1% of all annual conceptions. Our results do not change when we exclude miscarriages. 24 Current research on COVID-19 also links the ongoing pandemic to increased risk of preterm births (Delahoy, 2020). Also, a study by Khalil et al. (2020) documents that the overall stillbirth rate has increased during the COVID-19 pandemic. 25 Monthly population is calculated by using the population numbers as of January 1st for each year from the demographic dataprovided in the health district yearbooksand adding/subtracting the monthly number of births/deaths. Migration is thus attributed to the last month of the year. 26 Appendix Fig. B4 shows the distribution of the peak month of morbidity and mortality across districts, defined as the month with the highest increase in incidents/deaths compared to the average morbidity/mortality between January 1916 and December 1917 in a district. The vast majority of districts have their peak within our defined peak flu period. As adult and child mortality may affect the fertility decision differently, we calculate age-specific mortality rates. Adult mortality is the sum of deaths in the 20-40 age group representing potential parents, and child mortality is the sum of deaths in the 0-10 age group. Fig. 4 shows how conception rates and the three influenza variables evolve over time. Conception rates were at their lowest in September to November 1918 and drop with the increase in mortality and morbidity. Fig. 4b further shows that conceptions gradually increased after the influenza peak. As outlined above, we expect fertility effects to differ during different periods. Our analysis focuses on three time periods: Peak (August to November 1918), where we expect a negative effect on conceptions from the beginning of the pandemic up to its peak due to biological effects and/or postponement fertility; After (December 1918 to December 1920), where we expect an increase in conception rates due to postponement and/or replacement fertility leading to a baby boom in 1920-21; 27 andLater (1921-1927), where we expect a negative effect mainly stemming from long-term economic effects. 28

Econometric approach
For our main analysis, we specify the following model: for district i in period m 2 ½1915m1; 1927m12. Our main specification includes district fixed effects (a i ) and month-year fixed effects (l m ). The dummy variables D Peak , D After and D Later indicate whether period m falls within the influenza peak period (immediate effects), in the one to two years following the pandemic (short-term effects), or later years (long-term effects).
The reference period is the pre-influenza period ranging from January 1915 to July 1918.
The coefficients of interest are b 1 , b 2 and b 3 . With treatment defined as the degree of influenza exposure, b 1 corresponds to the differential effect of greater influenza intensity at the district level on conception rates during the peak period, b 2 captures the shortterm effect after the peak, while b 3 corresponds to the long-term effect. We consider the overall effect, but also split the analysis by rural and urban districts. Specification (4) represents a difference-in-differences model with variable treatment intensity. The crucial identifying assumption is that in the absence of the pandemic, conception rates in differently affected districts would have followed a common time trend. Appendix B provides evidence supporting this assumption: graphical evidence suggests parallel trends in the years preceding the pandemic regarding conceptions; balancing tests show that local observables were unrelated to excess mortality during the pandemic. Appendix B also presents event study graphs showing the coefficients of a flexible difference-in-differences model interacting the treatment variable with quarterly dummies. The flexible estimation allows for a placebo test, assuring that the flu had no effect on conceptions before it happened. All estimates before August 1918 are insignificant. 29 As a robustness check, we include a set of control variables X it and county-specific linear trends. The control variables include per capita earnings and capital income (both normalized by 1917 prices, in logs), the poverty rate and the log of the number of midwives proxying the local medical infrastructure. 30 Notably, some of the control variables can be seen as bad controls due to endogeneity (Angrist and Pischke, 2008) and some caution is required when interpreting estimates in specifications with controls. We therefore show results with and without controls.
Clearly, our outcome variable will react to changes in the composition of the population. The pandemic represents a shock to the population and may thus cause a mechanical change in the conception rates. We return to this issue in the next section, both by keeping population constant at 1917 levels and by quantifying the estimated effects relative to mechanical effects. Fig. 4. Conception, morbidity and overall mortality rates (per 1,000 people). 27 During this time period a few districts have experienced second or third flu waves and we might expect them to depress their fertility. Excluding these districts from the analysis leaves results qualitatively similar. 28 We do not include the fourth wave in 1920 as it was mild and concentrated in the north. Our results do not change when we exclude northern districts from the analysis. Similarly, our results do not change when excluding districts where influenza morbidity and mortality increased already in July 1918, or when including July in the Peak period and in treatment variable calculation in Eq. (3). Results also remain qualitatively the same when redefining the pre-flu reference period and let it last until May 1918the month when the first media reports on the influenza came from Spain. 29 As mentioned above, we do not find evidence for a pronounced early summer wave of the pandemic in Sweden. This finding is further supported by the insignificance of the coefficients before August 1918. 30 The pandemic strained the health care system and financial means to cope with the flu fell short in some districts (Holtenius and Gillman, 2014). This may also have had an impact on the medical care in those districts afterwards, which in turn may influence the decision to have children or not.  Table 1 presents estimates of the pandemic's impact on fertility for reported influenza and pneumonia incidents (Panel A) and adult and child mortality (Panels B and C). For morbidity, we note a small negative effect on conceptions during the peak period, completely driven by rural areas. This immediate response is in line with biological effects where women have difficulties conceiving if they or their husbands are ill or psychological effects of not wanting to conceive in uncertain times. There are no significant short-and long-term effects for either rural or urban areas and thus no indication of postponement fertility due to high morbidity. The lack of effects in the Later period corroborates the idea that morbidity primarily measures biological effects, mainly expected to be present during the influenza peak and some time after.
Turning to adult mortality (Panel B), there is again an immediate negative fertility effect, evident in both urban and rural areas. After the peak period, fertility bounces back in rural areas, but fertility is then depressed in the long run. This negative long-term pattern is also very pronounced in urban areas, where no bounce-back is observed right after the pandemic peak. 31  31 A potential concern for the observed difference between rural and urban areas is differences in measurement errors. Regressing the mortality rate on the morbidity rate and including an interaction term with an urban dummy, the interaction term is however not significant, suggesting that there were no significant differences in reporting influenza and pneumonia cases between rural and urban areas.
We thus find evidence of a small baby boom in the After period (December 1918 to December 1920), which can be explained by postponement fertility in the Peak period and later catch-up in the After period, but this should not be unique to rural areas. Yet, there could be differences across rural and urban contexts regarding replacement fertility. 32 In many respects rural societies were culturally and socially more close-knit than urban Sweden in the early 20th century. For example, households were interdependent during sowing and harvesting periods, tightening social ties. Also, divorces were predominantly an urban phenomenon (Sandström, 2011). Such knit may have initiated community rebuilding in rural districts that lost many adults, increasing collective fertility. An alternative explanation to why conception rates in urban districts did not rebound in the same manner as in rural areas is that the incentives to have children differed across these settings. In rural areas children represented an investment good, as they provided labor on the farm and care for the parents during old age, while children were more of a consumption good in urban areas. With costs and potential pay-offs of having children being different in the two settings, and if the influenza increased uncertainty, the decision to have a child or not could go in opposite directions.
In the long term (Later period), both rural and urban districts that exhibited high adult mortality decreased their fertility compared to less affected districts. In the full sample without additional controls, each additional adult death per 1,000 people reduced the monthly conception rate in the long-term period by 0.09. With a baseline monthly conception rate of 1.81, this translates to about 5 percent fewer conceptions. Comparing the quartile of districts least affected (the 25th percentile) in terms of adult mortality with the districts most affected (the 75th percentile) with an adult mortality rate of 3.13 and 5.24 deaths per 1,000 people respectively, the difference between these two districts corresponds to a 10.5 percent reduction in the monthly conception rate. This pattern is in line with economic effects including negative income effects and changes on the marriage market, as shown in greater detail in the next section. 33 Clearly, population size depends on mortality. Therefore, especially the short-run positive effect on conceptions may stem from a mechanical effect of reduced population. Appendix Table B2 provides results when keeping population constant at 1917 levels in the calculation of conceptions rates. The previously noted shortrun positive effects also appear with this specification.
We may also be concerned about a mechanical fertility effect following the death of potential parents. In order to assess whether our estimates go beyond mechanical fertility effects in the long run, we estimate the number of conceptions that would have happened if adults killed by the pandemic had remained alive and reproduced at pre-pandemic  rates. This estimate is given by where f is the monthly fertility rate in the population of reproductive age in the 1911-17 period (derived from and calculated based on Statistics Sweden, 1929), FluIntensity im is adult influenza mortality measured according to Eq. (3), and the last term adjusts for the fact that the 1918 population of reproductive age gradually moved out of that age bracket (we normalize m ¼ 0 at the outbreak of the pandemic so that m ¼ 240 after 20 years have passed). Fig. 5 (a)-(c) graphs the resulting cumulative fertility effect using our point estimates (illustrated by the solid black line) from columns (1), (5), and (7) of panel B (adult mortality) of Table 1, respectively. Confidence intervals are estimated analogously based on the estimated covariance matrix of coefficients. The dashed horizontal lines in each of the figures correspond to 1 and À1, which are useful benchmarks as 1 represents a situation where the pandemic is completely undone in the sense that there is an additional conception for each individual dying. The figures thus demonstrate the net cumulative effect of the sometimes conflicting short-and long-term responses. Fig. 5b shows that the short initial decline in rural areas is offset by the rebound in the medium term: 16 months after the onset of the pandemic (in December 1919), the cumulative effect is one new conception for each adult killed in the pandemic. This replacement is, however, completely undone 63 months after the beginning of the pandemic (in November 1923) after which the cumulative effect turns negative. In the urban areas the cumulative effect is always negative (Fig. 5c). In the pooled sample (Fig. 5a), the cumulative effect becomes significantly negative after approximately 83 months (by the fall of 1925). The blue solid lines in Fig. 5(a)-(c) represent the cumulative "mechanical" effect of missing conceptions, calculated according to Eq. (5). In the pooled sample, the initial dip and the long-term decline are both significantly larger than the predicted mechanical effect. In rural areas, also the intermediate increase in fertility occurring in the aftermath of the pandemic is significantly different from the mechanical effect. Thus, we conclude that our  Table B8 shows results for including both adult and child mortality in the regression. The results indicate that the noted positive short-run effects stem from adult mortality, indicating general replacement rather than child replacement. It should, however, also be noted that the correlation between child and adult mortality is very high-at 0.78. It is, thus, difficult to gauge the true effect of one over the other. 33 Results using child mortality (Panel C) are qualitatively similar to the results from adult mortality. analysis demonstrates behavioral responses that go well beyond mechanical effects driven by deceased potential parents.
As the cumulative effect is strongly negative in both rural and urban areas, we aggregate the periods Peak, After, and Later into one Post period and focus on the total effect of cumulative mortality during the flu months in the following analyses. 34 We will focus on adult mortality as the effects do not vary across mortality measures and morbidity only exhibits temporary effects. 35 As noted in Table 1, results are insensitive to the inclusion of covariates, therefore we exclude additional controls in the following.

Treatment effect heterogeneity
Next we conduct a heterogeneity analysis and explore whether the impact of adult mortality on conception rates differs across district types. For this exercise we use baseline district characteristics in 1917 collected from official yearbooks. To classify districts, we generate dummy variables indicating whether the district was above the median for a specific characteristic in 1917 and interact this with our treatment variable. We also check whether rural districts with different shares of Sami people (measured in 1910) responded differently. 36 We further include a measure for how well connected a district was. Here, we use information on the number of railway stations in a district and generate a dummy variable taking the value one if a district had at least one railway station in 1918. 37 Table 2 presents results for the whole country and for rural and urban areas separately. Mainly three characteristics associate with the fertility effect: high poverty rates, low population density and worse railway connection. The interaction term attains statistical significance for poverty in the urban sample and population density, which can also be interpreted as a measure of poverty in rural settings, in the rural sample. Urban areas with above-median poverty rates experienced disproportionate declines in fertility rates and more densely populated rural areas experienced smaller declines. We also note that fertility declines induced by the pandemic were less pronounced in rural districts connected to the railway network.
Appendix Table B3 provides the results from a complementary heterogeneity analysis using continuous variables instead of a median cut-off. A possible concern with such a specification is that it gives disproportionate weight to districts that are at the extremes of the distribution of the interaction variables. The results lend further support for high poverty rates and low population density being characteristics that associate with the fertility effect, although the interaction term for the urban sample is imprecisely estimated. This exercise also suggests that rural areas with higher conception rates saw a disproportional fertility effect of the pandemic. In line with Table 2 we also find that the Note: Monthly data on health district level. N refers to the number of health districts Â the number of time periods. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. The dependent variable is conception rate. All regressions include district and month-year fixed effects. Standard errors in parentheses, clustered at the district level. Exposure is used for readability and is defined as Exposure ÂX denotes the interaction of Exposure with the variable in the column heading. All interaction variables in specifications (1)- (7) are dummy variables taking on the value one for districts being above the median for the specific variable in 1917. The Sami share is taken from the 1910 census and since its median equals zero, the actual share is used in the interaction. Railway is a dummy variable taking the value one if the district had at least one railway station in 1918. 34 For consistency we also provide tables with results for the three different periods in Appendix B. 35 Some districts may have experienced high morbidity but low mortality, or high adult mortality and low child mortality, or vice versa. In order to gauge the relative importance of the three influenza variables, we run regressions jointly including morbidity and mortality measures. Results do not change when including morbidity and mortality in the same specification. 36 Being an indigenous population, the Sami people could exhibit divergent fertility behavior and previous work has shown that the local Sami population was an important predictor of influenza mortality in Norway (Mamelund, 2003). 37 The Swedish state started to build a national railway network around 1850, and the country soon had an extensive network of overland transport routes (Hedin, 1967). Around 60 percent of districts had a station in 1918.
fertility decline was less pronounced in rural areas that had access to a railway line. 38 Taken together the heterogeneity analysis suggests that the fertility declines induced by the pandemic were particularly pronounced in relatively poor areas. This finding underlines the importance of adverse economic conditions in fertility declines, and is interesting in the light of previous research that suggests that the 1918-19 pandemic had negative economic impacts. For example, Karlsson et al. (2014) find that the pandemic led to a significant increase in poorhouse rates in Sweden. Also Barro et al. (2020), Correia et al. (2020) and Guimbeau et al. (2020) find higher influenza-related mortality associated with persistent economic decline.

Marriage market
Given the observed long-term negative fertility response it is natural to look at changes in nuptiality as a potential pathway. We start by discussing the implications for the marriage market stemming from a mortality shock. Although a loss of one percent of the population may seem irrelevant, there is large variation in mortality across districts and the historical literature provides plenty of stories about families falling apart. 39 The flu was especially hard on individuals between 20 and 40 years of age and the pandemic likely broke up existing marriages by the death of a spouse. Remarriage was common after widowhood in early 20th century Sweden, but this process could take time (Lundh, 2007). 40 In fact, widowers were not unattractive on the marriage market as they generally could offer an established household. For women it was often a necessity to remarry to support themselves and their children. Young widows generally had better prospects of remarriage, but also stronger incentives to remarry as older widows could expect support from their adult children (Dribe et al., 2007;Lundh, 2007).
Also individuals not directly affected through the death of a spouse may face the consequences of a large mortality shock on the marriage market. Following Becker (1960Becker ( , 1973Becker ( , 1974, the sex ratio determines the chances of finding a spouse in a monogamous society for obvious reasons, but also population composition in terms of other traits plays an important role for marriage market outcomes through assortative mating (see, for example, Angrist (2002), Abramitzky et al. (2011), and Dribe and Lundh (2005) for an account on assortative mating in 19th-century Sweden). There are also reasons to expect that the marriage market may be affected by the economic uncertainty and psychological effects that followed the pandemic. Research on family formation during economic downturns has found adverse economic shocks to have negative effects on nuptiality and consequent childbearing (Neels, 2010), 41 and research in psychology provides theoretical grounds for a large mortality shock affecting the marriage market. 42 With information on the last civil status of a deceased person and the date of the last change in civil status, we estimate the number of people getting married or becoming widowed (those changing to the status 'married' or 'widow' for the last time before death) in each district following the pandemic. Appendix Fig. B5 shows the evolution of these series over time, and Appendix Fig. C2 shows trends for the highest and lowest district quartiles in terms of influenza exposure suggesting that there were no significant differences in the trends before the flu. Some caveats should be kept in mind. First, our data come from the Swedish Death Index and we do not observe the civil status of individuals that are still alive. 43 Second, we do not know the order of a marriage, i.e. whether it is a first or second marriage. Also, we do not know in which parish the marriage/widowhood took place or with whom. We use the birth parish whenever the birth date is closer to the marriage/widowhood date and the death parish whenever the death date is closer to this date. This leads to an assignment of the birth parish in around 70 percent of the cases. Comfortingly, incorrect assignments will largely be reduced by the aggregation of parishes to health districts, as birth and death parish lie within the same health district in almost half of all cases.
In a first step we estimate whether districts, which were particularly hard hit by the pandemic (in terms of adult mortality) experienced a change in marriage rates and/or widow rates afterwards. We therefore estimate the following model: where CivilStatusRate it is marriage or widowed rate relative to population numbers, in district i in period t 2 ½1915; 1927. In a second step, we examine if changes in widow and/or marriage rates stemmed from changes in the sex ratio induced by differential mortality rates among men and women. We calculate the absolute difference between adult male and female deaths normalized by the 1917 population and use this as the treatment variable in Eq. (6) instead of FluIntensity. This variable, Gender-Distortion, measures whether more men than women (or vice 38 Using a continuous measure on the number of train stations a reverse significant relationship appears for the urban areas. Rural areas that had a railway station very rarely had more than one. We therefore expect the continuous measure to matter mainly for the urban sample. Here, however, a few districts stand out with many railway stations. We therefore judge the estimate based on the median split more reliable in this case. 39 See for example Lundgren (1989Lundgren ( , 1991 on the story of a family and its survivors in Arjeplog, a parish in northern Sweden most severely hit by the flu. The local newspaper even had a special category on tragic family stories during the pandemic (Norrbottens-Kuriren, 1918-1920. Available from the archive of Norrbottens museum: https://norrbottensmuseum.se/arkivcentrum/arkiv-bibliotek/tidningsarkiv.aspx). 40 The Protestant Church accepted remarriage but imposed a mourning period of six months on men and one year on women.  2018) on a link between economic recessions and fertility. A negative relationship between both births and marriage rates and economic crises has also been observed in historical studies (Tzannatos and Symons, 1989;Lee, 1990;Bavel, 2001;Bengtsson et al., 2004;Teitelbaum, 2014). 42 On the one hand, stress theory suggests that community-wide exposure to mortality brought by disasters and pandemics has a negative psychological impact, in turn reducing marriage rates (Cohan and Cole, 2002;Goldmann and Galea, 2014). Research shows that adverse psychological effects are present years after a community shock (Bland et al., 1996;Bolton et al., 2000;Bonanno et al., 2008;DiGrande et al., 2010;Jalloh et al., 2018) and that enduring psychological damage is typically observed in up to 30% of exposed individuals (see Goldmann and Galea (2014) for a review). Depression and anxiety may also increase following stigmatization and discrimination of epidemic survivors (see, e.g. Karafillakis et al., 2016;Rabelo et al., 2016;O'Leary et al., 2018). On the other hand, attachment theory suggests that marriage rates will increase after a large mortality shock, as a society-wide pandemic will bring survivors closer together (Hill and Hansen, 1962;Bowlby, 1973Bowlby, , 1988Hazan and Shaver, 1994). Empirical evidence on the two hypotheses and their link to marriage rates is however scarce and mixed (see Nobles et al., 2015). 43  versa) died in the district, possibly making it more difficult to find a (new) partner. Table 3 presents the results. The upper panel of Table 3 shows the marriage market effects of adult mortality. Columns (1), (3), and (5) suggest that the widow(er) rate increased significantly following the pandemic. Notably, examining the dynamics of this effect, this increase only originated from the immediate and short-term (the 1918-19 and 1920-21) time windows and is not significant and negative in the 1922-27 (Later) time period. 44 Table 3 also shows that the pandemic lowered marriage rates in rural areas (column 4) that likely had a less dynamic marriage market than urban areas. 45 Examining dynamics here, see Table B4 in the Appendix, there are negative effects on marriage rates during the pandemic, but the main effect stems from depressed marriage rates in the long run. The pandemic hence caused a one-off shock to the marriage market during its peak, which was not compensated in later time periods. Instead marriage rates declined further.
The noted widowhood effect of 0.03 translates to a 7.4 percent increase above the pre-flu mean in the 1918-21 period, and the reduction in the share of marriages added to the disturbance in the marriage market. With a baseline annual marriage rate of 3.01 per 1,000 people in rural areas, the estimate of 0.017 implies around 5.6 percent fewer marriages. Accordingly, the long-term decline in marriage rates, rather than pandemic-induced couple-disruptions, seems decisive to the overall marriage market effect.
The lower panel of Table 3 suggests that imbalances in the sex ratio played a role in the decreased marriage rates in rural areas. Including both treatment variables (FluIntensity and Gender-Distortion) at the same time, however, only FluIntensity remains significant with an unchanged coefficient of 0.017, a result which provides support for the importance of economic conditions and behavioral changes driving the decline in nuptiality, rather than mechanical effects.
In conclusion it seems that a substantial part of the observed fertility effect stems from the marriage market, in particular long-term declining marriage rates. 46 This result is interesting from a contemporary perspective. For the COVID-19 pandemic mortality of potential parents is not a viable mechanism for fertility changes, but pandemic-induced economic and psychological uncertainty may well change family formation behavior.

Mother characteristics
Given that we find the decline in birth rates to be substantially driven by a decrease in nuptiality it is also relevant to investigate who changed their fertility behavior. In this and the following section, we are thus examining compositional changes within the reduced number of births we identified in Table 1. We, therefore, change the specification from rates to a logarithmic specification in this subsection in order to identify changes in birth characteristics given the knowledge that highly affected districts experienced lower birth numbers after the pandemic. Information on mother characteristics from the midwife journals gives a unique opportunity to answer this question. This analysis is of interest in itself, but is also motivated by the fact that changes in birth characteristics due to the flu would have great consequences for the interpretability of results in studies examining later life effects of in utero influenza exposure. If children conceived shortly after the mortality shock have better predisposition than those conceived during the pandemic, adverse health and income effects of being in utero during a shock will be overestimated.
In examining compositional changes we look for differences as compared to the 'normal' years 1915-17. With annual data we focus on the time of actual birth. and specify the following model: where lnðMotherType it Þ is the natural logarithm of the number of births in year t to married, single, first-time or not first-time mothers. lnðFluIncidents it Þ is the logarithm of the cumulative number of deaths between August 1918 and March 1919. We also Note: Annual data on health district level. N refers to the number of health districts Â the number of time periods. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. The dependent variables are marriage rate and widow rate. Exposure is used for readability and is defined as ; 1927, otherwise 0. All regressions include district and year fixed effects. Standard errors in parentheses, clustered at the district level. 44 For results separated by the three time periods Peak, After and Later, see Appendix Table B4. 45 Stockholm was exceptional in its acceptance of fertility and co-habitation without marriage (see Matovic, 1986). Our results on marriages will not capture this. For fertility effects, however, we find a reduction in both legitimate and illegitimate births in urban areas (see Table 4). It could thus well be that also this form of family formation was distorted by the pandemic. 46 The quartile of rural districts least and most affected by the pandemic exhibited adult mortality of 3.1 and 5.0 per 1,000 people, respectively. The difference between these two types of districts correspond to a 10.6 percent reduction in the annual marriage rate. With a long-term decrease in conception rates of 10.2 percent in rural areas it seems that a substantial part of the observed fertility decrease stems from reduced marriage rates.
N. Boberg-Fazlic, M. Ivets, M. Karlsson et al. Economics and Human Biology 43 (2021) 101020 include the log of the total number of births in district i in year t, lnðbirths it Þ, to account for the fact that fertility was reduced in districts heavily affected by the pandemic. 47 We thus only examine which type of births was or was not reduced disproportionately to the general fertility decline caused by the pandemic. As the dependent variables represent interdependent states we allow the error terms to be correlated across regressions. Appendix Figs. C3 and C4showing the time trends for the variables married, single, first-time and not first-time mothers in the highest and lowest district quartiles in terms of influenza exposure, respectivelyindicate no significant difference in trends before the flu. The upper panel of Table 4 shows a fertility shift to married mothers in rural areas. This indicates that more stable families had children after the flu, in line with a shift into higher social status parents (Richter and Robling, 2013). In urban areas the reduction in fertility is more evenly distributed, with negative effects for both married and single mothers, with stronger effects for single mothers. We also note relatively fewer births to first-time mothers, which is consistent with the postponement of would-be-parents of their first births during economic downturns documented in the literature (Goldstein et al., 2013;Lanzieri, 2014;Neels, 2010). This indicates that first-time mothers delayed births, again implying a shiftof the remaining births into existing families. Overall, we find an indication that urban areas were more affected by economic uncertainty in their fertility decisions. 48

Social gradient
As a third way to examine potential mechanisms we investigate the social gradient in changed fertility behavior. This investigation is also motivated by the results in Brown and Thomas (2018) and Beach et al. (2018) who note a shift towards lower social status parents shortly after the pandemic in the U.S.
With the available data we cannot directly observe socioeconomic status as income or occupational data for those born during our period are not available on the individual or the district level. Instead, we take advantage of having information on individuals' last names and follow Clark (2014) who shows in a detailed study on several countries, including Sweden, that last names provide a good measure of social position. We classify individuals into social groups according to their last name. Here, we define two social groups: (1) nobility/high social status (aristocratic and Latin names) and bourgeoisie (names including or ending on Lund/lund, Berg/-berg, Gren/-gren, -quist, -ström) and (2) others (including names ending on -son or -dotter). The vast majority of our individuals (76.4 percent) falls into the second category. 22.2 percent are born into the category 'bourgeoisie' and 1.4 percent constitute children of nobility/high social status parents. 49 We create a dummy variable for being born with 'high social status' (HighSES) taking the value one when the last name is 'noble' or 'bourgeoisie', and zero otherwise. We lag the date of birth by nine months in order to approximate the date of conception and estimate a linear probability model (LPM) of the probability of being conceived in a family with high social status. 50 As the number of births to high-SES parents varies considerably between months and across districts and is often zero for a particular month-district combination, we use the individual-level data in Table 5 Social gradient in conceptions.

All
Rural  Note: Annual data on health district level. The stars represent significance at the following p-values: *p < 0.1 **p < 0.05, ***p < 0.01. Results from estimating SUR models for married/unmarried and first birth/not first birth separately, standard errors in parentheses. All regressions include district and year fixed effects and the log of the total number of births. Exposure is used for readability and is defined as 47 This is, of course, a bad control variable. However, results are unchanged when not including it. 48 For results separated by the three time periods Peak, After and Later, see Appendix Table B5. 49 These numbers mirror official statistics and census data on the share of high-SES individuals in fertile age quite well. The annual publication Befolkningsrörelsen provides statistics on the occupation of fathers to newborns, and suggests that about 30 percent of fathers in the period 1911-1919 were classified as high-SES individuals. 50 A logit model produces similar results. this subsection. Thereby, we are able to estimate the likelihood of a person born in month m in year t to have high-SES parents, again given the lower number of births due to the pandemic.
where HighSES yim indicates whether an individual y is born with high social status in district i in period (month-year) mÁ D Post ¼ 1 if m 2 ½Aug1918; Dec1927, otherwise 0. FluIntensity is defined as above. Appendix Fig. C5 shows that the time trend for births to high SES parents in the highest and lowest district quartiles in terms of influenza exposure were not different before the flu. Table 5 shows no differential effect in the overall and the rural sample. In urban areas, however, we observe a clear shift towards parents of higher social status, with a higher proportion of individuals with high-status names being conceived, after the flu. In the previous section, we found a shift towards more stable families in rural areas with negative results across the board for urban areas. Social status seems to be a more relevant indicator for urban areas, where clearly high social status parents were less affected by economic conditions and uncertainty and therefore did not reduce fertility as a consequence of the flu. 51

Robustness checks
In this section, we present several robustness checks to address potential concerns with our analysis. The main issue of concern is that the observed negative fertility effect may follow from migration. If life became more difficult in severely affected districts, people might choose to move away. Also, one spouse may move temporarily to avoid the risk of infection if the other was ill, restricting the possibilities of conception. Although we are looking at conception rates it could be the case that young people in fertile age migrated more, which would bias our measure of conception rates downwards.
The left part of Table 6 presents estimates for the impact of the pandemic on annual migration rates. 52 All estimates in columns 1-3 are insignificant, but suggest that, if anything, there was an inflow into heavily affected areas. The right part of Table 6 assesses the importance of this inflow. In column 4 we use larger geographical units and repeat the analysis of Eq. (4) on the county level, reducing the number of geographical units from 367 to 25. Results are very similar to the main analysis. 53 Columns 5-7 drop counties that were characterized by particularly high out-migration (Blekinge, Västmanland, and Kronoberg). Again, the results are similar to Table 1. All in all, we conclude that selective migration does not represent a major threat to identification.
Furthermore, biological effects may be present for longer than we assume, i.e. beyond the Peak and After periods. The literature does not have a clear answer to how long women, and possibly also men, are negatively affected in their ability to reproduce following an influenza infection (Wiwanitkit, 2010). The positive effect on marital fertility in 1920-21 in rural areas contradicts this notion. Also, possible negative health effects would affect women who were infected but survived the infection. We would therefore expect such fertility effects to stem from morbidity, not mortality. Table 1 illustrates that this is not the case.
In order to take account of possible spatial correlation across districts, we also estimate our main results using Conley standard errors. 54 Estimates using cut-off levels of 50 km and 100 km are presented in Appendix Table B7 and show that results are unaffected.
Finally, we check the sensitivity of results to changes in district borders over time, by including dummy variables which take the value one for the year of a border change and thereafter, and to changes in the urban-rural classification. These results can be found in Table B9 in the Appendix. The results are unchanged. In summary, the robustness checks support the research design and the validity of the findings. Note: Columns (1)-(3) annual data on health district level, column (4) monthly data on county level, columns (5)-(7) monthly data on health district level. N refers to the number of health districts/counties x the number of time periods. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. In columns (1)-(3) the dependent variable is net migration rates (positive numbers representing in-migration) defined as migrants per 1,000 population. In columns (4)-(7) the dependent variable is the conception rate. All regressions include district and month-year fixed effects. Standard errors in parentheses, clustered at the district level. 51 For results separated by the three time periods Peak, After and Later, see Appendix Table B6 52 We calculate the net migration rate for every district using population numbers, number of deaths and births. For every year the residual provides a measure of how many people moved in or out of the district, subject to random measurement error. 53 The exercise where we aggregate and run the analysis on the county level is also informative regarding the potential challenge that pandemic intensity in one district influences pandemic intensity in a district close by, and should also handle potential outlier districts. 54 We use the procedure written by Hsiang (2010)

Conclusion
In this paper we have examined fertility response to the 1918-19 influenza pandemic in Sweden, which implied a great mortality and morbidity shock in a country that was neutral in WWI. We show that the pandemic affected fertility rates not only in the short term, but even a decade later. Specifically, we find some evidence of a positive fertility response in rural areas following the pandemic. However, this shortterm effect is of second-order importance and is overshadowed by a large fertility reduction in the long run. Furthermore, in urban areas the effect of the pandemic on fertility is negative throughout the whole study period. Our results thus suggest that the often noted positive fertility response to mortality shocks and pandemics is short-lived.
Examining heterogeneity effects we find that poor underdeveloped districts largely drive the negative long-term effect, suggesting a negative income effect on fertility. We further identify changes on the marriage market as an important mechanism. These changes on the marriage market represent mechanical effects due to the need to find a new partner, but more importantly behavioral and economic effects following increased uncertainty and reduced incomes. Overall, the mortality shock increased the cost of having children and, thus, reduced fertility in the long run.
We also find compositional effects: within the net fertility decline we observe a relative increase in births to married women and parents of higher social status. This result on parental composition is interesting in itself, but may also have implications for how we interpret the often noted later life effects of in utero exposure to health shocks. A recent literature assesses the implications of the observation that cohorts with in utero exposure to the 1918-19 influenza pandemic were born to lower socioeconomic households in the U.S. (Beach et al., 2018;Brown and Thomas, 2018). Our results suggest that the composition of cohorts born after the pandemic may also be important to consider.
It is all together evident that a deadly pandemic can have fertility effects that go far beyond the infection period itself. Putting the noted negative effect on family size in perspective of a quantity-quality tradeoff, we may expect parents to invest more into the education of those (fewer) children born after the pandemic in highly affected districts. The fact that we find compositional effects in favor of parents with high socioeconomic status may further reinforce this effect. According to the results presented by Parman (2015), these effects may even hold true for older siblings if resource allocation within the family changes due to the pandemic. For future research, it would therefore be interesting to examine educational outcomes of the children of those families who altered their fertility behavior due to the flu.
As stated in the beginning of this paper the event of a pandemic can cause major losses, and the past year has reminded us that deadly viruses may spread very quickly across countries. There are many similarities between the 1918-19 influenza and the COVID-19 pandemic, not least that both pandemics were caused by new and very contagious viruses and that transmission was similar. At the same time there are significant differences regarding whom the two pandemics affected the most and society is very different compared to a century ago, which makes it difficult to draw straightforward conclusions from effects following the 1918-19 influenza for developed countries today. Nevertheless we believe that our results may still be informative for the sizeable population that lives under similar conditions to early 20th century Sweden.

Authors' contribution
The paper "Disease and fertility: Evidence from the 1918-1919 influenza pandemic in Sweden" is a joint work between Nina Boberg-Fazlic (University of Southern Denmark), Maryna Ivets (CINCH, University of Duisburg-Essen), Martin Karlsson (CINCH, University of Duisburg-Essen), and Therese Nilsson (Lund Univeristy and IFN). All authors contributed significantly to the paper and have been involved in all parts of the project.

Acknowledgement
We wish to thank participants at seminars and conferences for useful comments, and Jan Wallander and Tom Hedelius foundation (P2017-0081:1) and the Crafoord foundation (20190686) and for financial assistance.

Appendix A: Variable definitions
Information comes from church records digitized by the Federation of Swedish Genealogical Societies in the SwedishDeath Index, the 1950 Census, and purposely digitized historical records from the National Medical Board, the historical midwife journals, the Swedish yearbook of municipalities and the annual publication on poor relief. The data on railway stations come from Olofsson (1921).
Adult mortality All cause deaths between August 1918 and March 1919 in the age group 20-40 up to conception month in district i.
Child mortality All cause deaths between August 1918 and March 1919 in the age group 0-10 up to conception month in district i.
Influenza Urban Dummy variable taking on value one if a health district is classified as municipal or city, otherwise 0.
Midwives Numbers of midwives working in a health district (proxy of local medical infrastructure).
Married mothers Number of births in a year to married mothers in a district.
Single Number of births in a year to unmarried mothers in a district. First birth Number of births in a year to first-time mothers in a district.
Not first Number of births in a year to not-first time mothers. Poverty The share of the population living in public poorhouses in a district.
Taxable income Per capita taxable earnings as reported to tax authorities, normalized by 1917 prices.
Capital income Per capita asset yields, rents and dividends in a district as reported to tax authorities, normalized by 1917 prices.
Private property Per capita assessed value of private properties in a district, normalized by 1917 prices.
Local revenue Per capita public revenue in a district. Local assets Per capita value of public assets by December 31 in a district.
Local debt Per capita public debt by December 31 in a district. Population density Population per hectar of area of a district. Sami Share of population belonging to the Sami people in 1910 in rural districts.
Widowed rate Widow rate (incidence of new widowhood) relative to population in a health district.
Marriage rate Marriage rate (incidence of new marriages) relative to population in a health district.
High SES Dummy variable taking on value one if an individual is born with a last name defined as belonging to the nobility (aristocratic and Latin names) or bourgeoisie (last names including or ending with lund, berg, gren, quist or ström), otherwise 0.
Migration Net migration rates (positive numbers representing in-migration), migrants per 1,000 population.
Stillbirths Number of stillbirth per 1,000 births in a district.
Miscarriage Number of miscarriages per 1,000 births in a district.
Births Total number of births in a district. Railway Dummy taking on value one if the district had a railway station in 1918, otherwise 0.
No. of railway stations Number of railway stations in the district in 1918. Figures   Fig. B1. Birth numbers from different sources compared, as percent of births recorded in official population statistics.        Note: Annual data on health district level. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. Results from estimating SUR models for married/unmarried and first birth/not first birth separately, standard errors in parentheses. All regressions include district and year fixed effects and the log of the total number of births. Note: Monthly data on the individual level. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. Dependent variable: dummy variable taking on the value one if born with a surname representing high social status, and 0 otherwise. All regressions include district and month-year fixed effects. Standard errors in parentheses, clustered at the district level. Appendix C: Evidence supporting identification

Common time trend
In a difference-in-difference design the key identifying assumption is that fertility behavior in heavily and less affected areas would have followed a common time trend in the absence of the pandemic. This assumption is untestable, but having access to 43 months of pre-exposure data we assess its plausibility in different ways. Fig. C1 plots conception rates in the highest and lowest district quartiles in terms of influenza exposure. There is no significant difference in the trends before the flu (high mortality districts adjust the conception rates already in the spring of 1918, but the confidence intervals overlap) and a clearly diverging trend thereafter. We also generate similar time trend graphs for morbidity and mortality rates for different age groups (available upon request). All provide very similar evidence to that of Fig. C1.

Balancing tests
To further test the common time trend assumption we perform balancing tests and regress our influenza intensity measures on pre-flu values from the years 1916 and 1917. If the degree of influenza exposure is predicted by several baseline variables there is a concern that the intensity of the pandemic correlates with relevant unobservables. Guided by previous work on pandemics and seasonal influenza (see, e.g. Clay et al., 2019;Markowitz et al., 2019) we regress a number of different district pre-influenza characteristics on two different measures of the influenza intensity defined as the absolute (left column) and relative (right column) increases in adult influenza mortality compared to the preinfluenza periodin Table C1. For each observable characteristic, we also estimate differences in levels (the absence of which is not a requirement for identification) and in trends (which should not be related to influenza exposure). The estimates show that heavily affected districts had slightly lower birth rates before the pandemic, but there are no systematic differences in trends with regard to birth rates, midwife density, infant mortality and overall mortality. The only exception in this regard is midwife density in urban areas, where the trend of heavily affected areas was more negative.
In Table C2 we conduct the same balancing tests for some indicators of the local economy and public finances. Again, we find some evidence that heavily affected districts had different preinfluenza means of these variables, but the common time trend assumption cannot be rejected for average taxable earnings, local Monthly data on health district level. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. Labels Abs and Rel refer to whether excess mortality during the pandemic is described in absolute or relative terms. All dependent variables are based on health district data. Monthly data on health district level. The stars represent significance at the following p-values: *p < 0.1, **p < 0.05, ***p < 0.01. Labels Abs and Rel refer to whether excess mortality during the pandemic is described in absolute or relative terms. All dependent variables have been taken from municipality yearbooks and aggregated up to the health district level. public revenue, local public assets, local public debt and local poverty rates. Only for property values there is some evidence of diverging trends in rural areas, where more affected districts had a more positive trend in the pre-influenza period. Taken as a whole, however, the evidence provided in Tables C1 and C2 supports the main identification strategy: out of 60 tests, only 2 are significant at the 5 percent level, and 5 are significant at the 10 percent level. We also provide a set of figures illustrating the time trend in the highest and lowest district quartiles in terms of influenza exposure for the dependent variables used in the section where we examine potential mechanisms, i.e. marriage rate, widowed rate, married mothers, single mothers, first births, higher births and births to high SES parents. In none of the cases we note any significant differences in the trends before the flu. (4) for each quarter at a time. Clearly, there are no influenza effects before August 1918. The positive effect on conceptions after the influenza peak in rural areas is significant for a period of 19 months, but the estimate in fact stays positive for a total of 31 months. Around 1922 this trend is reversed and the districts most affected by the influenza pandemic exhibit lower conception rates than less affected districts. All together the event study graphs confirm and corroborate the findings of the regression analysis.