Poor housing quality and the health of newborns and young children

This study uses linked administrative data on live births, hospital stays, and census records for children born in Hungary between 2006 and 2011 to examine the relationship between poor housing quality and the health of newborns and children aged 1–2 years. We show that poor housing quality, defined as lack of access to basic sanitation and exposure to polluting heating, is not a negligible problem even in a high-income EU country like Hungary. This is particularly the case for disadvantaged children, 20–25% of whom live in extremely poor-quality homes. Next, we provide evidence that poor housing quality is strongly associated with lower health at birth and a higher number of days spent in inpatient care at the age of 1–2 years. These results indicate that lack of access to basic sanitation, hygiene, and non-polluting heating and their health impacts cannot be considered as the exclusive problem for low- and middle-income countries. In high-income countries, there is also a need for public policy programs that identify those affected by poor housing quality and offer them potential solutions to reduce the adverse effects on their health.


Population census
Information on housing quality comes from the 2011 population census of the HCSO.The anonymized dataset available for research purposes covers the entire population of Hungary.The 2011 census was conducted in October 2011, the reference date was October 1st, 2011.In addition to the information on individuals, the census also included a separate housing questionnaire, which measured both the characteristics of the dwelling and how long the respondent had lived there.We defined six binary indicators of poor housing quality: (i) lack of flush toilet, (ii) lack of a bathroom, (iii) lack of piped water, (iv) lack of hot running water, (v) adobe house, (vi) polluting heating.Heating with solid fuels (wood or coal) was considered to be a polluting heating method if each room was heated separately.These six variables were summed to create an index of poor housing quality, ranging from 0 to 6.A value of 0 means that the dwelling is not considered to be of poor quality in any of the aspects assessed, whereas a value of 6 indicates the worst quality dwellings, i.e., the dwelling is considered to be of poor quality according to all the indicators assessed.
Beyond housing quality, the Roma ethnicity of the parents is also derived from the 2011 census.The Roma are one of the largest and poorest ethnic minorities in Europe.In Hungary, it is estimated that more than 8 percent of the total population is Roma 72 .They face poverty, multiple disadvantages, and discrimination [73][74][75][76][77][78][79] .Ethnicity was measured by two questions, allowing for multiple identities.All mothers and fathers were categorized as Roma if they identified themselves as Roma in either of the questions on ethnicity.Information from the 2011 census also allows us to take into account the characteristics of the geographical micro-environment of the children.The smallest unit of the neighborhood in the Hungarian census is the census tract containing around 250 individuals on average.Each census respondent belongs to a census tract.

Inpatient care
The health care system in Hungary is single-payer system.The vast majority of individuals are insured, inpatient and outpatient care is financed by compulsory health insurance and is free of charge.Total opt-out is prohibited, but people can use private care for certain services.This is typically the case for outpatient services, but even there it is a small number of cases.Inpatient care in private care is most common for obstetric care.Private inpatient care for young children is practically non-existent.
Anonymized data on inpatient care are obtained from the medical records of the Hungarian National Healthcare Services Center (NHSC).For the period 2008-2017, we have information on all inpatient stays for children born between 2008 and 2016 in public healthcare.Inpatient care events can be transformed into a panel database using an anonymized identifier.For each event, the patient's sex, date of birth, place of residence (zip code), and the date of the care event are known.Specific health conditions can be identified by the International Classification of Diseases (ICD) codes.We focus on inpatient care at the age of 1-2 years, as the anonymized identifier may have changed during the first few weeks/months of life due to administrative reasons.
From the inpatient care records, we created three indicators of early childhood health: (i) the number of days spent in inpatient care for any disease (ICD codes: A00-Z99), (ii) the number of days spent in inpatient care for respiratory diseases (ICD codes: J00-J99), and (iii) the number of days spent in inpatient care for infectious diseases (ICD codes: A00-B99).Each of these shows the total number of days spent in hospitals over the two years from age 1 to the end of age 2.

Data linkage and sample selection
The population used to study the relationship between housing quality and health at birth consists of singleton births in the live birth dataset between September 2006 and August 2011.In the first step, we excluded births with missing information on health at birth.Next, the birth records were linked to the census data.Neither birth records nor census records contain any personal identifiers, such as social security numbers, that would help link them.The main variables used for the linkage are the exact date of birth of the child and mother, the sex of the child, and the place of residence of the mother at the time of the child's birth.We found some additional matches when we narrowed down the multiple matches by including other variables (father's birth date, and parents' education).In the linked dataset, we excluded records where moving into the census dwelling occurred after the start of pregnancy.(This was possible because one of the questions in the census asks how long the respondent has lived in the current dwelling.)Finally, records where any item of the housing quality index was missing were excluded.The number of observations excluded at each step of the sample selection is reported in Table A1 (Online Appendix A).While problems of missing key variables only occur in the case of 1-3% of observations, we lose around 10% of the sample at linking the birth registry to the census and around 30% when we exclude those who have moved since the beginning of their pregnancy.This, however, is a necessary step as we want to ensure that the housing conditions derived from the census characterize the mother's living conditions while pregnant.The final sample covers 253,929 children.
When we analyze associations with early childhood health, we have to work with a narrower sample that includes children born between January 2008 and August 2011.Beyond the data linkage steps described above, in this case, we require successful linkage to the inpatient care data, which results in excluding around 27% of the relevant original sample (Table A2, Online Appendix A).The main reason for this relatively high failure rate is that we were only able to use the following information to link inpatient care data: date of birth, sex, and place of residence.The final sample, which is used to examine the relationship between housing quality and early childhood health, consists of 107,934 children.We can form an understanding of the introduced bias by examining the evolution of the key outcome variables over the steps of the sample selection, which we report in Table A3 (Online Appendix A).The magnitudes of the induced differences are small.For instance, the final analysis sample has an around 20 g higher mean birth weight than the starting singleton dataset, so the final analysis sample contains information on children with slightly better health outcomes on average.Additionally, the observations for the analysis of early childhood health are even closer concerning health outcomes to the original sample: in terms of birth weight, the difference is only around 10 g.As we control for several observable characteristics in the regressions and the selection does not seem to impact the key outcomes substantially, it is likely that our results are not far from what we would estimate for the entire population.Nevertheless, we re-estimated our main results with inverse probability weighting in our robustness checks.In this exercise, the weights are derived from a probability model that runs on the baseline dataset (singleton births in the live birth registry) and predicts the probability of being included in the final analysis samples with all information available in the birth records.
Table 1 shows the descriptive statistics of the outcome variables and the index of poor housing quality for the two analysis samples of our study.The average birth weight is somewhat more than 3300 g, while the average gestation length is nearly 39 weeks.Around 6% of the newborns were born with a low birth weight (< 2500 g) or premature (before the 37th week of pregnancy), 5% of the sample have a low APGAR score, and the share of SGA newborns is almost 10%.The children in the early childhood health sample spent, on average, nearly two days in hospital between the ages of 1-2 years.Nearly one hospital day was for respiratory illnesses and 0.6 days for infectious diseases.The average score of the poor housing quality index is 0.5 in the health at birth sample and slightly higher (0.6) in the early childhood health sample.The descriptive statistics of the control variables are shown in Table A4 (Online Appendix A).

Methods
The association between poor housing quality and the health of newborns and young children is estimated by the following regression: where H is the health of child i, born in year y and month m, and living in census tract c.PHQI is the index of poor housing quality (ranging from 0 to 6), and β shows how one higher value of the index is associated with lower/higher health.When the outcome variables are health at birth, β indicates the influence of housing quality during pregnancy.When the outcome variables are early childhood health, β indicates the joint influence of housing quality during pregnancy and early childhood.
X denotes the vector of control variables.It includes the sex of the child, the mother's and father's age (13-17,  18-24, 25-29, 30-34, 35-39, 40-), education (primary or less, vocational, high school, tertiary), labor market status (employed, unemployed, on maternity leave, student, other), Roma ethnicity, and occupation (Hungarian standard classification of occupations codes).For the mothers, the marital status (single, married, divorced, widowed), the number of previous live births (0, 1, 2, 3, 4, 5 +), induced abortions (0, 1, 2, 3 +), and spontaneous pregnancy losses (0, 1, 2, 3 +) were also considered.For the occupation codes, the most detailed categories were considered.The classification system distinguishes nearly 500 occupations.For each of these four-digit occupation codes, a binary indicator variable was included in the regressions, allowing us to control for the effect of occupation on health in the most flexible way.Missing dummies for all control variables are also included.
Year-by-month fixed effects (ρ) control for those unobserved factors that uniformly affect the health of children born in the same year and month.Census tract fixed effects (τ) control for all unobserved locationspecific factors that do not change over the years studied and affect the health of children living in the same small neighborhood (e.g., quality and availability of outpatient and GP care in the neighborhood, quality of drinking water, etc.). (1)

Prevalence of low housing quality
Poor-quality housing is not an uncommon phenomenon among children in Hungary (Fig. 1).One-quarter of the children in the health at birth sample live in a home that does not meet at least one of the basic quality criteria we examined, and 4% of children live in a home that scores 5 or 6 on the poor housing quality index.The latter children lack access to basic sanitation facilities such as piped water, flush toilets, or bathrooms, and their homes are characterized by a heating system that is considered polluting.These results are qualitatively similar when examining the distributions in the early child health sample (Fig. A1, Online Appendix A).Importantly, these rates are significantly worse for disadvantaged children.19% of children of mothers with at most primary education live in very poor-quality housing (index scores 5-6), while only 29% live in a home that is not considered poor quality on any of the indicators assessed.For children of Roma mothers, these figures are 27% and 19%, respectively.These deprived groups represent significant segments of society.Children of mothers with at most primary education account for 16% of the sample, while children of Roma mothers account for 6%.These results clearly show that poor-quality housing can be quite widespread among the poorest members of society, even in a developed country like Hungary.
Figure A2 in the Online Appendix shows the prevalence of the components of the poor housing quality index.Polluting heating and homes made of adobe are the two most common quality problems, affecting 17.5% and 13.1% of children respectively.However, the prevalence of the other components is also not negligible, ranging from 3.1 to 6.7%.

Health at birth
Table 2 summarizes the estimated associations between poor housing quality and health at birth estimated using Eq. ( 1).The results show overall that the poorer the housing quality, the worse the health of newborns.A onepoint higher index value is associated with a 24-g lower birth weight and a 0.64 percentage-point increase in the chance of being born with a low birth weight.In terms of gestation length, a one-point higher index value of poor housing quality is associated with a 0.01-week shorter pregnancy length and a 0.18 percentage point higher chance of preterm birth.A one-point higher index value is also associated with a 1.4 and 0.1 percentage point higher chance of being born as a newborn with SGA and a low APGAR value, respectively.
These values are especially substantial when comparing children with minimum (0) and maximum ( 6) index values.The difference is 146.3 g for birth weight, 3.8 percentage points for LBW, 8.4 percentage points for SGA, 0.06 weeks for gestation length, 1.1 percentage points for PTB, and 0.7 percentage points for low APGAR.
The sensitivity of the results is explored by a series of robustness tests: different location fixed effects, the inclusion of additional control variables, weighting, and the use of a narrower sample.First, we experimented with ZIP code fixed effects instead of census tract fixed effects (Table A5, Online Appendix A).Second, we added further control variables that describe the household composition and characteristics in the 2011 census (Table A6, Online Appendix A).These were the number of household members of different ages, the proportion of employed and unemployed persons among 25-59-year-olds living in the household, the proportion of www.nature.com/scientificreports/tertiary and secondary education among 25-59-year-olds, the proportion of people speaking foreign languages (English, German) among 25-59-year-olds, the proportion of people with long-lasting disease or impairment among 25-59-year-olds, and floor space per inhabitant in the dwelling.These additional control variables help us to capture more accurately those permanent socioeconomic circumstances of the children's household that may have shaped their health at birth and may be correlated with poor housing quality.Third, we re-estimated the regressions with weights that represent the inverse probability of being included in the final analysis samples (Table A7, Online Appendix A).Finally, we restricted the sample to children born between September 2008 and August 2011 (Table A8, Online Appendix A).In this way, we were trying to ensure that housing quality measured in 2011 describes as accurately as possible the housing conditions in the fetal period.Due to housing renovations, the housing conditions in the fetal period may, in some cases, differ from the 2011 situation, but by narrowing the time window this risk is reduced.The main results are robust, none of these changes alter the conclusions.
Poor housing quality is associated with lower health at birth in all specifications.Similar results are obtained when, instead of simply summing the six indicators of poor housing, we calculate our index by summing the z-scores of the six items (Table A9, Online Appendix A).The strength of the associations is similar to the baseline result when interpreting the coefficients for a change of one standard deviation.For example, a one standard deviation increase in the poor housing quality index is associated with a 29.3 g lower birth weight in the baseline approach (24.39 × 1.2), and a 27.4 g lower birth weight when the index is calculated using the z-scores (5.96 × 4.6).In addition, significant associations are observed even when the continuous outcome variables are log-transformed (Table A10, Online Appendix A).
Next, we examined the potential nonlinearity of the relationship between poor housing quality and the indicators of health at birth.The seven values of the index of poor housing quality are grouped into four categories to ensure that the categories have a sufficient number of observations and that the standard errors are not too Table 2. Housing quality and health at birth.Controls: sex of the child, the highest level of education, labor market status, occupation code, ethnicity, and age of the mother and father, marital status of the mother, number of previous live births, induced abortions, and spontaneous fetal losses of the mother.Robust standard errors are in parentheses.***p < 0.01, **p < 0.05, *p < 0.1.2 summarizes these results.We can see that the estimated relationships can be considered mostly linear.Even at low index values, the health indicators of newborns are worse, and the marginal effects appear to be roughly constant.For some outcome variables, however, the estimates are quite noisy, so the coefficients cannot be considered statistically significant.Nevertheless, the general trend still holds in these cases as well.Finally, we estimated regressions using the six indicators of housing quality instead of the index of them (Fig. A3, Online Appendix A).These results show, not surprisingly, that the total value of the six coefficients is about six times the estimated coefficient of the housing quality index obtained by summing them (see Table 2).Although the standard errors of the individual coefficients are sometimes large, the point estimates suggest that lack of access to hot water, flush toilet, and bathroom, and polluting heating are the most strongly associated with most indicators of health at birth.For example, in the case of birth weight, lack of access to hot water is associated with a 48.6-g lower birth weight, while for polluting heating, lack of access to bathroom, and lack of access to flush toilet, the estimated coefficients are − 47.5, − 21.1, and − 16.9, respectively.However, for other outcome variables, the relative strength of associations may differ.

Early childhood health
The relationship between poor housing quality and early childhood health is summarized in Table 3.The results show that children living in poor-quality homes spent more time in hospitals than children living in good-quality homes.A one-point higher value of the index of poor housing quality is associated with 0.11 more days spent in inpatient care at age 1-2 years.The results in Column 2 and Column 3 of Table 3 suggest that this overall increase is mainly due to an increase in hospital stays for respiratory and infectious diseases.A one-point higher index value is associated with 0.07 and 0.03 more hospital days for respiratory and infectious diseases, respectively.
As earlier, the robustness of the results is tested by using alternative location fixed effects (Table A11, Online Appendix A), including further control variables (Table A12, Online Appendix A), and applying weights that correct for the non-random chances of being selected in the sample (Table A13, Online Appendix A).The main conclusions remain unchanged in all these specifications.We also estimated regression using the sum of the z-scores of the six housing quality items.If the coefficients are interpreted for a change of one standard tion, the results of a regression using the z-scores are virtually identical to the results of the baseline approach (Table A14, Online Appendix A).
In addition, a new specification is estimated in which indicators of health at birth are controlled for (Table A15, Online Appendix A).Since the estimated coefficients in this specification are only slightly lower than in the baseline regressions, it can be concluded that the association between poor housing quality and early childhood health is not simply due to worse health at birth but that the exposure to poor hygiene and sanitation in the early years plays an independent role.Examining the potential nonlinearity of the relationship between poor housing quality and early childhood health, we find that the relationship is rather nonlinear (Fig. 3).Large differences are observed between children living in very poor-quality homes (index score 5-6) and children living in good-quality homes (index score 0).The difference in hospital days for any disease is 0.7 days, whereas for respiratory and infectious diseases, it is 0.4 and 0.2 days, respectively.However, the early childhood health indicators of children with index scores of 1-4 are not particularly different from those of children with an index score of 0.
When we use the six indicators of housing quality instead of our poor housing quality index, we find that lack of access to piped water is the most strongly associated with the number of days spent in hospitals (Fig. A4, Online Appendix A).But it should be noted that some items are highly correlated, which makes it more challenging to interpret the results correctly, therefore our preferred specification is the one that uses the poor housing quality index.

Discussion
By linking birth certificates, census records and administrative data of inpatient care for children born in Hungary between 2006 and 2011, this paper addressed the question of how poor housing quality is associated with health at birth and in early childhood.Unlike most of the previous literature, it used data from a high-income country and showed that although the average standard of living is high in Hungary, poor housing quality is not at all a marginal problem, especially among disadvantaged children.It is worth pointing out that the index of poor housing quality includes, among other things, the lack of access to basic sanitation requirements such as a bathroom, running water, or a flush toilet in the home.One might think that these problems were almost non-existent in the 2010s in a member state of the European Union-especially since adequate housing was recognized as a fundamental human right in the Universal Declaration of Human Rights as early as 1948-but we showed that the vast majority of poor people have low-quality housing on at least one criterion.In fact, a fifth of children of mothers with at most primary education and a quarter of children of Roma mothers live in extremely poor-quality homes, characterized by a lack of piped water, flush toilets, bathrooms, adobe walls, and polluting heating.This means that the potential impacts should not be considered as a marginal issue but as a substantial public policy problem.
We showed that poor housing quality is associated with lower health at birth and a higher number of days spent in inpatient care at the age of 1-2 years.Importantly, the estimated health differences are especially immense when comparing children with minimum and maximum index values of housing quality.It is also important to note that in the case of early childhood health, the associations seem to be nonlinear.Only children with extremely low housing quality spend more time in hospitals than children living in good-quality homes.No difference is observed for those with lower index scores.
Direct comparisons of our results with previous literature are difficult because they usually examine different indicators of housing quality, and early childhood health indicators also differ.We have analyzed data from a high-income country, which means that the results may be different in lower-income countries, where poor quality housing is likely to have different specific characteristics.For example, solid fuel heating (and cooking) is likely to represent much higher levels of air pollution in lower-income countries.This may be one reason why exposure to indoor air pollution from solid fuel use is more strongly associated with birth weight in these countries than in Hungary 80 .At the same time, the importance of housing quality is indicated by the fact that our estimated differences in birth weight and LBW rates (between children with low and high index scores) are very similar to the black-white differences 81,82 (adjusted for socio-economic and behavioral factors).Some papers have also found similar differences between newborns from immigrant and non-immigrant backgrounds in high-income countries 83,84 , but in this case the pattern is less clear and the results are mixed, mainly due to selection [85][86][87] .
Housing quality seems to be more important for some outcomes than for others.Calculating the health differences, in terms of standard deviation, between children living in the worst-quality homes and children living in good-quality homes reveals that the difference is particularly large for birth weight, LBW, and SGA.For these variables, the estimated differences are between 0.16 and 0.28 standard deviation (Table A16, Online Table 3.Housing quality and early childhood health.The dependent variables are the number of days spent in inpatient care at the age of 1-2 years.Respiratory diseases = ICD-10 codes J00-J99.Infectious diseases = ICD-10 codes A00-B99.Controls: sex of the child, the highest level of education, labor market status, occupation code, ethnicity, and age of the mother and father, marital status of the mother, number of previous live births, induced abortions, and spontaneous fetal losses of the mother.Robust standard errors are in parentheses.***p < 0.01, **p < 0.05, *p < 0.1.It is worth pointing out that the relationship between housing quality and children's health was examined while controlling for local neighborhoods.The census tract fixed effects allowed us to control for any unobserved, place-specific factors that uniformly affect the health of all children living in the same small area.This means that our estimates do not include the impact of exposure to poor-quality housing in the neighborhood, which can also have a significant impact on the health of newborns and young children [88][89][90] .
Although at first glance the recommendations and policy conclusions from the results seem clear, i.e. that the housing conditions of the people affected need to be improved, many questions remain about how to do this.It is important to consider whether, once a dwelling has been renovated and basic sanitation needs have been met, without other changes, residents will be able to pay the increased overhead costs.It may also not be clear whose housing conditions should be improved, as in most cases low-quality housing is geographically clustered.Improving all the affected homes can be extremely costly, while selective refurbishment can lead to tensions within the community, creating external costs that may not have been anticipated.Furthermore, access to higher quality housing is not only available through renovation but also through moving.In this case, the potential impact of changes in the environment and the social network must also be taken into account.These are social policy dilemmas that are not easy to answer and solve.Programs aimed to improve housing conditions for the most deprived require careful planning and considerable expertise.However, it is also worth considering that investments in early childhood health are likely to pay off many times over in later life and can therefore significantly reduce the future costs of social security.
Our findings, based on high-quality administrative data, provide important evidence on the relationship between housing quality and the health of infants and young children, but they have limitations.Most importantly, we could estimate correlations, not causal relationships.Although we controlled for a number of factors, ranging from several characteristics of family background to time-invariant unobserved characteristics of the geographic microenvironment, there may still remain confounders that could be behind the observed relationship.Such factors may include health-related behaviors such as smoking, alcohol consumption or diet during pregnancy, the frequency of use of antenatal care, or young children's dietary habits.Of these unobserved confounders, maternal smoking during pregnancy and exposure to secondhand smoke are probably the most important.Unfortunately, our data do not contain information on these factors, but survey evidence suggests that even when smoking were included in the estimations, a sizeable association between poor housing quality and children's health would still remain (see Online Appendix B).In other words, the baseline control variables are likely to capture a large part of the variation in behavioral factors.Nevertheless, if we could control for all unobserved confounders, the estimated coefficients would be lower in absolute terms.At the same time, adverse conditions are likely to increase the likelihood of fetal losses, and these fetal losses are likely to disproportionately affect those in poor health, so fetuses who survive to live birth are positively selected 91,92 .This means that the association between poor housing quality and health may be somewhat underestimated in our analysis due to this selection issue.A finer measurement of housing quality would also be very useful.In developed countries, damp, moldy, or drafty dwellings may be an even more common problem than the indicators examined here.Temperature and humidity in the home may also be relevant for health.Finally, we would like to point out that although our study examined the relationship between housing quality and children's health, the effects of housing quality may be much broader than this.The lack of basic hygiene facilities and polluting heating can also affect outcomes, partly through health and partly independently, such as learning or general well-being, which may also have consequences for later adult life.This work was supported by the Hungarian National Research, Development and Innovation Office-NKFIH (Grant no.K-132484) and the "Lendület" program of the Hungarian Academy of Sciences (Grant no.LP2018-2/2018).The sources of funding had no role in the study design; in the analysis, and interpretation of data; in the writing of the article; and in the decision to submit it for publication.Barnabás Benyák provided outstanding research assistance.We thank József Hegedüs for his advice.The present study has been produced using the de-identified live birth and population census records of the Hungarian Central Statistical Office, and the de-identified inpatient care records of the Hungarian National Healthcare Services Center.The Databank of the Centre for Economic and Regional Studies helped in the preparation of the raw data.We thank the staff of the Databank for their support.The calculations and conclusions are the intellectual products of the authors.

Figure 1 .
Figure 1.Distribution of observations by values of the index of poor housing quality.In the health at birth sample.(A) N = 253,929, (B) N = 44,388, (C) N = 17,552.

Figure 2 .
Figure 2. Housing quality and health at birth.The reference category is 0. Shaded areas represent 95% confidence intervals.Control variables, year-by-month fixed effects and census tract fixed effects are included.

Figure 3 .
Figure 3. Housing quality and early childhood health.The dependent variables are the number of days spent in inpatient care at age of 1-2 years.Respiratory diseases = ICD-10 codes J00-J99.Infectious diseases = ICD-10 codes A00-B99.The reference category is 0. Shaded areas represent 95% confidence intervals.Control variables, year-by-month fixed effects and census tract fixed effects are included.