Environmental Noise and Effects on Sleep: An Update to the WHO Systematic Review and Meta-Analysis

Background: Nighttime noise carries a significant disease burden. The World Health Organization (WHO) recently published guidelines for the regulation of environmental noise based on a review of evidence published up to the year 2015 on the effects of environmental noise on sleep. Objectives: This systematic review and meta-analysis will update the WHO evidence review on the effects of environmental noise on sleep disturbance to include more recent studies. Methods: Investigations of self-reported sleep among residents exposed to environmental traffic noise at home were identified using Scopus, PubMed, Embase, and PsycINFO. Awakenings, falling asleep, and sleep disturbance were the three outcomes included. Extracted data were used to derive exposure–response relationships for the probability of being highly sleep disturbed by nighttime noise [average outdoor A-weighted noise level (Lnight) 2300–0700 hours] for aircraft, road, and rail traffic noise, individually. The overall quality of evidence was assessed using Grading of Recommendations, Assessment, Development, and Evaluations (GRADE) criteria. Results: Eleven studies (n=109,070 responses) were included in addition to 25 studies (n=64,090 responses) from the original WHO analysis. When sleep disturbance questions specifically mentioned noise as the source of disturbance, there was moderate quality of evidence for the probability of being highly sleep disturbed per 10-dB increase in Lnight for aircraft [odds ratio (OR)=2.18; 95% confidence interval (CI): 2.01, 2.36], road (OR=2.52; 95% CI: 2.28, 2.79), and railway (OR=2.97; 95% CI: 2.57, 3.43) noise. When noise was not mentioned, there was low to very low quality of evidence for being sleep disturbed per 10-dB increase in Lnight for aircraft (OR=1.52; 95% CI: 1.20, 1.93), road (OR=1.14; 95% CI: 1.08, 1.21), and railway (OR=1.17; 95% CI: 0.91, 1.49) noise. Compared with the original WHO review, the exposure–response relationships closely agreed at low (40 dB Lnight) levels for all traffic types but indicated greater disturbance by aircraft traffic at high noise levels. Sleep disturbance was not significantly different between European and non-European studies. Discussion: Available evidence suggests that transportation noise is negatively associated with self-reported sleep. Sleep disturbance in this updated meta-analysis was comparable to the original WHO review at low nighttime noise levels. These low levels correspond to the recent WHO noise limit recommendations for nighttime noise, and so these findings do not suggest these WHO recommendations need revisiting. Deviations from the WHO review in this updated analysis suggest that populations exposed to high levels of aircraft noise may be at greater risk of sleep disturbance than determined previously. https://doi.org/10.1289/EHP10197


Introduction
Sleep is a vital component of human life that serves many critical roles in physical and mental health and well-being. 1 Sufficient quantity and quality of sleep are requirements for optimal daytime alertness and performance, and high quality of life. 2 Experimental studies suggest that restricted sleep duration causes blood vessel dysfunction, 3 induces changes in glucose metabolism 4,5 and appetite regulation, 6 and impairs memory consolidation. 7 Accordingly, epidemiological studies have consistently found that chronic short or interrupted sleep is associated with negative health outcomes, including obesity, 8 diabetes, 9 hypertension, 10 cardiovascular disease, 11 all-cause mortality, 12 and poorer cognitive function. 13 Chronic insufficient or disrupted sleep is therefore of public health relevance, and sleep disturbance is considered a major adverse consequence of exposure to environmental noise. 14 In Europe, there is a substantial burden of disease from environmental noise, primarily from aircraft, road, and rail traffic. 15,16 In 2011, the World Health Organization (WHO) attributed the majority of this disease burden to noise-induced sleep disturbance, with 903,000 disability-adjusted life years lost annually in Western Europe alone. 14 Environmental noise is also a problem outside of Europe, for example, recent data from the U.S. Bureau of Transportation Statistics estimates that 41.7 million people in the United States are exposed to air and road traffic noise at 24-h average levels (L AEq,24h ) >50 dB. 17 This noise level, per conversion data from Brink et al. 18 is equivalent to a nighttime (2300-0700 hours) level of 45.3 dB (L night ), which is around or above the level associated with adverse effects on sleep. 15 Nighttime noise can fragment sleep structure by inducing awakenings and shifts to lighter, less restorative sleep. 19 Importantly, these effects do not seem to habituate fully, and arousals and awakenings induced by aircraft noise can occur even among chronically exposed individuals. [20][21][22] Although noise-induced sleep fragmentation and reductions in total sleep time are less severe than in sleep restriction studies, sleep disturbance by chronic noise exposure may lead to the development of disease in the long term. Experimental studies have found adverse effects of nocturnal aircraft noise on parameters of endothelial function, oxidative stress, and inflammation. 23,24 This points to the importance of noiseinduced sleep disturbance for cardiovascular disease risk, and, indeed, this is supported with epidemiological data where nighttime noise is more strongly associated with indicators of vascular stiffness and hypertension compared with daytime noise. 25 The ubiquity of exposure to environmental noise in industrialized nations, and the chronic nature of that exposure, therefore poses a significant threat to health. 26 In 2018, the WHO published recommendations for protecting human health from exposure to environmental noise. 15 These guidelines included strong recommendations for target nighttime noise levels to mitigate adverse effects of traffic noise on sleep, which were 45 dB L night for road traffic, 44 dB L night for rail traffic, and 40 dB L night for air traffic. These recommendations were based primarily on a systematic review and meta-analysis on the effects of noise on sleep, which included studies published up to the year 2015 only. 19 There has been continued and substantial interest and research in the domain of noise and sleep during the intervening years. We therefore updated the earlier systematic review and meta-analysis to include studies published up to the year 2021. This updated analysis is restricted to field studies on the effects of nocturnal traffic noise on self-reported sleep in adults, and it has the overarching aim of synthesizing updated exposure-response relationships for the probability of being highly sleep disturbed.

Methods
This review and analysis was prepared following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement guidelines. 27 The completed PRISMA checklist is given in Table S1. The review and analysis protocol was defined a priori and registered in PROSPERO (record CRD42021229587) before conducting any preliminary searches, screening of articles, or data extraction. The University of Pennsylvania institutional review board (IRB) determined that the study did not meet the definition of human subjects research given that no identifiable information was being obtained, and therefore review or approval of the study by the IRB was not required.
The analytic approach is described in detail below and was consistent with the previous WHO review, 19 with the following exceptions: a) Exposures were limited to traffic noise from aircraft, road, and rail traffic, and b) effects on sleep were limited to self-reported questionnaire outcomes. These form the basis of the highly sleep disturbed exposure-response relationships and calculations of the burden of disease by noise and are, therefore, are critical outcomes from a noise policy perspective. Studies on acute noise-induced awakenings using objective measures, such as actigraphy or polysomnography, were not included.

Eligibility Criteria
Studies were restricted to primary investigations in humans exposed to environmental noise from aircraft, road, and rail traffic at home. Studies investigating other sources, such as wind turbine noise or hospital noise, were excluded. Studies were eligible only if sound pressure levels were measured or predicted at the participant's home. Studies with subjective evaluation of the noise levels, distance to the noise source as a surrogate measure of noise level, or noise levels not specific to a participant's home address were excluded. A minimum of two different noise level categories were required so that exposure-response relationships for sleep disturbance could be constructed.
Studies were eligible if they employed prospective, retrospective, cohort, longitudinal, cross-sectional, or case-control study designs. Laboratory studies, intervention studies, or studies in which noise was introduced artificially were excluded due to low generalizability in real-world settings. Studies were restricted to original research published or accepted for publication in the year 2000 or later. Article language was restricted to English, Dutch, French, and German.
This review and analysis focuses on self-reported sleep disturbance by traffic noise. Eligible studies included at least one of the three most common outcomes of self-reported disturbance that were identified in the original WHO review 19 : • Awakenings from sleep • The process of falling asleep • Sleep disturbance Studies were eligible if they either explicitly mentioned noise as the source of disturbance, for example, "How often is your sleep disturbed by noise from aircraft?", or included more general sleep questions that did not explicitly mention noise, for example, "How often do you have difficulties sleeping?". So that the probability of being highly sleep disturbed could be determined, eligible studies were required to include outcome scales that indicated either the severity or the frequency of symptoms or disturbance on a nonbinary scale. A binary response scale was, however, permitted if the phrasing of the question was such that a binary response would indicate being highly sleep disturbed, for example, "Is your sleep highly disturbed by noise from road noise?". Studies reporting other measures of self-reported sleep not described above (e.g., perceived sleep quality, estimated total sleep time, morning sleepiness), and studies on objective sleep (e.g., polysomnography, actigraphy) or sleep medication use, were excluded.

Study Selection
All studies identified in the WHO evidence review 19 for which data were already available for meta-analysis were included in the updated synthesis. We also identified studies published later than the WHO review from a scoping synthesis by van Kamp et al. 28 Because van Kamp et al. 28 included studies published up to June 2019 only, we further searched four electronic databases (Scopus, PubMed, Embase, PsycINFO), to identify more recent relevant studies published up to 31 December 2021. This search was done with the same search terms and strategy from van Kamp et al. 28 that were relevant for traffic noise and self-reported sleep. The full electronic search strategy is given in Table S2. Any studies of which we were aware but that were not identified during the literature search were also screened for eligibility.
Two reviewers (M.G.S. and M.C.) independently and manually screened the title and abstract of each identified study against the study eligibility criteria. If eligibility could not be determined from the title and abstract alone, the full text was reviewed. Any differences in eligibility judgments were resolved by discussion and consensus, with input from a third reviewer (M.B.) if needed.

Data Extraction and Synthesis
The following variables were extracted by a single investigator from the original records for review by the authorship team: article title, authors, publication year, traffic mode, noise level, noise metric and time base, noise exposure methodology, sleep disturbance question(s) and response scale(s), study design, country, city, effective sample size, number of data points per respondent, and sleep disturbance point estimates. If data could not be extracted directly from the published articles and supplemental materials, we directly contacted all study authors for whom contact details were available to request data. We requested a list of relevant questions on sleep and the response scales used, the total number of respondents in 5-dB bins, and the percentage of respondents reporting being highly sleep disturbed in each 5-dB bin. We requested only these summary data, and no identifiable information on any study respondents was requested or obtained. If the study authors did not reply after they were sent two reminders, the contact was considered a nonresponse and the study was excluded.
The exposure variable of interest for the meta-analysis was average nighttime outdoor A-weighted noise level from a single traffic mode (air, road, and rail) during the night, hereafter termed L night , measured in decibels. A-weighting is a filter network that is used to simulate the nonlinear frequency response of human hearing. The night period was defined as 2300-0700 hours, in line with EU Environmental Noise Directive 2002/49/EC. 29 In studies where noise levels were reported as a different metric, we converted to L night using the conversion formulae from Brink et al. 18 given below. L night was not treated as a continuous variable but, rather, was categorized into 5-dB bins, following the approach used in the WHO review. 19 For open-ended noise level categories, we assigned a noise level that was 2.5 dB above or below the cutoff, for instance, <50 dB and >50 dB would be coded as 47.5 dB and 52.5 dB, respectively. The midpoints of each 5-dB bin were used as the noise exposure levels in the statistical analyses.
The primary outcome of interest was the probability of selfreporting high sleep disturbance for a given noise level. We a priori defined three separate domains of questions that were used to determine sleep disturbance. First, "awakenings from sleep," referring to the period between sleep onset and final awakening. These awakenings are defined as events where a participant wakes from sleep, regains consciousness, and recalls the awakening the following morning. Second, the "process of falling asleep," defined as the transition from wakefulness to sleep. Third, "sleep disturbance," defined as the internal or external interference with sleep onset or sleep continuity. Included studies had to address at least one of these domains in the form of at least one self-reported question. For each of these three question types, the coding of whether a respondent was highly sleep disturbed depended on the response scale used. For responses using 5-or 11-point scales referring to the severity of the disturbance, the top two and top three categories were, respectively, defined as highly sleep disturbed, following previous conventions for the International Commission on the Biological Effects of Noise (ICBEN) annoyance scale. 30 For responses that referred to the frequency of symptoms, a frequency of "often" or at least three times per week was considered as highly sleep disturbed because this frequency of difficulty sleeping is a diagnostic criterion of insomnia. 31 One study used a dichotomous filter question, "Do you have any trouble with your sleep?", to determine if a respondent would answer a question on the frequency of difficulty falling asleep. 32 Any responses of "no" to this filter question were coded as not highly sleep disturbed.

Study-Specific Exposure and Response Characterization
One study reported noise exposure as 24-h average levels (L AEq,24h ). 33 These noise levels were converted to L night using the following conversion equations 18 : Road traffic: L nightð23-07Þ = L AEq,24h − 4:7 dB, and Railway traffic: L nightð23-07Þ = L AEq,24h − 0:6 dB: One study reported road noise as the day-evening-night level (L den ), 34 which was converted to L night as follows 18 : L nightð23-07Þ = L den -8:3 dB: One study reported noise level as Livello di Valutazione del Aeroportuale (LVA), 35 which is similar to the day-night level (L dn ), except that the night period is 7 h (2300-0600 hours) rather than 8 h. 36 Formulae to convert directly from LVA to L night are unavailable; therefore, we made the following assumptions in converting to L night : The 1-h shorter night when using LVA means that the same exposure assessed as L dn will be lower because L dn applies a 10-dB penalty to the night period. We assume −0:7 dB given that that is the difference in L dn metrics with a 1-h difference in the night period (8 vs. 9 h) for aircraft noise. 18 We then incorporated this difference into an appropriate conversion equation to convert from LVA to L night 18 : L dn = LVA −0:7 dB; L nightð23-07Þ = L dn − 8:9 dB, and ∴ L nightð23-07Þ = LVA −0:7 dB −8:9 dB = LVA −9:6 dB: One study used a noise category that was 10-dB-wide (65-75 dB LVA). 35 We subdivided these data into 5-dB-wide bins, assuming (n)/2 respondents in each bin (35 respondents per bin) and the same prevalence of high sleep disturbance in each bin as in the 10-dB-wide category.
Two studies assessed noise exposure as both calculated longterm outdoor noise levels and measured indoor noise levels over 3-6 nights. 20,21 We used the calculated outdoor noise levels as the exposure metric to be consistent with other studies in the meta-analysis.
In one study, 21 sleep in the previous night was assessed repeatedly over several mornings. Because of these repeated measures, we first calculated the probability of being highly disturbed using all five to six responses per respondent. We then used these probabilities to determine the number of individuals that would have reported being highly sleep disturbed if only one response was obtained per person. In this way, each respondent contributed only a single data point to the analysis.
One study calculated exposure to railway traffic as including noise from trains, trams, and subways. 37 The questions regarding "sleep disturbance by tram/subway noise" and "sleep disturbance by train noise" in this study were therefore averaged into a single sleep disturbance variable.

Risk of Bias and Quality of Evidence
The risk of bias at the outcome level within individual studies was assessed using the methodology developed within the WHO review, 19 with the following two amendments to the assessment criteria (Table 1). First, in line with recommendations for crosssectional studies by the National Institutes of Health, 38 a study was considered at high risk of selection bias if the response rate was <50%, down from the 60% criterion in the WHO review. Second, bias due to the sleep measurement outcome was not assessed because our updated analysis focused on only a single sleep measurement outcome (sleep questionnaires), whereas the WHO review included also heart rate or blood pressure, actigraphy, polysomnography, and other objective physiologic measurements. The risk of bias in each domain was assessed independently by two investigators (M.G.S. and M.C.). All studies were included in the meta-analysis regardless of the bias assessment.
To evaluate heterogeneity between studies, we calculated odds ratios (ORs) for each outcome within each study using binary logistic regression in SPSS (version 26; IBM Corp.). For consistency with the WHO review, 19 the range of L night was not restricted in this analysis. Forest plots for all outcomes across studies were generated using RevMan (version 5.4.1; Cochrane Collaboration) using an inverse-variance (IV) random effects method. Heterogeneity between studies for each outcome was assessed using the I 2 statistic. We interpreted I 2 values using thresholds defined by the Cochrane Collaboration. 39 Publication bias across studies was investigated using funnel plots of the individual study estimates.
The quality of evidence across studies for the effects of exposure to aircraft, road, and rail traffic noise on self-reported sleep outcomes where noise was specified, and self-reported sleep outcomes where noise was not specified, was assessed independently by two investigators using the Grading of Recommendations, Assessment, Development, and Evaluations (GRADE) criteria. 40 Any differences in the risk of bias assessments for individual studies, or in the quality of evidence across studies for each outcome (GRADE), were resolved by consensus with input from a third investigator if needed.

Meta-Analytic Approach
The primary goal of the meta-analysis was to generate updated exposure-response relationships for the probability of high sleep disturbance for each of the three disturbance types (awakenings, falling asleep, and sleep disturbance) for each traffic mode (air, road, and rail). In line with the WHO review, 19 we also generated a combined estimate for high sleep disturbance across the three different types of disturbance questions, using the following approach: If a study included two or three relevant sleep disturbance questions, the combined estimate was calculated by averaging the responses to those questions for each respondent within a study. This approach was adopted so that each respondent would contribute only a single data point to the analysis of each separate outcome. If a study included only one sleep outcome, the combined estimate and the single study outcome assessed would be the same.
Data for individual studies were provided directly by the authors of each study, binned in 5-dB-wide noise categories. One line of data was created for each sleep disturbance question from each study respondent. For instance, if a study had 500 respondents in the noise category with a 47.5 dB L night midpoint, and 10% were classified as highly sleep disturbed, we generated 450 data lines with non-highly sleep disturbed respondents (binary outcome = 0) and 50 data lines with highly sleep disturbed respondents (binary outcome = 1). Each data line also carried the midpoint of the 5-dB L night -exposure category, a three-level categorical variable for traffic mode (air, road, and rail), a dichotomous variable indicating whether questionnaire data originated from questions that did or did not explicitly mention noise as a source of disturbance in the question for each traffic mode, dichotomous study location indicated a European or non-European study, and a study identification number.

Statistical Analysis
Exposure-response relationships were generated with the following approach: Random study effect logistic regression models with L night (midpoint of the noise exposure category) as the only explanatory variable were performed with the NLMIXED procedure in SAS (version 9.4; SAS Institute, Inc.). This approach accounts for the fact that respondents were clustered within studies, and the weight of a study increases with its sample size. Analyses were restricted to levels between 40 and 65 dB L night because of inaccuracy in predicting noise levels <40 dB and that the highest exposure limit common to all three traffic modes was 65 dB L night . Separate regression models were run stratified by the three traffic modes (air, road, or rail), four sleep disturbance outcome (awakenings, falling asleep, sleep disturbance, or combined estimate of all questions within a study), and the dichotomous noise-specificity of the disturbance question (noise mentioned or noise not mentioned), yielding a total 3 × 4 × 2 = 24 separate regression analyses. Estimate statements were used to generate point estimates and 95% confidence intervals (CIs). Data are reported as dose-response curves and as ORs per 10-dB increase in L night .
To investigate whether a response differed depending between European and non-European studies location, we added study location as a covariate to the logistic regression model and repeated the analysis for the combined estimates of sleep disturbance. These analyses were restricted to the four outcomes where both European and non-European data were available.
We performed a sensitivity analysis to investigate the risks of exposure bias on sleep disturbance. We repeated the logistic regression for the combined estimates of sleep disturbance, restricted between 40-65 dB L night , and stratified analysis by studies that were judged to have a low or high risk of bias in the exposure assessment.

Study Selection
Study identification, screening and selection are summarized in Figure 1. All 25 studies in the WHO review were included. 19 Twenty-one studies published between January 2014 and June  41 and German Noise-Related Annoyance, Cognition and Health (NORAH) 42 projects]. We manually extracted the study documents from project webpages 41,42 and judged both studies to be eligible for inclusion after undergoing the standard screening protocol. Two studies initially deemed eligible could not be included in the meta-analysis 43,44 because data could not be obtained or noise exposure specific to the home address was unavailable (Table S3). We therefore identified 11 studies in total published since the WHO review to include in the metaanalysis, 20,21,32,34,35,37,41,42,[45][46][47] in addition to the 25 studies included in the original review 19 (Tables 2-4).

Comparison with Previous WHO Review
The effective sample size for each sleep outcome and for each traffic mode, determined using all data in the updated analysis (responses from the WHO analysis plus the 11 newly identified studies) is compared against the sample sizes from the WHO analyses in Figure 2. Sample sizes for the combined estimates where responses to multiple questions were averaged within

Previous studies
Reports sought for retrieval (n = 2) Excluded from meta-analysis: Could not obtain data (n = 1) Noise exposure specific to home address unavailable (n = 1) Total studies included in metaanalysis (n = 36) Figure 1. Flow diagram of study identification, screening, and selection. "Study" refers to a data collection campaign including a defined group of participants and one or more outcomes. In one instance, a study was reported in multiple articles 41,42 and is counted as n = 1 study. "Report" is a journal article, preprint, conference abstract, study register entry, clinical study report, dissertation, unpublished manuscript, government report, or other document supplying relevant information about a particular study or studies.   Phan et al. 57 Shimoyama et al. 58 1,460 Ho Chi Minh City, Vietnam Same as above L night , 2200-0600 hours (67.5-77.5 dB) Phan et al. 57 Shimoyama et al. 58 479 Da Nang, Vietnam Same as above L night , 2200-0600 hours (57.5-67.5 dB) Phan et al. 57 Shimoyama et al. 58 680 Hue, Vietnam Same as above L night , 2200-0600 hours (52.5-72.5 dB) Phan et al. 57 Shimoyama et al. 58 777 Thai Nguyen, Vietnam Same as above L night , 2200-0600 hours (52.5-67.5 dB) Sato et al. 59 1 Phan et al. 57 Shimoyama et al. 58 1,458 Ho Chi Minh City, Vietnam Same as above L night , 2200-0600 hours (67.5-77.5 dB) Phan et al. 57 Shimoyama et al. 58 481 Da Nang, Vietnam Same as above L night , 2200-0600 hours (57.5-67.5 dB) Phan et al. 57 Shimoyama et al. 58 682 Hue, Vietnam Same as above L night , 2200-0600 hours (52.5-72.5 dB) Phan et al. 57 Shimoyama et al. 58

781
Thai Nguyen, Vietnam Same as above L night , 2200-0600 hours (52.5-67.5 dB) Sato et al. 59 1    studies are given in Figure S1. For all three traffic modes, our updated analysis includes a substantially higher number of respondents for all self-reported disturbance questions.

Sleep Disturbance by Noise: Individual Studies
ORs for the probability of being highly sleep disturbed by noise for each study are shown in Figure 3 (aircraft), Figure 4 (road traffic), and Figure 5 (railway). Also shown is the risk of bias assessment for each study (Table S4 for the rationale for each judgment). With a 10-dB increase in L night , there was a statistically significant probability of being sleep disturbed by noise for all three traffic modes. This increased probability was independent of whether noise was specifically mentioned in the sleep question. There were significant differences between the subgroups for each traffic mode, and the ORs were lower in studies that did not specifically mention noise. There was considerable heterogeneity (I 2 ≥ 75%) for all three traffic modes when the sleep question mentioned noise. There was substantial heterogeneity (50% ≤ I 2 ≤ 90%) between studies of aircraft and road traffic when the sleep question did not specifically mention noise. The heterogeneity between studies of railway noise was deemed unimportant (I 2 ≤ 40%) when the sleep question did not specifically mention noise.

Sleep Disturbance by Noise: Overall Analysis
The ORs for the probability of being highly sleep disturbed by nighttime noise, calculated using data from all studies and restricted to 40-65 dB L night , are presented in Table 5. When the question mentioned noise as the source of disturbance, there was a higher probability of being significantly disturbed by noise for all three outcomes, as well as for the combined estimate. When the question did not mention noise, significant relationships were observed only for aircraft and road noise, and for only some of the sleep disturbance outcomes. A substantial proportion of studies into road and railway noise were judged as having a high risk of exposure assessment bias when the question mentioned noise. We decided post hoc to perform a sensitivity analysis for these traffic types, to elucidate the influence of these risks of bias on sleep disturbance. There was a greater probability of being highly sleep disturbed by noise in studies with a low risk of exposure assessment bias compared with studies with a high risk of exposure assessment bias (Table S5). The ORs for the probability of being highly sleep disturbed, stratified by studies performed in Europe and outside of Europe, are given in Table S6. Analyses were restricted to aircraft, road, and railway traffic when the question mentioned noise, plus aircraft traffic when noise was not specifically mentioned, because these were the outcomes where sleep disturbance data were available for both locations. Non-European study respondents were more highly sleep disturbed by railway traffic when noise was mentioned in the question and by aircraft traffic when noise was not specifically mentioned. Non-Europeans were also less disturbed by road traffic when noise was mentioned. However, none of these effects were significant.

Exposure-Response Curves: Questions Specifically Mentioning Noise
The exposure-response curves for the probability of being highly sleep disturbed, derived using data from questions that specifically mentioned noise, are given in Figure 6. Second-order polynomial equations for each curve are given in Table S7. Disturbance was substantially higher for aircraft noise for all three disturbance questions than for road or railway noise of the same level. Disturbance was similar for road and rail noise at low noise levels, and it was slightly higher for railway noise than road noise at higher noise levels.
We compared the updated exposure-response curves to curves derived using only the 11 new studies published since the WHO review 19 (Figure 7). This was done for the combined estimate only, given that there was a limited sample size for certain sleep questions in these recent studies. For aircraft noise, the recent studies indicated a higher probability of being highly sleep disturbed compared with the analysis incorporating all available data. For road traffic noise, the point estimates were slightly higher at the highest noise levels in the recent studies compared with the overall analysis (2.6% higher at 65 dB L night ). For railway noise, the recent studies were essentially identical to the overall analysis. . Forest plot for the odds of being highly sleep disturbed by road noise per 10-dB increase in L night (combined estimate derived from all relevant outcomes within studies). Subgroups are presented for questions that mentioned noise as the source of the disturbance, and questions that did not specify noise as the source of the disturbance. Risk of bias: A: selection bias; B: exposure assessment; C: confounding; D: reporting bias. Green (+) denotes low risk of bias, red (-) denotes high risk of bias, yellow (?) denotes unclear risk of bias. Plots were generated using an inverse-variance (IV) random effects method across the full noise range for each individual study (not restricted to 40-65 dB L night ). Note: CI, confidence interval; df, degrees of freedom; L night , nighttime noise; NORAH, Noise-Related Annoyance, Cognition and Health.
The exposure-response curves calculated in the original WHO review 19 are given in Figure 6. Relationships for the sleep disturbance question were not calculated in the WHO review due to an insufficient number of studies at the time. Point estimates for aircraft noise are generally slightly higher in the present analyses compared with the previous relationships, particularly at higher noise levels, although they still lie within the 95% CIs of the WHO review. Point estimates for the falling asleep and combined estimate outcomes are almost identical for road and rail traffic in the present analysis compared with the WHO review. For each disturbance question and traffic mode, all of the previous curves lie within the 95% CIs of the updated analyses. As expected, given that no additional studies were included for awakenings by aircraft or road traffic, exposureresponse curves for these outcomes were identical to curves in the WHO review. Figure 5. Forest plot for the odds of being highly sleep disturbed by railway noise per 10-dB increase in L night (combined estimate derived from all relevant outcomes within studies). Subgroups are presented for questions that mentioned noise as the source of the disturbance, and questions that did not specify noise as the source of the disturbance. Risk of bias: A: selection bias; B: exposure assessment; C: confounding; D: reporting bias. Green (+) denotes low risk of bias, red (-) denotes high risk of bias, yellow (?) denotes unclear risk of bias. Plots were generated using an inverse-variance (IV) random effects method across the full noise range for each individual study (not restricted to 40-65 dB L night ). Note: CI, confidence interval; L night , nighttime noise; NORAH, Noise-Related Annoyance, Cognition and Health. Note: ORs were calculated in logistic regression models with L night included as the only fixed effect and study included as a random effect, restricted to the noise exposure range 40-65 dB L night . Models were run separately for each traffic mode and for sleep questionnaire outcomes that did or did not mention noise. The combined estimate was calculated using average responses of the awakening, falling asleep, and sleep disturbance questions within studies. CI, confidence interval; L night , nighttime noise; OR, odds ratio. In the L night range 40-65 dB for which ORs were calculated.

Exposure-Response Curves: Questions Not Specifically Mentioning Noise
The exposure-response curves for the probability of being highly sleep disturbed, derived using data from general sleep questions that did not specifically mention noise, are given in Figure 8. Second-order polynomial equations for each curve are given in Table S7. With increasing L night , there was a small increase in disturbance for all questions, although the gradient of the exposure-response curves was generally smaller compared with questions that mentioned noise ( Figure 6). The differences between the three traffic modes were also less clear compared with questions mentioning noise ( Figure 6).

Quality of Evidence for Being Highly Sleep Disturbed by Noise
Funnel plots of the combined estimate for each traffic mode are given in Figure S2. The plots were approximately symmetrical, indicating a low likelihood of publication bias. The GRADE profile for the assessment of the quality of evidence across studies is given in Table 6. In the assessment, we deemed that for the majority of studies to be considered high quality (study limitations domain), there should be a low risk of selection bias and also a low risk of exposure assessment bias. If there was a high risk for one or both of these biases in the majority of studies, then overall study quality was deemed low. The overall quality of evidence for nighttime noise from aircraft, road, and railway traffic was rated as moderate when the question mentioned noise. When the question did not mention noise, the quality of evidence was low for aircraft and road traffic noise and very low for railway noise.

Noise-Specific Sleep Disturbance
In an update to the latest WHO evidence review and metaanalysis for the effects of traffic noise on self-reported sleep disturbance, 19 we found significant exposure-response relationships for being highly sleep disturbed by nighttime aircraft, road, and railway traffic when the sleep questions explicitly mentioned noise. With increasing nighttime noise levels, and for all three traffic modes, there were increased probabilities of reporting awakenings, having difficulties falling asleep, or having disrupted or disturbed sleep. When the sleep disturbance outcomes were combined for each traffic mode separately, the resulting exposure-response curves for road and railway noise were very similar to those calculated in the WHO review ( Figure 6). The similarity in the exposure-response curves improves confidence in the earlier estimates, which informed recent WHO recommendations for nighttime noise from road (45 dB L night ) and rail (44 dB L night ). 15 For aircraft noise, our updated estimates show a higher probability of being highly sleep disturbed at high L night levels. At 40 dB L night , however, which is the WHO recommendation for nighttime aircraft noise, 15 our updated estimates closely match the point estimates from the previous evidence review. 19 The ORs for aircraft noise were lower than for both road and railway noise. This is a consequence of the properties of ORs as a relative measure, given that a much higher proportion of people were sleep disturbed by aircraft noise at low reference noise levels. The exposure-response curves show that aircraft noise was in fact more disturbing than road or rail noise of the same level. This finding, although also seen in the original WHO review, 19 is superficially surprising in light of experimental studies showing that aircraft noise is less disruptive to physiological sleep than road or rail traffic. 67 The reasons for higher self-reported disturbance by aircraft are unclear but could result from the timing of aircraft noise events. Nighttime noise levels from aircraft are typically dominated by passenger plane takeoffs and landings that occur at the very start and the very end of the night period (2300-0700 hours). The early night is a period when many people are trying to fall asleep, and the end of the night is a period when people may be awakened by noise more easily, or have greater difficulty falling back asleep after awakenings, because sleep pressure has been dissipated over the preceding night. Noise around these times could therefore have a greater impact on selfreported disturbance than at other times of night. Such an explanation is supported by the higher disturbance for specific questions on awakenings and difficulties falling asleep owing to aircraft noise.
It is also possible that the higher disturbance by aircraft is a result of exposure misclassification. In most studies, noise was assessed at the most exposed façade, and the exposure levels specifically in the bedroom are not known. Noise levels in the bedroom for road and railway traffic are most likely lower, on average, than at the most exposed façade, because bedrooms may be located on quieter sides of the building. There is probably less exposure misclassification for aircraft noise, especially for homes that lie under flight paths, given that the positions of aircraft as noise sources are more dynamic relative to the home. Finally, it is possible that particular characteristics of air traffic are somehow more disturbing than road or rail noise of the same level. Aircraft noise events have a much longer duration than the other traffic modes, and so there are longer windows to become cognizant of the noise and attribute it as a source of sleep disturbance. However, each of these explanations cannot be thoroughly explored without additional temporal, spatial, and acoustical data for the noise sources.

Non-Noise-Specific Sleep Disturbance
The probability of being highly sleep disturbed was less clear when studies used general sleep questions that did not mention noise. For those sleep outcomes, all ORs were in the same direction and >1:0, suggesting potentially increasing disturbance with noise level. However, the effect sizes were smaller compared with noise-specific questions, and they were significant for only a minority of outcomes (5 of 12) assessed across all traffic modes.
Differences in sleep disturbance between studies employing general sleep questions and studies that specifically mention noise could result from heterogeneity between studies generally, which is discussed in detail later. When a question mentions a particular traffic source, a respondent may be better able to correctly attribute noise-induced sleep disturbance to that source, which could also explain the higher effect sizes in studies mentioning noise. Misattributing noise as the reason for an endogenous sleep is also possible, for instance, if respondents awaken spontaneously in the absence of noise, and a noise event that is later recalled coincidentally occurs during the awakening bout. A further important effect modifier could be noise sensitivity. Because noise-sensitive individuals may be more likely to report sleep disturbance than their less-sensitive counterparts, 68-70 they might rate themselves as more sleep disturbed to questions explicitly mentioning noise.

Risk of Bias, Quality of Evidence, and Study Heterogeneity
Most newly included studies were rated as having a high risk of selection bias. In most cases, this was due to response rates being <50%. Low survey response rates in public health research are becoming increasingly common, 71 something that can increase Figure 6. Probability of being highly sleep disturbed (%HSD) by nighttime noise, determined via questions that mention noise as the source of disturbance, stratified by disturbance question and traffic mode. Exposure-response relationships were derived using all available data, from the original WHO review 19 and the 11 newly identified studies. Results of the present updated analysis (solid purple lines with dotted 95% CIs) are compared against results of the 2018 WHO review 19 (dashed orange lines with shaded 95% CIs). Relationships for the sleep disturbance questions were not calculated previously. Asterisks (*) indicate sleep outcomes for which no new studies have been published since the WHO review. Parameter estimates were calculated in logistic regression models with L night included as the only fixed effect and study included as a random effect, restricted to the noise exposure range 40-65 dB L night . Models were run separately for each traffic mode and disturbance question. The combined estimate was calculated using average responses of the awakening, falling asleep, and sleep disturbance questions within studies. Note: CI, confidence interval; L night , nighttime noise; WHO, World Health Organization. the risk of nonresponse bias. 72 However, nonresponse bias can occur in studies with both low and high response rates. 73 More important than response rates is that the survey responses are representative of the target population sampled, 74 and surveys can still be representative even with lower response rates. Lacking nonresponse analyses, we cannot be certain of the representativeness of the exposure-response relationships, although the high risk of selection bias in the included studies does not necessarily mean that the sleep outcomes are unrepresentative of the overall population exposed to noise. Further studies with increased response rates would decrease the likelihood of nonresponse bias.
Sensitivity analysis revealed that sleep disturbance was lower in studies with a high risk of exposure assessment bias. One possible explanation is that road and railway noise exposure in the bedroom was overestimated in studies judged to have a high risk of bias. This would, in effect, shift the exposure-response relationships to the right in these studies. Alternatively, differences in sleep disturbance could be confounded by the fact that all studies with high risk of exposure assessment bias were published between 2002 and 2010, whereas the low risk of bias studies were from published more recently, between 2013 and 2021. It is plausible that the higher probability of high sleep disturbance in newer studies is attributable to nonacoustical factors, such as changes in attitudes to noise. Temporal changes in selfreported response would align with observed trends for increasing annoyance by a given level of traffic noise, although these trends have been observed predominantly for aircraft rather than road or rail traffic. 75 There have also been changes in the acoustical character of noise, with newer vehicles being typically quieter but with noise occurring more often as traffic flows increase, which may negatively influence perceived sleep disturbance.
The overall quality of evidence differed between studies where sleep disturbance questions did or did not mention noise. The assessment of a moderate quality of evidence for sleep disturbance when the question mentioned noise agrees with the assessment in the WHO review. 19 When the question did not specifically mention noise, we graded the quality of evidence for exposure to railway noise as very low, again agreeing with the WHO review, and the quality of evidence as low for aircraft and road traffic noise, which is one level higher than the very low quality assessment in the WHO review. The reason for the upgrade for aircraft and road noise was due to the statistically significant trends for awakenings (road only), falling asleep, and the combined estimates, that were not found previously. Since the previous review, three major cross-sectional studies involving road traffic noise exposure, with a combined sample size of ∼ 29,000 respondents, were published. 34,37,47 The exposure-response relationships for non-noisedependent disturbance are thus more representative, and with substantially greater power, than previously found.
There was substantial heterogeneity between studies for all outcomes except studies of railway noise that employed general sleep questionnaires. The heterogeneity could result from variations in the specific phrasing of the sleep disturbance question across studies, even when ostensibly measuring the same outcome. There was also a diverse range of response scales, with 11-point numerical and 3-, 4-, or 5-point verbal scales used to assess sleep disturbance, further diversified by assessing either the severity or the frequency of disturbance. These questions were administered in 14 nations, hence, there may be linguistic differences in the interpretation of certain phrases, as well as cultural differences in attitudes to sleep or noise, as well as contextual differences generally across specific studies. Questions also differed in the reference time frame for sleep disturbance, varying from the last 12 months to the last 4 wk to referencing specific noise events or no time frame at all. Finally, self-reported response to noise can be modified by contextual factors separate from noise level alone, including lifestyle, access to green space, access to quiet areas, social interaction, recreational activities, and local economy of the neighbourhood. 76 One or several of these factors could have contributed to study heterogeneity within specific sleep outcomes, across studies of different traffic modes, or across studies that used either general sleep questions or noise-specific disturbance questions.

Study Location
The majority of new studies originated from Europe. All newly included studies of road 34 Figure 7. Exposure-response relationships for the probability of being highly sleep disturbed (%HSD) by nighttime noise for questions that mention noise. Curves are shown for the updated analysis that includes all available data (solid purple lines), and for analysis including only newly identified studies published after the WHO review 19 (dashed green lines). Data are calculated as the combined response using average responses of the awakening, falling asleep, and sleep disturbance questions within studies, determined as the within-study average of disturbance questions that explicitly mentioned noise as the source of sleep disturbance. Parameter estimates were calculated in logistic regression models with L night included as the only fixed effect and study included as a random effect, restricted to the noise exposure range 40-65 dB L night . Models were run separately for each traffic mode. Note: L night , nighttime noise; WHO, World Health Organization. Figure 8. Probability of being highly sleep disturbed (%HSD) by nighttime noise, determined via questions that did not specifically mention noise as the source of disturbance, stratified by disturbance question and traffic mode. Exposure-response relationships were derived using all available data, from the original WHO review 19 and the 11 newly identified studies. Dotted lines indicate 95% CIs. Parameter estimates were calculated in logistic regression models with L night included as the only fixed effect and study included as a random effect, restricted to the noise exposure range 40-65 dB L night . Models were run separately for each traffic mode and disturbance question. The combined estimate was calculated using average responses of the awakening, falling asleep, and sleep disturbance questions within studies. Note: CI, confidence interval; L night , nighttime noise; WHO, World Health Organization.
European, as were the majority of respondents across the studies of aircraft noise. 35,41,42,46 Although there was one study of aircraft noise from Asia, 32 and three from the United States, 20,21,45 these studies were small, with sample sizes ranging from n = 33 to n = 559. European studies continue to be overrepresented ( Figure S3). However, we found no statistically significant differences in sleep disturbance between European and non-European studies. On one hand, this suggests that there are, in fact, no differences in response between the two locations, that the degree of sleep disturbance by noise is rather global in nature, and that results of the present analyses are relevant outside of Europe. Conversely, the point estimates were rather different between study location for several sleep disturbance outcomes. This could indicate underlying cultural differences in attitudes to noise and perceived sleep disturbance that have not been captured in studies to date. Future investigations outside of Europe may uncover relevant international differences, as well as increasing confidence that existing studies are representative of noise-induced sleep disturbance among these underinvestigated regions.

Considerations on Self-Reported Sleep Disturbance
Our overall findings of self-reported disturbance by noise should be treated with some caution when considering noise-induced effects on sleep. Sleep is, by its nature, an unconscious process, meaning that its subjective evaluation is difficult. Accordingly, there can be substantial differences between self-reported and physiologically derived measures of sleep and noise-induced sleep disturbance. [77][78][79] Self-report may also suffer from recall bias, particularly when questions relate to the preceding 12 months, as was typical for questions on sleep disturbance in most studies included in our meta-analysis. It is likely that responses to questions on these timescales are driven by noise exposure in the more recent past. However, self-reported sleep outcomes are methodologically convenient and inexpensive to implement in field studies, meaning that we could perform the meta-analysis with a number of studies and sample size that would not have been possible if focusing on physiologic outcomes. As such, we have higher confidence in the accuracy and representativeness of the analysis. A further advantage is that self-reported disturbance is a valuable end point per se, considered by the WHO as a primary health outcome. By focusing our analysis on these outcomes, the results may be useful in future estimates of the disease burden of environmental noise 80 and recommendations for nighttime noise limits, 15 both of which derive from selfreported sleep disturbance. Finally, self-reported outcomes capture habitual sleep quality and disturbance, unlike physiologic measurements that capture only acute effects within single nights. It does, however, remain unclear how long-term self-reported sleep disturbance by noise relates to overall health. Future large-scale field studies with objective measurements of noise and sleep can offer mechanistic insights linking nocturnal noise, sleep disruption, and epidemiological observations of the development of cardiovascular and metabolic disease associated with exposure to environmental noise in addition to the derivation of exposure-response relationships. 81 A better understanding of the underlying pathophysiological pathways is especially valuable when considering vulnerable populations who may be at increased risk of disturbance. These vulnerable groups include the elderly, who can suffer from age-related declines in sleep quantity and quality 82 ; populations who may have already poor sleep quality, such as people with mental health or sleep disorders 83 ; and populations with obesity, who are at increased risk of suffering from obstructive sleep apnea, as well as having increased risk for cardiometabolic diseases generally. 84,85 Infants, children, and adolescents can also be considered as vulnerable groups because of the importance of sleep of sufficient quality and duration for development. 80,86,87 Limitations Data could not be obtained for two studies that were initially deemed to be eligible for inclusion. It is unlikely that including the study of road traffic noise 44 would have substantially altered the updated relationships because the sample size was low (n = 225) compared with the overall sample size for all road traffic studies (n = 31,738). Including the study of aircraft noise, 43 however, may have altered the sleep outcomes where noise was not mentioned for falling asleep, sleep disturbance, and the combined estimate. Compared with sample sizes of n = 4,379 for questions on falling asleep and just n = 195 for sleep disturbance questions that were included in our analysis, the omitted study had a sample size of n = 2,831, which would have reflected a substantial proportion of the total data set. The change in effect size that would have resulted from including this study is unclear because the relevant sleep-disturbance questions were single items that formed only part of the insomnia severity index (ISI). Because only overall results from the ISI were published, we do not know whether the relevant items were related to noise exposure, or to what extent.
A limitation of the meta-analysis was that many studies modeled noise exposure at the most exposed façade of the residence, and thus noise levels specifically at the bedroom façade are unknown. This means there is probably some exposure misclassification, with lower noise levels if the bedroom faces away from the noise source. This is more likely for road and railway noise than aircraft noise, with the latter source being less fixed in position relative to the bedroom. This would, in effect, shift the exposure-response curves to the left, leading to an increased probability of disturbance at lower noise levels, given that noise levels at the bedrooms are, on average, probably lower than assuming they are all positioned at the most exposed façade. This was supported by two studies in the meta-analysis that found that a lower proportion of respondents were highly sleep disturbed by road traffic noise 46 or reported insomnia symptoms 37 when the bedroom faced away from the street. Furthermore, disturbance was lower when the difference in noise level between the bedroom and the most exposed façade was greater. 46 A second limitation of the meta-analysis is that we did not adjust for potentially relevant effect modifiers. We adopted this approach so that results would be directly comparable to those in the WHO review, which also did not include such adjustments. 19 Sleep, and its disturbance by noise, may differ depending on age, sex, socioeconomic status, and preexisting sleep disorders. Further, sleep disturbance is not unique to noise exposure and may arise from other environmental stressors, including air pollution, [88][89][90] vibration (from, for instance, freight trains on railway lines), 91 light, 92 and temperature and humidity. 93,94 Future studies should consider the consequences of exposure to multiple stressors, and their interactions on sleep.

Summary of Evidence
Our main objective was to update the WHO meta-analysis on sleep disturbance by traffic noise with evidence published after 2015. 19 The main findings and quality of evidence are summarized in Table 7. There was a significant probability of being highly sleep disturbed by nocturnal noise from aircraft, road, and railway noise when the disturbance question mentioned noise, and the quality of evidence for these outcomes was moderate. Exposure-response curves were similar to the WHO review for road and railway noise in our updated analysis, and we found an increased probability of being highly sleep disturbed by aircraft noise at high noise levels. Because of the number of studies published since 2015, for the first time, we were able to generate exposure-response relationships for sleep outcomes that did not Note: ORs were calculated in logistic regression models with L night included as the only fixed effect and study included as a random effect, restricted to the noise exposure range 40-65 dB L night . Models were run separately for each traffic mode and for sleep questionnaire outcomes that did or did not mention noise. Data shown are for the combined estimates calculated using average responses of the awakening, falling asleep, and sleep disturbance questions within studies. L night , nighttime noise; OR, odds ratio. a In the L night range 40-65 dB for which ORs were calculated. explicitly mention noise. Point estimates for these outcomes were smaller than questions mentioning noise, and were often not statistically significant, and the quality of evidence was graded lower, from low to very low. Our findings do not suggest that the recent WHO recommendations for nighttime noise need to be revisited, 15 although quantitative assessments of sleep disturbance by aircraft noise at high exposure levels should consider the implications of our analysis. We did not find significant indications of international differences in sleep disturbance by noise, but future large-scale studies in non-European nations may necessitate a reevaluation of the evidence.