Proximity to livestock farms and COVID-19 in the Netherlands, 2020–2021

Objectives In the Netherlands, during the first phase of the COVID-19 epidemic, the hotspot of COVID-19 overlapped with the country's main livestock area, while in subsequent phases this distinct spatial pattern disappeared. Previous studies show that living near livestock farms influence human respiratory health and immunological responses. This study aimed to explore whether proximity to livestock was associated with SARS-CoV-2 infection. Methods The study population was the population of the Netherlands excluding the very strongly urbanised areas and border areas, on January 1, 2019 (12, 628, 244 individuals). The cases are the individuals reported with a laboratory-confirmed positive SARS-CoV-2 test with onset before January 1, 2022 (2, 223, 692 individuals). For each individual, we calculated distance to nearest livestock farm (cattle, goat, sheep, pig, poultry, horse, rabbit, mink). The associations between residential (6-digit postal-code) distance to the nearest livestock farm and individuals' SARS-CoV-2 status was studied with multilevel logistic regression models. Models were adjusted for individuals' age categories, the social status of the postal code area, particulate matter (PM10)- and nitrogen dioxide (NO2)-concentrations. We analysed data for the entire period and population as well as separately for eight time periods (Jan–Mar, Apr–Jun, Jul–Sep and Oct–Dec in 2020 and 2021), four geographic areas of the Netherlands (north, east, west and south), and for five age categories (0–14, 15–24, 25–44, 45–64 and > 65 years). Results Over the period 2020–2021, individuals' SARS-CoV-2 status was associated with living closer to livestock farms. This association increased from an Odds Ratio (OR) of 1.01 (95% Confidence Interval [CI] 1.01–1.02) for patients living at a distance of 751–1000 m to a farm to an OR of 1.04 (95% CI 1.04–1.04), 1.07 (95% CI 1.06–1.07) and 1.11 (95% CI 1.10–1.12) for patients living in the more proximate 501–750 m, 251–500m and 0–250 m zones around farms, all relative to patients living further than 1000 m around farms. This association was observed in three out of four quarters of the year in both 2020 and 2021, and in all studied geographic areas and age groups. Conclusions In this exploratory study with individual SARS-CoV-2 notification data and high-resolution spatial data associations were found between living near livestock farms and individuals' SARS-CoV-2 status in the Netherlands. Verification of the results in other countries is warranted, as well as investigations into possible underlying exposures and mechanisms.


Introduction
The first case of COVID-19 in the Netherlands, early 2020, was living in the province of Noord-Brabant in the south of the Netherlands. In the subsequent weeks, it became apparent that COVID-19 incidence remained elevated and largely concentrated in the southeast of the Netherlands (Fig. S1 panel A). The early COVID-19 hotspots were in part explained by the multiple introduction events involving infected persons who had returned from February holidays spent in northern Italy and Austria, with increased spread by intensive local Carnival celebrations at the end of February. The southeast of the Netherlands is however a region with a high density of livestock farms, including pig, poultry, mink, cattle, goats and others (Fig. 1, Fig. S1 panel C,D), and had also been the epicentre of the large goat-related Q fever pneumonia epidemic, running from 2007 to 2010 ( Fig. S1 panel B). This triggered societal discussion about a possible relation between COVID-19 and livestock farming.
After the first epidemic wave, and when testing upon symptoms became available to the general public on June 1, 2020, the initial hotspot in the southeast of the Netherlands was no longer visible. The changing geographical focal points for SARS-CoV-2 over different periods illustrates the complexity of studying environmental risk factors for an infectious disease that spreads through human-to-human transmission in human networks, driven by behaviour and other human factors. The changing spatial transmission patterns also show that if exposure to livestock plays a role, it will most likely play a minor role relative to other factors driving the transmission. However, questions on a possible association between exposure to livestock and COVID-19 remain, and plausible mechanisms do exist. For instance, previous studies learnt that residential proximity to livestock farms and exposure to (parts of) micro-organisms, endotoxins, and ammonia emitted from livestock farms, was associated with various positive and negative health effects, including modulated immune responses, increased risk of pneumonia (Kalkowska et al., 2018;Klous et al., 2018;Post et al., 2019), reduced lung function (Borlée et al., 2017), mortality from respiratory diseases (Simões et al., 2022), and a lower prevalence of asthma and COPD (Borlée et al., 2015;de Rooij et al., 2019;Post et al., 2021;Smit et al., 2014). Although the mechanisms underlying these associations are not yet fully understood, possibly, similar exposures and mechanisms also influence the probability to acquire SARS-CoV-2, or the probability to develop symptoms and therefore to be tested.
In this exploratory study, we investigated whether residential proximity to livestock farms was associated with individuals' SARS-CoV-2 Fig. 1. Nine-Panel plot illustrating the spatial patterns of home address distances to the livestock farm types included in this study. status in the period 2020-2021, and whether results were consistent across geographic regions, time periods, and age categories.

Study population
Our study was based on data on the Dutch population on January 1, 2019 and notified SARS-CoV-2 infected patients with an estimated symptom onset during 2020-2021.

Patients
Laboratory-confirmed SARS-CoV-2 is mandatory notifiable in the Netherlands. We included all notified patients with disease onset, a positive test result, or a notification date before January 1, 2022 in this study. For this, on February 4, 2022, data were extracted from the national database at the National Institute for Public Health and the Environment, to which all 25 Public Health Services in the Netherlands report the laboratory-confirmed SARS-CoV-2 cases. Patient data included age, self-reported date of disease symptoms onset and the sixdigit postal code of the residential address. Six-digit postal-code areas comprised on average about eighteen residential addresses. In case date of disease onset was not registered, the date of the laboratory test result or else the date of notification to the Public Health Service was used as proxy. Patients with disease onset in 2022 or for whom the database lacked information on postal code or age were excluded.

General population
The study population was based on the digital population on January 1, 2019 of 436,748 six-digit postal-code areas from all 355 municipalities of the Netherlands. To be able to compare patients to the general population taking into account age, a synthetic study population per sixdigit postal-code area was constructed with a spatially-representative age distribution. This synthetic population was created, because individual-level data including age and address is not publicly available due to privacy issues. The synthetic population consisted of an attributed number of individuals per six-digit postal code for five age-groups (0-14, 15-24, 25-44, 45-64 years, and 65 years and older). These numbers had to be attributed based on neighborhood data, which is a part of a municipality that is seen as homogeneously based on historical or urban planning characteristics, and the smallest area for which the age distribution is publicly available. Attribution was done using Hamilton's method (Kohler and Zeh, 2012). Population statistics were retrieved from Statistics Netherlands, which annually provides publicly available statistical data on municipalities, districts, and neighbourhoods (Prins, 2000). The Netherlands had 13,379 neighbourhoods on January 1, 2019 with an average population of about 1,300 inhabitants.
To combine the patient populations with the total synthetic population, the number of notified patients per combination of six-digit postal code and age group were removed from the total synthetic number of inhabitants to obtain the number of non-patients per combination of six-digit postal code and age group. In case this procedure led to a negative number of non-patients for the combination, the number of non-patients was set to zero.

Distance to livestock farms
The distances of the centroid of address locations per six-digit postalcode area to the nearest livestock farm were calculated with ArcGIS 10.6 (ESRI [Environmental Systems Research Institute], 2011), based on information about locations of farms according to the national agricultural census of April 1, 2018 (re: horses, pigs, and poultry); the identification and registration data of July 1, 2019 (re: cattle, goats, and sheep); a list of active farm locations from the Netherlands Food and Consumer Product Safety Authority from June 15, 2020 (re: mink and rabbits). Only farms with a minimum number of animals were included, as in previous studies (e.g. Borlée et al., 2015;Post et al., 2021;Smit et al., 2014): cattle farms (at least 5 animals), pig farms (≥25), poultry farms (≥200), goat farms (≥50), sheep farms (≥50), horse farms (≥20), rabbit farms (≥200), and mink farms (≥200). The distances to livestock farms of any type were based on the minimum Euclidean distance of the six-digit postal-code centroid to the closest farms of any type.
Based on the distances to various types of livestock farms, we defined exposure bands of 0-250, 251-500, 501-750 and 751-1000 m, with >1000 m as reference category, for livestock farms of any type. Farm types were each included in separate analyses. No analyses were performed including multiple farm types at once, due to expected multicollinearity issues, and because only a small selected population was expected to live within 10 km of each of the farm types.

Air pollution
Since ambient air pollution was indicated as a possible risk factor for COVID-19, we included air pollutants with the known largest health impact in the Netherland: particulate matter (PM 10 ) and nitrogen dioxide (NO 2 ). The modelled annual concentration of PM 10 and NO 2 for 2019 was assessed for each six-digit postal code by linking maps yielding 1 × 1 km 2 grids of the concentrations to all residential addresses in the Netherlands on January 1, 2019, then averaging the concentrations per six-digit postal-code area. These were included as continuous variables in the statistical models. The PM 10 and NO 2 concentrations were calculated with the Operational Priority Substances (OPS) dispersion model, which takes into account dispersion, transport, chemical conversion, deposition, and the meteorological conditions in 2019 (Sauter et al., 2018;Van Jaarsveld and De Leeuw, 1993). Source data for this OPS model were the 2018 emissions reported to the Netherlands Pollutant Release and Transfer Register (Wever et al., 2020) and emissions from neighbouring countries (EMEP/CEIP, 2020). NO 2 levels were derived from the modelled NO x concentration using an empirical relationship between measured NO x and NO 2 concentrations (van de Kassteele and Velders, 2006;Velders et al., 2014). The concentrations of particulate matter and NO 2 were calibrated against results from Air Quality Monitoring Networks at 35-45 rural and urban background locations in the Netherlands (number depends on the contaminant). The modelled ambient concentrations represented the average of spatial background concentrations with a resolution of about 1 × 1 km 2 (Velders et al., 2020).

Social status
To adjust for contextual confounding due to social status, we used a social status score at the four-digit postal-code level (on average, 1,987 residential addresses), derived most recently for 2017 by the Netherlands Institute for Social Research. This score is based on income level, unemployment rate, and education level (Knol, 1998), and was standardised with an average of zero and a standard deviation of one. A low social status for a postal code was indicated by a low score. The social status scores were applied to all six-digit postal-codes in the four-digit postal-code area (on average, 109 six-digit postal code areas per four-digit area) to be included as continuous variable in the statistical models.

Urbanisation
The degrees of urbanisation of the six-digit postal-code areas were based on statistical data for 2018 from Statistics Netherlands pertaining to 500 x 500-m squares. This indicator was based on the average address density within a radius of 1 km divided into five categories: very strongly urbanised (≥2500 addresses per km 2 ), highly urbanised (1500-2499 addresses per km 2 ), moderately urbanised (1000-1499 addresses per km 2 ), low-urbanised (500-999 addresses per km 2 ) to nonurban (<500 addresses per km 2 ).

Excluded postal code areas
The populations living in very strongly urbanised areas (≥2500 addresses per km 2 ) were excluded from the statistical analyses because the population in these areas tends not to live in proximity to livestock farms, and to exclude risk factors associated with living in these areas. The populations living in six-digit postal codes known to include a nursing home were excluded because of possible data quality issues with regard to the number of cases and non-cases in nursing homes. Locations of livestock farms in Belgium and Germany were not known, therefore, persons living in six-digit postal codes within two km of the border with Belgium or Germany were excluded.

Statistical analyses 2.4.1. Models
To estimate associations (odds ratio OR and 95% confidence interval CI) between SARS-CoV-2 status and distance to livestock farms, we applied logistic regression models using a random effect for the regional catchment areas of the 25 Public Health Services in the Netherlands. Included covariates were age category, social status score, and air pollution (PM 10 and NO 2 ).
Data management was carried out in Stata version 16 (StataCorp, 2019) and in R version 4.0.1 (R Core Team, 2022). Statistical analyses were performed with R, using glmer function of the lme4 package for the multilevel regression analyses (Bates et al., 2017).

Stratifications by epidemic phases, geographic regions and age categories
In addition to analyses of cumulative positive SARS-Cov-2 tests for the entire study period in all studied postal code areas, subsets of the data were explored. This was done to assess whether obtained results were robust over space, time, age groups, and urbanicity levels, and not mainly driven by differences in testing behaviour or virus exposure due to e.g. changing testing policies, triage in hospitals, virus variants, immunity build-up, and the levels of community transmission at the time when social distancing measures were implemented.
Separate analyses were performed for each of eight phases: four quarters of each studied year (January-March, April-June, July-September and October-December in 2020 and in 2021). Each individual notified with a positive SARS-CoV-2 test was assigned to a phase based on the date of symptom onset. In the prior phases they were treated as population control while in subsequent phases they were treated as immune and therefore excluded.
And separate analyses were performed for five age classes (0-14, 15-24, 25-44, 45-64 years, and 65 years and older), the four geographic regions in The Netherlands (North, East, West and South), according to the NUTS (Nomenclature of territorial units for statistics) level 1 classification in the European Union as depicted in Fig. S2.

Sensitivity analyses
To assess as sensitivity analysis whether inclusion of the covariates affected the outcomes, we performed the analyses without all covariates, without the age categories, without the social score, without the air pollution variables, and with only age as covariate. To assess if nonlinear relations between SARS-CoV-2 status and covariates may have affected the outcomes, we performed three sensitivity analyses using categories based on quintiles for social scores, PM 10 and NO 2 concentrations. Two additional sensitivity analyses were performed to evaluate whether the removal of border areas and areas with a nursing home influenced the results. Separate analyses were performed for the four degrees of urbanisation (highly urbanised, moderately urbanised, low-urbanised, and non-urban) and for the very strongly urbanised areas that had been excluded from the main study population. An additional analysis was limited to the more rural areas by excluding the highly urbanised areas. To assess if the lack of widely available testing during the first two quarters of 2020 affected the results, we performed the analyses for the period July 2020-December 2021.
Finally, to assess exploratory whether obtained results were driven by particular livestock species, we performed analyses separately per farm type: cattle farms (at least 5 animals), pig farms (≥25), poultry farms (≥200), goat farms (≥50), sheep farms (≥50), horse farms (≥20), rabbit farms (≥200), and mink farms (≥200). For livestock farms of any type and for each farm type, we excluded populations living over 10 km. Next, we defined quintile exposure bands for each farm type, resulting in an equal distribution of the study populations across the distance bands (Table S1). Case data for the entire study period were used, except for analyses on mink farms. Due to SARS-CoV-2 infections in mink farms that were first detected in April 2020, all mink were culled between June and December 2020. Therefore, the analyses on mink farms were performed separately for the first quarter of 2020 (before culling) and for the remaining months of 2020, before mink farming was prohibited in January 2021.

Privacy
Dutch Civil Code allows the use of health records for statistics or research in the field of public health under strict conditions. All data management and statistical analyses were carried out within the National Institute for Public Health and the Environment. COVID-19 is listed as a notifiable disease and SARS-CoV-2 is listed as a notifiable causative agent in law. No informed consent from patients nor approval by a medical ethics committee is obligatory for registry-based health studies of this type.

Population characteristics
As of January 1, 2019, the total Dutch population consisted of 17, 278, 309 inhabitants. The merger of population data with the individual patient data and exclusion of postal code areas with a missing social status and with nursing home presence, the very strongly urbanised areas and the 2 km border zone with Belgium or Germany led to a synthetic study population of 12, 628, 244 individuals with full data on exposure at residential address and potential confounders. The majority of exclusions was due to living in a very strongly urbanised area (87%), followed by border areas (10%).
The cumulative number of notified individuals with a positive SARS-CoV-2 tests on the date of database assessment (Feb 4, 2022) with disease onset in 2020 or 2021 was 3,190,258, of which 3,121,352 were individuals' first infections. Among these, 3,089,123 had available data on age and a valid postal code and could be merged with the population data. After exclusion of areas with a missing social score, presence of a nursing home, very strongly urbanised areas and the 2 km border zone with Belgium or Germany, 2,223,692 cases remained and were included in the study. Fig. S2 provides a map of the included postal code areas. Fig. S3 provides an epicurve of the notified positive SARS-CoV-2 tests in the study population, and Fig. S4 shows how these cases are distributed spatially during the eight phases that we distinguished in this study.
The characteristics of the study population, including the distributions of the distances to the nearest livestock farms, are depicted in Table 1. Over the period 2020-2021, individuals notified with a positive SARS-CoV-2 test were younger than the total study population. Further, notified SARS-CoV-2 cases lived less often in region North, in postalcode areas with on average higher air pollution levels and similar social status scores, and more often in the closer distance bands to livestock in comparison to the total study population. Of the total study population, 38.5% lived within 1 km (Table 1) and all lived within 6.5 km from the closest livestock farm.

Distance to livestock farms
People living close to livestock farms had a higher probability of being notified with SARS-CoV-2. Expressed in ORs, there was a trend from an OR of 1.11 (1.10-1.12) in the 0-250 m distance band to 1.07 (1.06-1.07) in 251-500 m, to 1.04 (1.04-1.04) in 501-750 m, and to 1.01 (1.01-1.02) in the 751-1000 m distance band compared to > 1000 m. Analyses per region, per age group and per phase of the epidemic show similar results, except for the third quarters (July-September) of both studied years (2020 and 2021), the quarters that included periods with the lowest incidences (Table 2).

Sensitivity analyses
Exclusion of covariates, inclusion of covariates as categorical instead of continuous variables, and the inclusion of border areas or areas with a nursing home did not affect the observed associations (Table S2). Also analyses for the period July 1, 2020-December 31, 2021 and analyses excluding of the highly urbanised areas gave similar results. Stratification by degree of urbanisation resulted in similar pattens albeit with lower ORs. Within the very strongly urbanised areas, which were excluded from the study, no associations were seen except one OR slightly below 1 in the 251-500 m distance band.
The use of quintile distance bands for the distance to livestock up to a maximum of 10 km resulted in ORs of 1.09 (1.08-1.09) for the quintile of people closest to farms, to 1.04 (1.03-1.04), 1.01 (1.01-1.02) and 1.00 (1.00-1.01), versus 1 for the reference band (Table 3). Results for cattle, goat, pig, poultry and rabbit farms were similar in size and pattern of the highest ORs for people living the closest to farms with ORs gradually decreasing along the larger distances to farms. For sheep farms, ORs followed a similar pattern but were somewhat lower. Results for mink farms showed a similar pattern, but ORs were different in size for the two studied periods: OR 1.25 (1.15-1.36) for January-March 2020 (prior to the culling of mink) and OR 1.03 (1.02-1.05) for April-December 2020 (during culling until the ban on mink farms) for people living 0-3.55 km from the closest mink farm. Results for horse farms were an exception, with all ORs smaller than 1 and without any pattern along the distances (Table 3).

Discussion
Livestock farm proximity enhanced the probability of individuals to be notified with a positive SARS-CoV-2 test. Results were similar across regions and age groups, and for six out of eight studied time periods. The same association between livestock proximity and SARS-CoV-2 status was observed for the first, second and fourth quarter of 2020 and 2021 but not for the July-September periods, when the incidence was lowest.
Sensitivity analyses per livestock species showed comparable results across farms of any type, cattle, goat, pig, poultry and rabbit farms. For sheep farms, ORs were somewhat lower, possibly related to sheep grazing at locations other than the farm itself. Exposure misclassification is likely to be more limited for other ruminants. Cattle commonly graze during part of the year, but less than sheep, while most goats remain within the farm. For horses, results were not in line with the other species, with ORs below 1. As for sheep, farm location may not be a good proxy for the location of horses. Also, these 'farms' often included horse riding schools, while many other riding school locations were not available to our study, making results for horses hard to interpret. Analyses for mink resulted in higher ORs for the first quarter of 2020, before the culling of mink due to SARS-CoV-2 infections in mink. For the three subsequent periods, during culling and until mink farming in the Netherlands completely stopped by the end of 2020, ORs were reduced. The coherence of results across species suggest that the observed associations are not driven by one particular farm type, despite that in the analyses per species, no adjustments were made for the proximity to other species.
When limiting the population to the more rural areas (<1500 addresses/km 2 ) ORs remained similar. But when stratifying analyses by urbanisation level, ORs lowered in all strata. This could point at residual confounding by urbanisation level. Studies have pointed at population density as a driver of SARS-CoV-2 transmission (e.g. Smith et al., 2021), but also physical and mental health morbidity, social neighborhood factors like cohesion and physical exposures like air pollution are correlated with the degree of urbanisation (Zock et al., 2018). Possibly, address density acts as a proxy for unknown mechanisms that may affect with a nursing home, in very strongly urbanised areas, or within 2 km from the border with Belgium or Germany. b Individuals with a notified positive SARS-CoV-2 test and with estimated symptom onset in 2020 or 2021, excluding those without available data on age and 6-digit postal code area.
individuals' COVID-19 status that are correlated with the level of urbanisation or rurality but are not directly related to livestock. When comparing ORs across regions, ORs are lower in region South, a region that includes areas with high farm densities, and where the initial COVID-19 hotspot occurred that triggered unrest about livestock farming and COVID-19. While this finding should not be over interpreted, as it was primarily intended as part of a sensitivity analyses, this could be a topic for in depth follow up investigations. Future studies could also take into account combined effects of proximity to multiple farms and farm size or other farm characteristics.
Ambient air pollution has been associated with SARS-CoV-2 incidence in the Netherlands (Andree, 2020;Cole et al., 2020) and in several other countries among others Canada (Stieb et al., 2020), USA (Sidell et al., 2022), Italy (De Angelis et al., 2021) and Germany (Prinz and Richter, 2022), mostly in ecological settings. We therefore included ambient PM 10 and NO 2 concentrations as covariates. Livestock production is one of the sources that contributes to air pollution, in particular to particulate matter (PM) concentrations, so the associations Table 2 Odds ratios (95% confidence interval) for categories of distance to nearest livestock farm (0-250, 251-500, 501-750, 751-1000 m, and over 1000 m) for being notified with a positive SARS-CoV-2 test. Results are for the Netherlands and for various subsets (eight quarters, four geographic regions, and five age groups). Excluded from the analyses are residential addresses in areas with a missing social score, with presence of a nursing home, in very strongly urbanised areas, or within 2000 m of the border of Germany or Belgium. p < 0.1, *p < 0.05, **p < 0.01, ***p < 0.001. a Individuals with a notified positive SARS-CoV-2 test and with estimated symptom onset before January 1, 2022, excluding those without available data on age and 6-digit postal code area. b Model with 25 regional catchment areas of Public Health Services as random effect adjusted for age category, social status of the four-digit postal-code area, and annual average concentration of PM 10 and NO 2 in 2019 of the six-digit postal-code area. c Regions according to NUTS 1.

Table 3
Odds ratios (95% confidence interval) for quintiles of distance to nearest livestock farm and for distance to different farm types for being notified with a positive SARS-CoV-2 test. Results are for the Netherlands and excluded from the analyses are residential addresses with a distance of more than 10 km from the respective type of farm, in areas with a missing social score, with presence of a nursing home, in very strongly urbanised areas, or within 2000 m of the border of Germany or Belgium. p < 0.1, *p < 0.05, **p < 0.01, ***p < 0.001. a Individuals with a notified positive SARS-CoV-2 test and with estimated symptom onset before January 1, 2022, excluding those without available data on age and 6digit postal code area. b The upper and lower bounds of the quintiles are specified per type of farm in Table S1. c Model with 25 regional catchment areas of Public Health Services as random effect adjusted for age category, social status of the four-digit postal-code area, and annual average concentration of PM 10 and NO 2 in 2019 of the six-digit postal-code area. d Quarter 1: January-March 2020, before culling of mink; quarter 2-4: April-December 2020, during culling of mink and before mink farming was prohibited.
with distance to livestock farms may have been over-adjusted. However, sensitivity analyses without PM 10 and NO 2 gave similar results (Table S2) suggesting that overadjustment due to general air pollution is not an issue. Using modelled livestock-specific PM concentrations to better disentangle the different PM fractions, and to also account for example for the presence of multiple farms simultaneously as in Post et al. (2021), could be a refinement in follow-up research.
The results of our study are in line with previously reported associations between proximity to livestock farms and various health outcomes, including lower respiratory infections, where multiple livestock species have been implicated (Freidl et al., 2017;Kalkowska et al., 2018;Klous et al., 2018;Post et al., 2019;Poulsen et al., 2018;Simões et al., 2022;Smit et al., 2012). Hypotheses about underlying biological and physical mechanisms have been proposed. For instance, persons living in livestock areas having an enhanced responsiveness to livestock specific particulate matter (PM) including microbial contaminated PM, or Bio-PM triggering innate immune responses, possibly contributing to airway diseases (Liu et al., 2019;Poole and Romberger, 2012;Sahlander et al., 2012). Possibly, a similar mechanism might enhance the risk of SARS-CoV-2 infection (Diamond and Kanneganti, 2022). But this requires further investigation. Also, health conditions associated with exposure to livestock, such as a reduced lung function, may lead to more severe symptoms upon a SARS-CoV-2 infection, and possibly a higher inclination to be tested. Information on hospitalisations and deaths was not included in this study, but can be used in follow up studies. There is speculation that ambient particulate matter could transport virus particles and therefore increase SARS-CoV-2 transmission (for example Bontempi, 2020;Setti et al., 2020). However, we investigated proximity to livestock explicitly, while including ambient PM 10 concentrations as covariate. Since multiple sources contribute to PM 10 concentrations, and sensitivity analyses without air pollutants as covariates resulted in almost identical results, it seems unlikely PM 10 in itself would be the main explanation of the observed patterns for livestock proximity.
Of the types of farms included in this study, only mink farms have been shown to be infected by humans with SARS-CoV-2 (ECDC, 2020; Enserink, 2020;Oreshkova et al., 2020). Whole genome sequences provided evidence of mink-to-human transmission following genetic evolution in the animals . But spill-back of a mink sequence into the community, as occurred in Denmark (Hammer et al., 2021), was not observed in the Netherlands (Oude Munnink et al., 2020) so this route is unlikely to explain our study results for mink farms. In our study, we found similar results for multiple time periods and regions, also in absence of mink. This means that mink were not the main driver of the study outcomes. A similar reasoning applies to the former goat-related Q fever epidemic, of which the main affected area overlapped with the initial COVID-19 hotspot (van Gageldonk-Lafeber et al., 2021;Weehuizen et al., 2022). The associations that we found were not limited to the former Q fever areas.
In recent years, outbreaks of animal coronaviruses have occurred in e.g. pigs (porcine epidemic diarrhoea virus: PEDV) (Dortmans et al., 2018), poultry (de Wit et al., 2021) and horses (equine coronavirus) (Zhao et al., 2019), and many other animal coronaviruses are endemic in the Netherlands and worldwide and present in the environment (Decaro et al., 2020;Khamassi Khbou et al., 2021). One consideration is whether these animal coronaviruses, when inhaled and present on mucus, could result in false positive SARS-CoV-2 tests specifically in people around livestock farms. However, this possible explanation of our study results seems very unlikely as we expect this would have been noticed given ongoing whole genome sequencing activities worldwide.
Our study was able to use individual patient data and the six-digit postal-code of the address of SARS-CoV-2 cases, avoiding some of the inherent limitations of studies that rely on publicly available information at higher aggregation levels (Heederik et al., 2020;Villeneuve and Goldberg, 2020). However, we could not control for individual factors such as comorbidities, household income and education level.
The main challenge of the study was to avoid possible interference by local, under-the-radar, virus introductions and spread. During the start of the epidemic, SARS-CoV-2 was introduced unevenly frequent across the country, for example by persons returning from February 2020 holidays, and locally amplified by carnival celebrations, but data to reconstruct such spread across networks are sparse. The areas with the highest level of transmission at the moment of the implementation of control measures (lockdown) may have happened to coincide with, in this case, intensive livestock production. Other factors that may have influenced the dynamics of the epidemic include weather conditions (e. g. rainfall, temperature) fluctuating over time across the seasons (Smith et al., 2021), immunity build up in the population and upcoming new variants. When exposure to the virus is unknown, risk estimates for the incidence may be biased and may change as an epidemic progresses (Koopman et al., 1991;Villeneuve and Goldberg, 2022). While all reported SARS-CoV-2 cases were available to this study, not all infected individuals were tested or reported. Testing policies varied and changed substantially across settings (e.g. for healthcare workers, children, people in nursing homes) and over time (restrictive at first, broader later). Also, people living closer to testing facilities, for instance in the cities, were more prone to be tested than those living further away (Statistics Netherlands, 2021). This was especially relevant during the initial months after opening of testing streets in June 2020. One could speculate that together with lower case numbers, possibly due to summer weather conditions, distance to the testing streets in the third quarter of 2020 may result in OR's smaller than 1 for that period. The effect of testing can be seen in the epicurves in Fig. S3, which show that particularly in the first phases of the pandemic testing capacity was very limited. It is unknown if selective underdiagnosis and underreporting affected our results, since it is not known yet if the underreporting is related to proximity to livestock farms.
To address the issues related to virus transmission and underreporting, we performed the analyses for several phases over a period of two years, for several regions, and in various sensitivity analyses. Results for proximity to livestock were consistent, however with ORs turning lower or below 1 during the third quarter of the year, and with ORs turning lower in analyses stratified by urbanisation level.

Conclusions
This study suggests that proximity to livestock was associated with individuals' probability of being notified with a positive SARS-CoV-2 test in 2020-2021 in the Netherlands. This result adds to a range of other respiratory health effects that have been found to be associated with proximity to livestock farms. As mechanisms underlying these effects are only limitedly understood, while a considerable proportion of the Dutch population lives in the proximity of a livestock farm, more research regarding possible biological and physical explanations and their interactions is warranted. Moreover, better insight in a potential relation between SARS-CoV-2 infections and proximity to livestock farms requires international replication and verification, as well as follow-up observational studies with more advanced methods to account for individual risk characteristics such as comorbidities, exposure to multiple farms, and for the underling human behaviour and transmission and probabilities of testing and reporting.

Funding
This study was funded from the regular budget (project V/150207/ 20/RI), COVID-19 budget (projects D/111001/01/CO and V/190035/ 22/EB) and the Strategic Program (SPR) (project S/113002/01/IC) of the National Institute for Public Health and the Environment (RIVM), made available by the Ministry of Health, Welfare and Sport of the Netherlands The funding sources had no role in the study design; in the collection, analysis and interpretation of data; in the writing of the manuscript; and in the decision to submit the article for publication. Declarations of interest: none.

Declaration of interest
None.