Food safety vulnerability: Neighbourhood determinants of non-compliant establishments in England and Wales.

This paper utilises logistic regression to identify ecological determinants of non-compliant food outlets in England and Wales. We consider socio-demographic, urbanness and business type features to better define vulnerable populations based on the characteristics of the area within which they live. We find a clear gradient of association between deprivation and non-compliance, with outlets in the most deprived areas 25% less likely (OR = 0.75) to meet hygiene standards than those in the least deprived areas. Similarly, we find outlets located in conurbation areas have a lower probability of compliance (OR = 0.678) than establishments located in rural and affluent areas. Therefore, individuals living in these neighbourhoods can be considered more situationally vulnerable than those living in rural and non-deprived areas. Whilst comparing compliance across business types, we find that takeaways and sandwich shops (OR = 0.504) and convenience retailers (OR = 0.905) are significantly less likely to meet hygiene standards compared to restaurants. This is particularly problematic for populations who may be unable to shop outside their immediate locality. Where traditional food safety interventions have failed to consider the prospect of increased risk based on proximity to unsafe and unhygienic food outlets, we re-assess the meaning of vulnerability by considering the type of neighbourhoods within which non-compliant establishments are located. In-lieu of accurate foodborne illness data, we recommend prioritised inspections for outlets in urban and deprived areas. Particularly takeaways, sandwich shops and small convenience retailers.


Introduction
Historically, public health interventions, including those in the food safety domain, have focused their attention on distinct and well-defined populations with specific socio-demographic characteristics. These include young children, pregnant women, individuals with Limiting Long-Term Illness (LLTI), and people aged over 65 (Lund and O'brien, 2011). Whilst national-scale outbreak data (GOV.UK, 2019) evidences that these individuals are more susceptible to foodborne disease, these data host a myriad of problems. Mainly they are inaccurate and unrepresentative of the whole population. National surveillance data not only severely underestimate the true incidence of foodborne illness, but also exhibit biased towards the aforementioned groups who are increasingly likely to visit their GP. Therefore, traditional food safety interventions often fail to address situational vulnerability, whereby certain factors increase an individual's exposure to risk.
In the context of food safety, these factors could describe behaviours that increase the likelihood of contact with a foodborne pathogen, for example, frequently eating at food establishments that do not comply with recommended hygiene practices, or consuming food after its use-by date, when it is no longer safe to consume. Whilst studies have been undertaken to capture people's attitudes towards food (Food Standards Agency, 2016), identifying spatial patterns of risky food behaviours at scale is difficult, and comprehensive data are often unavailable. Alternatively, exploring situational vulnerability through the location and incidence of non-compliant establishments is possible as the Food Standards Agency (FSA) publish hygiene-related data for all businesses serving food in the UK. Many studies have mapped negative environment features with a view to understand their effect on health outcomes; however, limited studies have investigated associations between hygiene scores and demographic data, particularly in the UK setting. This paper uses data from the Food Hygiene Rating Scheme (FHRS) alongside small area socio-demographic data and neighbourhood characteristics to support the enforcement of food standards by identifying determinants of non-compliant food establishments in England and Wales.

Vulnerability to foodborne illness
An estimated 1.7 million cases of foodborne illness occur each year in the UK, resulting in 22,000 hospital admissions and 700 deaths (O'brien et al., 2016). Of these cases, the majority occur among groups of people considered inherently vulnerable. These groups are often immunocompromised, putting them at higher risk of contracting infection. For example, individuals aged over 65 experience age-related deterioration of the immune system diminishing the ability to protect against dangerous foodborne pathogens. Additionally, decreased gastric acid production and slow bowel motility is common for this age group, prolonging exposure of the colonic tissue to toxins and further increasing susceptibility (Smith, 1998).
Alongside individuals aged over 65 years, prenates and children under the age of five are also considered high risk. A study estimating the burden of foodborne illness in the USA found that Salmonella was the leading cause of bacterial illness in children. Infections among children contributed 40% of total Salmonella cases and accounted for 60% of hospitalisations and deaths (Scallen et al., 2011). With low body mass, only a small quantity of pathogen is required to cause infection, and as the body is extremely sensitive to small amounts of fluid loss, dehydration is a common and dangerous side effect. People with a LLTI or chronic disease are also at high risk. Individuals suffering prolonged illness or taking immunosuppressant drugs suffer weakened defences, reducing the body's ability to fight infection (Rosenblum et al., 2012).
Whilst Public Health England reporting data suggests that groups with under-developed or compromised immune systems are 2.6 times more likely to contract a foodborne illness (Lund and O'brien, 2011), these numbers are potentially misleading if not scrutinized further. The data do not account for cases where individuals do not visit a medical practitioner, or whereby a sample is not submitted for laboratory testing. Predicated upon bias and inaccurate data, food safety interventions have focused on the inherently vulnerable, and fail to consider populations at risk due to negative neighbourhood features. These features include non-compliant food establishments where studies have shown that outbreaks are twice as likely to occur compared to establishments which comply with the FSA's recommended hygiene practices (Poppy, 2017;Fleetwood et al., 2019). As 60% of all foodborne illness cases occur when consumers eat outside the home (Jones et al., 2017), an individual is far more likely to contract an infection if they frequently eat at non-compliant establishments. Jackson and Meah (2017) discuss the importance of considering both the situational and contextual nature of vulnerability in the domain of foodborne illness. They advise that an alternative view of risk should be adopted whereby ' … particular pathways and practices are emphasized rather than, or in addition to, the current emphasis on the inherent vulnerability of particular socio-demographic groups' (Jackson and Meah,p.91). Whilst identifying and assessing risky food practices at scale is problematic, national FHRS data are available for the UK, allowing spatial associations with demographic groups to be investigated. A number of similar studies have explored geographical determinants of disease and public health ailments, but limited studies have looked at predictors of food safety compliance; we discuss this further in section 2.2.

Geographical determinants of health
As the role of geography has become increasingly innate in public health and epidemiological research, significant emphasis has been placed on assessing the role of neighbourhood features as determinants of health. Demonstrating how distance from a feature of interest dictates the likelihood of contracting a disease, or suffering ill health, has been the focus of many studies. For example, Green et al. (2018) presented a multi-dimensional index, known as the Access to Healthy Assets and Hazards (AHAH) index, comprised 14 health-related features, including fast food outlets, which are thought to often feature in foodborne illness outbreaks. This study found a significant association between hazardous areas and a decrease in mental wellbeing, showcasing an example of mapping situational vulnerability. In recent years, there has been substantial investment in examining the impact of fast-food access on obesity (Wilkins et al., 2019). Specifically, studies have scruntinised the effect of unhealthy food landscapes on Body Mass Index (BMI) (Burgoine et al., 2018;Richardson et al., 2015); the incidence of colorectal cancer (Canchola et al., 2017); and the development of diabetes (Polsky et al., 2016).
Similar health related studies have explored the amenity of takeaway alcohol and its impact on consumption (Sherk et al., 2018) and evaluated the role of health care service accessibility on health outcomes (Goyal et al., 2015;Joseph and Boeckh, 1981;Etzioni et al., 2013). Although the literature relating to geographical determinants of vulnerability is multidisciplinary and comprehensive, geospatial and national scale studies relating to food safety, particularly in the UK setting, are limited. Darcey and Quincey (2011) investigated Critical Health Violations (CHV) for food outlets in Philadelphia, United States, by demographic group. This study found an increased number of food outlets in more deprived areas, and that these were subject to a higher number of public health inspections. This study found food establishments in less-deprived areas and areas with high Hispanic populations had a larger number of CHV compared to other demographic groups. They also found a decrease in the average number of days between public health inspections for facilities in primarily Hispanic and African American areas. In a similar study, Pothukuchi et al. (2008) investigated demographic patterns for CHV in Detroit, Michigan, however this study's findings contradict those of Darcey and Quincey (2011). Pothukuchi, Mohamed and Gebben (2008) found that facilities in deprived neighbourhoods and neighbourhoods with high African American populations had an increased number of CHV compared to other areas.
To our knowledge, our study is the first to utilise the FHRS data to identify determinants of non-compliant food establishments in England and Wales. We contribute to the characterisation of situationally vulnerable populations in terms of their socio-demographic traits and the neighbourhood within which they live.

Food Hygiene Rating Scheme
Local and Unitary Authorities (LA) are responsible for enforcing hygiene standards at food businesses in the UK. Environmental Health Officers (EHO) are accountable for undertaking measures to protect public health, including administering and enforcing legislation, providing advice on all aspects of food safety, undertaking inspections, and assessing hygiene standards. Data relating to interventions, sampling and enforcement actions are uploaded to the Local Authority Enforcement Monitoring System (LAEMS) and held centrally for analysis and reporting by the FSA. This data is also used to calculate Food Hygiene Rating Scheme (FHRS) scores for all food serving businesses in England and Wales (Food Standards Agency, 2020). The frequency of inspections varies by risk level, with higher risk establishments such as Schools and Hospitals inspected more often than low risk outlets. New food businesses should be inspected and received a FHRS score within 28 days, however only 85% of planned food safety inspections were undertaken by LA's in the reporting period -19 (Food Standards Agency, 2018. Individual food outlets are given a FHRS score ranging from zero (urgent improvement required) to five (very good hygiene standards) determined by hygiene standards at the time of the inspection (Food Standards Agency, 2018). A score of two or less indicates that the premises is not 'broadly-compliant' as it does not align with the FSA's definition of food safety compliance. A score of three or higher indicates 'broad-compliance'. Henceforth we use the terms 'compliant' and 'non-compliant' for ease of interpretation. If an EHO believes that a facility poses an immediate risk to public health, they are obliged to take preventative action, such as closing down the establishment. However many non-compliant businesses continue to operate (Food Standards Agency, 2020). The overall FHRS is a composite score of three separate measures; Confidence in Management, Structural Integrity; and Food Hygiene which are scored as violations on a scale of 0 to 50, where the higher the number of violations, the lower the FHRS score. See Table 1 for the mapped scores.
Whilst LA inspections of food outlets are a legal requirement in England and Wales, display of FHRS scores is only mandatory in Wales and optional in England. Recent research from the Food and You Survey (Food Standards Agency, 2016) suggests that only 43% of consumers actively consider the FHRS before deciding where to eat and 64% prefer to choose a food establishment based on their own personal experience. As 60% of outbreaks occur outside the home and are twice more likely to occur at non-compliant premises than higher scoring premises (Poppy, 2017;Fleetwood et al., 2019), these findings suggest that a large proportion of consumers unknowingly put themselves as risk. Moreover, consumers are considered at higher risk than ever before. Not only has the number of fast food outlets increased dramatically in the past decade, but consumption of food outside the home has also risen by 29% (Burgoine et al., 2014).

Setting and study design
Predicated in bias and inaccurate data, existing food safety interventions have focused on inherently vulnerable populations. This paper aims to better characterise situational vulnerability by identifying the characteristics of high-risk areas, and address the research question: what are the neighbourhood and socio-demographic determinants of non-compliant food establishments in England and Wales? This is an ecological study of cross-sectional design, which aims to identify smallarea socio-demographic, urbanness and outlet type determinants of noncompliant food establishments in England and Wales.

Data
Datasets were collected from numerous sources: Where ln is the base of the natural logarithm, unemp is percentage of unemployed individuals, overcrowd, nocar and renting are percentages of households which are overcrowded, have no access to a car or van, and are renting respectively. The resulting score is then standardised using z-score standardisation: Where x is the score, μ is the mean and σ is the standard deviation.   Table 1 Numerical violation scores mapped to the six FHRS ratings. Where 0-2 equals non-compliance and 3� indicates broad compliance. in October 2018 using a Python script and collated into one dataset for further analysis. The dataset has the latest inspection data for each food outlet.

Data preparation
We first undertake a process of data standardisation on the sociodemographic data by converting raw counts to percentages. For both age and ethnicity variables we use counts of individuals as the numerator and the total population in each OA as the denominator (Office for National Statistics, 2016). We collapse our 18 ethnicity variables into five categories based on the ONS recommended groupings for ethnicity (Office for National Statistics, 2019): White; Mixed; Asian; Black; Other. 14 age categories are also collapsed into six age categories: 0-4; 5-14; 15-19; 25-44; 45-64; 65�. We maintain critical age groups as distinct categories, mainly <5 and 65�. See Table 2 for the mean, standard deviation, minimum and maximum values of variables for OA's in England and Wales. Nemes et al. (2009) state that logistic regression models can overestimate coefficients for variables with a small sample size. Therefore, to ensure a sufficiently large number of data points in each category, we collapse the ten RUC categories into five variables: Urban cities and towns; Rural hamlets and isolated dwellings; Rural town and fringe; Rural village; and Urban conurbation. An urban conurbation is defined as a large urban area whereby the density of Dwellings Per Hectare (DPH) is sustained (DPH>3.75) as distance from the settlement focal point increases (Bibby and Brindley, 2013). These areas are sometimes known as urban agglomerations, however we will use the terminology 'conurbation' throughout this paper. There are 14 extended urban areas, known as conurbations, in England and Wales: Cardiff; Derby; Greater Bristol; Greater Nottingham; London; Jersey Belt; North Staffordshire; Portsmouth; Southampton; South Yorkshire; Teesside; Tyneside; West Midlands and West Yorkshire. See Fig. 1 for an example of the mapped categories.
Using the Business type ID in the FHRS dataset, we subset the food establishments based on geographical reach. As we are interested in identifying situationally vulnerable populations as determined by the neighbourhood within which they live, we exclude food businesses with large geographical reach whose produce is unlikely to be consumed locally. For example, food businesses who are concerned with distribution, growing, importing and exporting are likely to have large distribution networks that extend beyond their immediate output area. Only food businesses serving the immediate nearby locality are included in subsequent analysis. Schools, colleges and hospitals are also removed as they are subject to higher quality control and do not serve the entirety of the population. We include hotels, guesthouses and bed and breakfasts in further analysis, as although it would be unusual for an individual to seek overnight accommodation in their local area, many establishments in this category serve food to the local community and not solely patrons staying overnight. Therefore, excluded food business types are: hospitals, childcare centres, care homes; distributors, transporters; importers, exporters; farmers, growers; manufacturers, packers; schools, colleges, universities; mobile caterers. To attach neighbourhood characteristics to each food outlet we explore two methods. Firstly, ESRI ArcMap 10.4 is used to map the XY coordinates of each food outlet as reported in the FHRS scheme data. The point locations are converted from the WGS84 coordinate reference system to OSGB36 to align with the OA digital boundary data from ONS and a polygon to point spatial join is then used to establish the OA of each food establishment.
Secondly, a Postcode to OA lookup file (Office for National Statistics, 2018b) is attached to the FHRS data using the postcode field in both datasets. We then check the validity of the OA codes as determined by the spatial join by comparing these to the postcode to OA lookup. We find only a 51.1% agreement between the OA codes as determined by the spatial join when compared to the postcode to OA lookup. Therefore, we decide to use the OA codes as matched by the postcode to OA lookup to attach our ethnicity, age, RUC and deprivation variables for further analysis. We match 99.7% of food outlet postcodes when using this method. We find a number of duplicate records in the FHRS dataset, where a food outlet appears twice with exactly the same data. This could be due to a fault when uploading data to the LAEMS database; therefore, we remove duplicate entries to ensure coefficients in the model are not overestimated. Of 392,345 food businesses in the original data, 297,119 are included in subsequent analysis across the following categories: restaurants, caf� es, canteens; other retailers; super and hyper markets; other catering; pubs, bars, nightclubs; takeaways, sandwich shops; hotel, guesthouse, bed and breakfast.
Exploratory analysis indicates that 19,700 (6.6%) food businesses are non-compliant as they have a FHRS score of two or below. In terms of compliance within business type categories, takeaways and sandwich shops have the largest percentage of non-compliant outlets (12%), followed by other retailers (8%); restaurants, cafes and canteens (6%); and pubs, bars and nightclubs (5%). Again, super and hypermarkets (2%); hotels, guesthouses, bed & breakfasts (2%); and other caterers (2%) have the smallest percentage of non-compliant businesses. See Fig. 2 for a breakdown of compliance by business type. Overall, the largest proportions of non-compliant establishments are contributed by restaurants, cafes and canteens (31%); takeaways and sandwich shops (27%); other retailers including convenience stores and newsagents (25%); and pubs, bars and nightclubs (10%). Conversely, super and hypermarkets (1%); hotels, guesthouses, bed and breakfasts (2%); and other caterers (2%) contribute the smallest proportion of non-compliant businesses.

Addressing multicollinearity
A statistical assumption of the logistic regression model is that the independent variables should be independent of one another. Known as collinearity, if a variable in the regression model linearly predicts another variable, the variance can be inflated and coefficients become unstable. Dormann et al. (2012) suggest that pairwise correlation coefficients of 0.5-0.7 and above indicate high collinearity. Therefore to   address the problem, we calculate pairwise correlation coefficients for continuous independent variables: (percentage of) ethnicity, age and TDS. Fig. 3 shows the pairwise correlation coefficients as a correlation plot where red coefficients indicate positive correlation and blue coefficients show negative correlations. We remove a number of variables that exceed the lower end of the threshold (r ¼ �0.5). These include: 25-44 and 45-64 age groups and White ethnicity. Removal of these variables allows us to retain the maximum amount of variables whilst reducing collinearity. Furthermore, we calculate Variance Inflation Factors (VIF) for each remaining variable in the model. VIF provides a measure of the increase in variance caused by collinearity. For example, a VIF of 10 indicates a 10 times increase in variance due to the presence of a collinear variable compared to inclusion of the term alone (O'brien 2007;Dormann et al., 2012). A VIF of >10 usually indicates a problematic value. As our highest value is 2.314, we can be confident that strongly collinear variables were removed in the prior step. See Table 1 in the supplementary material for calculated VIF values.

Logistic regression model
A Logistic Regression Model is used to identify the determinants of non-compliant food establishments in England and Wales. Logistic regression is useful when the response variable's variance is not constant or not normally distributed, or in this case, as we are concerned with non-compliance, when it is constrained in binary form. The response variable is transformed by the 'link' parameter which allows the model to be fitted by least squares. Let y be the binary response variable indicating compliance (1) and non-compliance (0) and p be the probability of y to be 1: The model is fitted using the GLM command in R and using binomial as family and logit as the link function. More formally, it takes the form:  explore the association of the removed variables with food outlet compliance: percentages of white ethnicity, age 25-44 and 45-64. We also explore the individual components of the TDS (percentages of: unemployed persons, overcrowded households, persons without access to a car or van, and rented households). These models are run separately so the statistical assumption that no independent variables in the model are collinear is upheld, however we adjust the models for age, deprivation, RUC and business type. We fit all models using R 3.4 statistical software.

Results
Our logistic regression examines the determinants of non-compliant food establishments (n ¼ 297,119) in England and Wales, the results of which are displayed in Tables 3 and 4. We discuss each predictor variable in terms of the Odds Ratios (OR), which is the exponential of the estimate coefficient, and the 95% confidence interval of the Odds Ratio. An OR is a measure that describes the probability of an outcome, in this case food safety compliance, to occur given the predictor variable. A value less than 1 indicates that the variable decreases the probability of compliance and conversely, a value greater than 1 indicates an increase in the probability of compliance. The confidence interval provides an estimate of the level of uncertainty around the odds ratio, where a narrow interval indicates higher precision. Although likelihood and probability are not statistically equivalent, we use the terminology 'lesslikely' and 'more-likely' throughout our discussion and results to aid interpretation. Coefficients calculated using the multivariate model are adjusted for all other variables (Table 3). Coefficients calculated univariately (Table 4) are adjusted for age, deprivation, ethnicity RUC and Business type.
As expected with such a large number of data points, the majority of variables in the multivariate model are highly statistically significant (p<0.01), providing sufficient evidence that changes in the dependent variable are associated with changes in the independent variable. Six variables are not statistically significant (p>0.1): Age 5 to 14 and 20 to 24; Townsend quintile 2 and 3; RUC rural town and fringe, and rural village.
Whilst controlling for all other variables, we find that the presence of non-white ethnicities is negatively associated with the probability of food establishment compliance: Mixed (OR ¼ 0.984), Asian (OR ¼ 0.985),Black (OR ¼ 0.984),Other Ethnicities (OR ¼ 0.99). Furthermore the results of the univariate model show that food outlets in areas with a higher percentage of white individuals have an increased probability of compliance (OR ¼ 1.012).
Age has a very small effect with odds ratios of 0.994, 1.01 and 1.003 for age groups 0 to 4, 15 to 19 and 65� respectively. These results show that food establishments in areas with a higher percentage of 0 to 4 year olds have a decreased probability of meeting hygiene regulations compared to areas with a higher percentage of individuals aged 15 to 19 and 65 or over. Furthermore the univariate model results show that the presence of the age group 44-65 does not have a significant association with compliance (p ¼ 0.512), however food establishments in areas with higher percentages of 25-44 years olds have a higher probability of noncompliance (OR ¼ 0.995), with a narrow confidence interval and significant p-value (p<0.001).
In terms of deprivation, food establishments in the second and third quintiles of deprivation have neither an increased nor a decreased probability of compliance. These results are not statistically significant and both confidence intervals cross 1. However, food outlets in the fourth (OR ¼ 0.838) and fifth (OR ¼ 0.75) quintiles have statistically decreased probability of compliance, showing a clear gradient of association between compliance and deprivation. To summarise, the probability of food establishment compliance decreases as deprivation increases, with those in the most deprived areas 25% less likely to meet hygiene standards compared to the least deprived areas. Furthermore, we consider the individual components of the TDS. The results of the univariate model show that food outlets located in areas with high percentages of individuals without access to a car, and areas with high percentages of overcrowded households have decreased probability of compliance (OR ¼ 0.994, 0.981 respectively). Whereas increased renting seems to have a small but positive association with compliance (OR ¼ 1.002, p<0.001). Percentage of unemployed persons has neither a positive or negative association with compliance with a CI that crosses one and a high p-value (p ¼ 0.33).
Considering the effect of urbanness on compliance, we find that when adjusted for confounders, areas classed as rural town and fringe (p Table 3 Results of multivariate logistic regression model. Odds Ratios represents the exponentiated coefficient and indicates the probability of food establishment compliance given the presence of the predictor variable. An odds ratio above 1 indicates increased probability and below 1 indicates decreased probability of compliance. We also present unadjusted odds ratios. The confidence interval is exponentiated to aid interpretation of the level of certainty around the odds ratio.  ¼ 0.960) are not statistically significant, however the unadjusted OR's for these variables show a much higher probability of compliance. For food establishments in rural hamlets and isolated dwellings the probability of compliance is 24% higher (p<0.001, OR ¼ 1.243) than those located in an urban city or town. Conversely, we find that premises located within an urban conurbation have a decreased probability of compliance and are 32% less likely (OR ¼ 0.678, p<0.001) to meet the FSA's hygiene standards with a confidence interval ranging from 0.654 to 0.703. Finally, we consider the impact of business type on hygiene standards when compared to the reference category: restaurants, cafes and canteens. All business types aside from two: takeaways and sandwich shops, and other retailers, have positive associations. Supermarkets and hypermarkets have the largest odds ratio and are up to 3 times more likely to meet hygiene standards than restaurants, cafes and canteens (OR ¼ 3.393), although the confidence interval ranges from 2.969 to 3.899, we can be confident that there is a positive association (p<0.001). Similarly, business types including other catering (OR ¼ 2.785); hotels, guesthouses and bed and breakfasts (OR ¼ 2.048); pubs, bars and nightclubs (OR ¼ 1.186) all have positive associations with narrow confidence intervals. Conversely, takeaways and sandwich shops are 50% (OR ¼ 0.504) less likely to be compliant compared to restaurants, with a confidence interval ranging from 0.485 to 0.525. Other retailers are also 9% (OR ¼ 0.905) less likely to align with recommended practices with a narrow confidence interval ranging from 0.87 to 0.942.

Discussion
Our results show that age and ethnicity have small but significant associations with hygiene standards, whereas deprivation, urbanness and outlet type have larger and significant impact. Overall, takeaways, sandwich shops, small retailers such as convenience stores, and outlets located in deprived and conurbation areas have significantly decreased probability of compliance compared to restaurants, cafes, canteens, and outlets in affluent areas, rural areas, cities and towns. We find that an increase in the presence of certain socio-demographic characteristics, specifically non-white ethnicities and individuals aged 0 to 4, has a small but significant negative association with compliance. Therefore, populations of non-white ethnicity and individuals under 5 years of age should be considered at higher risk of exposure to a foodborne pathogen than white populations and individuals aged 5� when eating food outside the home.
We find a clear association between increased deprivation and noncompliance, particularly for areas with decreased car access and overcrowded households; however, the strength of the association weakens when adjusting for confounders. Li and Kim (2017) report that 'people with lower incomes have significantly smaller activity spaces and poorer food accessibility' than people with high incomes. This is mainly due to lack of car access among these populations combined with the rise of out of town hypermarkets which provide fierce competition for smaller outlets. Individuals without a car, and those who may not be able to shop outside their immediate locality (Evans et al., 2015) have reduced access to healthy and fresh food as a result. Furthermore, Mills et al. (2018) find that higher SES groups are more likely to consume home-cooked meals, whereas consumption of convenience food and takeaways is associated with lower SES groups. In line with these findings, Bivoltsis et al. (2019) find a significantly higher number of convenience stores in areas of low SES compared to areas of higher SES. Although SES and deprivation are not entirely equal, they do compare a similar construct and our findings show that decreased car access is associated with higher probability of non-compliance.
Interestingly, our findings show that supermarkets and hypermarkets are significantly more likely to have better hygiene practices than smaller convenience stores such as newsagents, which further increases food safety risk for deprived populations and those who may not be able to shop outside their immediate locality. Furthermore, although studies are limited, there is evidence to suggest that populations with low SES are also more likely to suffer from foodborne illness (Bytzer et al., 2001;Gillespie et al., 2010;Olowokure et al., 1999). Again, although SES and deprivation are not entirely comparable they often coexist geographically, therefore this supports our findings that small convenience stores and food establishments in deprived areas are less likely to meet regulations.
Although there are significant associations between urbanness, ethnicity, deprivation and FHRS scores, it is unclear as to how these predictors drive non-compliance and further work is required to unpick the relationship. We believe some associations may be explained by high population transience, which in turn is related to both high staff turnover and high business turnover. Although we find only a small negative association with compliance for 0-4 year olds, many studies have shown that this age group, alongside 25-44 year olds (their parents) are more transient than the majority of other age groups (Bailey and Livingston, 2007). This maybe because these age ranges encompass many life events, resulting in increased migration. For example, individuals having children and upsizing to accommodate their larger families (Lomax and Stillwell, 2018). There is also evidence to suggest that migration among 15-19 year olds is lower as education becomes increasingly important for this age group. Our results show increased probability of compliance in areas with higher percentages of 15-19 year olds, which could support the argument that population transience decreases the probability of compliance. While interviewing food business owners, Yapp and Fairman (2006) found that many proprietors did not send their staff on food hygiene courses because of high staff turnover within their businesses. Staff training is an important consideration when awarding FHRS.
Before adjusting for confounders, ours results show a clear gradient of association between non-compliance and urbanness, where food establishments in urban areas have a lower probability of compliance compared to rural areas. However, the association weakens when the model is adjusted, and disappears for areas classed as rural town and fringe and rural village, suggesting that the consideration of other variables such as age, ethnicity, deprivation and business type is important. Our adjusted results show a decreased probability of compliance in conurbation areas compared to cities and towns and an increased probability of compliance in rural hamlets and isolated dwellings. Again, these findings could be explained by population transience, which is generally higher in urban areas. As rural areas have lower netmigration compared to urban areas, this could result in better staff retention, more in-depth training, and a better understanding of food hygiene practices. Food businesses in remote areas are also more likely to rely on repeat custom due to a limited consumer market (Schiff, 2015). This could drive compliance through both increased social responsibility on the behalf of the business owner, and increased sensitivity to bad reviews, particularly via word of mouth (Han and Ryu, 2009). In urban areas, higher business turnover could be a driver of decreased probability of compliance. For example, whilst validating FSA food establishment data against street audit data, Wilkins et al. (2017) found less agreement in urban areas compared to rural areas, which was attributed to a higher turnover of food businesses. High streets in less affluent areas of England and Wales continue to cause concern due to the high number of shop vacancies and high business turnover (Whysall, 2014). Furthermore, high business turnover results in a higher number of newly established businesses, which are often disadvantaged during the inspection process as they have neither a track record of compliance or extensive records of food safety practices. Hawkins (1984) states that when faced with non-compliance during an inspection visit, the approach adopted by the EHO is often dependent on the wilfulness of the violation, the likelihood of recurrence, and the past behaviour of the firm (Hawkins, 1984).
Considering ethnicity, many studies have found an association between restaurants which serve ethnic cuisine and hygiene violations (Harris et al., 2015;J. Kwon et al., 2010;Roberts et al., 2011). For example, Schomberg et al. (2016) found that Chinese restaurants in San Francisco had a higher number of health violations compared to restaurants of other cuisines. Again, the reasons behind these findings are unclear, but there is evidence to suggest that some ethnic groups adopt food preparation practices for cultural reasons, although they conflict with food safety guidance. The 'Kitchen Life Study' undertaken by the University of Hertfordshire, found that many Pakistani participants routinely washed chicken during food preparation. Although this practice increases the risk of infection by campylobacter, it relates to the importance of ritual purity within Islam (Wills et al., 2013). This study focussed on domestic practices, therefore it is not clear whether these findings would translate in a business setting. Furthermore, we are unable to comment on the association between cuisine type and the ethnicity of staff, which may drive particular food safety behaviours such as washing raw chicken, nor whether the presence of particular cuisines is related to the demographics of the area within which it is located. Unfortunately, cuisine type is not reported in the FHRS data and so we are unable to establish whether certain cuisine types and specific ethnicities coexist in space. This certainly warrants further research.

Strengths and limitations
This paper has taken small steps towards identifying situationally vulnerable populations based on proximity to non-compliant food establishments. We have used openly accessible data to identify populations who may be at higher risk of foodborne illness based upon the type of neighbourhood within which they live. As Local Authority resources are scarce, our results could be used to prioritise inspections, in areas where the probability of compliance is lower. This would be with a view to educating and supporting food businesses who are struggling to meet hygiene guidelines, however there is an argument that increasing inspections in deprived and primarily non-ethnic areas could be seen as an oppressive measure, which would place a larger burden on the proprietors. Therefore as an intervention, this would have to be carefully considered. Our study has not considered the impact of current enforcement practices or examined area based response variables, therefore future work will involve examining the different rate of inspections and enforcement actions across Local Authorities and the impact on FHRS data. Similarly, it would be interesting to investigate the components of the FHRS including confidence in management, food hygiene and structural integrity scores and in different types of neighbourhoods.
Health violation data is often used as a proxy for foodborne illness (Darcey and Quincey, 2011), therefore increasing inspections in areas with a higher number of non-compliance food outlets could decrease outbreaks. Many studies have suggested that Consumer Generated (CGD) Data could also prove useful in targeting inspections by identifying dynamic risk, primarily outbreaks of foodborne illness (Harris et al., 2017;Kang et al., 2013;Nsoesie et al., 2014;Schomberg et al., 2016;Sadilek et al., 2016;Oldroyd et al., 2018). Including Trip Advisor and Twitter data, these data could not only provide insight into the consumer's perception of the restaurant environment and therefore hygiene standards, but also act as an early warning system for foodborne illness outbreaks. Future work may look at the feasibility of combining both CGD and FHRS data with a view to capturing both the dynamic and static elements of food safety risk.
Considering the limitations of this study, there are problems associated with the geographical scale of analyses. Firstly, the Modifiable Areal Unit Problem (MAUP). MAUP is a statistical bias resulting from the aggregation of point-based measures into areas. Increasing the size of an area amplifies the problem (Openshaw and Taylor, 1979). The shape and scale of the aggregated units will influence the ecological predictors in our model and therefore our results may change when using an alternative geography. Although we use a small level geography to minimise the effect, we cannot mitigate the problem entirely without using individual level data. Secondly, our method assumes that an individual will purchase food from an establishment located within the OA within which they live, and is unable to capture behaviours that are more complicated; for example, when a consumer purchases food in a neighbouring OA. Our study also does not take into account food ordered and consumed via online delivery services such as Uber Eats who typically have a larger delivery network than food establishments offering a traditional delivery service. Future work will attempt to use distance-based measures to capture complex interactions between food outlets and consumers.
Food safety risk is a complex phenomenon and we have proposed a method to better identify situationally vulnerable populations based on negative environment features, however we have not taken into account food behaviours. Adopting unsafe practices such as consuming food after its use-by date; not reheating, storing and defrosting food appropriately; and not cooking food thoroughly, increases risk of foodborne illness. Furthermore, our analysis does not allow for variations in eating outside the home. Although the Food and You Survey (Food Standards Agency, 2016) reports that 96% of respondents reported eating out and 43% did so at least once a week, individuals who frequently eat at non-compliant food establishments are at much higher risk then those who do not eat food cooked outside the home.
Finally, we did not find compelling evidence from the literature to test for interactions in our model, we are therefore unable to comment on the strength of association between an independent variable and the FHRS given the presence of another variable. However, future work may look at the strength of these associations; for example, the presence of renting populations may have a stronger negative association with compliance in more rural areas compared to urban areas.

Conclusion
Based on our findings, we recommend that food establishment inspections are prioritised for take aways, sandwich shops and small retailers such as convenience stores, especially in deprived and large urban areas. Conversely, restaurants, cafes, supermarkets, pubs, bars, hotels and guesthouses can be considered low risk, especially in more affluent and rural areas. Whilst further work needs to be undertaken to understand the complex relationship between ethnicity and compliance, including the role of cuisine, we conclude that neighbourhoods of increased Black, Asian, Mixed and other ethnicities should be considered at higher risk of foodborne illness than areas of primarily white populations.
Given the aforementioned problems concerning traditional public health interventions and the data they are based upon, our study has utilised readily available FHRS, socio-demographic and RUC data to identify the determinants of non-compliant food establishments in England and Wales. Our findings can be used, not only to further characterise situationally vulnerable populations who are often ignored, but can also help to prioritise food establishment inspections where LA resources are scarce.