Urbanicity and psychotic experiences: Social adversities, isolation and exposure to natural environments predict psychosis

Background: Research has shown that incidence rates of psychotic experiences are higher in urban areas, defined by their population density, and that an increasing number of people move to cities. Thus, it is critical to understand which characteristics of cities drive this association. To date, efforts to explore urban risk have predominantly focused on a few or single preselected candidate factors and clinical populations. Methods: We aimed to identify the best predictors of psychotic experiences (PE) in a subclinical population, considering 26 factors describing the physical and social environment. Two feature selection models were employed, i.e., a Boruta algorithm, a random forests approach, and an elastic net penalised logistic regression model. Results: Individual-specific social environment features emerged as the most robust predictors of PE, including childhood adversity, stressful life events, social isolation and low household income. Exposure to natural environments was found to be negatively associated with PE. Conclusions: Spending more time in residential natural environments could be an actionable target for preventing and treating psychosis.


Introduction
In the general population, the lifetime prevalence of psychotic disorders is estimated to be around 3% (Perälä et al., 2007).For psychotic experiences (PEs), including hallucinations, delusions, disorganised thought and psychotic fear, this figure is notably higher: approximately 5.3% of people report psychotic experiences in their lifetime (Linscott & Os, 2012).In cities, the risk of developing psychotic experiences or psychotic disorders, such as schizophrenia, is even more elevated.This association has been consistently observed among individuals who currently reside in (Fett et al., 2019;Gottesman & Gould, 2003;Os et al., 2008b;Sundquist et al., 2004), grew up in (Mortensen et al., 1999;Vassos et al., 2012) or were born in (Laursen & Munk-Olsen, 2007;Mortensen et al., 1999;Solmi et al., 2019) cities.Given the global trend of urbanisation, with projections indicating that 68% of the world's population will reside in urban areas by 2050 (Ritchie & Roser, 2018), it becomes imperative to discern the characteristics of urban settings that play a role in the manifestation of psychosis.

The role of the physical environment
Cities are typically defined based on population density, which has been consistently linked to higher risks of psychotic experiences (Linscott & Os, 2012;Os et al., 2008a) and psychotic disorders (Os et al., 2008a;Sariaslan et al., 2014).However, the urban risk may not solely stem from living in a densely populated area but is more likely attributed to the cumulative effects of various environmental and social stressors associated with urbanicity (Galea et al., 2011;Rapp et al., 2015).
For instance, a growing body of scientific evidence has shown that limited exposure and access to nature have adverse impacts on mental health (van den Berg, Maas, Verheij, & Groenewegen, 2010;van den Berg et al., 2015;Bratman et al., 2012Bratman et al., , 2015;;Fong et al., 2018;Gascon et al., 2015;Lee & Maheswaran, 2010;Shanahan et al., 2016).Exposure to nature can be delineated into exposure and access to residential green (vegetation) and blue spaces (water bodies), which have both been found to benefit mental health (Gascon et al., 2015(Gascon et al., , 2017;;Kondo et al., 2018;McCormick, 2017;Völker & Kistemann, 2011) and decrease the risk of psychosis (Chang et al., 2019;Engemann et al., 2018Engemann et al., , 2020)).A Danish case registry study observed that low exposure to nature was associated with a 1.5-fold increased risk of developing schizophrenia (Engemann et al., 2018).Furthermore, exposure to nature has been linked to reduced (Engemann et al., 2018;Gidlöf-Gunnarsson & Öhrström, 2007;Nowak et al., 2006) and urbanicity to higher levels of air and noise pollution (Heinz et al., 2013;Kelly et al., 2010).Previous studies on air pollution have predominantly focused on particulate matter (PM), which refers to microscopic particles suspended in the Earth's atmosphere.However, it seems that PM can only partially account for the risk of developing psychosis (Attademo et al., 2017).Another recent review reported no significant association between long-term exposure to PM 2.5 or PM 10 and psychosis (Braithwaite et al., 2019).Organic compounds, i.e., oxides of nitrogen (NO x ), particularly nitrogen dioxide (NO 2 ), were identified as significant modifiers between urban residency and psychotic experiences during adolescence (Newbury et al., 2019).Furthermore, noise pollution has been postulated to have an adverse impact on general health and well-being, similar to the effects of chronic stress (Jariwala et al., 2017).

The role of the social environment
Access to green spaces and unpolluted areas are associated with various socio-economic factors, including income and housing prices, which may confound the association between mental health risks and residential environments (Adler & Snibbe, 2003;Weich et al., 2002).Likewise, adverse neighbourhood-related socio-environmental factors were associated with higher incidence rates of psychotic disorders (Allardyce et al., 2005;Ivory et al., 2011;Kirkbride et al., 2008;Kirkbride et al., 2007b;Omer et al., 2014;O'Donoghue et al., 2016;Richardson et al., 2018;Sariaslan et al., 2014;Veling et al., 2014;Zammit et al., 2010) and psychotic experiences (Binbay et al., 2023).For example, Bhavsar et al. (2014) reported that crime and educational deprivation were significant predictors of schizophrenia incidence, adjusting for population density and ethnicity.Another study found that increased inequality was associated with a higher incidence of schizophrenia, but only in the most deprived areas (Boydell et al., 2004).Additionally, an elevated incidence of psychotic disorders has been observed among children and adolescents residing in neighbourhoods characterised by low social cohesion and high crime rates (Newbury et al., 2016;Polanczyk et al., 2010).
In contrast, a large-scale study in China found that rural upbringing was linked to increased and not decreased risk for PEs (Wang et al., 2019).In this study, rural areas were characterised by higher levels of social stress and economic deprivation, which, in turn, have been linked to an increased risk of PE (DeVylder et al., 2018).Likewise, urban residency may protect against psychosis due to better healthcare or resource access, particularly in less developed countries (Hou et al., 2018).This may, amongst other reasons, explain why DeVylder et al. (2018) could not find an association between urban residence and psychosis in low and middle-income countries.

The role of social and sociodemographic factors
Just as social and psychological stress has been linked to one's neighbourhood (Wang et al., 2019), it has also been associated with psychosis (Green et al., 2014) and PEs (Campbell & Morrison, 2007;DeVylder et al., 2018;Mamah et al., 2021;Turley et al., 2019).
Additionally, Stilo et al. (2016) suggested that first-episode psychosis patients experienced more social disadvantages in adulthood, such as living below the official poverty level, residing in overcrowded conditions, and living alone.Multiple other studies have found a similar link between low socioeconomic status (SES) and psychotic experiences (e.g.Linscott & Os, 2012;Loch et al., 2017;Mamah et al., 2021;Os et al., 2008a).SES may also be decreased by social disadvantage, discrimination and unstable housing situations (Veling & Adriaanse, 2013).
Moreover, there is evidence that individuals with psychosis have higher levels of loneliness compared to the general population (Chrostek et al., 2016;Meltzer et al., 2012;Stain et al., 2012).However, there is also evidence that loneliness already occurs before the onset of psychosis (Häfner et al., 1999;Moller & Husby, 2000) and in individuals with psychotic experiences (Chau et al., 2019;V. A. Morgan et al., 2012;Narita, Stickley, & DeVylder, 2020;Narita et al., 2022;Stain et al., 2012).In a study by Narita, Stickley, and DeVylder (2020), loneliness was still positively correlated with psychotic experiences when adjusted for adverse childhood experiences, sociodemographic factors and common mental disorders.Furthermore, higher incidence rates of psychosis have been documented among Black, Asian and ethnic minority and migrant populations, who more frequently reside in urban areas (Bourque et al., 2010;Cantor-Graae & Selten, 2005;Coid et al., 2008;Del-Ben et al., 2019;Selten et al., 2019;Tortelli et al., 2018).Various explanations have been proposed to elucidate ethnic disparities in psychosis, such as structural and systemic racism, discrimination, ethnic minority status, ethnic density, access to health services, use of cannabis, poorer social support and social disadvantages (Austin et al., 2020;Gayer-Anderson & Morgan, 2012;Leaune et al., 2018;Linscott & Os, 2012).Evidence suggests that the risk of PEs for ethnic minorities is not or only partially explained by cumulative social disadvantage (C.I. Cohen & Marino, 2013;Selten et al., 2019;Veling & Adriaanse, 2013).In a meta-analysis, ethnic minority position was found to be a consistent risk factor for psychotic disorders and PEs, with the risk level differing across ethnic groups and within countries (Leaune et al., 2018).
Lastly, incidence rates of psychotic disorders are higher among men (Jongsma et al., 2019;J. McGrath et al., 2004), and men frequently experience an earlier onset of psychosis than women (Ochoa et al., 2012).Additionally, vulnerability to environmental risk factors may vary across different age groups (Pedersen, 2001).

Cumulative effects of these risk factors
Most of these risk factors of PE are not independent but are highly correlated, pleiotropic and interactive (Guloksuz et al., 2018).For example, communities with low SES or minority groups are disproportionately exposed to environmental hazards, such as air pollution and noise, tend to reside in neighbourhoods lacking greenspace and access to health services and experience more psychosocial stress, including discrimination and racism (Europe, 2012;Hajat et al., 2015;Morello--Frosch & Shenassa, 2006;O'Neill et al., 2003).However, many studies have adopted a hypothesis-driven one-exposure to one-outcome approach, akin to the candidate-gene approach or only investigated a subset of potential risk factors.Just as many genetic factors contribute to the genetic risk of psychosis, environmental factors are likely to contribute to psychosis risk cumulatively (Nierop et al., 2013).
To date, most studies have predominately focused on investigating the impact of urbanicity on psychotic disorders in clinical samples, with schizophrenia being the prototypical disorder.Although 80% of PEs are transitory, about 20% persist, and 7% progress to full-blown psychotic disorders (Kaymaz et al., 2012;Linscott & Os, 2012;Zammit et al., 2013).Hence, the identification of urbanicity-related risk factors associated with PEs holds the potential to inform the development of targeted preventative strategies and could potentially reduce the likelihood of psychosis.
In combination, there are only a few papers, that we are aware of, that investigate how multiple different aspects of an urban environment are associated with psychotic experiences.There have been investigations about features of the physical environment in adolescence (e.g., Bouter et al., 2023;Newbury et al., 2019), neighbourhood characteristics at birth (e.g., Solmi et al., 2019), neighbourhood and individual-specific socioeconomic features as well as other factors in adulthood (e.g., Binbay et al., 2023;Zammit et al., 2010).Most of these studies tackled the statistical challenge posed by high intercorrelations and interactions of environmental features by using hierarchical and adjusted regression models.However, certain feature selection methods may be better suited for this type of research question, as they a) can handle interdependencies between environmental exposures better, b) can handle large feature spaces, i.e., facilitate a more data-driven approach, and c) allow for conclusions about the generalisability of the results (Orrù et al., 2012).We implemented these algorithms in an adult sample to determine how well long-term exposure to a particular environment predicted PEs above and beyond more widely known risk factors and identify which urbanicity-related factors held relatively greater importance.

Participants
Data were obtained from the United Kingdom (UK) Biobank, a comprehensive database comprising a general adult population cohort of over 500,000 British individuals aged 40 to 69 (Sudlow et al., 2015).Among this cohort, 157,348 participants participated in the online mental health questionnaire, which included questions regarding their PEs (Davis et al., 2020).Informed consent was obtained from all participants, and the UK Biobank received ethical approval from the National Research Ethics Service.This research was conducted under UK Biobank approval (application number 64615).
Only participants who maintained the same residential address since baseline were included in the analysis to ensure consistent long-term exposure to the same physical environment.Participants with missing data regarding psychotic information were excluded from the study.Furthermore, we checked all features and participants for more than 25% of missing data.This cut-off was chosen to exclude potentially nonrandom missingness of data.More specifically, for some participants, all the data about their social and/or physical environment was missing.As the missingness may be systematic, we excluded these individuals from the final analysis.As a result, 2116 samples were dropped from the analysis.We employed a non-parametric multivariate imputation approach to address the remaining incomplete data using a chained random forest combined with predictive mean matching (Mayer, 2021).This imputation method was selected to enhance the sample's representativeness and, consequently, improve the generalisability of the resulting prediction model (Gorelick, 2006;Moons et al., 2022).The final sample consisted of 22952 participants (Table 1), of whom 1.15% reported at least one PE within the last year at the second follow-up assessment.

Psychotic experiences
Participants were asked if they had experienced PEs in the last year.PEs included seeing a vision, hearing a voice, believing that something strange was trying to communicate with them or that there was a plot against them.A positive answer to either of these experiences led to the classification of PEs.The data were collected in 2016, i.e., six to ten years after the baseline measures.

Predictors
Based on the literature, 26 candidate variables were selected.These variables can be broadly categorised into three groups: Eleven factors describe the physical environment, seven features capture the social neighbourhood environment, and eight refer to the individual-specific social environment and sociodemographic information.These predictors are described in Table 2.The Supplementary Materials (Fig. 3) provide additional details on the intercorrelations among these variables.

Algorithms
We aimed to develop an accurate prediction model for PEs while identifying the urbanicity-related variables that enable a good prediction and contain more information than noise, i.e., produce a parsimonious model.
We employed a Boruta algorithm, a wrapper built around a random forest classifier algorithm (Breiman, 2001;Kursa & Rudnicki, 2010b).A random forest classifier is an ensemble learning algorithm that combines multiple decision trees, each trained on a random subset of features and data (Breiman, 2001).The Boruta algorithm follows a three-step process for variable selection: (1) Shadow attributes are produced by shuffling values of the original attributes across objects.(2) A random forest model is trained on the extended dataset, including the original and the shadow attributes, and normalised permutation variable importance scores are computed.(3) The algorithm uses statistical testing to compare the z-scores of the original attributes with those of the shadow attributes.Variables with lower z-scores than the shadow features are removed from the model.This process is repeated iteratively until all features are classified or a predetermined number of runs has been performed.Here, we ran the Boruta algorithm with 500 iterations.This algorithm was deemed the most powerful approach compared to other variable selection methods using random forests (Degenhardt et al., 2019;Kursa, 2013).It can detect non-linear relationships, effectively protect against overfitting, and is well-equipped to handle collinearity and interaction between variables, which is particularly important when working with environmental variables.The analysis was carried out using the 'Boruta' package (Kursa & Rudnicki, 2010a), built around the package 'randomForest' (Liaw & Wiener, 2002) in R (version 4.2.1;R Core Team, 2023).Although different variable selection methods may lead to models with similar prediction performance, the sets of selected variables may differ (Kursa, 2013).To ensure the robustness of feature selection, we trained a second feature selection model, i.e., an elastic net penalised logistic regression.A penalised (also called regularised or shrinkage) logistic regression method is a classification algorithm that incorporates a penalty term into the loss function, resulting in regression coefficients shrunken toward zero.An elastic net regularisation combines the L1 and L2 regularisations, integrating the strengths of the least absolute shrinkage and selection operator (LASSO; Tibshirani, 1996) and Ridge (Hoerl & Kennard, 1970) methods, respectively.This hybrid approach allows for selecting a parsimonious model through the LASSO term while considering the collinearity between variables via the Ridge penalty.Note that the L2 (Ridge) penalty provides a strong regularisation for large values, while the L1 (LASSO) penalty forces smaller terms to zero.The elastic net's two hyperparameters, α and λ, provide the ability to either balance or select between the values of these two regularisation terms and to determine the total weight of regularisation to For categorical variables, the range refers to the factor levels of the variable.Data were extracted from two time points, i.e., from baseline (0) and from a second time point (2).
apply, that is, whether and by how much to reduce the impact of or remove remaining features.In contrast to the Boruta algorithm, elastic net models require tuning of their hyperparameters, α and λ.Optimal hyperparameters were selected by exhaustive grid search, testing 25 possible values spread space of possible or likely candidate values.We implemented a 10-fold cross-validation (CV) to choose the best set, which was repeated five times with different fold selections to ensure stability and reliability of the results using the R package 'caret' (Kuhn, 2022).Furthermore, we fitted two non-selective models, i.e., random forest and an unpenalised logistic regression, to compare the generalisability of the selective to these non-selective models.Numeric data were centred and scaled for the logistic regression models to allow for the comparison between regression coefficients.

Evaluating the feature selection
To assess each feature's contribution to the models' overall performances, we compared their mean variable importance (VI).In the logistic regression models, variable importance was determined as the absolute value of the t-statistic for each model parameter (Kuhn, 2022).In the random forest models, VI was measured as the Mean Decrease Gini.This measure quantifies the total reduction in node impurities, as measured by the Gini index, resulting from splitting on the variable, averaged over all decision trees in the random forest (Han et al., 2016, pp. 219-224).
Partial dependence plots (PDP) were generated to visually examine the marginal effects of the individual features on the predicted outcome in the tree-based models.PDPs assume independence between the feature for which the plot is computed and the other variables.

Evaluating the predictive power
The external validity of the models was established by predicting PEs in an independent test dataset.We applied a 10-fold cross-validation, i. e., the models were trained on nine of the folds and then tested on the remaining data.This process was repeated ten times, with each fold used as the test set exactly once.Because tree-based methods tend to underperform in highly unbalanced datasets (Gu et al., 2008), the training data were balanced by randomly selecting control participants without PEs.Using the base function 'sample' (R Core Team, 2023), the same number of controls as individuals with PEs were drawn from the training data.The training dataset contained 490 participants (245 with PEs).Subsequently, the resulting models were tested on 2296 participants, of whom 20 reported at least one PE.
Random forests were constructed using the features previously selected by the Boruta model in more than half of the CV folds to evaluate the performance of the Boruta algorithm.Balanced accuracy was calculated to identify how often the classifiers were correct at predicting the experience of psychosis, taking the unequal representation of classes into account.Sensitivity, i.e., how well the models detected participants with PEs, and specificity, i.e., how well the models avoided false alarms, were also computed.Root mean square errors (RMSE) were used to evaluate and compare the quality of predictions.In addition, we calculated the area under the receiver operating characteristic curve (AUC) to capture the model's ability to discriminate between the positive, i.e., PEs, and negative classes, i.e., controls, across all possible classification thresholds.The AUC is robust to class imbalance.

Feature selection
The Boruta algorithm selected six features which consistently demonstrated high importance across multiple cross-validation folds.Those features were childhood adversity, social isolation, stressful events, age, living environment and exposure to natural environments.
The elastic net penalised regression model selected 12 variables that exhibited significant importance.All these features were also picked by the Boruta algorithm except for living environment deprivation.In order of decreasing mean variable importance, the elastic net regression selected stressful events, childhood adversity, household income, social isolation, cannabis, age, exposure to natural environments, education deprivation, crime, exposure to blue space, coastal distance, housing and living environment deprivation.The number of features chosen can be mathematically explained by the tuned hyperparameter of α = 0.43 (range 0 -1, where 0.5 represents equal weights for LASSO and Ridge penalties), which indicates a slightly greater impact of the Ridge penalty term.However, the composite penalty term λ had a value of 0.06 (range from 0 to infinity), suggesting that the overall weight of the regularisation term was low.
When considering the combined information from both feature selection models, childhood adversity and stressful life events emerged as the two most robust predictors of PEs, followed by social isolation and age.These variables were selected in 80% -100% of the cross-validation folds.Exposure to natural environments was repeatedly selected by both feature selection models but was only chosen 60% of the time and only reached relatively high variable importance in the Boruta model (VI = 2.6) compared to, for example, childhood adversity (VI = 10.75) and social isolation (VI = 7.72).In contrast, household income showed the third highest variable importance in the elastic net but was not selected by the Boruta algorithm.Education deprivation, crime and living environment deprivation showed comparable feature importances in the Boruta algorithm as exposure to natural environments but were not reliably selected (from VI = 2.58 to VI = 2.76).As selected by the two feature selection models, standardised variable importances are visualised in Fig. 1 and listed in Table 3.A colour-coded version of the table, including the results of the random forest and logistic regression, is provided in the Supplementary Materials (Table 1).A complete list of the unstandardised variable importances of the Boruta algorithm and regression coefficients of the elastic net regression of all features, including individual factor levels, can be found in Table 4.
Based on the analysis of the partial dependence plots and the regression coefficients obtained from the logistic regression models (detailed in Fig. 5 in the Supplementary Materials), the following directions of effects were found: Childhood trauma exhibited a positive trend, i.e., the more trauma was experienced, the higher the probability that the model would classify a participant as having psychotic experiences.A similar trend was observed for stressful events and social isolation.Moreover, the Boruta model yielded predictions of a higher risk of PE before age 65, followed by a sudden decrease thereafter.The predicted probability of PE displayed a non-monotonic reduction with increasing exposure to natural environments and decreasing levels of living environment deprivation.Further examination of twodimensional partial dependence plots derived from the Boruta model revealed that before age 65, a deprived living environment was associated with an increased predicted risk of psychotic experiences.Conversely, low exposure to natural environments was linked to an elevated probability of psychotic experiences across all ages (Fig. 6 in the Supplementary Materials).Additionally, the Boruta model predicted a high likelihood of PE for low exposure to natural environments irrespective of the level of living environment deprivation.The model further suggests that social isolation and the experience of stressful events attenuate the negative impact of a deprived living environment on the predicted risk of PE (Fig. 7 in the Supplementary Materials).

Predictive power
The models demonstrated prediction accuracies and AUC above the chance level between 0.62 and 0.64, indicating low to moderate accuracy.The models could correctly identify individuals with PEs 63.9% -79.17% of the time.However, the models demonstrated a limited capacity to avoid misclassification of individuals reporting PEs, as evidenced by the specificity measures falling between 40.5% and 63%.These values per model can be found in Table 5.With only six selected features, Boruta produced the most parsimonious and sparse model (depicted in Fig. 5 of the Supplementary Materials).

Discussion
This study aimed to predict psychotic experiences (PEs) by identifying the key variables characterising an individual's urban environment.PEs seem to be associated with and potentially arise from an accumulation of social and environmental stressors.We identified five variables that consistently emerged as important across feature selection methods and dataset splits, i.e., childhood adversity, stressful events in the past two years, social isolation, age and exposure to natural environments.Population density was not selected as a predictor of PEs.
In our analyses, we achieved a moderate level of predictive ability for PEs in independent datasets, which aligns with the well-established understanding that psychosis is influenced by both genetic and environmental factors, with 30% -50% explained by genetics and 50% -70% by the environment (Ronald, 2015;Zavos et al., 2014).This study's most novel and noteworthy finding is the evidence supporting the protective role of exposure to natural environments against PEs.Natural environment refers to surroundings comprising green and blue spaces with little or no apparent evidence of human presence or intervention.In our sample, exposure to natural environments close to one's home address was highly correlated with exposure to greenspace (r = 0.98) while being only weakly associated with blue space (r = 0.24).Thus, the measure of the natural environment used in this study primarily seems to capture the presence of green natural environments.Our findings support the work of Boers et al. (2018), who reported significantly lower amounts of residential green but not blue space in psychotic patients compared to a subclinical population.This highlights the importance of considering the quantity and quality of green spaces when designing residential areas.
Natural environments have high restorative and salutogenic properties, which can re-calibrate psychological and physiological systems, including improving immune function, decreasing blood pressure, facilitating the restoration of attention and fatigue, improving mood and reducing depressive symptoms (Amoly et al., 2014;Berman et al., 2008;  This table lists the features' mean variable importance, along with their standard deviations, and selection frequency across the cross-validation (CV).Only features selected by either of the selective algorithms in more than half of the CV folds are included.The variables are ranked based on their mean predictive importance as calculated by the two feature selection algorithms.To facilitate comparability between the models, the feature importance was scaled.Note that for logistic regressions, categorical variables' levels have been summarised into a single feature to ensure comparability.a Feature was selected in more than half of the CV folds.b Values are averaged across levels of this variable.Information about the individual levels of these variables is provided in the Supplementary Materials.Bratman et al., 2012;Ebisu et al., 2016;Fong et al., 2018;Hartig et al., 2014;James et al., 2015;R. Kaplan and Kaplan, 1989;Kuo, 2015;Maas et al., 2009;Markevych et al., 2017;Preuβ et al., 2019;G. A. Rook, 2013 1991).Several mechanisms have been proposed through which exposure to nature may benefit mental and physical health.The attention restoration theory (ART) suggests that natural environments can help to restore cognitive functioning (R. Kaplan & Kaplan, 1989;S. Kaplan, 1995;S. Kaplan and Talbot, 1983;Ohly et al., 2016).According to the stress reduction theory (SRT), being in nature can facilitate recovery from psychophysiological stress and promote relaxation (Ulrich, 1983;Ulrich et al., 1991;Ward Thompson et al., 2012).Additionally, exposure to natural environments can encourage healthful activities, such as physical activity and socialising (James et al., 2015;Kaczynski & Henderson, 2007;Maas et al., 2008;Markevych et al., 2017;Twohig-Bennett & Jones, 2018).The causality of this effect has been established through intervention studies.For instance, participants who engaged in walks through a natural environment showed decreased levels of rumination and reduced neural activity in the amygdala, a brain region associated with the risk for mental illness and stress, compared to those who took a walk in an urban environment (Bratman et al., 2015;Sudimac et al., The number of selections refers to the frequency with which each variable was selected across cross-validation folds.If a variable was chosen in half or less than half of the cross-validation folds, it was rejected, as listed in the columns "Decision".Variable importances and regression coefficients were averaged across cross-validation folds.Note that due to the high intercorrelations of the predictors, the logistic regression's assumptions are violated, and it was only calculated for comparison.RMSE = root mean square error, AUC = area under the receiver operating characteristic curve. 2022).Genes, as the alternative player, may influence psychosis risk by altering the sensitivity to (gene-environment interaction) or by increasing the exposure to certain environments (gene-environment correlation).The evidence speaks for the latter in schizophrenia (Engemann et al., 2020) and psychosis patients (Boers et al., 2018).Furthermore, living environment deprivation was repeatedly selected by the Boruta model, which was assembled by describing the overall air quality exposure, including outdoor and indoor living environments, e.g., housing conditions.Previous studies have established these environmental attributes as contributors to psychosis (Attademo et al., 2017;V. A. Morgan et al., 2012V. A. Morgan et al., , 2016;;Newbury et al., 2019;Nowak et al., 2006).However, air pollution by itself was not deemed important in the present analyses.
Nonetheless, the most predictive variables of PEs in the present sample captured individual-specific social stressors.Both types of stressors, i.e., during childhood and in later life, have already been extensively discussed in the literature (Daalman, 2012;Linscott & Os, 2012;Matheson et al., 2012;C. Morgan & Fisher, 2006;Nierop et al., 2014;Reeves et al., 2014;Schäfer & Fisher, 2011;Trotta et al., 2015;Varese et al., 2012).Moreover, the present results add evidence to the dose-response association between childhood traumatic experiences and PEs (Coughlan & Cannon, 2017;Cunningham et al., 2015;Dam et al., 2012;Shevlin et al., 2010;Trotta et al., 2015;Varese et al., 2012), aligning with the vulnerability-stress model or diathesis-stress model.This model states that susceptibility to psychosis results from an interaction of (epi)genetic risk factors and early social stress (Guloksuz et al., 2019;Monroe & Simons, 1991;Zubin & Spring 1977).Moreover, stressful life events, regardless of the time at which they occur, can lead to abnormal activity of the hypothalamic-pituitary-adrenal (HPA) axis, triggering a cascade of neurophysiological processes, dysfunction of dopaminergic neurotransmission and neural circuits relevant for facilitating onset, exacerbation and relapse of schizophrenia and psychosis (Collip et al., 2007;Corcoran et al., 2012;Faravelli et al., 2010;Grace, 2016;Howes et al., 2004;Howes & Kapur, 2009;Winkel et al., 2008;Yuii et al., 2007).However, the presence of greenspace may attenuate these detrimental health impacts of stressful life events van den Berg et al., 2010.Additionally, the impact of stressors, as well as recovery from psychosis, may also be mitigated by social support (Heinrichs et al., 2003;Lim et al., 2013;Temesgen et al., 2018).Previous research has demonstrated that social isolation, particularly long-lasting isolation, renders the individual more vulnerable to psychotic disorders (Appiah-Kusi et al., 2017;Badcock et al., 2015;Kirkbride et al., 2007a;Kirkbride et al., 2007c;Morgan et al., 2016;Stain et al., 2012) and mental health issues in general (Cohen, McGowan, Fooskas, & Rose, 1984;S. Cohen and Wills, 1985;House, 2001;Leavy, 1983).Our study found evidence linking social isolation to PEs in accordance with these findings.This may support the social defeat hypothesis, which posits that long-term social defeat, i.e., the feeling of being excluded from the majority group or social exclusion, can lead to sensitisation of the mesolimbic dopamine system and increase the risk for psychotic disorders (Selten & Cantor-Graae, 2005, 2007;Selten et al., 2013).
The present study also identified an association between age and the existence of PEs.Younger age has been previously reported in the literature as a significant demographic risk factor of PEs (Brandizzi et al., 2014;Linscott & Os, 2012;Loch et al., 2017) and psychosis (Kirkbride et al., 2008;Os et al., 2008a), and the negative trend found by the elastic net regression is consistent with these findings.However, the examination of the PDPs suggests a possible period of higher sensitivity in the early 60s and before the age of 65 in general (see Fig. 5 in Supplementary Materials).Although such examination is exploratory, this finding suggests that future work should apply time-varying models capable of detecting multiple sensitive periods (e.g., a TVEM; Shiyko et al., 2014).
Lastly, our results suggest a weak link between PEs and specific forms of social deprivation.On an individual-specific level, our findings indicated that lower household income is associated with an increased risk of reporting PEs, aligning with previous work (Burns & Esterhuizen, 2008;Karcher et al., 2021;Linscott & Os, 2012;Loch et al., 2017;Mamah et al., 2021;V. A. Morgan et al., 2016;Stilo et al., 2016).On a neighbourhood level, the results revealed a non-linear increase in PE risk among participants whose neighbourhoods were characterised by deprived living environments, attenuated crime rates and education deprivation.Multiple studies have linked the elevated risk of psychosis and schizophrenia to social deprivation in general (Anderson et al., 2012;Bhavsar et al., 2014;Croudace et al., 2000;Hardoon et al., 2013;Kirkbride et al., 2007a;Kirkbride et al., 2012;Kirkbride et al., 2007b;Kirkbride et al., 2007c;Linscott & Os, 2012;Omer et al., 2014;Veling et al., 2014), education deprivation (Bhavsar et al., 2014) and crime rates (Bhavsar et al., 2014).Interestingly, when participants come from socioeconomically deprived neighbourhoods or have low education levels, the association between the presence of green spaces in the living environment and health outcomes is particularly pronounced (Twohig-Bennett & Jones, 2018).
No link between air pollution and PEs was captured with the present analysis and study design in accordance with previous research (Braithwaite et al., 2019).A potential explanation for the lack of findings may be the time of exposure: Research has shown that individuals are particularly vulnerable to air pollution early in life.For example, Bradley et al. (2023) found a link between early life exposure to air pollution and psychotic-like experiences.Future studies may instead focus on this time window of exposure.In contrast, air pollution in the present analysis was captured only a few years before the measurement of PE risk.
Furthermore, ethnic background was not deemed important in the present analyses in line with previous research (see meta-analysis: Leaune et al., 2018).A meta-analysis by Tortelli et al. ( 2018) indicated that accounting for life and social adversity events mitigated the impact of ethnicity on subclinical psychosis prevalence rates.In our analysis, the variable importances of childhood adversity and current life stress may have decreased the effect of ethnicity.Arguably more crucially, the present operationalisation of ethnic background disregarded diversity and personal experiences tied to ethnicity.Future studies should focus on variables such as perceived personal discrimination and racism, ethnic identity, migration status and own-group density (Bouhaddani et al., 2019;Das-Munshi et al., 2012;Leaune et al., 2018).
Exposure to greenspace was infrequently selected despite its high correlation with the frequently chosen variable describing exposure to natural environments.The L1 regularisation may have dominated the L2 penalty in the elastic net model, selecting only one variable and setting the coefficient of correlated variables to zero.In contrast, the Boruta follows an all-relevant variable selection method, and surprisingly, only one variable was chosen.This may happen due to inherent randomness in the model produced by the shadow features and the random forest variability.Additionally, Boruta considers both main effects and interactions between features.If the two correlated variables are highly redundant in terms of predictive power, the algorithm might choose one over the other based on its internal criteria.
Likewise, exposure to domestic gardens was also not chosen.Although several theories cover why and how urban green may affect mental and physical health, the most commonly proposed pathways are reducing harm by alleviating heat, noise and air pollution, relieving mental and physiological stress and promoting healthful activities, e.g., exercise and socialising (Markevych et al., 2017).Domestic gardens may not facilitate interpersonal interactions and physical activity to the extent observed in natural environments.However, future studies ought to investigate this further.
In summary, this study suggests that the urbanicity effect for psychotic experiences represents a proxy of individual-specific features of the social and physical environment.Individuals with a high risk of PEs encounter social adversities and present with deficient tools to handle these stressors effectively, including a lack of social support.

Limitations
This study was not without limitations.Firstly, due to the restraints of data availability, we could not control the amount of time participants spent at their homes and in their neighbourhoods.Additionally, the data on the physical environment were collected several years before the mental health assessment.Thus, the present analyses were conducted under the assumption that the characteristics of the environment did not undergo drastic changes over time.Although our two approaches capture both linear (elastic net regression) and non-linear (Boruta) relationships with the outcome, some types of relationships may not be identified as clearly by these approaches.For example, it may be the case that a more complex combination of factors is required to see a strong effect of education deprivation or that crime rates matter to a small enough fraction of the population that its effects are difficult to see.Similarly, our decision to consider Likert-scale items as categorical may have influenced the selection of variables, especially for the elastic net regression.Future investigation into the effects of these factors may provide additional information.Furthermore, the generalisability of the present findings is somewhat limited since the UK Biobank sample consisted primarily of older individuals, with the youngest participant being 45 years old.Nevertheless, the number of extensive datasets that encompass numerous variables describing the physical environment and capture psychotic experiences is extremely limited.To address this, we currently plan to investigate subclinical psychosis and paranoia in twins.The sample was also subject to a volunteer bias, which may have excluded individuals with more severe forms of PEs.To measure the environmental impact, we only included participants with long-term residence in the final analyses, thereby introducing further bias in the sample.Most importantly, we cannot entirely dismiss that the choice of residential location may be confounded with genetics, i.e., via selective migration, or some other selection criteria.Further, this analysis does not directly test causation; individuals with pre-existing mental disorders might tend to move to a lower socioeconomic status and cluster in urban areas, i.e., social drift hypothesis (Colodro-Conde et al., 2018;Saraceno & Barbui, 1997;Sariaslan et al., 2016).Conversely, individuals with better psychological health may actively seek out residential environments that offer greater exposure to natural surroundings.However, multiple studies have provided evidence against these theories (Krabbendam, 2005;Os et al., 2000;Paksarian et al., 2017).Lastly, no objective measurements of PEs or clinical validation of psychosis were included in the UK Biobank data.

Conclusions
Despite the limitations of this study, we argue that these results have clinical implications.PEs have adverse health outcomes, such as suicidal behaviour (Bromet et al., 2017;DeVylder et al., 2015;Narita, Wilcox, & DeVylder, 2020), psychiatric comorbidities (DeVylder et al., 2014a;Kelleher et al., 2012;J. J. McGrath et al., 2016) and greater need for psychiatric intervention (DeVylder et al., 2014b;Murphy et al., 2010).Individuals with PEs have a 4-fold increased risk of psychotic disorder and a 3-fold risk for mental disorders (Healy et al., 2019).This makes early intervention critical.Encouraging individuals at risk for or with PEs to increase their time spent in natural environments, ideally with other people, may represent a cost-effective, easily administered prevention and intervention strategy with minimal, if any, side effects (Van den Berg, 2017).Taking this one step further, if policymakers advocate for readily accessible, high-quality natural environments, we may prevent the progression to psychosis in individuals with or with a high risk of psychotic experiences without active interventions.

Funding
UK Biobank was established by the Wellcome Trust medical charity, the Medical Research Council, the Department of Health, the Scottish Government and the Northwest Regional Development Agency.It has also had funding from the Welsh Government, the British Heart Foundation, Cancer Research UK and Diabetes UK.UK Biobank is supported by the National Health Service.All participants provided written informed consent.Ethical approval of the UK Biobank study was given by the North West Multicentre Research Ethics Committee, the National Information Governance Board for Health & Social Care, and the Community Health Index Advisory Group.
During the work on her dissertation, MB was a pre-doctoral fellow of the International Max Planck Research School on Computational Methods in Psychiatry and Ageing Research (COMP2PSYCH, https:// www.mps-ucl-centre.mpg.de/comp2psych;participating institutions: Max Planck Institute for Human Development, University College London).
The work has been partially funded by the European Union (ERC-2022-CoG-BrainScape-101086188 awarded to SK).Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency (ERCEA).Neither the European Union nor the granting authority can be held responsible for them.
During the work on this project, TB was partially funded by the Pennsylvania State University's Institute for Computational and Data Sciences.

Declaration of generative AI and AI-assisted technologies in the writing process
During the preparation of this work, the author(s) used ChatGPT in order to improve language and readability.After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Table 1
Demographic information in cases and controls in the entire dataset.

Table 3
Feature importance and selection frequency.

Table 4
Variable importance and regression coefficients of all candidate variables.

Table 5
Mean predictive ability of the different model approaches.