Nested species distribution models of Chlamydiales in tick host Ixodes ricinus in Switzerland

The tick Ixodes ricinus is the vector of various pathogens, including Chlamydiales bacteria, potentially causing respiratory infections. In this study, we modelled the spatial distribution of I. ricinus and associated Chlamydiales over Switzerland from 2009 to 2019. We used a total of 2293 ticks and 186 Chlamydiales occurrences provided by a Swiss Army field campaign, a collaborative smartphone application and a prospective campaign. For each tick location, we retrieved from Swiss federal datasets the environmental factors reflecting the topography, climate and land cover. We then used the Maxent modelling technique to estimate the suitability for I. ricinus and to subsequently build the nested niche of Chlamydiales bacteria. Results indicate that I. ricinus high habitat suitability is determined by higher temperature and vegetation index (NDVI) values, lower temperature during driest months and a higher percentage of artificial and forests areas. The performance of the model was increased when extracting the environmental variables for a 100 m-radius buffer around the sampling points and when considering the data over the two years previous sampling date. For Chlamydiales bacteria, the suitability was favoured by lower percentage of artificial surfaces, driest conditions, high precipitation during coldest months and short distances to wetlands. From 2009 to 2018, we observed an extension of tick and Chlamydiales suitable areas, associated with a shift towards higher altitude. The importance to consider spatio-temporal variations of the environmental conditions for obtaining better prediction was also demonstrated. Importance Ixodes ricinus is the vector of pathogens, including the agent of Lyme disease, the tick borne encephalitis virus and the less known Chlamydiales bacteria at the origin of some respiratory infections. In this study, we identified the environmental factors influencing the presence of I. ricinus and Chlamydiales in Switzerland and generated maps of their distribution from 2009 to 2018. We found an important expansion of suitable areas for both the tick and the bacteria during the last decade. Results provided also the environmental factors that determine the presence of Chlamydiales within ticks. Distribution maps as generated here are expected to bring valuable informations for decision-makers to control tick-borne diseases in Switzerland and establish prevention campaigns. The methodological framework presented could be used to predict the distribution and spread of other host-pathogen couples, to identify environmental factors driving their distribution and to develop control or prevention strategies accordingly.

2008) and we use the function "Proximity" in the QGIS 2.14.7 software (QGIS Development 197 Team, 2016) to derive four indices characterising the minimal Euclidean distance to watery 198 areas: distance to wetland, to watercourses, to stagnant water and to any watery elements. 199 Thirdly, we retrieved the 16-days composite Normalised Difference Vegetation Index (NDVI) values at a 1km-resolution were obtained from MeteoSwiss. From these datasets, we 218 estimated the daily saturated and ambient vapour pressure using the Tetens formula (Murray, -10-1966) and by approximating the temperature at dew point by the minimum temperature 220 (Running et al., 1987). We used them to compute the daily relative humidity and to derive 22 221 indicators summarising the monthly (9 indicators) and daily (13 indicators) values of relative 222 humidity. All these climatic predictors were computed in R, with the detailed procedure 223 presented in Supp. File 3. In total, this resulted in 77 environmental indicators, each of which 224 were resampled to a final spatial resolution of 100 m. 225 Data extraction 226 The values of the 77 environmental predictors were extracted for each data point site (tick 227 occurrence) according to their coordinates using the function "extract" from the R "raster" 228 package. The climatic and NDVI variables were retrieved as a function of the sampling dates. 229 To assess the influence of the conditions before sampling, we retrieved these variables for 1 230 month, 3 months, 6 months, 1 year, 2 years and 3 years before sampling date. For the other 231 stable predictors such as morphometric predictors, land cover type, percentage of coniferous 232 in forest and distances to watery areas one single extraction was used for all sampling dates 233 over the period of analysis (from 2009 to 2019).

234
To assess the influence of the environmental conditions surrounding the sampling points, for 235 each environmental predictor we also computed the mean value observed in square buffers 236 centred on the sampling point, with radius of 100 m, 200 m, 500 m, 700 m, 1 km and 1.5 km.

237
Raster layers were also computed for each of these indicators, with every buffer radius and 238 time period, for June months from 2009 to 2019. For each pixel, the computation of mean 239 values considering a square buffer around the pixel was done with a moving-window 240 procedure implemented in R, based on the "focal" function from the "raster" package.

241
Finally, we also extracted all predictors for a randomly generated data set (to test it against 242 sampling data, see hereafter). This generated data set is composed by sites with 10'000 coordinates randomly localised in Switzerland, for which dates were selected randomly within 244 the distribution of observed sampling dates (Supp. File 4).

Selection of environmental variables 247
The species distribution models were successively derived using the variables extracted for 248 each combination of buffer radius (100 m, 200 m, 500 m, 700 m, 1 km and 1.5 km) and time 249 period (1 month, 3 months, 6 months, 1 year, 2 years and 3 years). In addition, to select the 250 most significant combination of buffer radius and time period individually for each variable, 251 we performed a Student T-test to identify the variables that best discriminate the tick's 252 presences from random points. The computation was done using the function "t.test" in R and 253 variables were considered as significant if the p-value of the T-test was lower than 0.01 after a 254 Bonferroni correction for multiple comparisons. For each variable, we then kept only the 255 combination of buffer radius and time period showing the highest T-value. A "combination" 256 model was then derived using this "combination" set of variable.

257
As some environmental variables considered might be correlated, we used two methods to 258 pre-select uncorrelated environmental predictors. In the first one, we run a Principal 259 Component Analysis (PCA) on the variables to retrieved independent components. The with 3751 presence points were kept as an independent dataset used to test the models.

277
Since the performance of the Maxent models is known to be influenced notably by the considered either all components of the PCA or only the components needed to retain 50% of 288 the variance, resp. 70%, 80%, 90% or 95%. For the feature types, we tested the use of linear 289 features only, or the combination of linear and product, linear and quadratic or linear, product and quadratic together. Finally, we varied the regularisation constant parameters with values 291 equal to 1, 2, 5 or 10.

292
In order to perform a cross-validation procedure, we used 75% of the occurrences and 293 background points to train the model and kept 25% to test it. The training and testing 294 occurrences were selected randomly and 20 different runs were computed. All models were 295 projected using the "cloglog" scaled output (Phillips, 2017), interpreted in terms of suitability 296 index to avoid making assumptions regarding the prevalence of the species.

Model evaluation 298
The models were compared based on four criteria. First the Area under the Receiver

299
Operating curve (AUC) (Fielding and Bell, 1997) was computed on the testing dataset. The   The same procedure was then applied as for the modelling of the tick's suitability: 1) 346 computation of a T-test to select a "combination" dataset of environmental variables, 2) 347 selection of uncorrelated variables with either a PCA or a correlation/VIF procedure, 3) run of 348 Maxent models by testing various parameters (method to select uncorrelated variables, feature 349 types and regularisation parameters). In order to build models for the suitability of Chlamydiales within areas suitable for ticks, the predicted suitability for Chlamydiales 351 obtained by the Maxent model was then multiplied by the suitability obtained for I. ricinus. 352 As for I. ricinus, twenty runs were computed for each model, using 75% of the data to train 353 the model and 25% to test it. The ranking procedure used to evaluate the models was slightly Chlamydiales were not identified as compared to occurrence sites would therefore be 362 considered as more performant.

Best model 366
Among the 56 models tested with various parameters, the best one, according to the ranking 367 procedure, was obtained with the following parameters: 1) background points selected below 368 1500 m in altitude (corresponding to 6049/10 000 points), 2) a PCA procedure to avoid  the Swiss territory is predicted as favourable for Chlamydiales bacteria (using the threshold 505 maximising the sum of sensitivity and specificity). As the niche of the bacteria is nested 506 within the niche of the tick, modelling Chlamydiales bacteria suitability involved a 507 multiplication by the suitability results for Ixodes ricinus. Therefore, the areas predicted to be 508 unfavourable for the presence of the tick species are also predicted as weakly suitable for 509 Chlamydiales. On the contrary, some areas predicted to be highly favourable for the presence 510 of Ixodes ricinus on Figure 3 did not match and showed very low values on Figure 6. This is the case for the areas situated within urban settlements, in which a large portion was predicted 512 to be suitable for ticks but not for Chlamydiales. Indeed, the distribution of the favourable 513 areas within the various categories of land cover classes indicates that they are essentially 514 observed in natural areas, covered either by tree (74%) or grass (12%) vegetation, and only 515 4% of them are observed in regions characterised by a large portion of artificial elements.

516
When considering the altitudinal distribution, areas favourable for Chlamydiales seem to be 517 essentially predicted in forest suitable for ticks, between 500 and 1000 m in altitude.

518
However, due to other factors influencing the model, notably the climatic conditions, 52% of 519 those forests are also predicted to be unfavourable for the bacteria.  536 Distribution maps for ticks and bacteria from 2009 to 2019 highlighted an extension of the 537 suitable areas for both species and a spread towards higher altitude. Ixodes ricinus expended 538 from 16% to 25% of the Swiss territory, and a subsequent extension for Chlamydiales bacteria 539 is observed from 8% to 9.3%. Ixodes ricinus expansion occurred all over the Swiss Plateau 540 and toward higher altitude in the alpine valleys and was more extended in the South-West.

541
Newly available habitat concerned mostly grass and forest areas. Extension of Chlamydiales 542 followed similar trends, restricted to forest areas. As Ixodes ricinus presence is favoured by 543 higher temperature, we might expect that, in the future, this expansion might continue