Skip to main content
Advertisement
  • Loading metrics

Machine-learning model led design to experimentally test species thermal limits: The case of kissing bugs (Triatominae)

  • Jorge E. Rabinovich ,

    Contributed equally to this work with: Jorge E. Rabinovich, Nicholas M. Fountain-Jones

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    jorge.rabinovich@gmail.com

    Affiliation Centro de Estudios Parasitológicos y de Vectores (CEPAVE CONICET-CCT La Plata, UNLP), National University of La Plata, La Plata, Argentina

  • Agustín Alvarez Costa ,

    Roles Formal analysis, Methodology, Software, Writing – review & editing

    ‡ AC, IM and PS also contributed equally to this work.

    Affiliations Laboratorio de Ecofisiología de Insectos, Departamento de Biodiversidad y Biología Experimental, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina, Instituto de Biodiversidad y Biología Experimental y Aplicada (IBBEA), CONICET–Universidad de Buenos Aires, Buenos Aires, Argentina

  • Ignacio J. Muñoz ,

    Roles Formal analysis, Writing – review & editing

    ‡ AC, IM and PS also contributed equally to this work.

    Affiliations Laboratorio de Ecofisiología de Insectos, Departamento de Biodiversidad y Biología Experimental, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina, Instituto de Biodiversidad y Biología Experimental y Aplicada (IBBEA), CONICET–Universidad de Buenos Aires, Buenos Aires, Argentina

  • Pablo E. Schilman ,

    Roles Formal analysis, Funding acquisition, Methodology, Resources, Writing – review & editing

    ‡ AC, IM and PS also contributed equally to this work.

    Affiliations Laboratorio de Ecofisiología de Insectos, Departamento de Biodiversidad y Biología Experimental, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina, Instituto de Biodiversidad y Biología Experimental y Aplicada (IBBEA), CONICET–Universidad de Buenos Aires, Buenos Aires, Argentina

  • Nicholas M. Fountain-Jones

    Contributed equally to this work with: Jorge E. Rabinovich, Nicholas M. Fountain-Jones

    Roles Conceptualization, Formal analysis, Methodology, Software, Visualization, Writing – review & editing

    Affiliation School of Natural Sciences, University of Tasmania, Hobart, Australia

Abstract

Species Distribution Modelling (SDM) determines habitat suitability of a species across geographic areas using macro-climatic variables; however, micro-habitats can buffer or exacerbate the influence of macro-climatic variables, requiring links between physiology and species persistence. Experimental approaches linking species physiology to micro-climate are complex, time consuming and expensive. E.g., what combination of exposure time and temperature is important for a species thermal tolerance is difficult to judge a priori. We tackled this problem using an active learning approach that utilized machine learning methods to guide thermal tolerance experimental design for three kissing-bug species: Triatoma infestans, Rhodnius prolixus, and Panstrongylus megistus (Hemiptera: Reduviidae: Triatominae), vectors of the parasite causing Chagas disease. As with other pathogen vectors, triatomines are well known to utilize micro-habitats and the associated shift in microclimate to enhance survival. Using a limited literature-collected dataset, our approach showed that temperature followed by exposure time were the strongest predictors of mortality; species played a minor role, and life stage was the least important. Further, we identified complex but biologically plausible nonlinear interactions between temperature and exposure time in shaping mortality, together setting the potential thermal limits of triatomines. The results from this data led to the design of new experiments with laboratory results that produced novel insights of the effects of temperature and exposure for the triatomines. These results, in turn, can be used to better model micro-climatic envelope for the species. Here we demonstrate the power of an active learning approach to explore experimental space to design laboratory studies testing species thermal limits. Our analytical pipeline can be easily adapted to other systems and we provide code to allow practitioners to perform similar analyses. Not only does our approach have the potential to save time and money: it can also increase our understanding of the links between species physiology and climate, a topic of increasing ecological importance.

Author summary

Species Distribution Modelling determines habitat suitability of a species across geographic areas using macro-climatic variables; however, micro-habitats can buffer or exacerbate the influence of macro-climatic variables, requiring links between physiology and species persistence. We tackled the problem of the combination of exposure time and temperature (a combination difficult to judge a priori) in determining species thermal tolerance, using an active learning approach that utilized machine learning methods to guide thermal tolerance experimental design for three kissing-bug species (Triatoma infestans, Rhodnius prolixus, and Panstrongylus megistus), vectors of the parasite causing Chagas disease. These bugs are found in micro-habitats with associated shifts in microclimate to enhance survival. Using a limited literature-collected dataset, we showed that temperature followed by exposure time were the strongest predictors of mortality, that species played a minor role, that life stage was the least important, and a complex nonlinear interaction between temperature and exposure time in shaping mortality of kissing bugs. These results led to the design of new laboratory experiments to assess the effects of temperature and exposure for the triatomines. These results can be used to better model micro-climatic envelope for species. Our active learning approach to explore experimental space to design laboratory studies can also be applied to other environmental conditions or species.

Introduction

The main environmental requirements for any organism to be in thermodynamic equilibrium over a reasonable length of time in order to survive are well known for more than a half century [1]. Of these, radiation absorbed, wind speed, and air temperature are physiological requirements for a certain body temperature or temperature range, and are referred to as the ‘climate space’, and constitute the conditions which animals must fulfil in order to survive [2]. Much effort has gone into applying Species Distribution Modelling (SDM) to model climate space across different geographic areas to determine habitat suitability, including species that are disease vectors [39]. In general, the SDM methodology utilizes macro-climatic variables to predict the distribution of a species; however, for many disease vector species natural and human-made micro-habitats can buffer or exacerbate the influence of the macro-climatic variables [10,11]. While these models increasingly harness high resolution climatic data down to ~1km2 (e.g., WorldClim 2 [12]), the links between physiology, disease transmission and vector survival to micro-climate are opaque. Consequently, the use of these climatic variables leads to some caveats on the reliability of the suitability estimates produced by the SDM. However, accurately quantifying micro-climatic relationships and tolerances pose significant problems for most disease vector species.

Mathematical models have helped predict operative temperatures but also show how evaporative cooling and metabolic heating might cause the body temperature of an organism to deviate from the operative temperature [13]. However, despite mathematical models have provided important insights into what factors influence operative temperatures, they are impractical for mapping thermal environments at a sufficient resolution to understand selective pressures on behavior and physiology [14,15]. This is because to compute operative temperatures from a mathematical model many variables must be known (solar radiation, ground reflectivity, air temperature, ground temperature, and wind speed), which represents an overwhelming task for a large number of locations as needed for fine-scale mapping [15]. Operative temperatures have been computed for a limited number of microclimates, such as full sun and full shade, and for a few animal species [1618], but a general methodology is still needed.

Because of their use of a variety of micro-habitats, the trouble of using macroclimatic variables to predict vector species distributions is particularly serious [19]. This is true for the kissing bugs (Hemiptera: Reduviidae: Triatominae), a group of species vectors of Trypanosoma cruzi, the parasite causing Chagas disease in Latin America. This disease is endemic in 21 countries, and it is estimated that affects between 6 and 8 million individuals (with 25 million people at risk of infection), resulting in approximately 12,000 deaths per year [20,21]. The triatomines are a subfamily that comprise around 150 species grouped in 18 genera and six tribes [22]. A commonly observed behavior of most of these species is to enter domestic and peri-domestic structures (“intrusion”), with some of them trying to colonize the human habitat (“domiciliation”), making the sylvatic species a possible source of infection by T. cruzi [23]. Due to their generally nocturnal habits and hidden refuges, they are not only inconspicuous and hard to collect in the field, but also endure extreme values of macro-climatic variables that would not allow them to survive without the use of micro-habitats [24].

For triatomines, as with other disease vector species, there is very limited information available to adequately link insect physiology to micro-climatic variables. Much more experimental work is necessary, but those experiments are laborious, costly, and demanding a large number of insects. Additionally, the experimental design is a complex one for, in general, the micro-climatic variables (e.g., temperature) need to be included with at least two factors: the value of the micro-climatic variable itself, and the exposure time (or duration) to each value of the micro-climatic variable. It is difficult to guess beforehand the limits and number of those two factors, or the impact of each of them (and their combination) on the demographic parameters, in order to design a laboratory experiment. To facilitate the design of such kind of experiments (which, in theory, could involve hundreds or thousands of combinations) we propose a methodology based upon a machine learning pipeline approach [25] in order to predict the survival of triatomines by different combinations of micro-climatic temperature values and exposure times. This approach leverages recent advance in machine learning to construct powerful but interpretable predictive models that can guide what combinations of variables could be important in shaping a species thermal limit and thus can configure feasible experimental designs. ‘Interpretable machine learning’ approaches that enable complex ‘black box’ models to be interrogated and understood are increasingly used in ecology [25,26] including those for interrogating SDMs [27]. Importantly these models can quantify complex non-linear interactions even for small datasets [28] between variables that can be difficult to include a priori in a model. Further, this framework is ‘model agnostic’ in that the predictive performance from linear regression to gradient boosted models (GBMs) can be easily compared using a mathematically coherent set of statistics [25]. For these reasons, model agnostic interpretable machine learning offers exciting possibilities for exploring and understanding experimental design.

Finding computational solutions to guide experimental design or ‘active learning’ is not a new idea and has been used to guide experiments to better understand complex gene regulatory networks [29]. Active learning is an iterative process in that a model is formed from preliminary data that guides new experiments that in turn generates novel data that is used to update the original model (Fig 1). For example, machine learning approaches have been successfully used to guide gene knock-out experiments where the number of experiments is quadratic to the number of genes (see [29] for a review on the topic). However, active learning approaches are rarely applied to explore species thermal limits. We show the utility of the interpretable active learning approach to better understand the thermal ecology of an important group of disease vector species that could easily be adapted to guide ecological experiments more broadly.

thumbnail
Fig 1. Schematic description of the active learning approach used in this study.

We have highlighted the context and purpose of the work, the approach and methods, and an outline of the application of the main results. GBM: Gradient boost model, SVM: Support vector machine, RF: Random forests, BGLM: Bayesian general linear model. *Best model: model with the highest root mean square error (RMSE).

https://doi.org/10.1371/journal.pntd.0008822.g001

Materials and methods

Available laboratory information for triatomines

Data source.

The only previous data available were the results from thermal shock experiments (usually at 40°C) applied on different stages and adults of several triatomine species, exposed for 1 and 12 hours. In total we were able to find only seven bibliographic sources with adequate data at various temperatures and exposure times; only three species of triatomines were available with that information: Triatoma infestans, Rhodnius prolixus, and Panstrongylus megistus (See Data Cases below.)The first two species are considered as the ones with the highest T. cruzi transmission risk of all triatomines known [6,23]; P. megistus, although not as important a vector as T. infestans and R. prolixus, is becoming so because of its increasing colonization behaviour and domiciliation process.

Data preparation.

As in most of the cases results were presented as survival values (lx, with values from 1 to ≥0, which express the proportion of the initial number of individuals alive at day x, and where 1 ≤ x ≤ 30, 30 being the number of days of the control period of all experimental insects), we converted these survival values to mortality (dx, or number of individuals dying in day x, where again x is from day 1 to day 30 of the control period; we expressed the result of this conversion as dx = (lx—lx-1). The mortality values were inserted into a file with a database structure, and were considered as the end points (as percentages), for each combination of temperature, exposure time, and stage (and sex in the case of adults) predictor variables (hereafter ‘features’ in line with computer science terminology), making a total of 228 combinations (S1 Data). Table 1 represents a summary of the different cases analyzed.

thumbnail
Table 1. Source and characteristics of the original information.

https://doi.org/10.1371/journal.pntd.0008822.t001

The machine learning pipeline approach

We adapted the interpretable multi-model machine learning framework from [25] using our collated feature set to predict percentage mortality as a regression problem. This framework offers tools such as centered individual conditional expectation curves (cICE, [37]) that can be employed even on small datasets (< 100 data points) to detect heterogenous relationships between survival and each feature [25]. For our initial dataset, we checked for correlations between features prior to running our machine learning pipeline. We compared support vector machine, GBMs, Bayesian generalized linear models as well as linear models fitted with ordinary least squares. In each case we used 10-fold cross validation to compute model performance (root mean square error, RMSE) and to prevent overfitting the model. The model with the lowest RMSE and thus highest predictive performance was further interrogated. We also compared predictive performance by calculating R2, in this case the correlation between the observed and predicted values [37]. To interpret how each feature was affecting model performance we computed ranked feature importance using model class reliance approach [38] and plotted individual responses using the cICE approach. Interactions in the model were quantified by calculating Friedman’s H statistic [39] and visualized plotting multidimensional partial dependency plots. See [25] for more details and https://github.com/nfj1380/ThermalLimits_ActiveLearning for the code used.

Results

The machine learning pipeline results

The GBM model had the highest predictive power compared to the other algorithms we tested, with an R2 of 0.8 and RMSE of 18.2. In this model, temperature followed by the number of hours exposed were the strongest predictors of mortality based on the preliminary experimental results (Fig 2A). Species played a more minor role in our predictive models with life stage the least important of the variables included in the model (Fig 2A). When we further interrogated our model, we not only found some strong non-linear effects of each variable on mortality (Fig 2B–2E) but some striking non-linear interactions between variables (Fig 3). For example, triatomine mortality was generally stable up until the 39°C mark when it rapidly increased before plateauing between 40–44°C, and this response was an essential component of the experimental design for the combination of temperature and exposure time on insect´s survival (see below). However, mortality predictions for some individuals in the dataset were much lower than others which is evidence that interactions are important in this predictive model (indicated by the spread of the lines in Fig 2B, see [40]). The relationship between exposure and mortality was even more variable across individuals showing no clear trend (Fig 2C). Exposure time, however, had the strongest interactions with the other variables (Fig 3), with the relationship between exposure time and temperature being of the strongest predictive importance. For the three species here analyzed low mortality was predicted with short exposure times (<7 hours to temperatures between 0–39°C), yet any exposure to temperatures above 42–44°C was associated with high mortality, although an exact threshold was not evident. Our model predicted that temperatures between 10–39°C to be optimal for triatomine survival with no effect of exposure time on mortality. Our analysis also revealed that T. infestans experienced lower mortality overall (Fig 2C), and this may be linked to temperature as we found lower mortality of this species compared to the others at temperatures >39°C.

thumbnail
Fig 2. Results from the best performing machine learning model.

Variable importance plot (a) and centered individual expectation plot (cICE) plots (b-e) from the best performing machine learning model (GBM). Y axes represent the marginalized effect of each variable on vector estimated mortality (%) whilst controlling for other predictors in the model. The red line represents the average effect of the variable of triatomine survival.

https://doi.org/10.1371/journal.pntd.0008822.g002

thumbnail
Fig 3. Interaction between variables as detected from the machine learning model.

Plots showing the overall interactive strength of each variable (a), the top three interactions between variables based on Friedman’s H (b), followed by 2D partial dependencies of the most important interactions in our GBM model using Friedman’s H Index (c-d), where is the estimated mortality (%). Darker red in the heatmap (c) reflect higher predicted mortality. Fig A of the S1 Fig shows the interaction between exposure time and species.

https://doi.org/10.1371/journal.pntd.0008822.g003

Machine learning led experimental design

The results of our machine learning model revealed that any exposure to temperatures above 42–44°C was associated with high mortality. Subsequently, we exposed insects to one of three temperatures (40, 42 and 44°C) for different exposure times (1, 2, 4, 8 and 12 hours). We used fifth-instar nymphs of T. infestans provided by the National Chagas Control Service (Córdoba, Argentina). The fifth instar nymphs were selected because there is evidence of its temperature tolerance from previous studies [41,42]. Insects were fed on live chicken two weeks before experiments, and they were acclimated in the laboratory at 25 ± 0.5°C, and 12:12 light/dark photoperiod (light on 08:00 am) for one week, because thermotolerance showed a plastic response in this species [43].

The insects were placed in a chamber (PTC-1 Peltier Effect Cabinet; Sable Systems International (SSI), Las Vegas, NV, USA), connected to a temperature controller (Pelt-5 (SSI)) set at 40, 42 or 44°C for each of the exposure times. A group of insects handled in the same way as the experimental groups, but without being exposed to high temperature, and kept at 25°C, were used as the control group (C). After their exposure to the treatment the insects were maintained at the basic laboratory conditions described above, for 70 days. Ten insects were used for each replicate; the 40°C treatment was replicated twice because we did not find differences between the control treatment and the period of exposure, while the 42 and 44°C treatments were replicated three times. A total of 480 insects were used. The number of dead and molted insects was recorded each day after the heat treatment. Molted insects were eliminated from the survival analysis. A survival percentage was calculated as the ratio of number of live insects over the total number of treated insects (i.e., the “initial” number) for each replicate and treatment. To determine the effects of heat for different temperatures and exposure periods on survival, a Kaplan-Meier analysis was carried out for each temperature, i.e., 40, 42 and 44°C. A multi comparison procedure (the Holm Sidak method) was applied to evaluate the differences between exposure times for each temperature, with an overall significance level of 0.05. Also, a GLM (Generalized Linear Model) analysis was carried out to determine the importance of the covariates (the treatment temperature and exposure times of the laboratory experiments), as well as their interactions, on mortality; the GLM enables the use of linear models in cases where the response variable has an error distribution that is non-normal. The GLM fit to the data was then used to predict the insects’ mortality for various combinations of temperatures and exposure times; those mortality predictions were converted into age specific survival values (lx) and then used to estimate the life expectancy (days) after the heat treatment (e0). All these calculations included the 95% confidence intervals and were carried out in R [44](R Core Team, 2020); see https://github.com/nfj1380/ThermalLimits_ActiveLearning for the code used.

The laboratory experiment results

At 40°C survival did not differ significantly across exposure periods (Log-rank test, χ2 = 5.33, df = 5, p = 0.377) (Table 2). However, survival differed significantly across exposure periods at the other two temperatures, i.e., 42°C (Log-rank test, χ2 = 66.92, df = 5, p< 0.001), and 44°C (Log-rank test, χ2 = 138.25, df = 5, p< 0.001). At 42°C, the exposure periods of 8 and 12 h showed a lower mean survival time than the other exposure times (Table 2). At 44°C, the mean survival time of 2 and 4 h treatments were lower than the control and 1 h treatment, and 8 and 12 h showed the lowest mean survival time from all treatments (Table 2).

thumbnail
Table 2. Laboratory experiment results of the effect of temperature on survival of 5th instar T. infestans.

https://doi.org/10.1371/journal.pntd.0008822.t002

The fit of the mortality data using the survival function (with interaction) of the Cox analysis showed that both covariates (temperature and exposure times) were highly significant, as well as their interaction, in determining the mortality level (Concordance = 0.872 (se = 0.015), Likelihood ratio test = 295.9 on 3 df, p≤ 2e-16, Wald test = 276 on 3 df, p≤ 2e-16, and Score (logrank) test = 447.6 on 3 df, p≤ 2e-16). As an example of the fit of the Cox model to the laboratory data we show in Fig 4 the survival curves estimated for only one temperature (44°C) and three arbitrary exposure times (2, 4, and 6 hours). The survival curves for all combinations of temperature and exposure times are given in Fig B of S1 Fig.

thumbnail
Fig 4. Results of Cox analysis of laboratory experiments.

The survival curves (lx) predicted by the Cox analysis for 44°C temperature and three exposure times (2, 4, and 6 hours).

https://doi.org/10.1371/journal.pntd.0008822.g004

The GLM analysis confirmed this kind of results, except that it showed a much weaker effect of temperature (NS at the 5% level: p = 0.0552) and a highly significant effect of exposure time and the interaction of temperature with exposure time (p = 0.00408, and p = 0.00285, respectively); the multiple adjusted R2 was 0.822, with a p value of 4.274e-06. Fig 5 shows the proportion of insects alive after 70 days of the heat treatment for various simulated combinations of temperatures and exposure times.

thumbnail
Fig 5. Results of GLM analysis of laboratory experiments.

Proportion of insects alive (lx) after 70 days of the heat treatment for various simulated combinations of temperatures and exposure times.

https://doi.org/10.1371/journal.pntd.0008822.g005

Another useful parameter to represent the effects of the heat treatments is the expectation of life (ex) after the day of treatment (x); this parameter is defined as the average number of days of life remaining to an individual alive at age or time x, and is calculated as , where lx is the probability of being alive at age or time x, and lw is the same where w refers to the time of death [45]. Fig 6 shows the number of days expected to be lived by an average insect after each heat treatment (e0). These e0 estimates were calculated from both the experimental data and the survival curves predicted by the Cox analysis. It is clearly seen that the variability of the insects’ mortality response becomes more variable as the temperature increases.

thumbnail
Fig 6. Comparison of observed and expected life expectancies.

Life expectancy (e0, in days) after the day of the heat treatment from both the experimental data and the survival curves predicted by the Cox analysis. The dashed lines are the 95% confidence intervals.

https://doi.org/10.1371/journal.pntd.0008822.g006

Discussion

We demonstrate the power of an active learning approach to explore experimental space to design studies that can provide greater understanding of species thermal limits. From a preliminary and very limited dataset, we were able to identify complex but biologically plausible nonlinear interactions between temperature and exposure times shaping mortality, and setting potential thermal limits for three triatomine species for SDM (species distribution modelling). This led to successful new experiments that generated novel insights into this important vector group that can be utilized by species distribution models based upon micro-climatic information. This approach can not only guide experimental efforts for the specific study of disease vectors but can also be extended to reduce the experimental envelope in any systems with multiple interacting variables.

The harmful effect on insects´ survival of exposing them to extreme temperatures (close to the upper lethal temperature) for extended periods of time is not linear, and thus difficult to predict [46]. Without our active learning approach, researchers would have had to do a much larger number of experimental combinations to find these non-linear thresholds between exposure and temperature for each species; not only impractical but unaffordable. Our experimental temperatures of between 40–44°C and exposure times of 1 to 6 hours at high temperatures are found on T. infestans´ natural habitats during summer [24]; we extended the recorded exposure times to 8 and 12 hours to provide some margin for potential global climate change. Similar to [33], our results showed that survival of fifth nymph of T. infestans was not affected by any of the exposure times tested at 40°C. However, it could affect other physiological traits that were not measured like the germinal cells [47]. In addition, a survival reduction was observed after 8 or more hours of exposure at 42°C or 4 or more hours of exposure at 44°C. Similarly, in nymphs of the related species, Pastrongylus megystus, a brief exposure of 1 h to 40°C did not compromise its survival, but a long period of exposure, i.e., 12 h, showed an important drop on survival [30,31]. The combined effect of temperature and exposure time on survival is quite variable, possibly due to thermal tolerance differences across triatomine species, as shown by [48] for seven species of triatomines. Those species were chosen because of their epidemiological relevance, and it was observed that thermo-tolerance range increases with increasing latitude mainly due to better cold tolerances, and limiting their southern distribution [48].

A sub-lethal temperature acquires a considerable ecological relevance, when an extended time of exposures to such a temperature turns into lethal [46]. In our experiments this was the case for 42 and 44°C. These temperatures did not reduce insects´ survival when exposures times were small, e.g., 1 h, while longer exposure times, i.e., 2 or 8 h at 44 or 42°C respectively, highly decreases survival. There is little information about the effects of exposure times and temperatures in kissing bugs. There is one study where a brief exposure (0.17 h) at a high temperature (55°C) produced 100% mortality in nymphs and adults of T. infestans, while lower temperatures presented various effects: at 40°C there was an increase of insect´s activity, while 50°C produced severe damage, like knock-down, proboscis extension and leg paralysis [34]. Although the exposure times of our experiments were longer, we observed similar damage, especially knock-down, for almost all exposure times at 44°C, also in T. infestans (A.A.C. and I.M. personal observations).

Lastly, our methodology opens up further opportunities to model population dynamics based upon the responses to micro-climatic environments (as [49] have done for in a tephritid fly Bactrocera dorsalis). The effects of temperature and exposure on other demographic parameters like development time and fecundity in order to estimate the thermal effects on the population growth rate are reasonably understood. Under laboratory conditions daily temperature fluctuations (DTF) did not affect development time and fertility for Rhodnius prolixus [49]. However, fecundity was lower in females reared at DTF than at constant temperature, and males had higher body mass reduction rate and lower survival in the DTF regime, suggesting higher costs associated to fluctuating thermal environments [50]. Importantly, plotting cICE curves (Fig 2B–2E) enhances our ability to detect heterogenous relationships between survival and each feature (the effect of sex was not included). Also, the humidity factor has to be considered when performing SDM, which interacts with temperature (expressed as the vapor pressure saturation deficit [51]). Using an active learning approach could further help to understand this interaction and thus, the dryness dimension of the fundamental niche of small ectotherms such as insects, because of their high surface area-to-volume ratios, are usually at risk of dehydration in arid environments. Undoubtedly the lower thermal tolerance has also to be taken into account for a complete bioclimatic analysis of ectotherm population dynamics. The ultimate goal of all these physiological and SDM modelling efforts is to predict population density in different areas of the triatomine’s geographical range; population density of pathogens’ vectors is an important component of transmission risk, so this kind of micro-SDM modelling is of important epidemiological significance. We hope that our successful results will prompt experiments with other species in that direction.

Incorporating computational advances and active learning into ecological experimental design decision making is rare, but this study provides a case study on how this approach can be used in the context of species thermal ecology. As the world climate shifts even more swiftly, being able to search rapidly through an experimental space to design better experiments to identify species tolerance thresholds is increasingly important. Moreover, as ecological systems are inherently complex and ecological experiments often costly, an active learning approach has the potential to be of more general use in various ecological sub-disciplines.

Supporting information

S1 Data. Complete dataset used for the analysis.

https://doi.org/10.1371/journal.pntd.0008822.s002

(XLSX)

Acknowledgments

We are grateful to Yuedong Wang, University of California, Santa Barbara, who made valuable suggestions for the implementation of the GLM analysis to our laboratory results. Raúl Stariolo, of the National Chagas Control Service (Córdoba, Argentina), kindly provided the live specimens of T. infestans used to carry out the experiments.

References

  1. 1. Porter WP, Gates DM. Thermodynamic Equilibria of Animals with Environment. Ecological Monographs. 1969; 39(3):227–244.
  2. 2. Grant BW, Porter WP. Modeling Global Macroclimatic Constraints on Ectotherm Energy Budgets. American Zoologist. 1992; 32(2):154–178. URL: http://www.jstor.org/stable/3883756
  3. 3. Townsend Peterson A. Ecologic Niche Modeling and Spatial Patterns of Disease Transmission. Emerging Infectious Diseases. 2006; 12(12):1822–1826. pmid:17326931
  4. 4. Nieto P, Malone JB, Bavia ME. Ecological niche modeling for visceral leishmaniasis in the state of Bahia, Brazil, using genetic algorithm for rule-set prediction and growing degree day-water budget analysis. Geospatial Health. 2006; 1:115–126. pmid:18686237
  5. 5. Fuller DO, Ahumada ML, Quiñones ML, Herrera S, Beier JC. Near-present and future distribution of Anopheles albimanus in Mesoamerica and the Caribbean Basin modeled with climate and topographic data. International Journal of Health Geographics. 2012; 11(1):1–12. pmid:22545756
  6. 6. Gurgel-Goncalves R, Galvao C, Costa J, Townsend Peterson A. Geographic Distribution of Chagas Disease Vectors in Brazil Based on Ecological Niche Modeling. Journal of Tropical Medicine. 2012; Article ID 705326, 15 pages. pmid:22523500
  7. 7. Moriguchi S, Onuma M, Goka K. Potential risk map for avian influenza A virus invading Japan. Diversity and Distributions. 2013; 19:78–85.
  8. 8. Escobar LE, Romero-Alvarez D, Leon R, Lepe-Lopez MA, Craft ME, Borbor-Cordova MJ, et al. Declining Prevalence of Disease Vectors Under Climate Change. Scientific Reports. 2016; 6(39150). pmid:27982119
  9. 9. Bender A, Python A, Lindsay SW, Golding N, Moyes CL. Modelling geospatial distributions of the triatomine vectors of Trypanosoma cruzi in Latin America. bioRxiv. 2019; http://dx.doi.org/10.1101/738310.
  10. 10. Jatta E, Jawara M, Bradley J, Jeffries D, Kandeh B, Knudsen JB, et al. How house design affects malaria mosquito density, temperature, and relative humidity: an experimental study in rural Gambia. The Lancet Planetary Health. 2018; 2(11):e498–e508. pmid:30396441
  11. 11. Ribeiro AC, Sarquis O, Lima MM, Abad-Franch F. Enduring extreme climate: Effects of severe drought on Triatoma brasiliensis populations in wild and man-made habitats of the Caatinga. PLoS Neglected Tropical Diseases. 2019; 13(10):e0007766. 3.2–3. pmid:31600199
  12. 12. Fick SE, Hijmans RJ. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology. 2017; 37(12):4302–4315.
  13. 13. Casey TM. Biophysical Ecology and Heat Exchange in Insects. American Zoologist. 1992; 32(2):225–237.
  14. 14. O’Connor MP, Spotila JR. Consider a spherical lizard: animals, models, and approximations. American Zoologist. 1992; 32:179–193.
  15. 15. Angilletta MJ Jr. Thermal Adaptation: A Theoretical and Empirical Synthesis. Oxford: Oxford University Press. 2009. ISBN 10:0198570872. https://doi.org/10.1890/08-2224.1 pmid:19886501
  16. 16. Christian KA, Bedford GS. Seasonal changes in thermoregulation by the frillneck lizard, Chlamydosaurus kingii, in tropical Australia Ecology. 1995; 76:124–132.
  17. 17. Seebacher F, Grigg GC. Patterns of body temperature in wild freshwater crocodiles, Crocodylus johnstoni: thermoregulation versus thermoconformity, seasonal acclimatization, and the effect of social interactions. Copeia. 1997; 1997(3):549–557.
  18. 18. Van Damme R, Bauwens D, Castilla AM, Verheyen RF. Altitudinal variation of the thermal biology and running performance in the lizard Podarcis tiliguerta. Oecologia. 1989; 80:516–524. pmid:28312838
  19. 19. Parlin AF, Schaeffer PJ, Jezkova T. Modelling the effect of environmental temperatures, microhabitat and behavioural thermoregulation on predicted activity patterns in a desert lizard across its thermally diverse distribution. Journal of Biogeography. 2020 (In press),
  20. 20. Lee BY, Bacon KM, Bottazzi ME, Hotez PJ. Global economic burden of Chagas disease: a computational simulation model. The Lancet Infectious Diseases. 2013; 13(4):342–348. pmid:23395248
  21. 21. Cucunubá ZM, Okuwoga O, Basáñez M-G, Nouvellet P. Increased mortality attributed to Chagas disease: a systematic review and meta-analysis. Parasites & Vectors. 2016; 9:42. pmid:26813568
  22. 22. Alevi KCC, Moreira FFF, Jurberg J, Azeredo-Oliveira MTV. Description of the diploid chromosome set of Triatoma pintodiasi (Hemiptera, Triatominae). Genetics and Molecular Research. 2016; 15(2):1–10. pmid:27173222
  23. 23. Noireau F, Dujardin J-P. Biology of Triatominae. In: Telleria J, Tibayrenc M, editors. American Trypanosomiasis, Chagas Disease One Hundred Years of Research. Elsevier, London, UK. 2017. Chapter 7.
  24. 24. Balsalobre A. ¿Qué especies de vinchucas modificarán su distribución geográfica en la Argentina? Un análisis de los microhábitats y microclimas de los triatominos vectores de la enfermedad de Chagas. Doctoral Thesis. The National University of La Plata. La Plata, Buenos Aires, Argentina. 2016. Available from: http://naturalis.fcnym.unlp.edu.ar/repositorio/_documentos/tesis/tesis_1425.pdf
  25. 25. Fountain-Jones NM, Machado G, Carver S, Packer C, Recamonde-Mendoza M, Craft ME. How to make more from exposure data? An integrated machine learning pipeline to predict pathogen exposure. Journal of Animal Ecology. 2019; 88(10):1447–1461. pmid:31330063
  26. 26. Lucas TCD. A translucent box: interpretable machine learning in ecology. Ecological Monographs. 2020. 90(4):e01422.
  27. 27. Ryo M, Angelov B, Mammola S, Kass JM, Benito BM, Hartig F. Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models. Ecography. 2020; pmid:33304029
  28. 28. Fountain-Jones NM, Craft ME, Funk WC, Kozakiewicz C, Trumbo DR, Boydston EE., et al. Urban landscapes can change virus gene flow and evolution in a fragmentation-sensitive carnivore. Molecular Ecology. 2017; 26(22):6487–6498. pmid:28987024
  29. 29. Sverchkov Y, Craven M. A review of active learning approaches to experimental design for uncovering biological networks. PLoS computational biology. 2017; 13(6):e1005466. pmid:28570593
  30. 30. García SL, Rodrígues VLCC, García NL, Ferraz Filho AN, Mello MLS. Survival and Molting Incidence after Heat and Cold Shocks in Panstrongylus megistus Burmeister. Memórias do Instituto Oswaldo Cruz. 1999; 94(1):131–137. pmid:10029925
  31. 31. García S. L., García N. L., Rodrígues V. L. C. C. and Mello M. L. S. Effect of Sequential Cold Shocks on Survival and Molting Incidence in Panstrongylus megistus (Burmeister) (Hemiptera, Reduviidae). Cryobiology. 2001; 41: 74–77. pmid:11336492
  32. 32. García SL, Mello MLS, García NL, Rodrígues VLCC. Experimentally Induced Heat-Shock Tolerance in Panstrongylus megistus (Hemiptera: Reduviidae). Journal of Medical Entomology. 2001; 38(4):510–513. pmid:11476330
  33. 33. Buxton PA. The Thermal Death-Point of Rhodnius (Rhynchota, Heteroptera) Under Controlled Conditions of Humidity. Journal of Experimental Biology. 1931; 8:275–278.
  34. 34. Rodrígues VLCC Mello MLS, Ferraz Filho AN Dantas MM. Sobrevivência e ocorrência de muda em Triatoma infestans Klug (Hemiptera, Reduviidae) após choque de temperatura. Revista de Saúde Pública. 1991; 26(5):461–467.
  35. 35. Gentile AG, Sartini JL Campos MC, Sánchez JF. La aerotermia como alternativa para el control de Triatoma infestans (Hemiptera, Reduviidae) resistentes a deltametrina. Cadernos de Saúde Pública/ 2004; 20(4):1014–1019. pmid:15300294
  36. 36. Campos SGP, Rodrígues VLCC, Wada CY, Mello MLS. Effect of Sequential Cold Shocks on Survival and Molting Rate in Triatoma infestans Klug. Memórias do Instituto Oswaldo Cruz. 2002; 97(4):579–582. pmid:12118295
  37. 37. Goldstein A, Kapelner A, Bleich J, Pitkin E. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics. 2015; 24(1):44–65. https://doi.org/10.1080/10618600.2014.907095
  38. 38. Kvalseth TO. Cautionary note about R2. The American Statistician. 1985; 39(4):279–285.
  39. 39. Friedman JH, Popescu BE. Predictive learning via rule ensembles. The Annals of Applied Statistics. 2008; 2(3):916–954. https://doi.org/10.1214/07-AOAS148
  40. 40. Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2019. URL https://christophm. github. io/interpretable-ml-book.
  41. 41. Belliard SA, de la Vega GJ, Schilman PE. Thermal tolerance plasticity in Chagas disease´s vectors Rhodnius prolixus (Hemiptera: Reduviidae) and Triatoma infestans. Journal of Medical Entomology. 2019; 56(4):997–1003. pmid:30849174
  42. 42. Fisher A, Rudin C, Dominici F. Model class reliance: Variable importance measures for any machine learning model class, from the “Rashomon” Perspective. 2018. Retrieved from http://arxiv.org/abs/1801.01489
  43. 43. Therneau TM. survival: A Package for Survival Analysis in R. R package version. 2020. Available from https://CRAN.R-project.org/package = survival.
  44. 44. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2020. URL https://www.R-project.org/.
  45. 45. Carey JR. Longevity. The Biology and Demography of Life Span. Princeton and Oxford: Princeton University Press. 2003.
  46. 46. Chown SL, Nicolson SW. Insect Physiological Ecology: mechanisms and patterns. Oxford: Oxford University Press. ISBN: 0 19 851548 0. 2004
  47. 47. Mello MLS, Maria SS, Tavares MCH. Heat shock-induced apoptosis in germ line cells of Triatoma infestans Klug. Genetics and Molecular Biology. 2000; 23(2):301–304. http://dx.doi.org/10.1590/S1415-47572000000200011.
  48. 48. de La Vega GJ, Schilman PE. Ecological and physiological thermal niches to understand distribution of Chagas disease vectors in Latin America. Medical and Veterinary Entomology. 2018; 32(1):1–13 pmid:28857300
  49. 49. Mutamiswa R, Tarusikirwa V, Nyamukondiwa C, Chidawanyika F. Fluctuating environments impact thermal tolerance in an invasive insect species Bactrocera dorsalis (Diptera: Tephritidae). Journal of Applied Entomology. 2020; In press. https://doi.org/10.1111/jen.12795
  50. 50. Rolandi C, Schilman PE. The costs of living in a thermal fluctuating environment for the tropical haematophagous bug, Rhodnius prolixus. Journal of Thermal Biology. 2018; 74:92–99. pmid:29801656
  51. 51. de la Vega GJ, Schilman PE. Using eco-physiological traits to understand the realized niche: the role of desiccation tolerance in Chagas disease vectors. Oecologia. 2017; 185(4):607–618. pmid:29067559