A geospatial analysis of local intermediate snail host distributions provides insight into schistosomiasis risk within under-sampled areas of southern Lake Malawi

Background Along the southern shoreline of Lake Malawi, the incidence of schistosomiasis is increasing with snails of the genera Bulinus and Biomphalaria transmitting urogenital and intestinal schistosomiasis, respectively. Since the underlying distribution of snails is partially known, often being focal, developing pragmatic spatial models that interpolate snail information across under-sampled regions is required to understand and assess current and future risk of schistosomiasis. Methods A secondary geospatial analysis of recently collected malacological and environmental survey data was undertaken. Using a Bayesian Poisson latent Gaussian process model, abundance data were fitted for Bulinus and Biomphalaria. Interpolating the abundance of snails along the shoreline (given their relative distance along the shoreline) was achieved by smoothing, using extracted environmental rainfall, land surface temperature (LST), evapotranspiration, normalised difference vegetation index (NDVI) and soil type covariate data for all predicted locations. Our adopted model used a combination of two-dimensional (2D) and one dimensional (1D) mapping. Results A significant association between normalised difference vegetation index (NDVI) and abundance of Bulinus spp. was detected (log risk ratio − 0.83, 95% CrI − 1.57, − 0.09). A qualitatively similar association was found between NDVI and Biomphalaria sp. but was not statistically significant (log risk ratio − 1.42, 95% CrI − 3.09, 0.10). Analyses of all other environmental data were considered non-significant. Conclusions The spatial range in which interpolation of snail distributions is possible appears < 10km owing to fine-scale biotic and abiotic heterogeneities. The forthcoming challenge is to refine geospatial sampling frameworks with future opportunities to map schistosomiasis within actual or predicted snail distributions. In so doing, this would better reveal local environmental transmission possibilities. Graphical Abstract Supplementary Information The online version contains supplementary material available at 10.1186/s13071-024-06353-y.


Introduction
Schistosomiasis is a freshwater snail-borne neglected tropical disease (NTD) common across much of sub-Saharan Africa.Two forms of schistosomiasis occur, urogenital and intestinal schistosomiasis.Their respective transmission can only occur if the intermediate snail host species of the genus Bulinus and Biomphalaria occur.While various species of Bulinus are present in Lake Malawi, with Bulinus globosus and Bulinus nyassanus responsible for Schistosoma haematobium transmission, only in 2017 was Biomphalaria first formally noted along its southern shoreline.The expanding distribution of Biomphalaria pfeifferi in this area has facilitated the transmission of Schistosoma mansoni which causes intestinal schistosomiasis, which has now transitioned from emergence to outbreak [1]- [3].
Owing to the singular importance of this newly invasive Bi. pfeifferi, subsequent malacological surveys were undertaken to track its presence alongside concurrent parasitological surveys in local children in attempt to define the extent of schistosomiasis, particularly intestinal schistosomiasis (IS).These surveys demonstrated the need for further surveillance of freshwater snails, along-side emphasis upon updated and tailored interventions and policies for control of schistosomiasis in this lacustrine setting [1]- [3].However, as snail distributions can be patchy or focal, owing to their dependency on local habitats, many gaps in current cartography and in predictive mapping are exposed [4].Indeed, variation in such local characteristics create difficulties in outlining either permissive or refractory areas where snails may or may not be found, confounding control strategies.
Climate change and human behaviour are thought to be primary reasons for Biomphalaria invasion and colonisation into new areas [5].Characteristics such as vegetation, temperature, rainfall (precipitation), evapotranspiration and soil type have been reported as possible effects on determining snails' presence and abundance, increasing potential heterogeneity in snail populations over a wide area [5]- [7].Changes in climate and seasonal patterns are therefore likely to alter transmission of schistosomiasis over both space and time, increasing the need for identification of snail habitats to target appropriate control interventions [1].However, although snail distribution within a geographical area can be measured through malacological surveillances, physically collecting freshwater snails is expensive, time consuming, and therefore unfeasible to sample every possible location, thus effective sampling remains incomplete.Lake Malawi dominates the eastern side of Malawi, being 600km long and 75km wide.
It is known as the second deepest lake in Africa [8], and is vital for those using it for irrigation, agriculture, water supply, fishing industries and tourism [9].Due to the lack of adequate sanitation in Malawi, human urine and faecal materials continuously contaminate the shoreline facilitating the transmission of schistosomiasis, amongst other water-borne pathogens [10].In Mangochi District, representing the southern part of Lake Malawi, the eastern side of the lake is mountainous with high elevation (1000-1500m), whereas the western side is flat and with lower elevation (<500m) [11], [12].Lower temperatures and higher winds are reported on the eastern side [13], with low-lying areas such as the upper Shire River margins vulnerable to flooding [14].More broadly, the southern part of shoreline climate is affected by the migration of the Inter-Tropical Converge Zone (ITCZ).This leads to the dry season with cooler temperatures occurring between May and August, hotter temperatures between September to November and wet season between December and April [15], [16].Rainfall is dependent on altitude and time of the year [17].Lake water levels, vary over time and are at its highest during wet season also affecting evapotranspiration and outflows to the Shire River [2], [14].Most importantly perhaps is an increasing human and livestock population which is leading to more frequent water contact enhancing opportunities for transmission of schistosomiasis [2], [4].
The World Health Organisation (WHO) has supplied new guidelines to target elimination of schistosomiasis by reducing freshwater snail abundance, thus interrupting transmission [18].Identifying locations where freshwater snails are most abundant therefore aids targeted control methods, preventing initial infection, re-infection and hence helping eliminate or reduce transmission [19]- [21].
Here, we undertook a secondary analysis of primary malacological data first reported by Al-Harbi et al. 2019 [1] and Kayuni et al. 2020 [2].Our study models the snail distributions as a function of environmental and climate data measured along the shoreline aiming to i) interpolate and predict the distribution of the snails along shoreline of Lake Malawi where the snails had not been sampled, and ii) assess the association between environment data and snail distributions.In turn, we hoped to clarify the extent of environmental heterogeneities for schistosomiasis transmission along the shoreline of Lake Malawi and inform the targeting of control programmes to the most appropriate snail breeding sites.

Data
The data used in this study consist of observations of snail abundance at a small number of discrete locations on the Lake Malawi shoreline, together with remote-sensing data used to describe snail habitat.These are described separately below.

Snail abundance
The primary dataset reported in Al-Harbi et al. 2019 [1] and Kayuni et al. 2020 [2] which this secondary analysis is based on, was originally collected malacological surveys between 2017-2019 as shown in Figure 1 and available at Additional file 1: Dataset.Pilot surveillance data from November 2017 identified Biomphalaria sp. and Bulinus spp.along the shoreline.
May/June 2018 and 2019 malacological surveys resampled some of the original locations and added new sites based on satellite imagery or randomly based on their surrounding environment suitable for breading sites to confirm the emergence and outbreak of IS.The Danish Bilharziasis Laboratory key was used to identify Bulinus and Biomphalaria according to shell morphology.
Figure 1b shows a map of sampling sites, together with their relationship to primary schools in the region, demonstrating the importance of human proximity to the lake shore, and hence potential for exposure to infected snails.
At each recording site (Figure 1c and 1b), we note, snail abundance data was quantised into 0-50, 50+,100+ and 300+ snail per location.For the purpose of this study snail abundance was rounded to 0-50, 50, 100, or 300 collected snails per location.{Figure 1 place around here}

Remote sensing data
Publicly available continuously collected satellite sensory systems were used to extract environmental and climatic data measured adjacent to the shoreline as shown in Figure 2.

Construction of 200 prediction points along the shoreline
Snail abundance was predicted in 1-D representation to allow us to interpolate the values along the whole linestring.We made this assumption on the basis that snails live along the shoreline, in habitats that are associated with human water contact and entry, and so correlations between

Extraction of remote sensing data to linestring vertices
The covariate data was created by extracting the values of each remotely sensed covariate layer data variable surface at each of 200 linestring vertices.To do this, the mean of raster pixels within a 1km buffer around each vertex was computed.Where missing values were found for a vertex, the buffer took the calculated mean value for the previous corresponding vertex working away from the origin.In cases where missing values were present as the first sampling point, the next collected value was taken.

Bayesian Multilevel Model
A Bayesian Poisson multilevel model (BMLM) with a Gaussian latent process (GP) was developed using STAN programming language which uses a Markov Chain Monte Carlo

Observed data
After cross-checking the observation data, we obtained 33 sampling locations for Biomphalaria sp. and 63 locations for Bulinus spp. as shown in Figure 3.The mean number of snails for Biomphalaria sp. was 6.03 ranging from 0 to 50 snails, with the highest number of snails found at 46.17km along the shoreline from the origin.Whereas the mean number of snails for Bulinus spp. was 28.20, ranging from 0 to 300 snails, with the highest number of snails found at

Model fit
The Bayesian log-linear Gaussian Process model converged well according to their trace, and the priors were appropriately selected as shown in Additional file 6: Fig. S1 and Additional file 7: Fig. S1.

Covariate effects
Figure 5 shows the posterior plot for each species of snail, with mean snail abundance at location i on the x axis, with 95% confidence intervals (CI) filled.As shown in Figure 5  (reduction).No association could be found for LV soil compared to GL soil type.

Model predictions
For Biomphalaria sp., we predict the greatest number of snails present to be close to Moet, and Koche schools.For Bulinus spp., a higher number of snails is predicted over a wider area, close to Moet, Koche, Mtengeza, Chipeleka, Sungusya schools.However, for both Biomphalaria sp. and Bulinus spp.there is large uncertainty around all locations (2D version -Figure 6 and 1D version -Additional file 8: Fig. S1).
{Figure 6 place around here}

Discussion
To our knowledge, our secondary analysis has made a seminal attempt to analyse, interpolate and then predict Biomphalaria and Bulinus snail distribution in unsampled locations in the southern part of Lake Malawi, Mangochi District.Our study found a significant negative association between NDVI and snail abundance for Bulinus spp.Our results are also indicative of a similar association between NDVI and Biomphalaria sp.abundance, although this was not significant given our currently available data.Other covariates considered in the model were all non-significant as reported in Table 1; despite their uncertainty, we reported increase in rainfall along the shoreline, which causes a reduction in the mean snail abundance found along the shoreline for Bulinus spp.However, an increase in evapotranspiration and LST along the shoreline, this causes an increase in the mean snail abundance found along the shoreline for both Bulinus spp.and Biomphalaria sp.We found for soil type that an increase in PL or LV, caused a reduction in the mean abundance found along the shoreline compared with GM.The characterises of shoreline of southern part of Lake Malawi is known to vary drastically over focal areas (Figure 7), and in turn can increase or decrease snail abundance, we discuss below what we have found compared to other studies and how this could help identify Schistosoma risk.

{Figure 7 place around here}
In most previous studies, more vegetation (NDVI) is shown to have a positive association of snails found due to vegetation providing suitable breeding sites whereas, our study suggests a negative association [31]- [33].This difference in result is likely due to our focus on the Lake Malawi, as opposed to a more general area taking into account smaller bodies of stagnant water.
The presence of land vegetation around the shoreline, may well be descriptive of the land topology and hence the depth of the water in the immediate vicinity-deeper water is likely less conducive to snail habitats due to the absence of aquatic flora.Furthermore, the type of vegetation and whether it is submerged or nonemergent floating vegetation is known to be important as the freshwater snails need protection from wave action, food resources and aiding egg-laying is not considered in our model [34], [35].
An indication of increase in rainfall, decreases snail abundance in our model despite its uncertainty.Firstly this result could be due to the water flow increasing and spreading to new locations, disrupting freshwater snail habitats [32].Secondarily, an increase in rainfall has been reported to increase turbidity of water and in turn decrease the presence of snails (through disrupting their habitat) [32], [33].Lastly, increases in rainfall and water flow has also been reported to cause rapid changes in temperature causing thermal shock and reduced egg-laying of the freshwater snails, causing an overall reduction in snail abundance present [35].
Opposing to our result, in some cases studies have found increase in snail abundance during increase in rainfall.For instance, when excess rainfall, known as flooding, occurs, new areas of snail habit can occur where previously not present or eliminated.Runoff water can cause newly formed pools of water to occur adjacent to the shoreline or inland allowing for more breeding sites, to be colonised by the intermediate snail host and hence cause an increase freshwater snail abundance [5].Consequently, changing the human-snail contact interplay, through indirect effect on human behaviour and their associated Schistosoma infection risk [36], [37].However, other studies have suggested during flooding, these newly established pools of water can lead to humans visiting these new sites instead of the Lake Malawi with a possible decreased likelihood of snails being present already in these new sites which can lead to a reduction in schistosomiasis transmission [31], [36].Adding to the complexity, rainfall and water levels are known oscillate over time, with a general decrease in lake levels reported more recently, with ongoing localised peaks of lake levels occurring through time [2].This could impact the snail abundance and its presence spatially and temporally and indirectly affect human behaviour as mentioned before [2], [36], [37].For example, if the lake levels are regulated by needs for hydroelectricity, or because many individuals prefer to make contact with shallower and more safe areas of the lake [38], [39].
An indication of increase LST, increase Biomphalaria sp. and Bulinus spp.snails' abundance in our study.Many laboratory studies been carried out to determine the optimal temperature for snail survival, for Biomphalaria sp.snails the optimum temperature has been found to between 15 to 30 • C, where there is a decrease in snail abundance above 30 to 35 • C, and no snails survive above 35 • C [37], [38].For our prediction points along the shoreline the LST ranges between 25 to 32 • C, which suggest Biomphalaria sp.snails' abundance still increase above 30 • C, this difference could be due to it being in natural environment where snails are able to adapt to climate change [40].It has also been reported that freshwater snails move further into the lake when temperatures increase which we did not consider in our model due to it being constrained to the shoreline and buffer area [41].
Similarly, an indication of increase evapotranspiration, increase the Biomphalaria sp.
and Bulinus spp.snail abundance in our study.The increase in evapotranspiration, also known as the increase in evaporation of water is known to have an impact on pH, salinity (salt concentration), conductivity and temperature of water through unpublished field studies, these finer physical characteristics need to be further investigated.Which suggests an increase in evapotranspiration causes these unexplored covariates to become more habitable for Schistosoma intermediate snail hosts, causing an increase in snail abundance, how these unexplored covariates interact and their effect on snail abundance is not considered in our study, however, has been investigated in other studies [5].
Our study found an increase in soil type PL, to decrease snail abundance compared to GL soil type.PL soil type are clay-based soils, plinthic soil with high concentration of iron and GL are mineral soils which are a mixture of sand, silt, and clay and are both muddy when rainfall occurs (become water-logged) [42].In a previous study by Koch et al. 2004, there has been opposite results found with muddy soil being reported to improve the survivability of Biomphalaria sp. by preventing them from losing moisture in the hot and dry seasons compared with sandy ones, stony and decomposing material [7].The difference between PL and GL soils, is GL is known for its reduction in iron reduction [43].Kulina et al. 2018 [44], reported an increase in transmission of snail risk in higher iron concentration ground water [44].We found a different result which suggests another chemical within the soil type could be interacting with the snail abundance and affect transmission and further there is uncertainty in our result.Further, the soil types from SOTER database are for wide scale, a lower-level data is needed to improve the information on soil types more localised [29].Other resources are more has been created for example, SoilGrids for Africa, which if time permitted and was at lower-level data for southern Malawi, this could be applied to our study in the future [45].
Our secondary analysis study shows substantive heterogeneities in snail distributions along the lake's shoreline, with certain schools being in close proximity to areas of high snails abundance and hence, SAC attending these schools may be more likely to be exposed to schistosomiasis.Moet and Koche school were predicted to be the nearest proximity to the highest number of Biomphalaria sp.present along the shoreline, suggesting there is likely more S. mansoni infections occurring at these schools compared to the 12 other schools.Whereas, Moet, Koche, Mtengeza, Chipeleka, Sungusya were all predicted to be the nearest proximity to the have the highest number of Bulinus spp.However, for both Biomphalaria sp. and Bulinus spp.predicted presence had a large uncertainty along all the shoreline.Further, we cannot be certain of the exposure risk for the SAC as this secondary analysis does not consider their water contact patterns including, where they visit (how far do they travel) the shoreline, how often, the type of contact and how long they remain at the shoreline.This will need to be further investigated as previous studies have reported increase snail abundance in localised areas, where there are more water contact occurring in the area [41], [46].In addition, the ability to measure exposure risk for SAC from our secondary analysis, is dependent on presence of snails in an area being indicative that freshwater present are shredding cercariae, which is difficult to be certain [35].
There are many more physical, chemical and environmental factors (abiotic and biotic) which could impacting Schistosoma intermediate snail's habits and their relative snail abundance, not considered in our model, due to time constraints or non-accessible data.For instance, pH, salinity, conductivity, flow velocity, turbidity, calcium and bicarbonate concentration, dissolve oxygen and soil density and water capacity [35], [36], [47].Further, other factors such as food source, pollution, parasitism and even the competition for snail habit with other organisms within an area were not considered in our model [47].Variation in human movement patterns can make it difficult to locate the location of acquired infected, land use and human influence index could be included in our model if time permitted [48].
One limitation of our study is the restricted study period (November 2017 to June 2019except for evapotranspiration 5-year time frame) and takes the mean values for each prediction location.Rabone et al. 2019 [41], reported seasonality affecting snail abundance, with higher snail abundance during the dry season compared to the wet season.For instance, seasonality can affect growth of vegetation in the freshwater snail live due to the variation in sunlight, therefore leading to changes in snail abundance [47].In the future, we would like to investigate how seasonality affects the snail distribution from our model.We report the seasonal changes of the covariate data in Additional file 3 (Additional file 3: Fig. S1, Fig. S2 and Fig. S3) this allows to visually see how covariate data changes over time however was not considered in our model.
Another limitation of our study as mentioned before, are snails are not only found on the shoreline of Lake Malawi, also found in pools adjacent to the lake or rivers, ponds and streams, which has been reported to effect snail abundance by effecting the microhabitat for instance changes temperatures [41], [49].From unpublished field work studies in 2021, it is known that southern lake slopes in areas, with western side of shoreline having shallower areas for longer and near the Upper Shire River it is known to have more vegetation, swampy areas than the rest of the shoreline.Bathymetric data for water depth was originally considered in our secondary analysis taken from the GLObal Bathymetric (GLOBathy) dataset which relies on HydroLakes dataset [49], however, was excluded from the study due to missing River Shire An important main limitation of our analysis is the resolution of the raster data we use as covariates.Many remotely-sensed metrics are known to be inaccurate over water, and for this reason we positioned our shoreline linestring just in-land of the water's edge.Thus any associations between land-based measurements and habitat conditions in the water are likely to only indirectly affect snail abundance.A repeat study, using direct observations of shoreline habitat composition, perhaps using towed arrays of sensors behind a boat sailed close to the water's edge, may be able to provide more accurate map of predicted snail abundance.

Conclusion
Our study provides a preliminary method of predicting the abundance of Biomphalaria sp. and Bulinus spp.snail along the shoreline of Lake Malawi, given malacological data collected at sparse locations and remotely sensed environmental data.Further, our study shows substantive heterogeneities in snail distributions along the lake, and abundance information which may be used to develop further statistically-grounded study designs to improve the identification of likely snail habitats posing a high risk for schistosomiasis transmission.Tables Table 1: snail locations are affected by distance along the shoreline.And not by stretches of deep, open water, e.g., mouth of a bay.The 1-D shoreline was represented by computing the distances between a sequence of 200 vertices obtained from the 2-D linestring representation.To achieve this, we used the following method : i) A 2-D linestring was drawn by hand following the shoreline as shown by Google Satellite imagery (Figure 1) ii) Re-sample the linestring onto 200 equally spaced points along the linestring iii) Allocate each observed sampling site location to its closest linestring vertex, iv) Compute the 1-D representation of the linestring as the distance of each vertex from the origin (north-western end of the 2D representation) v) The 1D linestring with 200 prediction points was converted back into on the 2D shoreline using the matched indices.

(
MCMC) algorithm to regress snail abundance data onto the remotely-sensed covariate data, accounting for (1-D) spatial correlation along the shoreline.We assume that the number of snails observed at a sampling location was Poisson distributed, with log-mean given by a coefficient-weighted sum of the covariates plus a spatially-correlated error term.Covariance between the error terms was represented at the sum of spatially-correlated variance (using quadratic, exponential, or Mate ´rn) κ uncorrelated (or nugget) variance[30].Suitably uninformative priors were applied to the model coefficients and variance terms, with MCMC run for 10000 iterations.Posterior summaries (mean and 95% credibility intervals) were computed for the fitted model, as well as predictive distributions for each of the linestring vertices conditional on the data.All data processing and analysis was performed in R version 4.1.1.See supplementary information (Additional file 2: Model Formulation) for a mathematical explanation of the model.

{Figure 4 place around here} 3 . 3
14.66km along the shoreline from the origin.For observed Biomphalaria sp.data the extracted environmental data ranges: Rainfall with mean 78.8mm [63.01-89.51mm],LST with mean 29.68 • C [24.97-32.44 ( • C) Bulinus spp.data the extracted environmental data ranges: Rainfall with mean 80.62mm [63.01-89.5mm],LST with mean 30.28 ( • C) [24.97-32.44 ( • C)].Additional file 3 shows the observed data for 1D (Additional file 3: Fig. S1) and 2D (Additional file 3: Fig. S2 and Fig. S3).A histogram of the centred and scaled covariates are shown in Additional file 4: Fig. S1.{Figure 3 place around here} 3.2 Environmental data prediction points The extracted environmental data prediction points ranges: Rainfall [59.82-90.37mm],LST [24.68-32.46( • C)], NDVI [0.29-0.61],Evapotranspiration [0.10-0.66]along the prediction points of the shoreline.Evapotranspiration was lowest and NDVI highest along the River Shire, with east shoreline having the most rainfall and lowest LST ( • C) compared to the west shoreline.Luvisols ( LV) soil type was absent around the River Shire compared to Gleysols (GL) soil type and for soil type Planosols (PL) were present at the entrance to the River Shire, and south of the River Shire compared to GL soil type.A 1D version of Figure 4 can be seen in Additional file 4: Fig. S2.Covariance function comparison As shown in Additional file 5: Fig. S1 the exponential quadratic covariance function was found to over fit (smoothed to much) the model, the Mate ´rn (κ=1.5)smoothed the results, whereas exponential covariance function was the roughest fit of the model.Further, there seems to be no difference in predicted log (î) against distance along shoreline for both Biomphalaria sp. and Bulinus spp. as shown in Additional file 5: Fig. S2.This suggests that the effect of the covariates (environmental data) is more prominent than the Gaussian process.

values as shown in Additional file 9 :
Fig. S1, Fig S2 and Fig S3.Therefore, water depth needs to be further investigated, as mentioned before, it is known water levels vary over time and this leads to changing water depth.Bulinus spp.and Biomphalaria sp. has different preferences in the water depth and vegetation [1]-[3].

Figure 3 :
Figure 3: Scatter plot of absolute snails numbers observed at sampling points versus

Figure 6 :
Figure 6: 2D mean GP prediction of number of snails log ( �  ) along the shoreline (km) a) Biomphalaria sp.b) Bulinus spp.Legend: Blue to red stand for mean gaussian process (GP).

Figure 7 :
Figure 7: A collection of location photographs representative of the variation of the Southern Abbreviations and Table1, a significant result was reported for NDVI, where 1sd increase in NDVI had a -0.83 [CI: -1.57,-0.09]reduction in the log µi, mean Bulinus spp.snail abundance at location i.