Spatial risk assessment and the protection of cultural heritage in southern Tajikistan

Growing digital documentation of cultural heritage resources yielded from an increasing number of international projects, calls for the development of formal computational approaches to assess the risks that this invaluable material heritage is exposed to. This paper proposes a nuanced case-control approach to the risk assessment developed by the Central Asian Archaeological Landscape (CAAL) project for an area in southern Tajikistan. A number of statistical methods are applied for the spatial modelling of the risk to the cultural heritage across the study area and for the assessment of the local scenarios of potential


Introduction
In recent years, a number of projects have undertaken the systematic digital documentation of cultural heritage across the world, and the creation of online open access platforms that can share this information (e.g. EAMENA projecthttp://eamena.arch.ox.ac.uk/). Moreover, EU-funded projects like ATHENA (https://athena2020.eu/) aim to establish centres of excellence in Remote Sensing applied to the Cultural Heritage documentation and management.
The collection of an ever-increasing volume and variety of data has resulted in the rapid development of nuanced analytical approaches that can aid in the understanding, monitoring, and protection of sizable datasets of cultural heritage resources. Documentation needs to be paired with 'use' of these large collection of data, as this is one viable way of making inventories and databases sustainable within a varied community of users such as heritage professionals, researchers and local stakeholders, to mentions just a few. This paper will propose a computational approach to allow the remote assessment, location, and quantification of the risk posed to potential archaeological features in a region of Central Asia, where database. However, this software is not without its limitations, particularly in regard to data analysis, hence the need to adopt a parallel solution for performing spatial analysis of the large-scale dataset. Both Arches and QGIS are open-source and community supported, making them sustainable tools that can be customised to satisfy the needs of the project.
One of the key benefits of the creation and analysis of such a large inventory, containing data from such a wide a variety of sources, is the potential that this has for the development of novel methodologies for the assessment and monitoring of on-going threats to the cultural heritage across the globe, particular in remote or inaccessible areas.

Research aim
The main focus of this paper is on the satellite remote sensing strand of CAAL and in particular on the potential use of remotely sensed data and related computational methods, to develop a better understanding of on-going threats in a specific area of Tajikistan, South Khatlon. Major disturbances endangering and destroying the archaeological landscape of the region are discussed, and a nuanced case-control approach, borrowed from disciplines such as epidemiology and ecology, is proposed in order to understand the regional distribution of risk levels arising from a variety of natural and human threats.
We will demonstrate both how to model the risk across the territory and how to formally explore the locational settings of archaeological features already affected by natural and human disturbances. In doing so, we will illustrate how this allows for the empirical identification of cultural heritage areas at higher risk.

Study area: South Khatlon
The region of Khatlon is located in the southwestern part of Tajikistan, it borders the Districts of Republican Subordination to the north, Uzbekistan to the west, Afghanistan to the south and southeast, and Gorno-Badakhshan to the east. Its territory, known as the north-eastern Bactria ( [1], 1), covers an area of 24,800 km 2 , characterised by the mountain ridges which divide the river valleys ( [2], 1) (Fig. 1).
The area considered in this study extends approximately 146 km from the border with Uzbekistan in the west to the border with Afghanistan in the east, marked by the course of the river Panj.
At the southern border of the region, the river Amu-Darya marks the country border with Afghanistan. The river is one of the largest in Tajikistan and is formed by the confluence of the Panj and Vakhsh river. The latter, together with the Kofarnihon river (or Kafirnigan), are the two major tributaries of the Amu-Darya in South Khatlon (Fig. 1).
Approximately 13 km from the border with Uzbekistan lies the fertile oasis of the Kofarnihon River, which flows from north to south. The valley is delineated by mountainous reliefs, one of which is located in its centre. Intense agricultural activity extends across the entire plain, to the foot of the mountains (Fig. 1).
To the east, lies the Vakhsh valley. Being a river of significant water flow, in the Soviet era several dams were built for irrigation and electricity production which facilitated the territorial expansion [3]. The Russian authorities have implemented an irrigation campaign in Central Asia to increase cotton production, including the construction of extensive infrastructures on the Amu-Darya basin. In Tajikistan, the Nurek Dam and later the Rogun Dam, both built on the Vakhsh River, rapidly increased the number of irrigated fields in the territory. In 1999, it was estimated that the area of farmlands increased by 200,000 ha ( [4], 118, 119). In 2015, the fields irrigated with water extracted only from the Nurek dam reached 650,000 ha [5].
Overall, Tajikistan and in particular the Khatlon region have seen significant landscape transformations caused by modern economic development (mostly agrarian) and construction, with changes being most prominent in water management [6,7]. This, combined with large infrastructure developments, poses a serious threat to the natural and archaeological landscape in the region [8]. Consequently, a formal assessment of the risk zonation is urgently needed.

Data
In order to enhance the existing record of archaeological data, the CAAL project is undertaking a systematic mapping of the whole of the Central Asia (henceforth CA) region through the use of freely available satellite imagery. A bespoke database has been set-up within QGIS in order to map potential archaeological featuresincluding monuments, sites, and landscapes -and record a preliminary assessment of the threats that each feature currently faces. This information is currently being collected for sample areas across the whole of CA by a number of staff members based at University College London (UCL). In the future, the project aims to transfer this knowledge to the various partner institutions in CA, building local capacity for data collection and analysis which will allow local authorities to assess and monitor risks and threats to archaeological heritage in the territories. Central to the sustained growth of these large inventories, is the ability to illustrate the great potential that such datasets have for improving the understanding and modelling of risks and threats to the cultural heritage of a specific territory. The results of such analytical efforts can assist policy makers in taking informed, data-driven decisions, as well as promoting awareness of the importance of protecting the cultural heritage among local communities. In this respect, CAAL is supporting partner institutions in the joint construction of a central database of archaeological data, along with sustainable capacity for data curation and analysis; all with a view to better understanding and monitoring the risks faced by local archaeological and heritage resources.

Remotely sensed data
Mapping of potential archaeological features is being carried out on very-high resolution (VHR) imagery available via Google Earth and Bing Imagery, streamed within the QGIS platform. Basic information regarding the nature of the mapped features is recorded in a bespoke GeoPackage layer, where specific data fields are constructed in order allow compatibility with the main Arches platform (Table 1). Each identified feature has then been mapped as a polygon that envelops the anomaly, or an area with multiple anomalies that can be construed as a consistent group/cluster. Alongside the above, a broad remote assessment of the threats that could potentially endanger the preservation of archaeological resources has also been carried out, based initially on what is visible in satellite imagery (see Table A1).
A risk level (0-5) has been assigned for each of the threats deemed assessable from the imagery, thus achieving a broad overview of what are the major risks for the cultural heritage in the South Khatlon (see Table A2).
Although there are obvious limitations to the methods used here, the intention of this study is simply to illustrate the significant potential that a remote and regional-scale approach poses for the understanding of risks and threats to the cultural heritage. Computational methods are used here to support decisions and interventions for the monitoring and protection of the archaeolo- Fig. 1. Location of the case study area, the topographical settings of region considered, and the distribution of potential archaeological features mapped from satellite imagery (background satellite image: @DigitalGlobe, via Google). In blue are the urban areas as classified by the Global Land Cover project [42]. gical heritage, and only a holistic approach can guarantee the best management strategy [9].

South Khatlon data
In order to facilitate the exploration of the area through ESRI, Bing Aerial, and Google-provided satellite imagery, the territory has been divided into square grids of 100 km 2 .
Google-provided imagery provides images taken across a number of years, thus enabling the observation of any geomorphological changes, and increases in human activities.
The analysis led to the identification of 2677 potential archaeological features of various natures, mainly located in mountainous or desert territories. A level of certainty was assigned to each recorded feature, allowing the filtering out of records that fell below a particular certainty threshold and thus the identification of records most likely to be of archaeological interest (Fig. 2).
The two main types of features identified were single or groups of tombs consisting of both burials and burial mounds ( 65%), and abandoned enclosures ( 30%). Other, less frequently detected elements ( 5%) were tepe (hills or mounds, usually of archaeological nature), settlements, forts, religious buildings, temples, and other structures that are difficult to interpret by satellite but that were nevertheless characterised by archaeological or ancient evidence (see Fig. A1 for an overview).
The largest group of elements identified consisted of burials of various types ( Fig. 3a and b). As far as the enclosures are concerned, everything that has left a mark on the ground has been embedded under this category. This included livestock fences, foundations of possible settlements' boundaries, or marks left by dwelling basements (Fig. 3c).
In contrast to burials and abandoned fences, ancient structures such as temples, tepe, settlements and forts are found in all the valleys of the study area, often near rivers (Fig. 3d). Since urban development and the increase in farming do not always allow for clear identification of such elements, and in many cases have covered or destroyed archaeological evidence, the acquisition of declassified CORONA satellite imagery for this area -captured in the period between 1964 and 1971 -has made it possible whether or not to confirm their historicity. Moreover, the exploration of the area through these imageries shows further archaeological evidence and consequently the increase in agriculture and urban layout in the area.
Although, the mapping of potential archaeological sites from satellite imagery has been carried out by experienced researchers, we are acutely aware of the limitations and uncertainty associated with such sources of information [10][11][12]. However, the CAAL project is designing a plan for fieldwork that will help assessing the nature of the several classes of remotely sensed objects which at the time of writing had to be postponed due to the global pandemic of COVID-19. For the purpose of this paper though, we propose a dynamic model whose results can change if different data are used as input. This makes the approach more customizable and replicable, thanks to the accompanying material which includes the analytical workflow. Nonetheless, the workflow has also been tested using only the data classified as having a higher certainty level and the results are comparable to the ones derived by the analysis of the full dataset.  String -multi-lingual thesaurus of monument types shared across resource models.

Monument type2
String -multi-lingual thesaurus of monument types shared across resource models.

Monument type3
String -multi-lingual thesaurus of monument types shared across resource models. Interpretation String -free text.

Comments
String -additional notes on the interpretation of the anomaly. Certainty String -certainty of the interpretation. Merit ground-truth String -indicates whether the anomaly is worth of a field visit.

Ground-truthed
String -indicates whether the anomaly has been visited on the ground. Threats Anthropic Urban Encroachment Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria).

Construction
Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Looting Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Quarrying Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Agriculture Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Fire Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Dumping Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria). Natural Riverine Erosion Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria).

Soil Erosion
Integer (0-5) -indicates the risk level for the threat (see Table A2 for risk criteria).

Condition comments
String -additional comments on the condition of the anomaly. Metadata Date of recording Date -date of recording.

Recorder
String -name of the person who recorded the anomaly.

Table 2
Overview of the risk levels recorded for the 9 threats during the remote sensing mapping in South Khatlon.

Understanding threats to the archaeology in South Khatlon
In South Tajikistan, as with everywhere else in the world, archaeological heritage is endangered by an array of hazards, both natural and anthropogenic. The remote assessment of the region of interest using high resolution satellite imagery as conducted here, provides an overview of the major threats that the archaeology in this area is exposed to. Overall, the archaeology in South Khatlon seems to be in good condition and most of the threats have only recorded low levels of risk across whole area (Table 2). Howe-ver, several of the archaeological features have been recorded as endangered by some of the 9 threats.

The human factor
The area considered is characterised by two major river valleys where people have settled and therefore human activities of various natures are endangering the archaeological remains that have survived within modern settlements. A combination of urban encroachment and infrastructure development is affecting few hundreds of potential archaeological features ( Table 2). Although there are no major cities in the river valleys, the size of built-up areas has expanded considerably over the last decades, leaving little space for long-lived heritage sites such as the fort located to the north of the village of Arabkhana (Fig. 4a). For example, the comparison of the current landscape with that of 40 years ago shows the rate of expansion of the medium sized village that encloses the fort, which is already worn by soil erosion as shown by its typical conical shape.
Alongside urban sprawl and general construction development, the valley bottom on the western side of the study area has seen intense agricultural exploitation, which also represents a major threat to the local archaeological heritage (Fig. 4b). Agricultural activities such as ploughing and tillage are a major threat to the archaeology across the world and in particular in countries (such as ex-Soviet Republics) where extensive agriculture of resourcehungry crops has been vital for the economy over the last century [13,14]. Furthermore, South Khatlon underwent the drainage of large portions of the lowland areas through the construction of new irrigation systems [15] in order to turn those areas into arable land (Fig. 4b).
The settlement located a few kilometres to the north of the village of Murattepa is a clear example of how intensive agriculture has completely destroyed what once used to be a 50-60 ha settlement featuring a quadrangular fort in the centre (Fig. 4c). The CORONA image shows traces of what was a densely built-up area that has now been completely flattened out by agricultural development and that is just recognisable from the ephemeral anomaly left by the once towering fortification. Many other archaeological sites like this have already been partially or completely destroyed by the intense human activities across the whole of Central Asia.

Hydrology
Another major on-going threat recorded during the remote sensing assessment is general erosion of features of interest, mostly due to general hydrological instability, bank erosion and soil movements.
The Vakhsh river is one of the main water resources in Tajikistan because of its water capacity (5th order in the Strahler classification), and therefore represents a major threat to the archaeological landscapes. Bank erosion has been recorded along stretches of tributaries of the Vakhsh, where the high-speed streamflow can severely damage the archaeology (Fig. 5). During the early stage of a river's development (the so called 'youthful age'), rivers are characterised by steep gradients and quick flow, resulting in deep erosion of the banks and everything located on the riverbank, including archaeology.
Conversely, old rivers like Vakhsh with low gradient and slower streamflow, present lower erosive energy, but are characterised by flood plains and generally wider course changes, called 'river channel migration'. This is a geomorphological process of lateral movement of the river course across the floodplain and is the result of the combination of bank erosion and sediment deposition across time. Arguably, the river channel migration creates a very unstable environment that prevented (or at least discouraged) people in the past from settling and in general from building activities within the range of action of the river meandering. Thanks to the historical images CORONA it is possible to visually asses the magnitude of this later movement of the river course and to estimate its range of action (Fig. 4d).
A way of mapping the range of action of the channel migration process is to model it on a digital terrain model (DTM), by calculating the upstream catchment area that is used in the computation of the Topographic Wetness Index. The visual assessment of the distribution of all sites and sites recorded as affected by riverine erosion, shows how the majority of the mapped features are located far from the highly hydrologically unstable areas (see Fig. A2).
Such an unstable environment is also endangers the zones directly adjacent the floodplain, where potential archaeological remains have been mapped. A possible cemeterial area mapped along the riverbank of the Amu Darya in a 2013 satellite image is now completely destroyed by the lateral movement of the river course, causing extensive bank erosion (Fig. 4e). Overall, it seems that potential archaeological features mapped from satellite imagery are mostly distributed on the mid to high lands of South Khatlon (Fig. 2) and very few are located in the valley bottom. This can be explained by a number of factors that either prevented the settlement of past societies or indeed affected the preservation of the few archaeological remains in the lowlands of. A visual assessment indicates that modern urban development, agriculture and hydrological instability can explain the lack of archaeological features in the certain areas. However, a formal quantitative analysis of the data can provide a far more accurate prediction of which areas are at the greatest risk, allowing for more successful preservation of cultural heritage. This will support heritage management from both a legislative and managerial perspective, where decisions should be based on hard evidence of both large-scale and detailed local ground-based data analysis [16,17].

Quantifying the risk
Whilst landscape archaeology approaches and spatial statistics have been widely used within the research domain of the discipline but only few studies have applied spatial computational methods to the monitoring and management of cultural heritage [12,16,18]. Point pattern analysis (PPA), for example, provides a useful tool to explore how the distribution of archaeological sites or features is dependent on a set of independent variables, thus providing a first explanation of different recovery rates in different environs across the territory under study. Moreover, PPA can also help to disentangle the behaviours of internal attraction or repulsion of site locations, thus giving a second possible explanation of why features are clustered (or not) in a specific area [19][20][21]. These two properties (first and second order effects) of a point pattern are extremely difficult to separate and within a research on site location strategies or settlement patterns analysis it would be necessary to understand how they work together [22]. In this study, we aimed to employ computational approaches usually applied to researching past human locational strategies, to the cultural heritage management domain. Specifically, looking at the first order characteristics of a point pattern representing potential archaeological features mapped from the satellite imagery, in relation to spatial covariates representing potential threats to the cultural resource. Once the behaviour of the whole point pattern of archaeological features is explored, we will propose a way of identifying areas of higher risk to the archaeology, on the basis of the condition assessment initially recorded during the mapping exercise. Proposing a way of formally explaining the location of all features and highlighting areas more at risk from the threats considered could help the monitoring of the cultural heritage within the broader region. It would also potentially serve as a 'remote' method for examining other areas in Central Asia with different monument types situated in diverse environmental settings. Strengthening the capability of data analysis, along with the data collection and digital archiving, is one of the objectives of the CAAL project, which aims to support local institutions in the development of nuanced management procedures that rely on modern technologies.

First order effects
As we argued above, the main focus of this study is to understand how the location of the mapped features, potentially of archaeological interest, is dependent on specific spatial covariates which represent threats to the preservation of such features. Therefore, the first order characteristics of a point pattern (the whole dataset of features) is analysed to test whether and how the intensity (=density) of the point pattern, is a function of four spatial covariates. A point pattern represents the result of spatial events occurring within a given region [23]. The investigation of the first order properties entails testing the degree to which the point pattern is the result of underlying process that describe the spatial event locations, namely understanding whether or not the feature location is influenced by external factors than can be modelled as covariates.
In our study, the covariates have been built to represent degrees of risk arising from the range of threats discussed above (Fig. 6).
Modern settled areas have been considered as sources of three of the nine threats: construction, urban expansion and agricultural activities, and therefore a map representing proximity to towns is here assumed to be a proxy for risk levels generated by human activities. We took each individual continuous cluster of buildings (N = 213) and used the centroid of the bounding polygon to calculate the distance map (Fig. 6c). This of course does not take into consideration the size of the settlements, which could reflect the intensity and degree of development, construction and land exploitation. Nonetheless, despite the size difference among the clusters, we assume (for the sake of simplicity) that the intensity of urban sprawl and infrastructure development is relatively homogenous in the whole area.
Furthermore, the Topographic Wetness Index (TWI) has been included as a proxy for areas of water accumulation, which characterise agricultural lands as well as flood-prone areas [24]. This is a particularly accurate proxy in areas that have been drained and turned into arable lands (Fig. 6b). The TWI was calculated using the SAGA GIS algorithm [25] and used to identify lowlands characterised by cultivated fields and hydrologically unstable belts along the watercourses, caused by the river channel migration (Fig. 6a).
Erosion derived from water flow has then been factored in as one of the major natural (although artificially induced) threats to the archaeology.
Bank erosion usually occurs in the so called 'youthful age' of a river, which is characterised by a steep gradient and fast flowing water. Therefore, we used the topographical slope to highlight areas where the risk of loss of soil along the riverbanks is higher (Fig. 6b). As discussed above, this is an on-going threat endangering archaeological features such as clusters of kurgans (see Fig. 5).
Finally, we considered the proximity of potential archaeological features to 'old rivers' located in the floodplains as representative of the risk that archaeology incur in being located next to hydrologically unstable areas (Fig. 4e). To extract 'older rivers' we used the Strahler stream order to isolate the high order (≤3) water stretches featuring a greater number of tributaries and consequently a larger amount of water flow. From those we calculated a buffer distance (Fig. 6d).
A non-parametric estimation ( ) of the univariate relationship between archaeological feature intensity and the four spatial covariates was calculated ( [22], 180).
This shows how the distribution of archaeological features is significantly related to the spatial variation of the covariates (Fig. 7).
Potential archaeological features are more likely to be located in areas with a Wetness Index of around 15, which defines the boundary between uplands and cultivated lowlands, than expected by chance alone (see Fig. A3). Moreover, a drastic decline in point intensity is notable for TWI values greater than 20, which in some areas are indicative of hydrologically unstable belts. The distribution of features is almost 'contained' within zones of little water accumulation (10 < TWI < 15) representing hilly and mountainous environment.
Within the upland environs, archaeological features appear to occur most likely in relatively flat areas (between 5 and 7 degrees of slope), but this does not exclude their proximity to 'young water streams' (Fig. 7b). Indeed, if their location seems to be more likely to be farther from high order rivers than expected from chance alone, a comparison of the non-parametric estimation of feature intensity in relation to all rivers and high Strahler order rivers shows that the general tendency is for features to be situated in close proximity of water courses (Fig. 7d and e). This shows that even though archaeological features are located far from major river-and therefore safe from the threat derived from lateral meandering, they are at risk of being affected by bank erosion, which is more vigorous along fast-flowing, smaller upland streams.
Finally, the intensity of potential features of interest is highly dependent on their distance from modern towns, in that they are more likely to be found in remote areas beyond 25 km from built-up areas (Fig. 7c). This apparently shows how the archaeology seems to be safe from threats like urban sprawl, construction and agriculture; however, this pattern can also be explained with the lack of archaeological features identifiable from the satellite imagery being the result of the very poor state of preservation of these in lowlands because of those threats having affected the area for decades.
Generally, first order effects suggest that potential archaeological features are located in relatively 'safe' areas, even though the overall absence of features in the valley bottom can most likely be attributed to the resulting severe destruction of the cultural heritage over the last century. Based on the univariate estimation of feature intensity in relation to the four covariates, we built a first order logistic model (Table 3) and produced a prediction surface of archaeological features intensity in South Khatlon (Fig. 8).
Investigating the first order effects of the point pattern provides insights as to where there is a higher chance to find archaeological features in relation to the four covariates representing sources of threats to the archaeology. However, this does not take into consideration those archaeological remains that have been already affected or damaged. Therefore, a method is required to explore the distribution of archaeological features marked as being affected by the threats here considered. A case-control approach is therefore considered.

Case-control approach
The predicted surface of feature intensity across the region highlights areas where any potential archaeological remain is more likely to be found than expected from chance alone. This prediction is based on the four covariates/predictors representing levels of risk to the archaeology, but does not consider the features that have already been affected by either riverine erosion or human activities. A binary multivariate logistic regression based on a randomised non-site samples has been useful to test whether the four covariates where good predictors for the location in relatively 'safe' areas of the overall population of mapped features (Table 3). However, in order to understand whether or not the features mapped as already affected by any of the threats are located in 'riskier' areas compared to the non-affected ones, we need to adopt a case-control approach, widely used in disciplines like ecology and epidemiology [26][27][28]. This more robust, evidence-based inference allows to understand whether the whole population of potential archaeological features is located in 'risky' areas and whether a subset of affected features is situated in 'riskier' areas, when compared to the rest of mapped features. Namely, the affected features are considered as cases and the rest of the population of non-affected features as controls. The first step of this two-step process confirmed that the location of all mapped features is strongly dependent on the covariates, with the function indicating that they are located in 'safe' areas. The second step involved the exploration of two subsets of affected features.
During the remote sensing analysis we have been recording the condition of the archaeological features, based on what was observable from the satellite imagery. In this way, it was possible to mark the points that had been recorded as features disturbed by a certain threat. Because the remote assessment is somewhat subjective, features presenting a risk value greater than 0 have been marked with the specific on-going threat (see Table A1 as reference of the threats considered). The first subset (N = 57) considered consisted of the features marked with Riverine erosion. For the second subset (N = 364), consisting of features endangered by human activities, we lumped all the features marked with Urban encroachment, Construction or Agriculture.
A binary multivariate logistic regression model was fitted to the first subset using the other mapped features non affected by riverine erosion as non-site samples (Table 4). This shows how those features already eroded by the action of water flow generally follow similar locational settings to the rest of the population, thus indicating that they sit in nor 'riskier' nor 'safer' areas. TWI is the only predictor that seem to influence the presence of eroded features differently from the rest of the population of mapped features and therefore this will be used in creating a prediction surface (see Fig.  A4 top). A second multivariate logistic regression model was fitted to the second subset, consisting of a point pattern of features marked as affected by human activities (Table 5). Conversely to the previous model, this shows how three of the four predictors are influencing the location of features endangered by human activities in a statistically significant different way compared to the rest of the population, thus meaning that they might be located in 'riskier' areas. Interestingly, the explanatory variable with the highest statistical significance is the distance from a modern town. Proximity to modern inhabited areas is commonly considered as being 'risky' for archaeology, as most human activities are supposed to be carried out in the near surrounding of settlements and for this reason a surface indicating Euclidean distance from built-up areas was  included as a covariate representing a threat in this study. However, the positive z value suggests that the probability of finding a feature affected by human activities increases as the distance from modern settlements increases, which seems to contradict the use of proximity to towns as a covariate representing a risk to the archaeology. From Table 2, it is clear how the majority of the features populating this subset have been affected by the construction activities not related to urban expansion, nor to agriculture, but rather to infrastructure development, most likely aimed at connecting remote areas.
The three independent variables that scored low p-values have been used to build a prediction surface indicating the areas of highest probability of finding features disturbed by human activities (see Fig. A4 bottom).
The two surfaces generated from the logistic regression models constructed following a case-control approach offer the heritage manager with a predictive model that provides (1) a regional zonation of areas where is more likely to find potential archaeological features disturbed by a number of different hazards, and (2) insights on which threats are better predictors to explain the location of specific subset of disturbed features.
A way of highlighting different levels of risk of finding features affected by riverine erosion and human activities within those predicted areas is proposed in the next part of the study. A closer look at the density/intensity of the two subsets is now explained from a case-control perspective via means of relative risk surfaces.

Relative surfaces of risk
Kernel smoothing is a standard spatial statistical and GIS method for exploring the spatial intensity of a point pattern thus, in our case, useful to highlight areas of higher concentration of potential archaeological features. This is a good estimate to describe the intensity of the whole population of points, but when exploring a subset of marked points such as features affected by human activities, a more accurate approach is the ratio-based density estimation known as 'relative risk surface' [29]. This case-control approach is usually adopted in epidemiological studies when investigating areas more at risk of an infectious disease, based on the underlying at risk population [30][31][32][33][34]. In our case, the population at risk is the whole array of archaeological features mapped from the satellite imagery, as they are all potentially at risk of being damaged or affected by the threats considered in this study. The infected cases are features marked as already having been disturbed by, or in clear danger of, human and natural originated threats. In our case, the intensity of features marked as endangered is biased by the underlying intensity of the whole population of mapped archaeological features. In order to control for this bias, we used the 'relative risk' approach as follows. An initial bivariate kernel density is calculated for the features marked as affected by riverine erosion (see Fig. A5 top). The bandwidth has been calculated using the likelihood crossvalidation method proposed by Berman and Diggle ([22], 171; [35]) for the three point patterns (all sites, sites affected by erosion and sites affected by human activities). The coarser bandwidth of the three adjusted via ppl has been chosen and applied to all the probability densities.
The areas with higher intensity of features marked as affected by riverine erosion clearly matches areas of higher concentration of all features, which suggests that the probability density is heavily biased by the distribution of the underlying population at risk. Therefore, by calculating the ratio between the density of marked featured by the density of population at risk, we can derive a surface of 'relative risk' which controls for the whole population bias (see Fig. A5 bottom). In order to avoid misleading values, we then plot a truncated risk surface (> 20%) overlaying the upstream catchment areas, indicating where the major water courses are meandering. We can notice how there is a higher risk of riverine erosion in areas adjacent to the floodplain, along the major rivers Panj and Amu-Darya, characterised by channel migration and where the water course is a slow-flowing 'old' river ( Fig. 9). Arguably, areas of river channel migration along the Vakhsk valley are archaeologically empty either because of the general hydrological instability that hindered people settling in the past as well as destroying the occasional archaeological remain, or because the archaeology has been completely wiped out.
Similarly, the intensity of features marked as in danger from human activities has been calculated (see Fig. A6). The 'raw' intensity estimation of this subset presents a similarly biased pattern, since areas of higher density values match areas of higher concentration of the population at risk (see Fig. A6 top). If the relative risk surface is calculated in order to control for this bias the result shows areas where archaeology is more at risk from human activities (see Fig. A6 bottom).
It is clear how even though the majority of features disturbed by human activities are located in the more elevated and remote parts of the study region, the areas in the heavily settled areas are indicated as most likely to contain archaeological features affected by human activities.
By plotting relative intensity values greater than 20% against the topography and the distribution of modern towns, it is clear how the lowland areas heavily settled and agriculturally exploited have a higher probability (>60%) of endangering and destroying the archaeological heritage (Fig. 10).
Interestingly, proximity to modern settlements proves to be mis-interpreted as a good predictor for locating areas where the archaeology is more endangered by human activities. This might be construed as an over-interpretation of the local peaks and troughs of the risk surface. A way to overcome this misleading interpretation of the results is to calculate a surface of p-values for the estimated risk values and plot the contours representing the significance threshold of 0.05 over the risk surface (Fig. 11).
This shows which peaks are statistically more significant, and therefore where the risk of finding archaeological features being affected by either river erosion (Fig. 11 top) or human activities (Fig. 11 bottom) is statistically high. While most risk surface peaks are confirmed to be of statistical significance, areas close to modern towns (represented by white dots) is considered to be a misleading peak in the relative risk values (Fig. 11 bottom). Arguably, this can be explained by the paucity of archaeological features mapped in the valley bottom, for preservation reasons already mentioned.

Discussion
The destruction of archaeological heritage is an on-going worldwide issue, being accelerated by a wide range of natural and anthropogenic factors. Organisations from national and regional levels to the ones of international standing, all work towards a better protection and conservation of archaeological sites and their work often requires better scientific data. One of the first priorities identified in all regions is the poor condition (or lack) of the documentation regarding archaeological sites and landscapes. Indeed, over the last decade, a lot of effort has been put into building comprehensive inventories and databases of archaeological sites and monuments in areas which are impressively rich in cultural heritage. Large research projects are contributing to better recording of existing sites as well as greater integration of modern records with historical data in various regions (i.e. EAMENA). CAAL, like other initiatives, is populating a complex multi-dimensional database of sites and landscapes that covers the whole of Central Asia, pulling together data from a variety of sources. One of these resources is satellite imagery, that has been at the centre of archaeological research for decades and is now increasingly employed as a resource in large documentation projects. Generally, satellite imagery is used to map and record the condition of the archaeological features, on the basis of what is visible 'from above', both in terms of structural layout of features of interest and in terms of quantifiable threats to the archaeology.
In this study, we develop a way of statistically analysing the mapped data and of modelling threats in order to quantitatively define areas which are 'riskier' for the cultural heritage. We construe the danger coming from the whole array of hazards and threats as a virus that is infecting and putting a valuable population of heritage resources at risk. Consequently, a case-control approach borrowed from disciplines like epidemiology resulted in a strong computational and analytical method with which to understand and model the risk threating the cultural heritage in South Tajikistan.
Modelling these threats as covariates allowed us to map areas that are potentially more dangerous for the archaeology, observing that archaeological features in South Khatlon are mostly located in 'safe' areas. However, the relative risk surfaces calculated on subset distributions of features marked as already being affected by  human activities and natural erosion, suggests that the areas where archaeology is more at risk are actually those where it is less likely to find sites and monuments. Arguably, the more likely of the explanations is that unfortunately the lack of archaeology in the valley bottoms, heavily settled and cultivated, is the result of these disturbances destroying remains of sites and monuments over several decades.
The potential of these formal methods of analysis lies in the fact that all of the input factors considered here can change over time. Thus, if more features are mapped, or ground-truth shows some sites to be of no historical interest, or if the number of modern settlements and cultivated lands increases, the final result of the model could be remarkably different. Moreover, if new threats are to present themselves, they could easily be added to the model, again producing different outputs in terms of risk prediction maps. Therefore, the advantage of the proposed approach stands in it being customizable to different circumstances (e.g. new or updated input factors) and to diverse regions that present distinct priorities. The efforts of this study to formally quantify the risk to the cultural heritage in a specific region should be considered one aspect of a more complete approach to risk management, which should involve an equal qualitative participation of local stakeholder and heritage authorities. We suggest that perhaps Spatial Multi-Criteria Decision Making [36][37][38] is the most obvious setting in which the present work could be of use for the numerous relevant stakeholders including the National Academy of Science of Tajikistan, the Ministry of Culture of Tajikistan and the Museums and Universities of the country. We advocate for the development of similarly formal and customizable methods of approaching the risk to cultural heritage in other regions where the archaeology is severely under threat.