movecost: An R package for calculating accumulated slope-dependent anisotropic cost-surfaces and least-cost paths

Cost-surface and least-cost path analyses are widely used tools to understand the ways in which movement relates and engages with the surrounding space. They are employed in research fields as diverse as the analysis of travel corridors, land accessibility, site locations, maritime pathways, animal seascape connectivity, transportation, search and rescue operations. This work describes the ‘movecost’ package, designed for the free R statistical environment, which provides the facility to produce, in a relatively straightforward way, various accumulated slope-dependent cost surfaces and least-cost outputs from different models of movement across the terrain. The package motivation and significance are described, and the main software characteristics are outlined by means of an illustrative example.


Motivation and significance
Geographic Information System (GIS) is a widely used tool for the analysis, modelling, and interpretation of a wide range of data [1,2].GIS provides facilities to acquire spatial data, to integrate them in a common framework, and to meaningfully organize and interrelate spatial information.GIS and spatial data analysis prove crucial to research fields as diverse as environment and earth science [3,4], ecology [5], agriculture [6], public health [7,8], socio-economic [9] and forensic disciplines [10], crime studies [11,12], archaeology/anthropology [1,[13][14][15].One of the aspects that GIS may help exploring is understanding the ways in which movement relates and engages with the surrounding space.For this purpose, cost-surface and least-cost path analysis have been increasingly used in a variety of contexts and for different aims including (but not limited to) the study of prehistoric travel corridors [15][16][17][18][19], dispersal rate [20,21], human movement and land accessibility [22,23], archaeological site location [24], Roman aqueducts [25] and roads [26], maritime pathways [27][28][29], animal seascape connectivity [30], off-road transportation [31,32], maritime search and rescue operations [33].To put it in a nutshell, cost-surface analysis entails assigning ''a cost to each cell in a raster map, and to accumulate these costs by travelling over the map'' [34] outward from a source location.Cost can be expressed using different measures, for instance fuel consumption [32], energy, speed, or time [35].Least-cost analysis calculates which neighbouring cell on the accumulated cost raster one reaches back to the source location along a least costly path [1,13,36].
The 'movecost' package aims to provide the facility to calculate in R [37] an accumulated slope-dependent cost-surface and leastcost paths pertaining to movement across the landscape.It must be acknowledged at the outset that the package addresses the issue of cost-surface and least-cost path calculation from the specific standpoint that cost is function of the terrain slope.While this can be seen as a limitation, and acknowledging that movement can be also influenced by symbolic elements, type of terrain, weather condition, clothing, loads carried, gender, age, fitness, body characteristics, headwinds, field of view [13,[38][39][40], slope is indeed a significant factor (albeit not the only influential one) affecting the movement across the terrain [17,19,39,41].While GIS software such as GRASS [42] and Esri's ArcMap [43] feature modules for cost-surface and least-cost paths generation, virtually no R package as yet (at the best of this author's knowledge) provides the possibility to produce, in a comparable way, various accumulated slope-dependent cost surfaces and leastcost outputs from different models of movement.An interesting R package, 'fastmaRching' developed by Fabio Silva [44], can indeed output arrival time from a source location to each raster cell.However, unlike 'movecost', the cost of movement must be preliminarily defined by the user and input as a cost surface.Interestingly, 'movecost' and 'fastmaRching' can be conceived as complementary.
The availability of different slope-dependent cost-functions marks a sharp difference from the GRASS and ArcMap modules mentioned above.Unlike the latter, where the cost surface must be input separately or where a table of vertical factors must be pre-computed by the user and fed into the software [43], 'movecost' implements ten cost functions (see later on) which can be easily and readily applied to the analysis of movement across the terrain represented by a Digital Terrain Model raster (hereafter DTM).While literature abounds of functions that model the cost of movement across the terrain [overview in 1,45], the ones implemented in 'movecost' have been chosen both on the basis of the author's preference and in consideration of their frequent use in literature.Future improved versions of the package will widen the body of implemented functions and will also provide the facility to apply a user-defined cost function along the ones that are built-in.A review of the functions' rationale, as well as of their potential and limitations, is beyond the scope of this article; the interested reader is referred to the available literature [e.g., 1,13,36].However, it must be borne in mind that ''any modelling entails a varying degree of generalization'' [29] and risks to simplify extremely complex realities.It has been found, for instance, that the Tobler's hiking function for off-path movement tends to overestimate the time taken to navigate, while his model for on-path movement seems to better predict walking speed [46].
Besides the implementation of slope-dependent cost surface and least-cost path analysis in a free and open source environment like R, and besides the availability of different cost functions, another motivating factor for the development of the package has been the easiness of use.To touch upon what will be elaborated further in the next paragraph, all the user has to do is feeding a DTM into the movecost() function, along with a point representing the start location.An accumulated slope-dependent cost surface raster will be generated and, if destination locations have been also fed into the function, least-cost paths will be displayed over the input DTM.The cost-function on which the analysis is based can be selected with an apposite parameter, and other parameters allow the user to adjust aspects of the graphical outputs and/or to export the output data in anticipation of any other use the user may have in mind.
Finally, the package has been conceived with an eye toward high-quality graphical results.Considerable effort has been put in making the package produce images that are elegantly laid out (of course, by this author's subjective aesthetic standards), with informative main titles and subtitles, and which may prove ready for use in publications without further editing.

Software description
The 'movecost' package (currently in its 0.2 version) is available from CRAN (The Comprehensive R Archive Network) and can be installed using the command:
For the calculation of the cost surface, the package ultimately relies on the 'gdistance' package and follows the procedure described in literature [14,52].Internally, movecost() first calculates the altitudinal difference between the cells of the input DTM (using the transition() function out of the 'gdistance' package) and then divide it by the distance between cell centres (using the geoCorrection() function out of the same package).The result is the slope expressed as rise over run.Next, the cost function is applied to the slope dataset, limiting the calculation to adjacent cells [52].For instance, the Tobler's on path cost function is internally defined as 6*exp(−3.5*abs(x[adj]+0.05)),where x[adj] is the slope as rise/run calculated for adjacent cells [52].
The interested reader may want to refer to the package's help documentation to know how the implemented cost-functions are internally defined.Finally, the geoCorrection() function is applied again to account for the distance between cell centres because the cost involved when moving in diagonal directions is larger than the cost of moving along cardinal connections.
The walking-speed-related cost functions (see later on) are used as they are, while the other implemented functions are reciprocated.This is done since, as stressed in literature, ''gdistance works with conductivity rather than the more usual approach using costs''; in consideration of this, ''we need inverse cost functions'' [14].Therefore, if we want to estimate time, we have to use the walking-speed functions as they are since the final accumulated values will correspond to the reciprocal of speed, i.e. walking time (or pace).In the other cases, we have to use 1/cost to eventually get cost/1 [14].

Illustrative example
Rather than providing a list of all the implemented parameters (which are indeed thoroughly detailed in the package's help documentation), the use of the package is described by means of an illustrative example.A brief outline of the package's main parameters will be nonetheless provided.In our example, we use a DTM (cell size 25 m) representing a portion of land immediately west of the Mount Etna (Sicily, Italy); it is made up of 1660 rows and 2846 columns, totalling 4,724,360 cells.The DTM is part of a larger tile of the European Digital Elevation Model (EU-DEM, v. 1.1) [53].We also have a point ('SpatialPointsDataFrame' class) representing the location from where we want the cost of movement to be accumulated outwards, and eight spots representing destination locations ('SpatialPointsDataFrame' class).The DTM, the start and the end locations have been stored in R in three objects named dtm, start and ends respectively.
For the sake of this example, we want to produce a raster that represents the accumulated cost of moving from the start location outwards, and the least-cost paths from the start to the  end locations.We conceptualize the cost in terms of walking time.Among the different slope-dependent cost functions implemented that express the cost as walking time, we use the well-known Tobler's hiking function (for an on-path walk) [54].
The latter models walking speed as function of the terrain slope and, when reciprocated (for the reasons explained above), can be used to work out walking time.
To accomplish the mentioned task, the user can simply enter the command:
As for the function's output, two maps will be produced in the R graphic panel and some data (described later on) will be returned.The first map (Fig. 1) represents the walking time accumulated from the start location outwards.Walking time is expressed in hours by default, but the user is given the facility to choose minutes using the parameter time=''m''.Black isolines (called isochrones) represent different walking time zones.The user can set the interval at which the isochrones are charted using the parameter breaks; if no value is supplied, the interval is set by default to 1/10 of the range of values of the accumulated cost surface.As far as the accumulated cost map is concerned, the user can select the type of visualization.This is achieved using either outp=''r'' or outp=''c''.The former produces a raster with a colour scale and contour lines representing the accumulated cost surface, like in Fig. 1; the latter option only produces contour lines (i.e., isochrones).
The second map (Fig. 2) represents the start location (black dot), the destination locations (red dots), and the least-cost paths (black solid lines), plotted on top of the input DTM.Numeric labels close to the destination locations report the cost (time, in this example) involved in moving from the start to the destinations.Needless to say, in this example the cost is expressed in hours as set by the apposite parameter mentioned above.The labels can be disabled using the parameter destin.lab=FALSE.In the mentioned Fig. 2, a blue solid line was added for the sake of this illustrative example to show the anisotropy (i.e., directiondependent) [1] of the least-cost path calculation.The least costly path to move from the origin to the destination (solid black line) does not totally overlap with the path along which one moves in the opposite direction (solid blue line) [52].
The two above described maps can be arranged together in a single display using the parameter oneplot=TRUE.
As for the output data to which reference was made earlier (and that in this example are stored in the result object), the function returns a list storing four components: -a raster ('RasterLayer' class) representing the accumulated cost (accumulated.cost.raster); -contour lines ('SpatialLinesDataFrame' class) extracted from the accumulated cost surface (isolines); the estimated least-cost paths ('SpatialLines' class) (LCPs); -a copy of the input destination location(s) dataset with a new variable ('cost') added ('SpatialPointsDataFrame' class) (dest.loc.w.cost).
The length of each least-cost path is stored under a variable of the LCPs component; in the present example the variable can be accessed using results$LCPs$length.The length unit depend on the unit used in the input DTM, metre in our example.Should the user want to use the output data into any GIS software, the parameter export=TRUE will export the accumulated cost surface as a GeoTiff file, while the isochrones and the least-cost path(s) will be exported as shapefile; all the files will bear a suffix corresponding to the selected cost function.
Finally, as for computational demand in terms of processing time, producing the accumulated cost surface around the start location, which in our example entails processing 4,724,360 cells, took about 3.7 min on a macOS-based machine (2.3 GHz Intel Core i5, RAM 16 GB 2133 MHz LPDDR3).Producing the accumulate cost surface and the least-cost paths took about 4.18 min overall.

Impact and conclusions
From the above paragraphs, it is apparent that the 'movecost' package has not been designed to pursue new research questions.Rather, the rationale and motivation lie in the opportunity to provide GIS users with a free and open-source facility for the generation of various accumulated slope-dependent cost surfaces and least-cost outputs from different models of movement.As stressed, this was not as yet available in R nor anything directly comparable is so far available in widely used GIS software such as GRASS or ArcMap.Considering that many are the factors that influence the cost of movement across the landscape, the package's reliance on the slope may be seen as a limitation.However, since slope proves a significant factor influencing the movement, and since the analysis of the way in which movement relates and engages with the surrounding space in terms of cost is a topic central to many disciplines (like for instance the social/anthropological sciences [1,13]), the package will likely meet the favour of many users in more than one research field.Future developments of the package, including widening the number of implemented cost functions and the facility to apply a userdefined function for cost calculation, will definitely expand the potentials of the programme.In any instance, as of its current version, the easiness of installation and use, the transparency of the source code, the availability of a relatively large body of ready-to-use slope-dependent cost functions, and the attention for the quality and flexibility of the graphical outputs, are bound to assure a positive reception by both GIS users and R enthusiasts.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Accumulated cost-surface raster produced by the movecost() function.The black solid dot represents the location from which the cost (expressed as walking time, in hour) is accumulated outwards.Isolines (isochrones; iso = equal, chrone = time) connect locations of equal travel time.The cost is based on the Tobler's hiking function for on-path walking (see the text for further details).

Fig. 2 .
Fig. 2. Least-cost paths produced by the movecost() function.Black solid lines represent the least costly paths connecting the source location (black dot) to the destination locations (red dots).Labels indicate the walking distance in terms of travel time (in hour).The solid blue line represents the least-cost path for reaching back the source location from one of the destination locations; the path only partially overlaps the other because of the anisotropy (i.e., direction-dependent) of the least-cost path calculation.Dots and lines are plotted on top of a DTM (25 m cell size) representing a portion of land immediately west of Mount Etna (Sicily, Italy).