Laying bare the landscape: commercial archaeology and the potential of digital spatial data

Archaeology in Britain and elsewhere in the world has undergone considerable change due to the advent of large-scale developer-funded excavation. Huge areas are now routinely excavated and some hot-spots of intensive development have seen considerable portions of the landscape investigated (Thomas 2013). In parallel with larger scale excavation has been a digital revolution in which much more information is created and stored in digital form, making it, potentially, much easier to combine and manipulate diverse sources of information. However, the links between the bodies carrying out excavation and those curating the resulting information are not always set up in a manner best suited to creating, storing and integrating large amounts of digital data. A pilot project in the upper Thames valley has shown the potential and the difficulties of data flow, and how this can allow, or impede, a new understanding of rural settlement in all periods. Here we will present the project, titled ‘Laying bare the landscape’, discuss our working methods and offer with some recommendations for how data flow and curation might be improved.


Introduction
Archaeology in Britain and elsewhere in the world has undergone considerable change due to the advent of large-scale developer-funded excavation. Huge areas are now routinely excavated and some hot-spots of intensive development have seen considerable portions of the landscape investigated (Thomas 2013). In parallel with larger scale excavation has been a digital revolution in which much more information is created and stored in digital form, making it, potentially, much easier to combine and manipulate diverse sources of information. However, the links between the bodies carrying out excavation and those curating the resulting information are not always set up in a manner best suited to creating, storing and integrating large amounts of digital data. A pilot project in the upper Thames valley has shown the potential and the difficulties of data flow, and how this can allow, or impede, a new understanding of rural settlement in all periods. Here we will present the project, titled 'Laying bare the landscape', discuss our working methods and offer with some recommendations for how data flow and curation might be improved.
Certain areas in the United Kingdom have seen great concentrations of development-led archaeological investigations, often carried out over several decades and by a multiplicity of different organisations. Realising the pressing need for wider synthesis of these results, in October 2012 we began a pilot project, supported by the John Fell Fund (Oxford University), to test the methodology and its application in part of the upper Thames valley. The project has two linked aims. One is to determine if the data produced by development-led archaeology could be put to use for extensive landscape study through integrating it within a Geographical Information System (GIS). We have learned to coordinate data which has come from over 60 different investigating bodies and which is held in three different county Historic Environment Record (HER) systems. We have also developed a more nuanced understanding of the character and significance of the buried archaeological resource of this 2 area, as well as an understanding of the potential and the challenges of conducting this kind of synthesis. Along the way, we have had some indications of potential approaches to digital spatial data generated by large-scale development-led archaeology. The second aim is produce a new narrative of human history in this landscape from prehistoric times right the way through to modern land use, including a greater understanding of the relationships between areas of intensive human activity and apparent 'negative spaces', as revealed by this large-scale work. These archaeological implications will be considered in more detail in a future paper (Thomas et al. forthcoming).
In this era of abundant development-led work, we are still coping with a vast amount of data from excavation, evaluation, and surveywe are, to some degree, 'drowning in data' (Thomas 1991). Moreover, this data is being generated by a number of commercial archaeological practices. Large open-area excavations are a characteristic of commercial archaeological investigation in the upper Thames region, with some of these developments involving investigation over many tens of hectares, far exceeding any research-led excavation projects in their scale. Exploration of these enormous areas allows for analysis at the landscape level, rather than simply at that of the individual settlement or field system.
The arrangements for carrying out this work, and the financial resources to support it, are guided and limited by specific individual development schemes. Archaeologists in the local planning authorities decide, on the basis of information held in their HERs, what archaeological work is needed on a particular development. The developer will then commission a commercial archaeological practice to carry out the work. Therefore, despite the intensity of work conducted in the area, individual archaeological practices do not have responsibility (or the funding) for undertaking wider synthesis of all the results.

Methodology
We took as our study area an elongated polygon covering 154 km 2 to the south of Cirencester, stretching between Cricklade and Lechlade, in the uppermost part of the Thames valley in southern England (Figure 1). The massive amount of archaeological excavation and other fieldwork here since the 1970s and, especially, since 1990 has occurred largely as a consequence of extensive gravel quarrying ( Figure 2). The study area is archaeologically rich, as seen from the cropmark evidence (mapped in detail by English Heritage's National Mapping Programme (NMP (Fenner n.d.)). In total, over 630 separate investigations (or 3 archaeological 'events') by more than 60 consultants have occurred ( Figure 3). 78% of these involve some form of 'intrusive' archaeology. Further large areas have been subjected to geophysical survey.

Figure 1. Location of the study area
Our main starting point for assembling information was Bournemouth University's Archaeological Investigations Project (AIP) database (Darvill and Russell 2002), checked against and supplemented by consultation with the county HERs. The study area is divided between the counties of Gloucestershire (55%) to the north and Wiltshire (43%) to the south, with Oxfordshire making up the easternmost 2%. Data gathering from this multiplicity of authorities highlighted a complication at the outset. Terminology and data classification systems were not standardised between the HERs (one person's evaluation was another's trial trenching and a third's watching brief). As anyone who has searched a large archaeological database will know, differences in terminology are widespread (Evans 2013: 26).  Most HERs distinguish between three types of record: 'events' (discrete episodes of archaeological investigation, such as excavation or building survey), monuments (i.e. types of archaeological site); and sources/archives. We were interested in modes and scale of investigation, so we entered relevant events into the GIS, allowing them to be analysed by event type, size and investigating organization. In most cases these records provided a GIS boundary 'shapefile' for the limits of the investigative event, which enabled us to quantify what types of investigations were being conducted and on what scales ( Figure 4). However, we wanted to be able to think at a more detailed level, showing actual features as recorded by the excavators or surveyors. Our next step was therefore to visit the investigating organizations to obtain their digital phased site plans. Our objective was to assemble all the relevant data, convert it into GIS-usable form (where it is not already in that form), integrate it within the GIS and then analyse these results, mainly in terms of the overall spatial and chronological patterning of features such as trackways, fields and settlements. The GIS software used was ArcGIS 10.1 and data management employed both Microsoft Access and Excel. We were aware that the exact formats and recording methods of this data were likely to vary between investigating organisations and from site to site, but we believed (rather ingenuously as it turned out) that most of the results of this work would be available in digital form; we assumed that GIS-ready phased shapefiles or CAD files of site plans were created routinely during the post-excavation and publication process. This assumption proved to be incorrect.
Visits to commercial practices and searches of their digital archives showed that few sites have the sort of GIS-ready data available in an immediately useful format, but that the data existed; when digitised and made consistent the work of different practices could be fitted together in such a way as to enable views of entire excavated landscapes. In areas of continuing development, such as successive extensions of gravel quarries, it is important that new archaeological takes full account of what was found previously, possibly by a different organisation, in adjacent or nearby areas ( Figure 5). Because of a lack of integration of existing data, this is often not easily done at present. However, by combining digital (or digitised) site plans we can produce detailed views of the whole archaeological landscape. As well as being of interest to researchers, this is potentially of great value to archaeological practices and local authority archaeologists. They are aiming to make the archaeological input to planning decisions as rapid and as soundly-based as possible. Access to integrated GIS information of the kind described should assist greatly with this. Thus, a key outcome of our project is a series of suggestions for how data flow and curation could be improved for the future. A major improvement would be that all future deposition of data to the HER include not only the GIS shapefile of the investigation boundary, which has become standard current practice, but also digital phased feature plans in a format that can be readily (or easily modified for ready use) integrated into GIS. This should involve as a minimum: The first of these is the most basic yet often most neglected -the georeferencing of any digital plans. This permits the accurate integration of fieldwork work done by different organizations, allowing (for example)_features only partially seen in one excavation to be understood over their entire length ( Figure 6).Whether this takes the form of CAD .wld files or any real world coordinate system, plans without this type of referencing are problematic.
Firstly, the organization which has carried out the work is in the best position to know the precise geographical coordinates of the features and interventions, and therefore can produce the most accurate georeferencing if it is done at the same time as the illustrative figures are produced for the client. This might also include linking tabulated phased feature information with appropriate CAD layers. Secondly, the curators of the data (HERs, the National Monument Record (NMR) and the Archaeology Data Service (ADS)) should not have to bear the burden of georeferencing all incoming digital data in order to place them within the single GIS. Whilst a degree of this will need to occur with the already extant backlog of held digital data, there is no need to add to the backlog when the remedy is relatively easy, and really just a matter of good professional practice. Digital plans showing at least the major excavated features, and linked dating information (or, alternatively, phased digital site plans) are also important for making the data readily useable.
Beyond these basic requirements, we feel that flexibility in data formats is important. This is partly because computer hardware and software changes regularly, but it is also because commercial archaeological work is carried out by organizations of varying size, some without the resources to support a specialist geomatics unit, or to have invested in the often costly software required for 'high-end' GIS and CAD manipulation of data. However, some of these limiting factors can be mitigated by the use of open-source programmes and 'free-ware', which should allow for some standardization of geo-referenced site data. As an early result of our project, both Gloucestershire and Wiltshire County Councils have begun requesting that archaeological practices deposit more detailed digital data in a form suitable for GIS. At the time of writing, the level of positive responses has reached nearly 10 50%. We are confident that, as the benefits of integrating phased digital feature plans become apparent, then that figure will rise. A presentations given at the 2013 Institute for Archaeologists (IfA) conference generated a great deal of interest from both HERs and commercial practices in other areas of the country, keen to discuss how they would benefit from having such a GIS resource for their region. It is important to also keep in mind that the digital spatial data created for the grey literature or published reports are a part of the site archive, and as such should form part of the archival deposition as well as the submission to the relevant county authority. The exact method of dissemination of the data still needs to be considered, in light of the responsibilities of the HERs and curatorial bodies; however, this is not an insurmountable hurdle and the channels of communication and spirit of teamwork engendered by this pilot project will likely lead to workable solutions.
When the National Mapping Programme was first suggested in the last century, there was some incredulity that so much data from so many diverse sources could possibly be brought 'under one roof' (R Bradley, pers. comm.). When, however, the benefits of having such a resource were seen, these perceived stumbling blocks were quickly overcome and a tool of great value was created. It is our hope that such will be the case with the compilation of this mapping data in areas of intensive investigation.

Some Implications of Our Work
Data at first sight seems to be about factual information, computer systems and standards. It does indeed involve all of these things, but data also derives from, and lays bare, crucial social and organisational relations. Archaeology has a well-recognised tendency towards fragmentation, with commercial practices, local authority archaeologists and academics respectively being driven by varying motives, different funding streams and different desired outcomes. All are under pressures of time and money, but are also united in an attempt to understand the past. As well as our technical recommendations, an increased sense of collaboration between academic researchers, commercial archaeologists and planning bodies is also needed (Bradley 2006). We have already begun this by hosting a very successful collaborative seminar which drew on the combined expertise of representatives from the relevant HERs, English Heritage, and project and geomatics managers from some of the major practices working in the study area. The local authority archaeologists for Wiltshire and Gloucestershire have been strongly supportive of this project. They have provided assistance both with advice and with information from their HERs. We anticipate that the recommendations made by this project will be of significant value to them in their planning 11 work. Furthermore, since this problem (lack of integration) is a national one, English Heritage has also been extremely supportive in the development of this pilot project.
The integration of cropmark, geophysical, evaluation and excavation evidence within a single GIS database (and, if possible, the addition of results from local voluntary groups and individuals, such as fieldwalking) should allow for more efficient desk-based assessments, a more nuanced understanding of risk-by-period for planning purposes, and better awareness of the overall archaeological potential of an area prior to field evaluations. In some cases, a high level of existing knowledge and understanding of risk could even reduce the need for large scale evaluation by trenching events; this could instead be replaced by more targeted and in depth excavation sampling. More freely available digital data can also become the basis for greater public engagement with archaeology and other evidence of the past. The next phase in our work is to use the data we have to better understand the landscapes of this uppermost part of the Thames valley. We now creating a series of phased landscape plans, using the spatial data we have collected. One aspect of interest which has already emerged is that we can start to recognise and to think about the 'negative spaces' in the 13 landscapeareas in which little or no activity is archaeologically detectableas well as the areas of intensive activity.
In the overall study area, the Neolithic and Bronze Age, appear to be only very lightly represented in the uppermost Thames valley, as compared to the middle and lower Thames.
This suggests rather different long-term histories in these areas; additionally, in some specific regions of the study area, we noted that certain periods were particularly poorly represented compared to neighbouring regions (notably the sub-Roman/Saxon period). Possibly, certain factors, whether social, political, or environmental, led to an under-exploitation of these regions. Alternatively, these 'holes' may be in some instances an artificial by-product of excavation methodologies. If the open area stripping technique is applied, without careful sampling of topsoil and subsoil layers, it possible that some of the shallow archaeological material (perhaps the absent sub-Roman evidence) is being machined away. This is a real concern and one which it may be possible to address by analyses of the site methodology of the investigations within these period-based 'blank spots'. It may be that more appropriate techniques could be devised to large scale developer funded work which would answer this question.
In some instances, however, comparison of the excavated information with the NMP cropmark mapping strengthens the argument that these areas are truly lacking in archaeological features. Such comparison can also help to suggest a chronology for some previously undated cropmarks, as we can now link the dating evidence for excavated portions with their fuller extent as linear cropmarks in the wider landscape. These may indicate a series of negative spaces that constitute part of the very organization of the landscape, being apparently blank areas between settlements or fields. Despite there being no obvious difference in topography or geology, there are some locations which appear to have been left 'unused' across immense swathes of time, in some instances from the Iron Age into the Post-Medieval period (Figure 7). These could have been areas of common pasture or woodland, for example.

Conclusion
Our selection of the study area was heavily guided by the fact that the number of wellestablished units who did the majority of the work in the study area, and the HERs involved, had the IT resources to assist the project and plan for future extensions of its application; we wanted to run the pilot where it had a greater chance of success. We recognise that nationally 14 not all units and HER's are equally equipped, but we believe the principles of the project can be applied anywhere in Britain. Building on this project, we are developing a larger scale proposal which will apply what we have learned to other intensively investigated regions of the country. Archaeology world-wide is participating in the so-called 'Big Data Revolution'.
There are huge empirical benefits in combining digital data sources in a seamless and accessible manner. Equally the creation, movement, curation and use of archaeological information highlights the organizational structure of British archaeology, in the process highlighting both the need for various sectors to act together and the potential benefits to be had from greater cooperation in our combined attempts to understand the past and to manage its physical traces.