Urbanism in Ancient Peninsular Italy: developing a methodology for a database analysis of higher order Urbanism in Ancient Peninsular Italy: developing a methodology for a database analysis of higher order settlements (350 BCE to 300 CE)

This article describes the methodology of a two-year research project to create an analytical database and GIS of 583 (proto-)urban centres on the Italian peninsula that existed between 350 BCE and 300 CE. The article is linked to the project's data files, deposited with the ADS, and is essential reading for users of the database. The research design, format and functionality of the database are described in conjunction with the challenges encountered during the methodological development of the project. The relevance of the project to the historical development of urbanism on the Italian peninsula during the period under study is outlined. An overview of the project's results provides an insight into the potential of the research methodology. It is relevant to anyone interested in ancient urbanism, Italian and Roman archaeology, or in the methods and results of combining ancient textual and archaeological legacy data with geospatial data.


Introduction
This article describes the methodology of a project to collate and analyse the evidence for all the archaeologically known major settlements on the Italian peninsula south of the Po River occupied during the period 350 BCE to 300 CE. The overarching research aim of this project is to employ quantitative and geospatial analyses to understand how and why the settlement pattern of (proto-)urban centres changed during the period spanning the Roman conquest and control of the peninsula. At the core of this research has been the creation of an analytical database and GIS. This article documents the processes of data collation and database architecture, and provides indicative results to illustrate the value of the resulting resource both for the research project's wider aims and for other potential users of the project archive.
An important innovation is the collation and analysis of data on a peninsula-wide scale. This is intended to permit the potential identification of patterns that are not apparent when approached on a site-by-site, or even region-by-region, basis. Regional developments may be compared and contrasted, and thus better contextualised. Although there is already general awareness that supraregional processes affected ancient settlement patterns, owing to the regionalism of archaeological studies (e.g. Colivicchi and Zaccagnino 2008;Oakley 1995;Paoletti 2008), this awareness remains impressionistic. The database is therefore intended to highlight the chronological and geographical parameters of the supra-regional settlement processes that characterised the period which encompasses the Roman conquest of Italy through to the eve of Late Antiquity.
The database contains both archaeological and ancient textual information relevant to physical and institutional development of these sites over time. A particular purpose of the analyses is to establish with greater precision whether there are chronological and geographical correlations between known historical events, changes in settlement patterns, and the physical and institutional development of these sites. Moreover, the analyses have the potential to identify and elucidate historical processes that are not documented in ancient textual sources. The most significant of these processes is the sharp rise in the number of archaeologically documented major settlements during the early Hellenistic period (Sewell forthcoming). These settlements are contemporaneous with the most intensive phase of fortification construction on the peninsula during the whole of classical antiquity; they also concentrate in the same areas (see also sections 4.8 and 5). These developments are not clearly documented in Roman textual sources and although they occurred during the period of Roman conquest, they did so both in areas directly conquered by Rome and in areas where no Roman activity is recorded. These and other results are detailed in a series of papers that address settlement processes related to specific historical periods (Sewell forthcoming, in prep a, b, c).
The project's data files are deposited with the Archaeology Data Service (Sewell 2015). For users of these files, this article provides supplementary information on the methodological decisions taken during the collation of the data and the resulting significance for the character and value of the database. Part of the reflexive approach adopted for the study is to publish discussion of the methodological processes undertaken in relation to the research design and the consequent database structure. Data categorisation required particular care, and a discussion of this aspect of the project sheds light on the nature of the archaeological record and some of the problems of using it.
The study is limited to sites that are known archaeologicallya final total of 583making the adoption of a database and GIS essential for the storage and analysis of the large quantities of data involved. A primary research objective is to adapt the methodology of landscape archaeology for the study of ancient urbanism. The core of the database is formed by archaeological data collated exclusively from published works, which can thus be described as legacy data (Witcher 2008). Owing to the large number of sites included in the study, it was not possible within the limits of the current project to identify and integrate unpublished evidence from disparate archives. In order to research such a large number of sites efficiently, categories of archaeological information were selected for digitisation that are both commonly published and are diagnostic of sites' physical development.
The structure of the database and the analysis of the data have been determined by the project's core research questions. Historians consider Rome to have subjugated most of the Italian peninsula in the period c. 350 to 250 BCE. Yet what might be described as characteristically 'Roman towns', with fora and specific types of monumental architecture, such as basilicas and amphitheatres, only became common in peninsular Italy from the later 1st century BCE in a process associated with the creation of veteran colonies and municipia ( Figure 1). In fact, the majority of these towns had pre-Roman origins and therefore demonstrate several centuries of development before they reached their familiar Roman form. A key research aim has been to understand more about the processes that shaped the network of major settlements over time, from immediately before the Roman conquest through to the development of Roman towns during the early imperial period. GIS is well suited to the task of identifying changes in the spatial organisation of settlements on regional and supra-regional levels, and a database possessing a robust chronological structure can chart these changes over time.

The map comprises both confidently assigned statuses derived from inscriptions or explicit statements in literary sources and hypothesised statuses assigned by scholars based on archaeological evidence and/or implied references in ancient texts.)
Although the well-documented events of the Roman conquest have tended to dominate and influence the interpretation of the archaeological record, the rich evidence for pre-Roman settlement provides the opportunity to set the developments of the Roman period into context. Greek and Roman written sources describe at least 35 distinct peoples inhabiting Italy during the 4th and early 3rd centuries BCE. Apart from some Greek accounts of the Greek city-states of southern Italy (Purcell 1994), these non-Roman peoples have left no surviving accounts of themselves. Hence, we have only glimpses from the ancient texts, and almost exclusively through Roman or Greek eyes. As a result, knowledge of Italy's pre-Roman peoples is largely based on archaeological evidence, published in studies that are predominantly regionally based. A database of their major centres, encompassing the entire peninsula, permits the settlement histories of each region to be compared and contrasted so that broader developments may be identified and better understood.

Context of the Project in Relation to Earlier Work
The best known of the early attempts to catalogue all ancient towns in Italy is William Smith's Dictionary of Greek and Roman Geography (1854). More comprehensive still is the urban-focused second volume of Heinrich Nissen's Italische Landeskunde (1902). In addition to describing historical events and the archaeological remains relevant to each site, both of these great works drew on ancient written sources and contemporary scholarly opinions to discuss the possible location of ancient towns that were yet to be discovered. Since these works were published, a number of these towns have been securely located; the locations of those still missing continue to be debated. During the course of the project a list of 273 names was compiled of settlements referred to in ancient written sources, the locations of which remain either unknown or disputed. These are excluded from the project for two reasons: 1) a lack of archaeological evidence for their settlement histories and physical forms means that it is not possible to determine whether they would have fulfilled the criteria for inclusion in the study, explained below; 2), without certainty of location, no spatial analysis is possible. Roughly 80% of these unlocated sites are broadly situated in those areas of western-central and southern Italy where the higher concentrations of archaeologically known centres are found (see section 4.3).
New sites continue to be discovered and published, however, and the project includes 216 ancient sites that are usually referred to in published works by their modern toponyms, either because their ancient names are unknown or disputed. For the purposes of the project, the nomenclature of the Barrington Atlas (Talbert and Bagnall 2000) was the basis for the primary identifier for each site in the database. This was amended on a site-by-site basis only if the Barrington's nomenclature proved to be controversial on the basis of further research. Sites not included the Atlas were assigned identifiers most commonly used in the archaeological literature.
As a result, the project has identified 273 ancient toponyms with no known locations and 216 sites with no known name. The fact that there are more ancient toponyms without locations than there are anonymous sites is just one of many indications that numerous major settlements are archaeologically invisible. For example, the literature documents many necropolises with no known associated settlement site. For this project, however, only five of these necropolis sites have been included in the database on the basis that their significant size indicates that they served substantial, but undocumented, urban centres.
Perhaps due to the densely populated character of ancient Italy, plus the high and ever-increasing number of sites found through rural survey, more recent studies of Italy's ancient settlement forms are mostly limited to specific cultural groups, regions or individual sites (e.g. Bourdin and D'Ercole 2014;Quilici and Quilici Gigli 2002;Spanu 2004). Those seminal works dealing with urbanism more generally (e.g. Sommella 1988;Gros and Torelli 2007) focus mainly on the historical development and character of Roman urbanism, illustrated through the presentation of select individual sites. Although Rome's contribution to the pattern of major settlements on the peninsula was not as significant as might be imagined, peninsular colonies founded by Rome still tend to dominate scholarly enquiry (e.g. Lackner 2008;Sewell 2010). This is due in part to the essential roles colonies played in Rome's consolidation of power across the peninsula, and the privileged status colonial towns maintained in the peninsula's urban network well into the imperial period (see the contributions in Dialoghi di Archeologia ser.3, 10.1 (1992)). The bias towards colonies also reflects their treatment in the ancient written sources which put particular emphasis on the foundation of colonies during the Roman Republican period.
Although the study described here has a mapping component, it has a different purpose to that of the Italy section in the Barrington Atlas and its regularly updated online version, Pleiades. Although an extremely useful resource for locating sites, the georeferenced information provided by Pleiades relates to only three of the 91 data categories used in the current project's database. The specific research questions of the project required much more detailed and specifically structured data, and hence the database is distinct from more generic encyclopaedia and atlas listings.
Italian and international scholarship on the settlement archaeology of the peninsula has been prolific over the last 30 years. Many new sites have been identified and bright new light has been shed on previously known sites. The project consulted more than 1,400 bibliographical references, of which more than half date to the 21st century, and more than 80% to 1990 or later. Thus the project responds to the opportunities and challenges presented by the availability of large amounts of new primary material now in need of synthesis, as well as exploiting the technological solutions for efficiently and effectively dealing with such large quantities of data.

Specific Research Questions
Four research questions determined which data categories were to be incorporated into the database and how they should be structured. The database is designed to reveal patterns related to the following research questions.
Are there chronological correlations between the Roman conquest of Italy, subsequent historical events and changes in the pattern of urban and proto-urban centres of the peninsula?
Are supra-regional processes (e.g. abandonment, fortification, expansion) apparent that do not seem to be related to events reported in surviving ancient textual sources?
Can any geographical or morphological patterns be determined in the urban character of the municipia founded from the 1st century BCE onwards?
Can any geographical or morphological patterns be determined in the urban character of the veteran colonies founded from the 1st century BCE onwards?
A summary of the history and archaeology of urbanism in Italy: 350-30 BCE These questions require qualification. Question 1 responds to a desire to understand whether the consequences of historical processes reported in ancient texts are reflected in changes in settlements and settlement patterns. The Roman conquest of Italy is chief among these, but other major historical events that occurred after the conquest are also thought to have impacted the network of major centres, such as the Pyrrhic War, the Hannibalic War, the Social War and the Civil Wars (see link). Question 2 aims to discern historical processes that may be identified archaeologically, the chronologies of which cannot be tied in to any specific historical event.
Together, Questions 1 and 2 allow for any potential pattern that might be generated by the database, regardless of specific cause, to be isolated and assessed. Although broadly formulated, the questions reflect the wish to contextualise the results of these analyses with historical information, while acknowledging that more was happening in ancient Italy than is reported in ancient texts.
The types of processes the database can elucidate are best demonstrated through some examples. As noted in the introduction, there was a proliferation of fortified centres in the period 350-300 BCE. Although this increase in settlement activity occurred during the period of the Roman conquest, after a critical evaluation of the evidence there is little to suggest that it occurred because of the conquest (Sewell forthcoming). As a result, although it is a pattern that appears to respond to Question 1, with further analysis it seems more likely to be a pattern that addresses Question 2. This demonstrates that care is required in the interpretation of patterns generated by the database. The chronological correlation of archaeologically attested change and historical events should not automatically imply causation.
An example of a pattern of the post-conquest period, and one that contrasts with the preceding period, is that hardly any new fortified centres greater than 2ha in size were founded by non-Roman cultural groups that did not subsequently go on to become Roman towns (Sewell in prep a). Another pronounced pattern is the reduction in the average size of peninsular settlements between the later 4th and 1st centuries BCE (Sewell in prep a), brought about through two distinct processes. Firstly, many of the older and particularly large pre-conquest centres underwent dramatic reductions in their urban areas, often by way of nucleation, most notably in Etruria, Magna Graecia, Messapia and Daunia. Secondly, many of the new centres founded after the conquest that became Roman towns were relatively small. As a result, by the Augustan period, on a scale of 10ha increments, more towns fell into the 10-20ha category than any other. This can be put into perspective for readers who have visited Pompeii, which comprises 65ha.
Another post-conquest phenomenon was the increasing preference for level terrain for the location of new urban centres (Sewell in prep a), especially apparent from the 2nd century BCE onwards, and in stark contrast to a pre-conquest preference for hilltop locations. Although many of the new settlements of the last two centuries BCE were founded in the less hilly northern areas of the peninsula (i.e. there was little option but to locate towns on level terrain), the general new preference for low-lying sites is very apparent in the mountainous Apennine chain. Here possible hilltop locations were rejected in favour of those on level ground. Although archaeologists are aware that this process occurred during the Late Republic (e.g. Cancellieri 1997), the database facilitates an accurate temporal and geographical assessment of its origins. In turn, such observations should prove useful to scholars interested in how the development of urban economies may have related to the growing network of Roman roads (viae publicae), water management and technology (aqueducts), and changing agricultural practices.
Questions 3 and 4 also form an interrelated pairing. The municipia that emerged in the 1st century BCE were communities with pre-conquest origins; analysis of these late republican sites may shed more light on Italy's pre-Roman peoples. Although the municipia were, by definition, communities of Roman citizens, this institutional status did not equate to a uniformity of settlement form. The majority have evidence of canonical Roman civic architecture, but there is otherwise a startling diversity in the physical character of these centres. For example, the 5th-century BCE Greek colony of Neapolis (Naples) became a municipium, and yet there are other municipia which seem never to have developed a fully urban form, such as Angulum or Genusia. A comprehensive study of the municipia created in the 1st century BCE has been already been undertaken from an institutional standpoint (Bispham 2007a;cf. Humbert 1978;Laffi 2007), which also explores the consequences of municipalisation for the communities concerned. Yet no peninsula-wide archaeological study of the municipia has ever been undertaken, and the database provides information to fill this lacuna (Sewell in prep b). By establishing and evaluating the archaeologically attested diversity of municipia, a clearer definition of what a municipium was may be achieved.
Similarly, in relation to Question 4, there are institutional and historical studies of veteran colonies (Brunt 1988, 240-80;Keppie 1983), but no peninsula-wide archaeological study; the database contributes towards such an analysis (Sewell in prep c). Like munipicia, veteran colonies were selfgoverning communities of Roman citizens. They differed in as much as municipia were existing communities bestowed citizenship, whereas the colonies were created when contingents of retiring legionary soldiers were given land in the territories of existing settlements, the vast majority of which had pre-Roman origins. The military background of these settlers suggests that the colonies might have had a different type of relationship to Rome than did the citizens of municipia. These socio-political differences might have led to divergences in the physical developments of municipia and veteran colonies. A preliminary observation may be made in this respect: more of the veteran colonies went on to assume a clearly urban form than did municipia. Care is required, however, since dozens of municipia subsequently became veteran colonies, and enormous investment in monumental architecture and infrastructure is apparent in both municipia and colonies, especially during the Augustan and Julio-Claudian periods.
As this discussion of the research questions indicates, the project and its database are focused on the identification of spatial, chronological and morphological patterns across the network of (proto-)urban centres in peninsular Italy. The aim is to look beyond individual sites, and local and regional patterns, to examine wider trends both in relation to known historical events and in terms of broader settlement processes that may relate to issues of economy, demography or social organisation.

Geographical and temporal parameters of the study
The study area comprises peninsular Italy south of the River Po, excluding all islands. The Po was chosen for its convenience as a physical boundary rather than for any historical significance: it defines an area to the south that could be comprehensively studied within the duration of the project. This choice means that more than 40 Roman towns between the Po and the Alps are omitted from the study. The start date of 350 BCE was chosen because the Roman conquest entered a new intensive phase during the second half of the 4th century BCE. This signifies, however, that all settlements abandoned prior to 350 BCE are excluded from the study, unless they were reoccupied after 350 BCE. Otherwise, all known dates of abandonment that pre-date the beginning of the medieval period (nominally 500 CE) have been documented for each settlement, but the dynamic processes affecting urban centres in late antiquity (after 300 CE) have been omitted, such as physical contraction and fortification building. As reflected in the research questions, the project is primarily concerned with the periods of the Roman Republic and the early Roman Empire. The dynamism of Roman urbanism slowed after the early imperial period. Only two new centres are recorded in the database as having been founded after the Augustan period, and by the 3rd century CE, evidence for major urban construction projects is greatly reduced. The database incorporates material relevant to the period 100-300 CE precisely to document this diminution of activity and hence to contextualise and define the floruit of urbanisation during the late Republic and early Empire.

Defining the object of study
The study is concerned with 'urban' and 'proto-urban' centres. The adjective, 'urban', is problematic, however, owing to difficulties with definition (cf. Marcus and Sabloff 2008). Like the nouns 'city' and 'town', archaeologists use the adjective 'urban' to describe a broad spectrum of large nucleated settlements. Cities of various kinds have been a feature of the Old and New Worlds for millennia. With the growing appreciation among archaeologists of just how diverse such sites could be, in terms of their formation processes, size, structure and function, it has become increasingly difficult to find a meaningful universal term. 'Proto-urban' is often used in the context of Italian archaeology to refer to a fortified settlement with little archaeological indication of dense or highly structured habitation within its walls. Because many of these centres nevertheless demonstrate marked development over time, the term 'proto-urban' has come under fire for not sufficiently recognising settlement evolution (Gualtieri 2004, 40, n.45).
To avoid misunderstanding, a term was devised specifically for the project: 'higher-order settlement'. This is defined as a settlement belonging to a community that exercised or might have exercised control over a territorial unit. This definition has the benefit of being equally applicable to urban and proto-urban settlements, but it also presents a challenge: how can one determine whether or not a settlement once controlled a territory? For the Roman period, texts and inscriptions document the legal status of hundreds of individual sites; these statuses indicate that some of these settlements controlled territories (e.g. coloniae and municipia) and others not (e.g. vici and pagi). Hence all archaeologically known colonies founded by Rome are included in the database, regardless of size, because these settlements had legal control over their territories. For the same reason all the municipia (self-governing communities of Roman citizens) which have been located with confidence are included.
Conversely, sites that Roman texts indicate to have been subordinate population centres, regardless of size, have been excluded. These include road-stations (mansiones, stationes, and settlement names with the prefix 'Ad...') and other minor settlements (vici, pagi). The only exceptions to this are related to secondary research questions. All vici (village-like settlements) found to possess urban characteristics (e.g. fortifications, regularised streets, civic architecture) have been included with the aim of establishing whether there was a Cispadane (south of the Po river) precedent for the phenomenon of the urbanised vicus, or 'the small town', known in northern Italy, Gaul, Germany and Britain. Some municipia and coloniae developed from vici and others became vici after periods of urban decline, and this is also recorded in the database. All the archaeologically known fora (i.e. the settlement type, not to be confused with the monumental public space found in most Roman towns) were included with the aim of establishing whether there are any commonalities in their urban forms, even though some were dispersed settlements of indeterminate size.
If ancient textual documentation supports the confident categorisation of particular Roman settlements as higher-order settlements, the absence of this information for earlier periods means that it is impossible to tell how many of the pre-Roman sites administered their own territories. Occasionally, the results of landscape archaeology can provide a good indication if, for example, patterns of rural habitation are found to have gravitated around large nucleated centres. It was thus necessary to determine an archaeological definition of a higher-order settlement for sites that are not described in surviving ancient texts. Size has been chosen as the main attribute upon which to base this judgement, primarily because it allows consistency. All fortified centres with a perimeter enclosing an area of 2ha or more are included in the study, although the very high number of such settlements means it is possible that a few sites have been missed. This particular surface area size has been selected for two reasons. Firstly, it permits the inclusion of all colonies founded by Rome. Most of the urban centres of Rome's peninsular colonies were fortified, and several were under 3ha. Since they possessed territories, they qualify as higher-order settlements. Circeii was an early Latin colony and its walled area actually enclosed 1.5ha (Quilici and Quilici Gigli 2005, 123-46), making it the smallest site and the only exception to the 2ha threshold in the database. The fact that fortified Roman centres of c. 2ha could administer territories naturally has no bearing on whether the same applies to fortified non-Roman settlements of this size. The city-states of Latium, Etruria and Magna Graecia were higher-order settlements by definition, but there are hundreds of settlements for which a similar status is difficult to prove. Since much more is known about Roman settlement hierarchies than those of ancient Italy's other peoples, it is important to resist the temptation to imagine that Roman-style systems existed elsewhere on the peninsula prior to conquest (Stek 2009, 107-20). In some cases, it might be overly simplistic to envisage that settlements either did or did not have control of territories. It has been suggested, for example, that some very small fortified hilltop settlements in Etruria controlled territorial units (Becker 2002, 90-92), yet it is possible that they themselves were subordinate to much larger Etruscan cities. Unfortunately, archaeology provides virtually no direct insight into the potential complexity of pre-Roman settlement hierarchies. Because it is not possible to establish with certainty whether many sites had higherorder status, the 2ha threshold was applied to non-Roman as well as Roman fortified sites for consistency, and with the belief that many higher-order settlements are likely to be among them.
The second reason for the 2ha threshold was to restrict the number of sites in the database. With the exception of Circeii, all other fortified sites <2ha were excluded because hundreds of them have been identified (see below), and their inclusion was not feasible within the limits of the project.
Settlement sites with monumental fortifications are particularly visible archaeologically, and thus have drawn the attention of field archaeologists. Of the 583 sites in the database, 451 had or are suspected to have had fortifications. To a degree, this preponderance of fortified sites reflects the reality that they are easy to identify and represent well-defined units for archaeological investigation. As well as to protect those within them, stone fortifications would have impressed both inhabitants and visitors, and extant examples still elicit this response (Sewell forthcoming). In published works, scholars often use the date of its fortifications to ascribe urban or proto-urban status to a settlement. While it is reasonable to assume that stone defences were a characteristic of important centres, some subordinate settlements were also fortified. For example, 14 published sites in southern Italy are described as Greek forts or towers, dated variously from the Archaic to the Hellenistic period. None of them were greater than 0.25ha in size. Fortification walls and higherorder status were thus not synonymous. This is also confirmed by the existence of large settlements with no fortifications during and following the Roman conquest, many of which were likely to have been higher-order settlements. For example, in the 4th century there are numerous references in Livy and other Roman authors to the names of settlements for which the earliest archaeological evidence dates to a period sometime after the date they are first mentioned in the texts, including sites such as Privernum, Plestia, Fundi and Ricina. How do we explain this? Although the reports might be erroneous, another possibility is that the early settlement forms assumed by these ancient communities are hard to recognise archaeologically: specifically, they possessed no city wall. Some examples of this situation are known, such as the site assigned the modern toponym of Cupola, a sprawling dispersed Daunian settlement in the south-east of the peninsula, the physical limits of which are unclear because it has no known fortifications (Rocco 2004). The settlement appears to have been the site of pre-Roman Sipontum. Meanwhile, nearby Roman Sipontum has produced no settlement archaeology pre-dating 194 BCE, its reported date of foundation as a Roman colony (Livy 34.45). This also seems to be the moment when adjacent Cupola was abandoned (Greco and Longo 2008, 455-58;Lackner 2008 , 185-86). Cupola's settlement archaeology dates back to the start of the first millennium BCE, and it is postulated to have been the original Sipontum because Livy reports that in 335 BCE a city of this name in this general location was captured by Alexander the Molossian, the uncle of Alexander the Great (Livy 8.24). It was likely to have been a higher-order settlement because its capture was of sufficient significance to be recorded historically, and because the settlement that replaced it was a higher-order settlement. Although Cupula probably held sway over a territory, it is otherwise without characteristics that we associate with urbanism. Urban form and higher-order status were thus not synonymous.
The Samnites had similar settlements; Abellinum, Allifae, Caudium, Telesia, Aequum Tuticum and potentially Aeclanum were all relatively low-lying dispersed settlements of the pre-conquest period which went on to develop urban characteristics and several became Roman towns (De Benedittis 2013). There are still, however, major lacunae across the rest of the peninsula. For example, most peninsular municipia developed from pre-Roman communities (Ciancio 2002). The study has documented 12 municipia that developed into urban centres in the 1st century BCE in locations where archaeologists have found no trace of earlier settlement from which they might have been derived. Thus it is possible that at least some of these 12 municipia might have earlier dispersed forms of settlement yet to be discovered. There might have been many polycentric or dispersed higher-order settlements, some surviving long into the Roman period. Those which at some point constructed a city wall and developed into Roman towns are included in the project because their physical remains and stone inscriptions are more easily found. There is an indeterminablepossibly largenumber of sites that did not go on to become municipia which remain to be discovered archaeologically, or which might be among the thousands of scatters of ceramic and tile listed in the published carte archeologiche, many of which are 2ha or above in size. The potential presence of higher-order settlements among these scatters poses a challenge for the 2ha threshold adopted for this project. This is because the vast majority of these scatters were likely to have been farms or other forms of rural settlements. Size alone is therefore not an indicator of a higher-order settlement. It was thus important to incorporate all of the non-fortified sites that may have been higher-order settlements while excluding those unlikely candidates. Firstly, any site described as 'rural' or a 'village' in published works was deemed to be subordinate and was thus excluded, regardless of size, as were all unclassified sites discovered through surface survey (e.g. 'surface scatter'). Six criteria were devised for other types of settlement sites of 2ha and above around which no fortifications have yet been found; for each site, one or more of these criteria needed to be satisfied for it to be included in the study:  The site is known or hypothesised to be that of a municipium (including civitas sine suffragio and civitas optimo iure), colony, praefectura, conciliabulum, forum or an urbanised vicus.  The site is located in a dominating position over the local landscape (e.g. on a highly pronounced elevation).  The site possessed both substantial habitation and facilities for loading/unloading waterborne vessels.  The site possessed monumental public architecture and/or a regularised street-grid.  The site is interpreted in published works, without major dispute, to be that of an oppidum listed by Pliny the Elder in Naturalis Historia.  The site is interpreted in published works to have been a settlement that was replaced by another and the latter is known to have possessed higher-order status (e.g. Cupola and Salapia Vetus).  Based on these criteria, 132 unfortified sites of 2ha or more were included, 73 of which either became or were founded as Roman towns.
Ninety-one sites are included for which no estimate of surface area can be calculated, which includes both fortified and unfortified examples. Inclusion was determined by the same criteria listed above for unfortified sites. If, as a result, a site of unknown size only fulfilled Criterion 2 (dominating position), it was only included in the study if there were strong indications that it was 2ha or above in size. This was determined through an assessment of size extrapolated from published plans and text. In relation to published images, there are numerous plans of hilltop fortifications that have only partially survived, displaying the extant sections of perimeter wall, for example on just one or two sides of a hilltop (e.g. Caiazza 1986, Tav. 20). In such cases, the continued lines of the missing wall-sections were hypothetically reconstructed based on the trajectories of the surviving sections, the absolute height of the terrain on which they are located and the prolongation of topographical features that seem to have dictated their course. Naturally, such hypothetical reconstructions are highly interpretative. They are undertaken purely to determine whether or not a site was likely to have been 2ha or over and should therefore be included in the study. If the assessment was positive in this respect, the hypothetically reconstructed size was neither recorded nor input in the database because of the lack of interpretational certainty. For some sites, it was possible to determine whether a size of over 2ha was likely on the basis of an author's description of the site. As an example, the habitation of the Lucanian hilltop site, Montrone (Oppido Lucano), is simply described as being spread over an axis of 1km (De Gennaro 2005, 68). Because this 1km-axis need only be 20m wide for it to cover 2ha, the site is included though without any estimate of its size in the database. If there was insufficient published evidence to be able to evaluate whether a hilltop site was 2ha or above (and which also did not fulfil Criteria 1, 3-6), it was excluded from the study.
Literature was consulted in relation to 85 necropolises for which the sites of their associated settlements have not yet been determined. Only five of them are included in the database (Cafaggio, Casone, Corcolle, San Brancato and Santo Mola), based solely on specific statements by authors that they were likely to have served major settlements. The chronologies of these necropolises are assumed to reflect the chronologies of the unknown associated settlement sites. Since burial practices changed at various times and in different areas of the peninsula, the chronologies of these necropolises may not represent the full chronological range of their associated habitations.
Many of the settlement sites in the study have produced artefactual evidence, above all ceramic, that pre-dates their city-walls and other elements of the built environmentphysical elements that contribute to the archaeological impression that they were higher-order settlements. This material only indicates that a site was occupied from a particular period onwards, but it does not provide any clear testimony regarding any potential higher-order status. Thus it is not always possible to establish whether a site was founded as a higher-order settlement or whether it attained this status over time. Where the evidence allows, the earliest and latest dates of occupation at each site is recorded in the database, but it should not be automatically assumed that settlements had primacy within their territories throughout their full history of occupation. All of the decisional processes described in this section are visualised in Figure 2.

The evidential basis
The archaeological literature reveals notable regional diversity in the numbers and forms of settlement sites, the types of evidence, their states of preservation and the degrees to which they have been studied. Most of the sites within the study area that fulfil the criteria for inclusion have suffered spoliation for reusable building materials during and after classical antiquity. Just over half of the sites are covered today with substantial areas of modern habitation (see Figure 3). In many cases, continual occupation and urban renewal over centuries have led to the disappearance of ancient structures, thus posing significant challenges for archaeological investigation. Of the greenfield sites, those on hilltops have been subject to processes of erosion over at least two millennia, and many low-lying sites have been disturbed by deep ploughing from the mid-20th century onwards. In terms of distribution, the numbers of greenfield sites are proportionally greater in the more mountainous areas of the peninsula. This indicates that hilltop sites are much more likely to be free of modern structures than low-lying ones. Thus the type and degree of preservation varies enormously as does the level of access for investigation. For data collection purposes, only published sources have been consulted without recourse to unpublished archival material. After establishing the names of relevant sites from published regional surveys, the most efficient strategy has been to then consult the most recent publications on each site. As well as containing the latest evidence, they often present and assess the more important of the previously published discussions. Older publications have been consulted if and when it was clear that they were still important, but it is likely that many older and superseded interpretations have not been documented. For the description of each settlement, archaeological evidence derived from stratigraphic excavation is prioritised, but not all of the sites have been excavated. For those that have not, sometimes the results of surface survey can be drawn upon. The existence and extent of some features, such as the course of fortifications and other standing remains, are often derived from topographical surveys. Occasionally, the presence of public buildings is conjectured from antiquarian reports or place-name evidence. Ancient texts, such as inscriptions and literary topographical descriptions, supplement data derived from modern studies and also provide information on the potential existence of monumental public architecture that is no longer extant.
Italy's strong tradition of regionalism has clearly shaped the nature and comparability of the evidence. In terms of archaeological research, there are disparities in the degree to which different regions have been studied. The Samnites have the highest number of attributed settlements, the majority of which are hilltop fortifications. Only ten of these 51 sites (≥2ha) have been subject to any systematic study, demonstrating our reliance on a few key examples to understand a very large group of sites. Relatively good data are available for Etruscan settlements owing to the long tradition of Etruscan studies, and the same can be said for the Lucanian centres, many of which have been well published. Aequian, Marsic and Vestinian fortified centres are numerous and many have been recognised over the last 30 years, but aside from some observations made about surface material and some structural remains, very little can be said with certainty about their chronologies. Many of the centres in Umbria that became Roman towns have been well studied, but there are numerous pre-conquest Umbrian sites smaller than 2ha, again about which very little is known. In Liguria, many of the pre-Roman sites have been published, but the majority are very small and are thus omitted from the database. Very few sites fulfilling the criteria for inclusion are found in the northern areas of the peninsula where Gallic tribes, the Senones and the Boii, established themselves from the early 4th century BCE onwards. No Picentine sites fitting the criteria for inclusion are known apart from those that went on to develop into Roman towns. The picture that emerges is one of diversity not only in regional settlement forms and numbers, but also in the degree to which the regions have been studied.
At the time of the Roman conquest, and for a considerable time after it, regional differences are reflected in distinct settlement forms associated with broadly defined geographical areas. Many of the areas that ancient writers ascribe to Italy's various pre-Roman peoples are associated with distinct material cultures, ancient languages and settlement forms. Yet the geographical correlation between historically attested peoples and their archaeological assemblages is not always easily discernible (Bradley 2000, 112-113;Stek 2013, 405). On occasion, it has proved impossible for archaeologists to confirm which particular cultural group dominated a site in its pre-Roman phases, such as at San Salvatore (Timmari) (Osanna et al. 2012), Frigento (Ebanista 2009), and Montescaglioso (Canosa 1993 Coarelli and La Regina 1993, 5-17). Even if such an exercise were possible, the boundaries should not be thought of as static features of the landscape (Bradley 2000, 112-113;Suano and Scopacasa 2013, 405). Ancient textual sources and archaeology testify to repeated shifts of territorial control as a result of conflict between various cultural groups. It is therefore inadvisable to define the regions associated with distinct peoples with precise territorial boundaries and, consequently, calculations of settlement densities within them cannot be meaningfully undertaken.
This issue also means that it is difficult to assign sites to the appropriate regional cultural group with any confidence. Each site is tagged with the cultural label that is most often proposed in the archaeological literature, although it is important to note that in most cases this is ultimately derived from Roman texts. If publications expressed uncertainty or conflicting opinions on the identity of the relevant cultural group, this is noted in the database. If no identity is provided by authors, in some cases it is assigned based on published maps showing ancient territorial boundaries (e.g. De Benedittis and Ricci 2007, 8, fig. 2), but only if the site is well within the hypothetical boundaries. These assignments are thus speculative: little is known about how pre-Roman peoples regarded themselves, and many sites may have accommodated populations of varied origins. A pragmatic approach has been taken towards accommodating ancient Italy's regional character in order to permit historically sensitive analysis. Specifically, it permits the grouping and comparison of sites on the basis of commonly used cultural categories, allowing broader patterns of similarity and difference to be identified and assessed.
Under Augustus, Italy was divided into 11 regions for some kind of administrative purpose, though the precise motive is not directly attested in the ancient sources (Nicolet 1991;Laffi 2007, 81-118;Bispham 2007b, 63-66). The boundaries of these regions can be reconstructed with relative confidence and are widely recognised and used by scholars. Each site in the database is assigned to the relevant Augustan region, but this grouping is only historically relevant for the second half of the period under study. All settlements founded after the Romans achieved predominance in the area in which they are located, are labelled as 'Roman' in the database. This is not meant to be an indication that they were ethnically composed of Romans or possessed Roman citizenship (although some were and are identified as such by other means). Rather it is a chronological label so that they can be differentiated from the pre-conquest sites during analysis. All of these post-conquest sites have the identity of the pre-Roman people associated with the local area recorded as a separate data category. Figure 4: Distribution of sites assigned to Italy's pre-Roman peoples in the database (that have >2 sites assigned to them).
There are significant variations in the quantities of sites attributed to each of Italy's pre-Roman peoples (Figure 4). This disparity becomes even more pronounced when the sites that are excluded from the study due to their small or indeterminate size are considered. Although 583 sites fulfil the study's criteria for inclusion, these are only a minority of the 1518 sites that were assessed. A running list of those sites failing to meet the criteria for inclusion ensured individual settlements were investigated only once. Because the list includes a brief description and a bibliographical reference for each of the 935 excluded sites, this has developed into a resource in its own right and is also deposited with the Archaeology Data Service. The list includes 486 fortified sites smaller than 2ha, or of indeterminate size, which either did exist or could have existed at some point during the period 350 BCE to 300 CE. It should be noted that many of them have been linked to specific pre-Roman cultures in publications without knowledge of the sites' periods of occupation. Thus for many of these sites there is no actual proof that they were occupied during or after the conquest period.
With the inclusion of these additional sites, a clearer picture emerges of overall site numbers. The assigned identities in Table 1 reflect the immediate pre-Roman situation in 350 BCE. Many of these reflect the identities of cultural groups who had conquered or assumed control over others' settlements shortly before the Roman conquest. For example, the consequences of Samnite expansion from the second half of the 5th century BCE onwards for Table 1 are that the Samnites are assigned 13 settlements that they did not themselves found (including Pompeii). In summary, the archaeological evidence for higher-order settlements varies in quality and abundance. The database is 'archaeological' in the sense that each primary record represents a physical site, documented by a series of data fields derived from multiple sources: archaeological (excavation, field survey, topographical survey); textual (ancient literary sources, epigraphy, placename evidence) and spatial (geographical coordinates). There are major regional differences in settlement forms and site numbers. Moreover, no two sites, or regions, were found to have been studied with the same intensity using identical investigative methods. The problem of uneven data quality has thus to be addressed, as is the case for any study that aims to understand historical processes affecting settlement patterns at any scale greater than the micro-regional. How can heterogeneous data be compared so that the results of the analysis can used with greater confidence?
One solution is to tag the various data with an indicator of the degree of confidence which can be placed upon them. This was a fundamental element of the database's research design (see further below). This approach permits analyses to be undertaken on multiple levels according to the robustness of the data and the results are to be handled appropriately. All the patterns identified must be critically evaluated. This is necessary, not only because of the variable quality and quantity of data, but also because the database can only describe, not explain, patterns. Correlations, for example, must be carefully studied before they can be interpreted as causation. In sum, the database is intended to facilitate empirical documentation of settlement forms and trends that have hitherto lacked a robust basis for discussion; critical evaluation of the patterns is then required to understand their significance and how best to interpret them.

Defining settlement sizes
Owing to the importance of size as a criterion for inclusion in the database, sizes needed to be calculated as accurately as possible. All sizes are based on real or reconstructed perimeters of settlements (most often, city walls) rather than the known or hypothetical extent of the habitation within such perimeters. All published size estimates are documented and a further 123 sizes have been calculated from published plans. In total, 492 of the 583 sites in the study possess size data in which differing degrees of confidence can be expressed. Published works were found to contain many errors in relation to settlement sizes, often in the form of incompatibility between sizes listed numerically and sizes depicted in accompanying plans. As a consequence, published estimates were double-checked and corrected if the data were available. Strikingly, the study did not encounter a single example of a publication that explained how the size estimates contained therein were calculated. To address this situation, the current project used a freely downloadable software package, GeoGebra, from which it is possible to calculate the area of any shape on a plan accurately as long as the scale is known or represented. Plans with erroneous scales were calibrated with the measurement of equivalent areas in Google Earth. The complete perimeters of only a small number of settlements are known with absolute confidence owing to missing sections of fortification walls, modern conurbation, or the lack of walls altogether. It is reasonable to assume that most of the larger settlements had unbuilt areas within their perimeters, but they are only flagged as such in the database if explicit published statements to that effect are documented in the literature. The author is grateful for the bibliographical information related to town sizes provided by Luuk de Ligt for the early imperial period, in addition to those he has already published (De Ligt 2012, 289-336).

Database design
Establishing the structure of a database and its data-fields at the outset of a project is common practice, but this approach carries methodological problems. This is because one is always wiser after all the data are assembled. There is always the risk that once the process of collating data is underway, it becomes apparent that the database's structure should be amended in order to remove superfluous fields or incorporate new ones (see for example, Cougle 2008). As a result, it can become necessary to re-examine sources already consulted. For this project, particular challenges lay in the regional and chronological diversity of settlement forms and in the qualitative and quantitative variation of the archaeological evidence. At the bottom of the scale there are some higher-order settlements known only from names attested on coins and, at the top of the scale, Pompeii; in between there is vast and fascinating variety. The full spectrum was inevitably only going to become apparent as the research progressed.
In order to address this problem, a two-stage process was implemented. A draft database design was outlined, describing the fields and data types. With this structure in mind, data were initially entered into the text fields of a reference-management programme (Citavi) in the same abbreviated form intended for the analytical database. The Citavi text fields were also structured to mimic the database design: on a site-by-site basis, with the same stipulated data-categories. Any further information considered relevant to the research questions was also recorded in Citavi, initially as notes. This additional information was the key, because it often stimulated reconsideration of the final database design. Yet as the design existed only in draft form, it was easily amended. Through intermittent re-evaluation, the design developed alongside the process of data collation. The analytical database was populated from Citavi only after data collection was complete and the final database format had been determined. Another benefit of using reference management software in this way is that it facilitated the creation of a full bibliography, structured on a site-by-site basis, and also deposited with the Archaeology Data Service.

Metadata: data categories
Based on the completed design, the database contains the following geographical and archaeological information for each settlement (where available or applicable):  Identifier: ancient and modern toponyms.
 Georeferenced location (exact latitude and longitude of centre of settlement; height above sea level).  Degree of confidence in georeferenced location.  Degree to which the settlement is covered by modern conurbation.  If the settlement is believed to have replaced another or was replaced by another (e.g. Falerii Veteres/Falerii Novi), and how quickly this occurred.  If the settlement is hypothesised to have been dependent upon another (e.g. port-town serving a larger settlement).  If the settlement was polycentric and the chronology of the polycentrism.  Earliest and latest dates of occupation determined by material evidence if available, or by textual evidence if not.  If the settlement occupied the site of an earlier abandoned settlement; if so, then the chronology of the earlier settlement is noted.  Archaeologically identified moments of destruction and periods of abandonment and their chronologies.  If the settlement is considered to have been permanently, seasonally or occasionally occupied.  Incomplete urbanisation: whether an author specifically states that unbuilt areas existed within the boundary of the settlement.  Whether the boundary of the settlement is: apparent from the (near) complete survival of fortifications; only partly known; hypothetically reconstructed; unclear or unknown.  Surface-area size in hectares. Expansions and reductions in size and their chronologies.  Dispersed settlement: chronology of recognised periods during which settlement was non-urban.  Post-abandonment: if there are indications of a low level (rural) occupation of the site after it is judged no longer to have functioned as a nucleated settlement.  Topographical defence (elevated location/water course/combination/none).  State of preservation of fortifications.  Chronology of fortifications, including repair/reconstruction phases.  Character and chronology of orthogonal street system, including replanning actions.  Character and chronology of monumental architecture and public spaces from archaeological or epigraphic evidence (forum/agora; intramural monumental temple; architectural assembly place; theatre; amphitheatre, basilica, bath complex).  Proximity to Roman roads in the Barrington Atlas (on; near; far).  If the settlement possessed facilities for loading/unloading waterborne vessels ('approdo').  Is the site in Pleiades at the moment accessed? Y/N  Information derived from ancient textual sources and published interpretations thereof.  The name(s) of the pre-Roman people(s) associated with the site in 350 BCE or later.  The name(s) of pre-Roman peoples associated with the site prior to 350 BCE.  Upper and lower dates of the period during which the people of the local area came under Roman dominion.  Character and chronology of Roman-assigned legal status (colonia, civitas sine suffragio, civitas optimo iure, municipium, foedus, praefectura, conciliabulum, vicus) and changes to it over time, keeping distinct those supported directly by ancient textual evidence and those that are hypothesised in publications.  Is it in the list of Italian towns provided by the Roman writer Pliny the Elder? Y/N  Augustan region in which site is located (I to IX).
Data relevant to each of these categories were documented when encountered in publications and entered into the database. The database should not, however, be considered as a completely comprehensive record of information in relation to these categories for each site. For example, published statements that a settlement had unbuilt areas within its perimeter were documented in relation to 44 sites, but this does not necessarily mean that the other 539 were fully built up. The fragmentary nature of the archaeological and ancient textual record should always be borne in mind when analysing the patterns produced by the database: none of the sites it contains have been completely excavated.
Usually only small areas of archaeological sites are excavated in relation to their overall size, but the data they provide are still generally employed by authors to interpret the history of the entire settlement. As a result, some of the database's categories are more robust than others. Those that record the physical presence of archaeological features are likely to be the most robust, especially as the database documents particularly monumental, and thus more easily recognisable, elements such as fortifications and public buildings. Chronological information is more interpretative (see below). Doubt is often expressed by authors in relation to periods of occupation that produce little archaeological material: does this reflect lower intensity of activity (e.g. decreased population or lower production and consumption of material goods) or abandonment? If such uncertainty is expressed by an author, the database records continued occupation as a default. Only in cases when authors specifically state that a site was abandoned is this entered into the database as such (for the widespread downturn in occupation levels of settlements in many areas of the peninsula during the later 5th and early 4th centuries BCE, see Sewell forthcoming).

Metadata: categories of confidence
The robustness of the entries in each data category reflect the variable quality and quantity of archaeological evidence available from each site. For the majority of interpretative categories of data, a separate indicator records the degree of confidence assigned to that interpretation. Assessing and assigning categories of data robusticity requires consistent reflexivity and, of course, is itself a process of interpretation. This is nonetheless essential so that, for example, the dating of city walls derived from construction techniques can be distinguished from that derived from stratigraphic sequences. Because the bases for determining the degree of confidence assigned to interpretations differ according to the data category, a detailed description of the process is provided for each category in the explanatory notes that accompany the database, available via the Archaeology Data Service.
Some sites have been the subject of scholarly research for centuries, and others have never been investigated systematically. Explicitly acknowledging these differences serves to increase confidence in the results and interpretation of the analysis. It is thus possible to conduct analyses that include or exclude data on the basis of stated confidence levels. If patterns are revealed using only data of the highest confidence level, it is possible to rerun the analysis including data of increasingly lower confidence levels in order to establish whether or not the pattern remains discernible.
Conflicting published interpretations from multiple scholars are recorded in the database in relation to the chronologies of fortification construction, the sizes of settlements and the cultural groups assigned to them. For the analysis, the interpretation selected was either that which was mostly widely accepted among published works, or the one supported by new evidence not available to the authors of older publications.

Creating a chronological structure
Data relating to historical chronology requires particularly careful assessment of confidence levels. In order to establish how settlement patterns changed over time it is necessary to record not only the duration of each site's occupation, but also how their physical characteristics changed over time. Authors are often justifiably imprecise when dating sites and their features as archaeological data frequently only permit the establishment of broad dating brackets. Conversely, inscriptions sometimes allow monuments to be dated very precisely. Because of the variety in the precision of dating interpretations for each of the chronological categories, they are entered into the database in the form of date ranges using calendar dates. Published dating interpretations are treated consistently so that, for example, 'late 4th/early 3rd century BCE', is always entered into the database as 325-275 BCE, and '4th/3rd century BCE' is always entered as 350-250 BCE. All similarly described overlaps between other centuries are treated identically. Thus published dating interpretations may have been made more precise than their authors intended, but this process is necessary for systematic data entry and analyses. Insufficient information is available to determine the dates of occupation at 25 sites in the database, meaning they are effectively undated. They are nevertheless included because of the expressed belief by scholars that they existed at some point during the period under study. Data are not yet available to establish the start of occupation for 11 sites in the database, and the abandonment for a further 10 sites.
For the dated sites, chronological information is documented in relation to periods of abandonment, destruction events and identified moments of contraction and expansion in settlement size. All of the elements of the built environment chosen as data categories, such as city walls and monumental architecture, also have construction chronologies linked to them in the database. In total, 30 data categories are for historical chronology, although none of the sites have entries against all of these.
Dating in archaeology is a notoriously interpretational process, even when chronologies are derived from scientific dating techniques. A broad spectrum of dating methods has been encountered during this research: from artefacts recovered from stratified deposits, construction techniques, comparanda from other sites, and the interpretations of ancient written sources. Confidence in each chronological interpretation is expressed in the database as levels of certainty that differ slightly according to what characteristic of the settlement is being dated. Because all of the chronologies are collated from published literature, a reflexive approach is required to assess confidence; a sometimes complex interplay of interpretational elements informed the assessmentthe degree of confidence expressed by an author; the quality of the scholarly argument; the type(s), quantity and quality of presented evidence. Generally, dating interpretations derived from systematic excavation are flagged with the greatest level of confidencea 'high' certainty ratingalthough such dates are, of course, still not guaranteed to be accurate due, for example, to issues such as ceramic chronologies and residuality. Most other methods by which dating interpretations are derived are flagged with 'medium' certainty. A 'low' certainty rating is given if support for a dating interpretation is particularly weak, including cases where no two authors agree, or if extreme uncertainty is expressed by authors. Disagreements on dating among authors are commonplace and alternative dating interpretations are documented in the database. In such cases, the choice of the dates used for the analysis is based on the approach described in the previous section on data confidence. A 'low' certainty rating is also assigned to all incomplete date ranges, for example, when the date of a settlement's abandonment is known but when its foundation date is not, or vice versa. In such cases, the missing date is entered as 'undated'. A final category of 'unconfirmed' is assigned in cases where no supporting argument or evidence accompanied the published dating interpretation.
For sites that have not been subject to intensive open area excavation, the dating evidence can sometimes be uneven. This means that between the known dates of foundation and abandonment, the available dating evidence does not cover all intervening periods. All such sites are flagged with 'medium' certainty because of the risk that the lacunae might represent one or more periods during which the settlement was abandoned. Figure 5: Instances of fortification construction/reconstruction on the Italian peninsula: 8th century BCE to 3rd century CE, with a breakdown of the certainty of dating interpretations. This graph has been generated using a method devised by Dan Lawrence (see Lawrence et al. 2012;Wilkinson et al. 2012). (The fortifications of settlements abandoned prior to 350 BCE are not included in this graph, but the older fortifications of multiple settlements still occupied in 350 BCE are included) The advantage of including all forms of published dating interpretations, even those without supporting arguments, is that it permits a peninsula-wide survey and regional comparisons of how settlements are currently dated. The resultant patterns indicate that confidence in dating interpretations is not evenly distributed over time ( Figure 5). Variability, however, is closely connected with chronological resolution; over periods of multiple centuries, patterns among the confidence categories do not differ significantly, but over periods of decades, much more notable differences are apparent. The full results and implications are presented in forthcoming publications relevant to the specific historical periods.

Database format
All of the data are contained in a spreadsheet, or 'flat file database'. A relational database would have allowed for greater efficiency in storage and the avoidance of data redundancy, but this solution is less functional in relation to the project's open access objectives. Beyond issues of interoperability, relational databases require greater investment of time to create (which must be balanced against the gains achieved) and can be harder for other users to rebuild and query. Further, the spreadsheet can be directly uploaded into GIS software as an attribute table without the need for complex querying before or after import, although for some GIS software it may need to be saved first as a comma delimited .CSV file.
The spreadsheet has been structured in such a way that it can be queried to produce answers to the project's broad range of research questions: each row reflects one site, and each column, a category of data or degree of confidence in those data. All categories of information can be cross-referenced and quantified, singly or in groups. It is possible to conduct analyses on every geographical level between individual site and the whole peninsula. It has an absolute chronological structure which allows for settlement patterns of distinct periods to be isolated and compared. Further, through integration with basic geographical data, it is possible to use GIS to visualise and analyse spatial patterns and to present and disseminate the results. In particular, it has been possible to establish geographical connections between contemporary phenomena, stimulating further analyses.

Concluding Remarks
The preceding discussion has considered the key methodological problems encountered and the solutions implemented to facilitate the project's wider research aims. Although some of these are specific to the particular dataset at hand, many are generic and therefore likely to be of broader resonance. Explicit discussion of such methodological issues is perhaps not as common as it should be. Such exercises are valuable not only because others can avoid 'reinventing the wheel', but also because these methodological decisions fundamentally shape the resulting data. Too few projects make clear the assumptions and judgements that underlie their digital resources. By way of conclusion, a brief overview of some of the project's results provides insight into the potential of the database.
The later 4th century BCE is well known as a particularly vigorous period for higher-order settlements in central-western and southern Italy. Published explanations of the causes of this activity currently focus on individual regions. The database allows these regional trajectories to be identified and compared so that the full scale of the dramatic rise in settlement numbers in the late 4th century BCE can now be presented, quantitatively and geographically (Sewell forthcoming). Because this sharp increase in settlement activity is apparent in multiple contiguous regions simultaneously, it seems likely that supra-regional processes were at work. The increase in the number of higher-order settlements seems to be related to another well-known phenomenon of the same period: a sudden and dramatic rise in the number of rural sites (Terrenato 2001a, 2-3;2001b, 63;Attema et al. 2010, 147-70). GIS analysis has revealed that the areas that demonstrate investment in major settlements correspond with those where increased rural infill is most strongly attested, suggesting a link between these processes, possibly related to the consequences of innovations in land exploitation. This period also corresponds with that of the Roman conquest and, for several regions, heightened settlement activity in major centres is regarded by some to be a response to the Roman threat. Yet not all of the areas where major settlement growth is attested were directly affected by the Roman conquest at this time. This indicates that the apparently favourable conditions for the creation and augmentation of settlements during this period were not directly or even indirectly caused by Roman activity. Instead, it raises the intriguing possibility that the particular success of Roman territorial expansion during this period might have been at least partly due to the Romans having been able to exploit these propitious conditions for settlement growth (Terrenato 2001a, 3).
In the post-conquest period the overall number of major settlements decreased, partly because, in stark contrast to the previous period, virtually no new non-Roman fortified hilltop centres were founded. Subsequently, the sharpest decreases in settlement numbers correspond to moments immediately after the dates of historically attested major wars with foreign invaders. Yet this does not seem to have impacted the continued, but regionally diverse, expansion of rural settlement. Over time, very large centres generally reduced in size, especially apparent with the Greek colonies of the south. Although there were multiple exceptions, relatively small urban centres (10-20ha) appear to have become particularly numerous.
The events of the 1st century BCE had a particularly dramatic impact on the physical development of major settlements. Many existing centres became notably urbanised after Roman citizenship was conferred on peninsular communities, with regularised town planning and embellishment with public buildings, described below. Hundreds of municipia emerged. The civil wars and the consequential settling of veteran soldiers in new peninsular colonies seems to have been a catalyst for urbanisation, as were the conditions resulting from the new political regime under the early emperors. Specific types of monumental architecture became widespread. As the database has recorded the presence and general construction dates of theatres, amphitheatres, basilicas, temples, bath complexes and monumental fora, this process can now be quantified and visualised with GIS. Although the partial standardisation of public architecture lends the impression that Roman towns underwent a degree of homogenisation, there is actually a great deal of diversity in the physical manifestations of urban forms. Many towns had pre-Roman origins, and the regional diversity of earlier developmental periods clearly left its mark on later Roman urbanism. As well as the more familiar centres with rectilinear street systems and fortifications, polycentric and even non-urban municipia are also attested. The database has provided an opportunity to undertake the first archaeological analysis of all the archaeologically known municipia of peninsular Italy with the aim of capturing the full spectrum of this settlement form.
Veteran colonies of the period were, with perhaps one exception, founded at existing settlements. The corresponding archaeological sites are all very recognisably urban, and further analysis aims to reveal the degree to which this resulted from the agency of the original colonists and their descendants. These results and others are discussed in much greater depth in forthcoming publications (Sewell forthcoming, in prep a, b, c).