The use of prehistoric ‘big data’ for mapping early human cultural networks

ABSTRACT The archaeological record is one piece of the puzzle in understanding the evolution of humans, helping to trace the cultural connections between different species and their technologies, as well as their expansion in time and space. Here we demonstrate a method for mapping the boundaries, centers, and peripheries of ancient cultures, as well as the technological similarities between different cultures. The proposed workflow includes: a systematic collection of archaeological information in a database; a process to infer the similarities between assemblages and generate a network; and finally, a graphical method for big data visualization, a technique also used in social media analysis. We present the geography of multiple cultural complexes that span several stages of cultural evolution from the Lower to the Upper Paleolithic (Stone Age) and involve several species of the genus Homo. Finally, we discuss some alternative trajectories in which this workflow can be developed further.


Introduction
The search for the origins and the spatial and cognitive expansion of our own species is one of the drivers of paleo-research. Over the course of history scholars have tried to reinterpret the self-image of humans drawn by science and their relationship with nature. Current research addresses several questions such as: (1) whether ancient human populations emerged from a single or multiple populations and regions (Scerri et al., 2018); (2) the number of waves, routes and timing of geographical expansions within and out of Africa (Bergström et al., 2021); (3) the role of connectedness between ancient populations (Miller & Wang, 2022); and (4) what role the environment played (Beyer et al., 2021). To answer these questions, classical sciences such as paleoanthropology, archaeology and paleoenvironmental studies work together in an interdisciplinary way with newer disciplines such as molecular science, computational modelling and paleogenetics. Particularly in recent years, the latter fields have shaped the debate on a global scale with regard to the geography and timing of the dispersal of early humans, their genetic differentiation, and the effects of admixture. In addition to scientific expertise and the constant improvement in technical sequencing methods, the existence of structured research databases, powerful computational analysis methods, and an attitude towards sharing data also play a significant role in this development (Anagnostou et al., 2015). However, the genetic population history is only one piece of the puzzle and needs to be complemented with archaeological data giving insights into early cultures and aspects of ancient lifeways. Furthermore, the archaeological record has advantages in terms of quantity and resolution, which hold great potential, provided they are available in a suitable structure. Stone tools and debitage are especially durable and ancient humans produced these in large amounts, as compared to paleontological and genetic materials which are more strongly affected by taphonomic processes. Consequently, the sheer quantity of the archaeological record tends to be much greater, especially in regions with unfavorable preservation conditions, even though newer methods such as environmental DNA have the potential to increase the extent of the paleogenetic record (Massilani et al., 2022;Vernot et al., 2021). Moreover, archaeological data offer a relatively high degree of temporal resolution, because a technological trait can pass from individual to individual through cultural transmission via social learning, as opposed to gene flow that occurs over generations. In order to gain insight into the life and conditions of early cultures on a global scale, and to exploit the specific properties of the archaeological record, data must be encoded in a way that merges diverse sources, structures them consistently, and makes them available in a machine-actionable way.
Our goal is to establish a workflow to illustrate technological (dis-)similarities between archaeological assemblages within and between cultures and cultural periods. Cultures and their parent group, cultural periods, can be viewed as categories used to sort the archaeological record. Archaeologists assign assemblages to cultures based on the artifact types and properties they contain. These entities are not perfectly homogeneous, but rather show internal temporal and regional differences. While they are not rigidly distinct from each other, there are spatial transitions, e.g. neighboring cultures exchanging technological skills, and temporal transitions between successive cultures. In this study, we analyzed the technological similarities between assemblages of different cultures based on artifact categories and inspected their spatial distribution. We used one of the most extensive databases for paleo-research, the ROCEEH Out of Africa Database (ROAD, www.roceeh.net) to derive a geographical network graph from 15,719 assemblages distributed over Africa and Eurasia ranging from 3,000,000 to 40,000 years ago. This graph is based on nodes that were constructed from dated assemblages and their geocoordinates. The nodes are connected with edges, which are weighted by similarities between the assemblages' individual sets of artifact categories.
We chose a method of big data visualization related to flow maps for the cartographic presentation of the results. Radial flow maps, also called origin-destination flow maps, are cartographic styles to map the movement of phenomena such as people or goods between locations using symbols such as lines or arrows (Peterson, 2020). Typical examples include flight connections between airports to illustrate an airline company's service area, or the routes of bird migration. They can also be used to represent quantitative measures of relationships between places by manipulating line width and color, for example in Minard's famous graph from 1869 showing Napoleon's Russian Campaign 1812/ 1813 (Minard, 1869;Robinson, 1967). One instance that has received great media publicity worldwide and gained popularity in the social media is the Facebook map by Paul Butler (2010), where the connections between locations are represented by the number of Facebook 'friendships' and their distance, visualized by means of a color gradient. By combining a minimalist approach of showing only the connections with a large amount of data, a pattern emerges that shows urban centers and peripheries as well as natural and political boundaries. Field (2018) describes this style as an alternative to classical design principles, according to which information is summarized by means of generalization and abstraction. Thereafter, an 'overall view of data' results from the large amount of detailed information and appropriate cartographic processing. Here, we demonstrate how a 'cultural network' of prehistory can be drawn using this technique by showing examples of the cultural periods known as 'Middle Paleolithic' and 'Middle Stone Age' and detail maps of a variety of cultures reasonably well constrained in space and time, for example: 'Acheulean', 'Mousterian', 'Aterian', 'Howiesons Poort', and 'Initial Upper Paleolithic' cultures.

Data
The Role of Culture in Early Expansions of Humans (ROCEEH) began synthesizing archaeological, paleoanthropological, paleontological and paleobotanical data into the ROCEEH Out of Africa Database (ROAD) in 2009. As of December 2021, the team has entered more than 2100 localities into ROAD, including over 15,000 assemblages in Africa, Europe and Asia dated between 3 million and 20,000 years ago. The data come from over 4200 publications written mainly in English, French, German, Italian, Spanish, Russian and Chinese. The data in ROAD are structured according to localities (archaeological sites) which consist of dated strata (layers) containing assemblages (finds) such as artifacts, human and animal fossils, and plant remains plus their bibliographic citations.
The attributes listed in Table 1 relate to a locality and its assemblages. Each locality has geographic coordinates 'x' and 'y'. The attributes assigned to a culture in ROAD is 'idarchstrat' and includes such entities as Mousterian or Gravettian. Culture can be grouped into general categories called 'cultural_period', for example, the Mousterian belongs in the Middle Paleolithic, and the Gravettian in the Upper Paleolithic. Each culture includes an estimate of its age range as well as its geographic distribution, called 'regional_technocomplex'. For example, the Mousterian occurs from 350,000 to 40,000 in Eurasia, while the Gravettian ranges from 35,000 to 25,000 years ago and is present mainly in Europe.
The age of each geological layer at a specific locality is determined through the synthesis of radiometric ages. These data can then be combined with information about the relative age derived from culture, Classification of the cultural period assigned to the assemblage region Regional specification of a period age_min Minimal age age_max Maximal age paleomagnetic data, biostratigraphy, and the sequence of the layers. The best age estimate of each layer is given by the attributes 'age_min' and 'age_max'. If a layer is undated, but lies between dated layers, its estimated age can also be entered. The attributes listed in Table 2 relate to assemblages that recur frequently enough so that conducting a statistical analysis becomes a worthwhile endeavor. The tool groups (tg) comprise different types of stone artifacts, while symbolic artifacts can be interpreted as art, music or ornament. Organic tools can be fashioned from bone, antler, ivory, teeth or shell. Miscellaneous finds include ochre, shell or ostrich eggshell, among other materials. Finally, a feature is a non-portable find, such as a burial, combustion feature or pit, which allows us to infer about past human behavior.
We established a static API to retrieve a customized query from the ROAD database, which is available under the URL 'https://www.roceeh.uni-tuebingen. de/api/roceeh/assemblagesForSimilarityCalculation. php'. The API delivers a table with information on the assemblage level (Table 1), which combines the spatiotemporal properties (locality, assemblage name, geocoordinates, dating) and cultural classifications (culture, period and region) with a set of 32 attributes describing the presence or absence of relevant archaeological artifacts contained in the assemblages (Table  2). Since the database is continuously growing and updated, the results accessed through the API may change over time. For the sake of reproducibility, we downloaded a working copy as of 2021-11-16 and used this static dataset for further analysis instead of using one that changed daily. The working dataset contains information on a total of 15,719 assemblages, which meets our minimum demands regarding the completeness and integrity of spatial coordinates and age estimates.

Processing
Our analysis is based on the principles of network analysis (Scott, 1988) and features a graph where nodes represent archaeological assemblages, and weighted edges represent the similarity between two assemblages inferred from the find record. In a complete graph, where each node is connected with all other nodes but itself, the maximal number of theoretically possible undirected edges (K n ) between n nodes is given by the formula: With n = 15,719 assemblages, this results in a total number of K n = 123,535,621 possible connections. However, not all these edges are relevant for our study of cultural connectivity. Therefore, we developed a procedure that filters those connections whose assemblages overlap in time and exhibit a minimum of cultural similarity. Our workflow comprises five processing steps to subsequently reduce the potential number of relations and filter the most meaningful ones ( Figure 1).
In the first step, we constructed a full list of all potential combinations of assemblages including self-connections and directed vertices (maximum number of links). In the second step (different sites), we excluded all connections whose nodes appear in the same locality for the following three reasons: (a) self-connections, as these are redundant; (b) distinct assemblages at a locality refer to distinct events of deposition, as apparent from the archaeological context, and therefore did not overlap in time; (c) our study focused on cultural relationships over continental to global scales, and on assemblages rather than the microscale relationships within single archaeological profiles sharing the same geographical position (Figure 1).
In the third step, we filtered for temporally overlapping assemblages, since we are interested in the cultural similarities between contemporaneous events. Over longer periods, the spatial distribution of cultural traits may be affected by geographical processes like expansion, relocation, and more complex models of diffusion. These would lead to an overestimation of the extent of cultural relations at a given point in time, if temporal correlations were not taken into account. In this study, we chose a rather wide definition for the contemporaneity of assemblages, which we defined as an overlap between the age ranges between the variables 'age_min' and 'age_max'. The age ranges are affected by a number of factors, e.g. whether they originate from cultural or absolute dating methods, the uncertainties and limitations of different dating methods, and of course the quality and plausibility of the individual date. Furthermore, there is an increase of uncertainty with age, so that age ranges tend to be higher for older dates, which likely overestimates the connectivity of older assemblages. If specific research questions require more precise dates, these can be filtered in the ROAD database by setting constraints for dating methods and uncertainties.
In the fourth step, we considered two options to implement the directionality of graphs. A directed graph is applicable, when the direction of transmission of cultural traits from one assemblage to another is clear. This is not the case for our data, because we do not know whether there was direct or indirect transmission between two assemblages. Furthermore, dating uncertainties impair the reconstruction of chronological sequences. Therefore, we implemented the relation between assemblages as an undirected graph, which means that the similarity between assemblages is considered mutual. Thus, we excluded directed edges and retained undirected ones.
In the fifth step, we computed the weights of edges in the form of similarities between the find composition of the assemblages, which is expressed in the attributes presented in Table 2. We transformed the item set of each assemblage into a Boolean vector, where 1 indicates presence of a find type in the respective assemblage and 0 represents absence. Then we compared the similarity of these vectors.
We parameterized the similarity based on the Jaccard distance coefficient implemented in the R package 'philentropy' (Drost, 2018) that estimates the similarity between two assemblages A and B with: The dataset contains information on culturally relevant materials, the functional and practical interpretation of artifacts and remains, typology and the use of abstract categories (art, music, ornaments) available from the archaeological record. The dataset is binary encoded and indicates the presence or absence of the artifacts in the respective assemblage. Figure 1. Analytical steps to reduce the large number of potential cultural connections to condense the most meaningful similarities. The red bands and columns show the number of connections that passed a selection step and served as input for the subsequent step. Gray bands and columns show the connections that were excluded during the respective steps. Of all potential connections between assemblages we distilled out the 1.4% to be most informative.
The Jaccard Similarity is scaled between 0 and 1, where a value of 1 indicates a complete match between two assemblages, 0 represents no matches, and values in between show partial overlap in the item sets. From these results we excluded all edges that did not show at least partial overlap between the endpoints (J Sim = 0).
For this selection of similarity-weighted edges, we also constructed a geographic representation. We estimated the Euclidean distance between the assemblages' localities based on the great circle distance on the WGS84 ellipsoid with the R library 'geosphere' (Hijmans et al., 2021).

Results
In this section, we first describe the spatial distribution of localities and potential biases emerging from the exploitation of a large archaeological database. Then we present further details on the network in general and the cultural periods as well as the nine technocomplexes visualized on the main map and the side maps.

Spatial distribution, spatial autocorrelation and network metrics
An overview of the 15,719 assemblages used in this study is presented in Figure 2. The analysis of our assemblage dataset shows that their spatial distribution is not random, but follows a clustered pattern, as underlined by the Average Nearest Neighbor (ANN) test (Clark & Evans, 1954). According to this, the assemblages are distributed over n = 1374 localities which have an observed ANN Distance of D O = 42.25 km, opposed to the expected Distance D e = 124.46 km, which was estimated from the study area (A = 85,129,000 km 2 ). This results in an ANN Index of 0.34 with a z-Score of z = −28.66 and indicates a significantly clustered spatial pattern. Regions with an accumulation of more than 100 assemblages per hexagon include Southern Africa, East Africa, the Maghreb, the Levant, Caucasus, the Altai Mountains, northern China, and Java. The distribution is the result of multiple spatial processes, including: (a) the habitats and population densities of earlier populations and their preferences for lithic material culture; (b) the taphonomic conditions which are responsible for the preservation of artifacts and are affected by factors like material deposition, geology, and climate; (c) the regional differences in scientific research intensity; and (d) the completeness and selection criteria of the ROAD database itself. We draw attention to the fact that research history and intensity vary across Africa, Europe and Asia due to a variety of factors related to politics, economics, infrastructure and culture. As research continues, we hope to enhance our view of the under-represented regions of the map.
Spatial autocorrelation refers to a basic principle of geography, which is formulated in Tobler's first law stating, 'Everything is related to everything else. But near things are more related than distant things' (Tobler, 1970;Tobler, 2004). In the context of this study, this would imply that spatio-temporally proximate sites show a higher cultural similarity than distant sites. We tested this assumption by estimating the correlation between the cultural similarity (Jaccard Index) and the distance between localities with Spearman's rank correlation coefficient (Table 3, column r s ). Accordingly, we found no significant evidence to support the assumption of spatial correlation within the cultural record. Furthermore, we visually inspected the potential relationship between the variables with heat maps (Figure 3). Also there, we found no pattern suggesting an effect of distancedecay. We note, however, that our results are likely affected by the uneven distribution of localities (see last paragraph) and our definition of temporal overlap. Therefore, further studies targeting the issue of spatiotemporal autocorrelation of ancient cultures are needed, which involve the temporal dimension into measures of spatial autocorrelation (Getis, 2010).
Our network analysis yielded a total number of 3,506,065 edges between 15,719 nodes (see Table 3). Based on this, a node is connected to an average of 233 other nodes in terms of temporal overlap and the coexistence of at least one mutual cultural category. Considering that a complete graph with this number of nodes (K 15,719) may contain 123,535,621 edges, we note that 2.8% of them were constructed as actual edges based on our criteria. We did not consider the entire network to be visualized on a twodimensional map due to its temporal depth but present below a selection of distinct cultures and cultural periods queried from the network.

Description of cultures and cultural periods
The main map shows two cultural periods, i.e. supercategories of human cultural evolution that comprise different cultures. Both periods begin at least 300,000 years ago and end about 40,000-30,000 years ago, whereby these temporal limits are regionally differentiated. Due to different research perspectives in the history of archaeological study, finds in Sub-Saharan Africa, and sometimes also North Africa, are assigned to the Middle Stone Age (MSA), whereas finds north of this divide are usually attributed to the Middle Paleolithic (MP). Technologically, however, they are similar despite the regional naming conventions. Both cultures are characterized by the use of Levallois technology, a process for manufacturing stone tools using prepared cores (Chazan, 2020).
Considering the MSA, the main map shows close similarities between the Southern and East African  are the results of this query. 'K n ' describes the potential number of connections in a complete graph as described in formula 1. The means of distance and Jaccard Index are reported with standard deviations, although it has to be stated that unimodal and multimodal distributions make comparisons with these measures difficult (see Fig. S1 in the Supplementary Material). The correlation r s between distance and similarity was calculated with Spearman's rank correlation coefficient (rho).
assemblages, but there are also occasional technological links with sites in West and North Africa, and even the Arabian Peninsula. The latter refers to two assemblages from the locality 'Jebel Faya FAY-NE1' which were specifically assigned to the MSA because the authors observe considerable technological similarities with African assemblages (Armitage et al., 2011). The Middle Paleolithic is particularly well connected in Europe and Southwest Asia, especially the Levant, where there is also a high concentration of sites. These are strongly linked to the assemblages in North Africa and Asia, although it is apparent that the connections between North Africa and Asia are rather weak. Furthermore, it is striking to note that the assemblages in North Africa, as well as Central and South Asia, are internally connected within their respective regions. In contrast, East Asia appears to be connected to other regions like Europe, while internal links are rare. This result is due in part to the development of the ROAD database over time. Since 2010, the focus of data entry has shifted from Africa to Europe and then Eastern and Central Asia, so that coverage is rather extensive in these regions. However, data entry in South Asia began in 2019 and in East Asia in 2021, so that these regions are less well-represented. Finally, the map also shows the substantial connections between the two cultural periods, which should be expected due to their technological similarity, and in some cases, their geographical overlap. The particularly frequent connections between Southern and West Africa with the regions in North Africa, Europe and the Levant are evident.
In addition to the main map, we present three overview maps depicting selected cultures relevant to the Figure 3. Heat maps show no correlation between cultural similarity (y-axis, 0 is low and 1 is high) and distance (x-axis) of network edges. In the case of pronounced spatial autocorrelation, high cultural similarity would be expected at low distances (low x, high y), which then decreases with increasing distance (high x, low y). development of early humans. These cultures are reasonably well constrained in their temporal and spatial extent and are arranged chronologically, with the Acheulean (Lower Paleolithic and Early Stone Age) at the bottom, a selection of three Middle Paleolithic and Middle Stone Age cultures in the middle (Aterian, Micoquian, and Howiesons Poort) and three Upper Paleolithic and Later Stone Age cultures at the top (Initial Upper Paleolithic, Early Later Stone Age, and Gravettian).
One of the oldest technologies is the Acheulean (Sahle, 2020), which is associated with the production of bifacial handaxes. We see evidence of Acheulean technology at sites where Homo erectus was found, and later on, H. heidelbergensis and early H. neanderthalensis. We created regional subsets of the Acheulean to emphasize the temporal and spatial expansion of this technology. The culture originated in Africa about 1.75 million years ago, spread to the Levant by about 1.5 million years, and reached Europe around 1 million years. The end of the Acheulean is estimated at 300,000 years ago in Africa and 250,000 years ago in Europe and the Levant. However, the dates are the subject of debate because handaxes continued to be used in later periods as well. And whether a true Acheulean technology can be found in East Asia, is still up for debate.
The three technological traditions displayed in the middle map are associated with H. sapiens in Africa and H. neanderthalensis in Eurasia. Middle Paleolithic and Middle Stone Age (Blackwood & Wilkins, 2022) assemblages are generally characterized by a range of lithic tools. These artifacts are often made using the Levallois technique and include unifacial and bifacial points, scrapers, and backed pieces. In addition, geometric designs, ochre pigments (Hodgskiss, 2020), shell beads, and ornaments demonstrate an increasing use of symbolic artifacts. The Aterian (Garcea, 2021), especially known for its tanged arrowheads, dates broadly between 150,000 and 40,000 years and shows connections between assemblages across North Africa, especially in the Maghreb. The Micoquian is a Middle Paleolithic industry found predominantly in Central Europe which dates between about 130,000 and 40,000 years ago. The Micoquian is characterized by asymmetrically formed bifaces and includes the Keilmessergruppe with its bifacially worked knives. The Howiesons Poort (de la Peña, 2020) connects assemblages in Southern African between 66,000 and 58,000 years ago. It is known for its blades and backed segments made on a fine-grained rock called silcrete, as well as bone tools and other technologies.
The upper map shows three technocomplexes of the Upper Paleolithic and Later Stone Age (Bicho, 2021), periods associated only with H. sapiens. As its name implies, the Initial Upper Paleolithic is considered as the first industry of the Upper Paleolithic. It is known from Eurasia and ranges between 47,000 and 40,000 years ago. This industry marks the transition from the predominance of archaic Homo to the sole presence of H. sapiens and shows technological similarities among assemblages in the Levant, Eastern Europe, and the Baikal region of Siberia. Like the Initial Upper Paleolithic, the Early Later Stone Age demarcates a transition in Southern Africa starting about 42,000 years ago in which microlithic assemblages, as well as bipolar and ground stone technologies play an increasingly important role. Finally, the Gravettian occurs across much of Europe between about 35,000 and 25,000 years ago and marks a time when ice age hunters made specialized toolkits and a wide variety of art, symbolic artifacts, and personal ornaments.

Discussion/conclusions
Our results show that this approach has the potential to map the spatial distribution and boundaries of prehistoric cultural networks. Compared to point symbology, the network offers the advantage that an internal differentiation of cultures can be made. This helps to distinguish more similar regions from less similar ones and thus draw inferences about connectivity. Furthermore, it allows archaeologists to explore the technological relationships between different cultures, for example, the neighborhood relations between contemporaneous cultures or the inheritance of cultural traits between successive cultures. Alternatively, it is also possible to delineate cultural boundaries and disconnectivity in space and time, as quantified by the grade of dissimilarity between entities.
It should be critically noted that the edges of our network simply represent the material similarity of assemblages. They do not necessarily represent the exchange of individuals or the process of tradition of cultural traits through social learning between the nodes directly. A technological trait can be transmitted through many intermediaries, which opens the possibility of many alternative paths compared to the direct connection shown in our maps. Furthermore, the inheritance of traits from earlier cultures is possible, so that their occurrence in temporally overlapping assemblages is not an expression of the connectedness of specific individuals. Also the independent development of traits at multiple locations and times plays a role, and the necessity of cultural transmission especially for early artifacts is questioned (Snyder et al., 2022). These circumstances are an issue in archaeological science in general and must be taken into account in order to interpret the results correctly.
Our approach is particularly well suited to the archaeological record, which is extensive, compared to other archives from the paleo-sciences, and thus provides a quantitative basis to construct a network with millions of edges. The cultural network also covers regions and periods in which the processes of site taphonomy prevent the preservation of evidence, from disciplines such as paleogenetics or biological anthropology. Combined with a variant of flow mapping, a visualization method for big data, it is possible to obtain a large-scale, overarching picture from a wealth of detailed information.
During the implementation of this study, we considered replacing individual processing steps with alternatives and take the opportunity in the following paragraphs to discuss some of our thoughts.
The study underscores the robustness of the ROAD database as an analytical tool for macroscale research, because it provides a structured, machine-actionable record of thousands of scientific publications with a coherent taxonomy. While this is only possible with a certain degree of abstraction and aggregation, it serves to make the existing record comparable. Researchers who prefer a higher granularity may apply our workflow and dataset to answer more targeted archaeological research questions. Another route would be to focus on specific details, for example 'signbase', a database in which markings on archaeological objects are documented with more detailed attributes (Dutkiewicz et al., 2020).
The results underline that ROAD covers large areas of the Paleolithic world, but some regions are clearly under-represented due to a variety of structural reasons, varying from politics and economics to infrastructure and culture. While these regions are not ignored, they provide fewer possibilities for a finegrained analysis. Finally, to make data with a high level of detail and global extent available for largescale studies, the scientific community needs to agree on common standards, ontologies and data sharing with linked databases (McKeague et al., 2019;McKeague et al., 2020).
The process of edge construction may be fine-tuned by an alternative definition of 'temporal overlap' or the selection of narrower absolute dates, replacing cultural dates. We based the query of cultural attributes (Table 2) on their frequency in the database; however, other researchers may find different traits of more interest, depending on their specific questions.
There are also alternatives to the Jaccard similarity measure, as included in the R package 'philentropy' (Drost, 2018) and discussed in dos Santos and Zárate (2015). We chose great circles to cartographically represent the edges due to their simplicity. However, considering the many physical and environmental barriers early human populations faced, these underestimate the real distances. Computational methods such as ABM and Least Cost Path analysis hold potential to provide a more accurate representation of actual pathways (Crabtree et al., 2021;Lewis, 2021;Verhagen et al., 2019). In addition to the definitions of cultures defined through the ROCEEH project, the PERIODO project offers alternative spatial and temporal extents of these categories (Rabinowitz et al., 2016).
When we first developed this study, our progress was driven by the wish to visualize the ROAD database as a whole. It soon turned out to be more informative to query different categories within this haystack, such as the cultures and cultural periods presented on the map. We, therefore, propose to further develop this idea into an interactive discovery tool, that enables users to customize the selection of cultural attributes used to construct the network, or the ability to visualize different categories tailored to the user's needs. This could help scientists to explore the 11 cultural periods in different ways. It would also allow the exploration of more than 100 other cultures available in ROAD and our network, but not shown on these maps. The technical foundation is provided by open source dashboard software, like 'shiny'  or 'Plotly' (Sievert, 2020). Further developments to gain insight from the network comprise: (a) the method of edge bundling (Cui et al., 2008;Graser et al., 2017) for the aggregation of multiple edges eventually leading to a distributive type of flow map (Peterson, 2020); (b) the computation of network metrics to assess connectivity (Brughmans, 2013); and (c) the visualization of the temporal axis.
We mentioned at the beginning that we see the cultural networks of early populations as a piece of the puzzle of human evolution. For a holistic picture, it is, therefore, necessary to link our results with global environmental reconstructions and biological anthropological evidence. This allows us to ask further questions. How are cultural traits exchanged and inherited between different populations and species? To what extent are cultural spaces confined by environmental factors and the human niche? And how does this change over time? Concepts such as Niche Construction Theory and Triple Inheritance could provide a suitable theoretical framework for such investigations (Laland et al., 2000;Laland & O'Brien, 2011;Odling-Smee, 2015).

Open Scholarship
This article has earned the Center for Open Science badges for Open Data and Open Materials through Open Practices Disclosure. The data and materials are openly accessible at https://doi.org/10.5281/zenodo.6873723.

Acknowledgements
We thank Zara Kanaeva for the technical development and maintenance of the ROAD database and the development of a static API for this study. Furthermore, we thank ROCEEH's student research assistants for data entry.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This study was developed within the research project 'The Role of Culture in Early Expansions of Humans' (ROCEEH) funded by the Heidelberg Academy of Sciences and Humanities. We acknowledge support by Open Access Publishing Fund of University of Tübingen.

Data availability statement
The input data, reproducible code and results of this study are made permanently available under the link https:// doi.org/10.5281/zenodo.6873723 under the CC-BY 4.0 license. Furthermore, the Supplementary Material under https://doi.org/10.1080/17445647.2022.2118628 contains Figure S1, the density distribution of the cultures and cultural periods,