A solution to the challenges of interdisciplinary aggregation and use of specimen-level trait data

Summary Understanding variation of traits within and among species through time and across space is central to many questions in biology. Many resources assemble species-level trait data, but the data and metadata underlying those trait measurements are often not reported. Here, we introduce FuTRES (Functional Trait Resource for Environmental Studies; pronounced few-tress), an online datastore and community resource for individual-level trait reporting that utilizes a semantic framework. FuTRES already stores millions of trait measurements for paleobiological, zooarchaeological, and modern specimens, with a current focus on mammals. We compare dynamically derived extant mammal species' body size measurements in FuTRES with summary values from other compilations, highlighting potential issues with simply reporting a single mean estimate. We then show that individual-level data improve estimates of body mass—including uncertainty—for zooarchaeological specimens. FuTRES facilitates trait data integration and discoverability, accelerating new research agendas, especially scaling from intra- to interspecific trait variability.


INTRODUCTION
Traits are the measurable morphological, physiological, behavioral, and life-history characteristics of organisms that directly interact with the environment and thus determine how organisms respond to changing environmental conditions (Eronen et al., 2010;Polly et al., 2011;Guralnick et al., 2020;Saarinen et al., 2021). Trait-based approaches in ecology are vital as new theoretical and empirical efforts have led to novel insights about linkages between traits and niche overlap at the population and community levels (McGill et al., 2006;Violle et al., 2014;Read et al., 2018), as well as the importance of traits in structuring composition of assemblages (Ackerly and Cornwell, 2007;Holt et al., 2018). These approaches also have been crucial for asking and answering time-extended, macroevolutionary questions, such as relationships between rates of trait evolution and species diversification (Folk et al., 2019;Upham et al., 2020), patterns of functional diversity along gradients at varying scales (Cisneros et al., 2014;Dreiss et al., 2015;de la Sancha et al., 2020), adaptive and plastic responses of traits to past environmental change (Smith and Betancourt, 2006;Saarinen et al., 2021), and human modification of the environment (Tomé et al., 2019;Hill et al., 2008;Guthrie, 2003). Thus, trait-based approaches will continue to connect within and across disciplines, providing a common framework across not only ecology and evolution but also paleontology and environmental archaeology.
Given the centrality of traits in modern biology, it is unsurprising that many trait databases have recently emerged, typically (Smith et al., 2003;Jones et al., 2009;Gallagher et al., 2020), but not always (Kattge et al., 2011;Gonç alves et al., 2018), built by extracting information from existing literature. Focusing ll OPEN ACCESS here on vertebrates, such compilations typically cover key life-history information such as number of offspring, body size, or even equations for estimating body size (Jones et al., 2009;Smith et al., 2003;Wilman et al., 2014;Damuth and MacFadden, 1990;Myhryold et al., 2015). While impactful in enabling macroscale research, these compilations usually only report species' mean or maximum values (e.g., Jones et al., 2009;Smith et al., 2003;Wilman et al., 2014;Damuth and MacFadden, 1990), or value ranges (Myers et al., 2020). Emphasis on means or ranges fundamentally limits the utility of these trait databases for many biodiversity-based research studies. These summaries have been built de facto from measurements of traits of individuals, but neither the measurements of the traits themselves nor their provenance are typically maintained, except in unpublished original field and lab records. In particular, critical metadata about the specimens on which they were based, including sample sizes, spatial and temporal scope of the measurements, sex, reproductive condition, and age classes or life stage are often not reported, providing few mechanisms for error checking and improvement. The outcome is that these species-level trait values become operationally static the moment they are published. A more effective approach would link standardized metadata about specimens, observation and measurement processes, and trait terms explicitly built to apply to individuals. Such an approach not only enhances discoverability and replicability of data but also facilitates research examining variation in traits across scales.
An improved system for communicating and storing traits reported at the individual-level is needed, where users can access open trait data and metadata and summaries of trait values can be dynamically generated. Building such a system need not start from scratch; we can learn from, and build upon, the infrastructure of open-access specimen databases and specimen data repositories such as (iDigBio: https://www.idigbio. org), VertNet (Constable et al., 2010; VertNet: https://vertnet.org), PaleobioDB (PBDB: https:// paleobiodb.org), NOW (NOW database: https://nowdatabase.org/now/database), and Neotoma (Williams et al., 2018;NeotomaDB: https://neotomadb.org). These repositories have shown great success developing a community of data publishers and users built around adherence to community data standards that define key terms about collecting events, occurrences, taxonomies, and, if applicable, ways to define time. Researchers know what these key fields mean because they link to permanent definitions with examples [e.g., Darwin Core (dwc); Wieczorek et al., 2012] or are defined in the database schema. Standards are particularly essential for enabling research across disciplines, time periods, and spatial extents, providing a lingua franca that allows articulations across disciplines (LeFebvre et al., 2019). For example, standards and robust metadata fields are needed to aggregate data that spans time: zooarchaeological and paleontological specimens are collected at one date but lived at another and, thus, it is critical to properly report temporal context information.
Despite the enormous growth in specimen-level digital data, the biodiversity informatics community has paid much less attention to standardizing how traits measured from specimens are assembled and reported. This significant infrastructure gap has impeded broader integration and development of the extended specimen concept, where specimens sit at the center of a growing constellation of specimenderived data (Lendemer et al., 2020). This gap is particularly important to close because trait data are already streaming into repositories, yet remain effectively undiscoverable and unusable (Troudet et al., 2018). Guralnick et al. (2016) showed a significant amount of trait data, including external measurements such as body length and reproductive state information, are often published along with specimen records. These data, however, remain hidden in notes or ''associated data'' fields, because existing standards, and the data publication systems constructed on those data standards, are not built for making all trait data types discoverable.
Even when data can be harvested and re-assembled from these ''catch-all'' fields, the challenge remains to harmonize and standardize trait information in a way that supports the broadest usability. In particular, trait definitions can be ambiguous due to differing homology definitions, uncertainty in specifics of the trait measurements (e.g., at which points on a bone are traits measured), uncertain measurement units, and/ or lack of information or illustration of the trait, as well as updates in technology that change protocols for measuring traits (e.g., direct measurement on bone via calipers versus measurement from a photograph or 3D reconstruction). Even after proper standardization, studies investigating traits across time or taxa are still not comparable, because sub-disciplines have different practices about what to measure. Modern ecologists, zooarchaeologists, and paleontologists often do not use overlapping and comparable traits. For instance, modern mammalogists often take soft-tissue measurements, such as ear length and hindfoot length (e.g., Patton et al., 2000;Simmons and Voss, 1998;Voss et al., 2001), whereas zooarchaeologists and paleontologists take skeletal measurements in the absence of any soft tissue. Thus, both aligned trait and measurement definitions as well as analytical approaches for examining allometries are needed to help with linking and scaling across traits. Ontologies that leverage individual-level observations can help alleviate these issues (Walls et al., 2014). By creating ontology terms that are specific and nested, related terms can be mapped together, creating a unified terminology and supporting data integration. Furthermore, these ontological approaches can increase trait discoverability and can complement statistical approaches that quantify scaling relationships.
We have developed the Functional Trait Resource for Environmental Studies (FuTRES; few-tress) in response to the rapidly growing need for individual-level trait data. FuTRES has a back end (maintained by FuTRES) and a front end (interaction with users) for data ingestion and extraction. The back end is maintained by FuTRES and comprised of data validation, triplification, reasoning, and an API (application programming interface). The front end involves user input: interaction with (Data S1) template terms, trait terms to put into the ontology, and the input of data to GEOME (Deck et al., 2017; GEOME: https:// geome-db.org) before being put into the FuTRES datastore. Our datastore is based on graph-like relationships among specimens, traits, and data, where new entities can be added without disrupting the model. It is built on new and existing trait ontologies and data integration workflow that aim to standardize and streamline trait data publication through our template preprocessing toolkits and thereby improve downstream use for paleontologists, zooarchaeologists, and neontologists (see the data life cycle: Michener and Jones, 2012;Griffin et al., 2017). FuTRES seeks to weave together efforts in trait and specimen data management to overcome the limitations of species-level trait data while building critical linkages to existing digital specimen records from which other specimen-related data can be found. These include linkages to existing repositories using occurrence identifiers and future linkages to MorphoSource (Boyer et al., 2016). We further expand the utility of FuTRES by also providing toolkits in beta release (see Supplemental Information) for data standardization and data cleaning (flagged data).
FuTRES is currently focusing on mammalian trait data but will eventually support trait descriptions and measurements across the animal Tree of Life. The data contribution process for FuTRES is enabled via expansion of existing animal anatomy and trait ontologies, and it already provides access to millions of mammalian trait measurements via a data portal and API ( Figure 1). We demonstrate how FuTRES facilitates access to specimen trait data and encourages community best practices for collecting and using these data. We showcase a user-requested, best practices-based data cleaning workflow for producing the best possible trait estimates, especially for the millions of neontological trait data measurement records that are already available but lack critical standardization for best use. We further provide two case studies to illustrate the benefit of using FuTRES to dynamically derive trait means and allometric equations for research relevant for modern as well as paleo-and zooarchaeological studies. The case studies showcase two common data uses: proper determination of distribution of body masses within a species, and predictions of body mass using skeletal material to predict potential body mass change over time.

Developing FuTRES
FuTRES is a dynamic datastore connected to a community-available data ingest system, GEOME (Deck et al., 2017), which is an open-source toolkit that simplifies data import and validation for the community. FuTRES uses a specialized designed template in GEOME that defines required and optional fields for data uploads (https://github.com/futres/template). GEOME also provides means for providers to apply creative commons licensing, and embargoing data before release. The vast majority of records on FuTRES (>99.9%) are publicly available. A series of detailed help guides are available to support new providers getting started (https://github.com/futres/futres_website/blob/master/content/data_tutorial.md and https:// github.com/futres/futres_website/blob/master/content/how_it_works.md).
FuTRES is a dynamic trait datastore, populated by pulling the most recent data loaded into GEOME and VertNet, annotating traits with updates from our FOVT (FuTRES Ontology of Vertebrate Traits; https:// obofoundry.org/ontology/fovt.html) application ontology, so that each search retrieves the most up-todate data available. In static datasets, data collection is paused at the time of publication; with FuTRES, an investigator can develop workflows such that each time analyses are run, the most up-to-date results iScience Article are produced. Because the datastore is dynamic, we can better leverage the semantic web to link FuTRES trait data with other data sources, especially taxonomic resources to help update changing taxon concepts, but also environmental layers, gene sequences, and stable isotope records. This critical feature of FuTRES showcases how it can be part of the ecosystem of resources needed to implement the extended specimen concept (Lendemer et al., 2020).
The FuTRES datastore can be accessed via a simple web interface (FuTRES Datastore: https://futres-datainterface.netlify.app) or via an R package, rfutres (https://github.com/futres/rfutres). While current functionality of the R package is mostly focused on access to the datastore, it will also have functions for data cleaning, using the methods in this paper. Finally, in order to support those users who may want access to the whole of FuTRES, for larger analyses, we also provide a Zenodo archival snapshot that has a citable (https://doi.org/10.5281/zenodo.6569644; Gurlanick et al., 2022), and plan to produce those yearly for the community. The FuTRES community collects data from a variety of sources: the field, the literature, online databases, or from museum collections. The users input data formatted to a template accessed through GEOME, which accommodates paleo-, zooarchaeo-, and neontological metadata types. FuTRES works with the user to preprocess the data, but is also building tools, such as an RShinyApp (https://github.com/futres/RShinyFuTRES), that will allow submitters to prepare their own data for GEOME. The trait terms are defined and standardized; if a term does not exist, the user can create an issue to request a term through https://github.com/futres/fovt. The data are then validated and stored in GEOME. The FuTRES workflow then converts the data into RDF triples and reasons over the ontology and terms, resulting in standardized, discoverable data. The FuTRES team provides a cleaning routine for the data, filtering data, simple metrics about data, mapping and visualization of data, and ultimately the download of data. The user then can access and discover trait data at the specimen level.

OPEN ACCESS
We developed an extendable workflow based on a set of existing tools for taking unstandardized trait data reporting and converting them into formats that best enhance findability and accessibility of individuallevel trait data. Using graph-like relationships allows for scalability because new data property terms and trait terms can be used without restructuring the workflow schema. For the first round of data ingest, we added 48 new ontology terms for the 12 traits (Table 1). These terms included anatomical terms, which will be a module in UBERON (Uberon Anatomy Ontology; obophenotype.github.io/uberon; Mungall et al., 2012;Haendel et al., 2014), as well as length terms, which currently are in the FOVT but will become available in OBA (Ontology of Biological Attributes; https://github.com/obophenotype/bio-attributeontology). FuTRES works with the community to develop new FOVT terms using a well-established mechanism for such requests (e.g., GitHub issues; https://github.com/futres/fovt/issues). With the workflow and ontology in place, seven datasets were standardized. Trait term requests can be made by creating an issue in the FOVT repository on GitHub. Standardized data are available through the FuTRES API and data portal (FuTRES Datastore: https://futres-data-interface.netlify.app).
We downloaded the ingested data from the FuTRES datastore, which has 3,958 species and 2,384,293 records. We then developed a cleaning routine to label outliers and potential juvenile records so that the data without known life stage are retained and enhanced (see example in Figure 2). We removed 56,993 records that were obvious outliers (2.5%). This point is highlighted in the example using Otospermophilus beecheyi, the California ground squirrel (Figure 2), and showcases how we were able to use data cleaning approaches to make previously unusable data usable by retaining records with unknown life stages [194 out of 233 records with unknown dwc:lifeStage; retained 222 records (28 known adults, 194 with unknown life stages but within adult body mass limits)]. The data cleaning toolkit checks whether values fall within the known adult distribution and flags the data as ''possibly good, possible adult'', ''outlier'', or ''possible juvenile'' in the ''measurementStatus'' column, letting the user decide whether they want to use it for downstream analyses. The data cleaning routine is rather liberal and biased toward keeping smaller trait measurements, and thus mean adult values may be slightly smaller than overall species body mass mean. Further cleaning, such as using known adult and juvenile body mass distributions, where warranted, may further help refine known body masses of both life stages, and we encourage community development of new efforts that can be implemented easily and linked to FuTRES. A key aspect of FuTRES is supporting Trait terms are the same as in the ontology (FOVT), with their IRI in parentheses. We also include counts for total number of records and for non-modern records. Synonyms for terms are either synonyms in the ontology or, in the case of the astragalus lateral length, the term we use in the paper to reflect terminology in von den Driesch (1976

Case studies
In our evaluation of overall species means represented in the current FuTRES datastore, we find that reported species means from the literature (specifically PanTHERIA, Jones et al., 2009), while often not wildly far off, are also not generally in close agreement with species means in our dataset (Table 2; Figure 3). Only $32% of species mean body masses reported from PanTHERIA were within 3 standard errors (se) of mean values or within 95% and 5% quantiles of body mass by species in this study; $68% were not (Table 2; see  also Table S1). Species means reported in PanTHERIA tended to be larger than the average body masses from our study (Table 2). We tested the relationships between sample size and the mean body mass difference to assure that sample size did not affect these results, where perhaps smaller sample sizes would result in a larger difference in body mass averages; however, we found no relationship (see also  Table S2; Figure S1). Additionally, we tested for a relationship between body mass and difference in mean body mass, with the expectation that perhaps larger-bodied species with a wider body mass range would show a greater difference from mean body mass. We found a slight relationship, seemingly driven by an extreme case (see also Table S2; Figure S2), suggesting that sampling differences due to body size do not markedly affect this analysis.
We predicted body mass for 27 specimens of Odocoileus virginiaus with astragalus length (Figure 4; see also Table S3). The greatest length of the lateral astragalus (GLl; von den Driesch, 1976; astragalus lateral length FOVT:00,000,013) measurements for the modern deer ranged from 29.75 to 37.33 mm (see also  Table S4). Modern deer body mass ranged from 21.79 to 59.93 kg (see also Table S4). The allometric relationship is log 10 ðyÞ = 2:04 + 1:45$log 10 ðxÞ with an R 2 = 0.26 and p value = 0.004 (Table 3). The zooarchaeological astragalus measurements fell within the range of the modern deer (31.5-39.8 mm). Likewise, the resulting body mass estimates fell within the range of modern deer (32.3-51.4 kg; see also Table S3). We also estimated body mass using the constants of slope and intercept from the original lab calculations curated in the FM-EAP (see also Table S3). We tested whether the single-value estimated body mass fell within the range of newly calculated body mass within 2 se (95% confidence interval) calculated in this study (see also Table S3). We found that the original body mass estimates did not fall within the range of predicted body mass values from this study, often being underestimates of body mass.  Here, we show Otospermophilus beecheyi as an example of the data cleaning process and success. Much data had unknown life stage (A), where purple colors denote known adults, yellow unknown life stage, and gray juveniles which we exclude from subsequent analyses. In this example, Otospermophilus beecheyi had 108 body mass records with no life stage reported. To remedy this, we created a distribution to test whether the unlabeled data were potentially adults. 1. Non-inferred, adult measurements were tested for outliers (results in B; gray bars below distributions are outliers). 2. From that set of data, we created +/À3s upper and lower limits. 3. We tested the unlabeled, non-juvenile data against those limits (results in C; gray bars below distributions are outliers). Those within the limits we kept and labeled ''possible adult; possibly good'', those outside of the limits were labeled ''outliers'' or ''possible juvenile''.

OPEN ACCESS
iScience 25, 105101, October 21, 2022 7 iScience Article principles. Conversely, FuTRES is a community-developed and ontologically robust trait datastore with an initial focus on mammals that is extensible broadly to the Tree of Life.
FuTRES relies on widely used trait ontologies and is synchronized with the existing, well-developed data ingestion pipeline (Stucky et al., 2018). The power of ontology and this workflow is 2-fold. First, as more trait terms are added, the ontology will become more flexible both in trait specificity and generality, enabling trait discovery. Second, our workflow (Walls et al., 2014;Stucky et al., 2018) connects instances of a specimen occurrence to instances of a specimen measurement. The metadata and ontological terms easily connect with other data repositories that use specimen occurrences, such as the Global Biodiversity Information Facility (GBIF; https://www.gbif.org). This facilitates encoding data values (measurementType hasValue isNumeric) and units (measurementValue hasUnits isString) into the ontology, increasing interoperability. These standardization tools also reduce the need to wrangle data [an estimated 80% of data handling time (Furche et al., 2016)], facilitating research by centralizing standardization of datasets that would otherwise be cumbersome or impossible to accomplish by individual actors.
To further increase data usability, best practices for error checking and data cleaning are incorporated into the FuTRES cleaning routine (Figure 2; see STAR Methods). We emphasize keeping verbatim fields and flagging data so that no information is lost or modified in existing data columns. Our data cleaning routine did reasonably well at removing putative outliers and providing a way to filter for adult body mass records. The cleaning routine we present here is conservative, often retaining the lower (smaller) end of measurement, which may represent erroneous data or unlabeled juveniles. Still, our dynamically derived mean trait values are, generally, close to means reported in the literature. Sometimes, the conservative data cleaning resulted in mean body masses lower than PanTHERIA and the Animal Diversity Web (ADW; Myers et al., 2020; https://animaldiversity.org). In the case of Microtus californicus (n = 3,004), the California vole, our average body mass (38.0 g) was lower than PanTHERIA (57.4 g), yet still within the range provided by ADW (38-108 g). By contrast, for Myodes rutilus (n = 15,334), the northern red-backed vole, we retained lower body mass estimates, and still the mean body mass (22.2 g) was greater than PanTHERIA (19.9 g) but well within the range of reported body mass from ADW (20-40 g). In both cases above, the sample sizes in FuTRES are thousands of individuals-even with potential juvenile bias-and so the mean body mass likely reflects the actual species' mean body mass better than the estimates in PanTHERIA and ADW. The benefit of the FuTRES data is that sample size is known, and each report is tied to specimen records and specimens, providing a researcher with the most information possible to make judgments about usability of the data.
These cases of conflict with other trait resources highlight one of the important benefits of FuTRES: factors that influence mean trait values, such as sample size, geographic range, age, and sex, are known and can be explicitly accounted for in downstream analyses. Access to all of the underlying specimen-level data allows researchers to make informed decisions about the quality of summary statistics. For instance, a significant difference in average body masses between PanTHERIA and FuTRES may be seemingly unimportant if the difference is small: for Pteronotus davyi, a small bat, the average body mass differs by 25 se, equating to 2 g. This amount may seem trivial, but it represents $27% of the species' total body mass. These small differences are therefore non-trivial, with impact for inferences about species life history that may vary across space and time. iScience Article FuTRES focuses on multiple traits, often collected from the same organism, providing another significant advantage compared to many trait compilations. Reconstructing body mass is a common step in paleoand zooarchaeological research because so many other life history traits are known to depend on body mass (Hopkins, 2018;Damuth and MacFadden, 1990;Schmidt-Nielsen, 1975). Having a large sample size of modern skeletal and body mass measurements improves reconstructing body mass of paleo-and zooarchaeological specimens. Access to datasets where skeletal measurements and body mass have both been reported allowed us to show the power of a dynamically derived allometric equation for reconstructing body mass in white-tailed deer (Odocoileus virginianus) from archaeological specimens. Our case study showcases how using multiple data resources from both neontological and zooarchaeological collections can refine body mass estimates and link across temporal scales. However, it is worth noting that FuTRES -and other datastores -often lacks body mass measurements for large-bodied animals, and when these are present it is not always clear what state the animal was in (e.g., skinned, gutted, and preserved) when measured. Body mass measured prior to, and after, viscera are removed is dramatically different, and reconciling body mass data when reporting about preparation methods can be sparse remains challenging.
While efforts to continue collecting and reporting large mammal body mass data and metadata are needed, FuTRES provides a useful means to assess data gaps and prioritize needs based on community input. Furthermore, we note the value of directed work with citizen scientists and land managers, who could help alleviate gaps in assembling body mass data for animals taken by legal hunting or culled by government land management programs. In addition to citizen science work and land manager contributions, a best practice for all field biology is to have a procedure to take body masses of animals that are sampled (live or dead) so we can begin building more extensive large mammal datasets. In the interim, Saarinen et al. (2021) suggest an approach to choosing the optimal body mass estimation regression from legacy regressions that are currently available in the literature. The authors compared the percent error of the body mass estimate for each skeletal or dental element of wild Equus to determine the best predictors for body mass. Because of the dearth of body mass data on extant large mammals, these methods are important not only for paleo-and zooarchaeological body mass reconstructions but also for estimating the body masses and body condition indices of modern large mammals, such as zebras, collected over the last two centuries.
FuTRES exists to streamline and automate the process of assembling and integrating biological trait data measured from individuals, facilitating the use of trait data in a similar way to that pioneered for genetic data by GenBank (Benson et al., 2012). FuTRES supports data producers in sharing their data and connecting to data users, with a focus on community development and best practices. The coauthors of this paper, who were active participants in workshops (https://futres.org/workshop) and post-workshop activities, are just the first step of a growing research community with strong interest in understanding the basis of iScience Article phenotypes and phenotypic variation. As FuTRES matures as a data resource, researchers will be able to collect and share data more easily, and in ways that instill best practices both in data collection and reporting, while also providing a crucial means for data discovery and use that can facilitate testing new hypotheses about trait evolution in novel and unexpected ways. We call out the particular importance of linking from intra-to interspecific trait variation at the broadest scales (Read et al., 2018). Finally, tools in FuTRES allow easy tracking of the use and re-use of trait data, so field researchers can clearly document the impact of their collecting efforts, justifying funding and institutional support for new fieldwork and maintaining collections.
While the initial focus for FuTRES has been on linear measurements of mammals, the ontology structure of the data resource allows it to be expanded to handle any kind of trait for any kind of organism. There are efforts underway, for example, to add non-scalar traits to FuTRES, some of which represent ecological interactions, like shark bites or parasite load. In particular, we are exploring the expansion of measurement data from the current focus on legacy linear morphometrics to the growth area of 3D geometric morphometrics (Herná ndez et al., 2017) and describing landmark locations using trait semantic approaches. With the use of the FOVT application ontology, the path is already laid to begin adding trait data for other vertebrates beyond mammals, and we hope that larger communities working across the Tree of Life coalesce around individual-based trait repositories. We close by noting that FuTRES is not simply meant to be an archive of trait data, but rather a growing repository where new tools, such as the R package, rfutres, and knowledge can grow. As a final example, FuTRES is actively exploring assembling real-time allometry equations that change as new data are assimilated into the datastore and cleaned for use and providing these outputs such that the links to the data used are persistent. This approach reflects a vision of a knowledge resource that is focused around community-established best practices.

Limitations of the study
FuTRES is still in development, and so does not yet accept all types of trait data. We have concentrated on 2D linear measurements of mostly mammal appendicular elements, as reflected in our case studies. We encourage readers and future data users and contributors who wish to suggest linear measurements to submit a term request via a new issue at https://github.com/futres/fovt.
Our case studies showcase both the power and some potential limitations of individual measurements from specimens. For example, lack of reporting of life stage, which is surprisingly common in published iScience Article specimen records, can make assessment of adults versus juveniles difficult and subjective, limiting use. In general, improved reporting of specimen-level metadata will increase usability downstream for research.
To overcome this challenge, we built reusable cleaning routines that will be made available in the next version of the R package, rfutres, (https://github.com/futres/rfutres). These can be refined further, as they likely retain some reporting of juvenile trait values. We encourage community development of enhanced methods by submitting issues on GitHub. Still, the routine provides a set of best practice approaches for cleaning datasets, including flagging data so that users can make informed decisions about data quality.
Finally, we note that our body mass comparison case study focused on a single, highly curated source (Jones et al., 2009). We are aware that there are other compilations, potentially of high quality, such as the Animal Diversity Web, that differ from both this study and Jones et al. (2009) mean estimates of body mass. Our goal is not to do a comprehensive comparison of estimates across resources but to show the power of being able to easily assemble body mass distributions built from individual-level reporting, which underlies creating any mean body mass estimates.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:   Constants and needed information, such as SE(se) of the slope (b), intercept (log 10 (a)) and sample size, are needed to estimate (log 10 (y)), which in this case is body mass. We show our revised intercept, slope, r-squared value (R 2 ), and p value with degrees of freedom (df) for estimating body mass compared to those derived in the 1990s in the FM-EAP with a smaller sample size (unpublished data) and used in Reitz (2008 iScience Article iScience Article uploaded by the community of researchers generating these data. We also created an outline for best practices in data cleaning, in an effort to preserve data that may otherwise be removed during later data filtering steps. From this cleaned set of data, we compare derived mean trait values to those in species-level databases that have been assembled based on literature. This comparison demonstrates the value of specimen-level data storage and integration efforts.

Data collection
An impetus for FuTRES was to make accessible trait data that arealready available but lack standardization or are effectively hidden in current published datasets. Through the FuTRES team and our initial FuTRES workshop in summer 2019 (https://futres.org/workshop2019), we amassed and integrated into the FuTRES datastore seven mammalian species metric datasets (Table 1). Besides new VertNet data, the modern data also include smaller datasets of Puma concolor (cougar) weight (intact, skinned, and gutted) and total length from Oregon Department of Fish and Wildlife (2020), Odocoileus virginianus (white-tailed deer; K. Emery) from Georgia and Florida with intact body mass and various post-crania skeletal measurements from von den Driesch (1976), Otospermophilus beecheyi (California ground squirrel; Blois et al., 2008) from California with soft tissue measurements, body mass and toothrow length, and Aepyceros (impala; A. Villaseñ or) from east Africa with various cranio-dental and post-cranial measurements following von den Driesch (1976). White-tailed deer and California ground squirrel datasets contain a mix of whole carcass measurements and skeletal measurements, allowing for linkages between traits. The zooarchaeological datasets include two archaeological datasets on Odocoileus virginianus (white-tailed deer): one from the Florida Museum Environmental Archaeology Program (FM-EAP) collections, which was ingested into FuTRES, and one from Reitz et al. (2010) (Table S3). A key paleontological resource is a database of over 20,000 records of fossil Equid specimen-based cranio-dental and post-cranial measurements following Eisenmann (1988; Bernor et al.,. 1997) from R.L. Bernor with a global distribution spanning 16 mya to recent. The paleo-and zooarchaeological datasets are heavily curated with large numbers of skeletal trait metrics. Together, these datasets encompass 3,958 species, over two million measurement records, and 12 traits (discussed below; Table 1), with more traits to be added. The original data is stored in the CyVerse Discovery Environment (https://de.cyverse.org). Below we describe how these datasets were ingested into FuTRES and show their value and utility for enabling new research.

Back end Workflow
The FuTRES data processing workflow improves interoperability of datasets by standardizing metadata and trait names to ontologies and data standards ( Figure 1). We built upon an existing ingest pipeline (Stucky et al., 2018) by modifying it for vertebrates and for three intersecting disciplines (paleo-, zooarchaeo-, and neontology). The workflow includes four steps: preprocessing, converting the data to RDF-OWL triples, reasoning (inferring additional facts based on the ontology), and exporting to a semantic toolkit, GEOME (Genomic Observations MetaDatabase; Deck et al., 2017; https://geome-db.org), which tracks metadata and validates datasets. Here we focus on preprocessing, because the other steps remain largely unchanged from Stucky et al. (2018). Preprocessing includes identification of the minimum set of metadata terms required for paleo-, zooarchaeo-, and neontology, standardization of column headers, and standardization of trait terms. The pre-processing steps below cover existing datasets requiring conversion and transformation before proceeding to the additional processing steps. Data sets can also be submitted directly to GEOME using the FuTRES Sample Project Template Generator, which automatically creates a datasheet with the required fields and their definitions, therefore lessening the need for pre-processing. We have a tutorial for data uploading available online (https://futres.org/data_tutorial). We additionally made a web application (in beta; https://github.com/futres/RShinyFuTRES) to re-format legacy datasets so that they are able to be uploaded into GEOME.

Template
All datasets require a minimum amount of metadata (e.g. a title, description, ownership). After capturing these dataset level metadata, data were mapped to a template that standardized column headings and data types to ensure reproducibility and facilitate creation of RDF triples (Figure 1). We decided which columns (i.e., metadata) to include through consultation with a group of disciplinary experts during the summer 2019 workshop as well as with specific data providers (Data S1). We encourage the use of uniform resource identifiers (URIs) linking to associated data whenever possible. The template requires the ll OPEN ACCESS

Ontology
To standardize trait terms, we used UBERON (Uber-anatomy ontology; Mungall et al., 2012), the speciesneutral ontologies for animal anatomy, and OBA (Ontology of Biological Attributes; Dö nitz and Wingender, 2012) for traits. Because the timing of their release schedules would delay our addition of new terms to these ontologies, we have created trait terms we need in an application ontology, the FuTRES Ontology for Vertebrate Traits (FOVT; https://obofoundry.org/ontology/fovt.html). The FOVT trait classes will be replaced by OBA terms as soon as they are released. The hierarchical arrangement of trait ontology terms allows for flexibility and integration across taxa and disciplines that measure traits differently. For example, if ''humerus length'' is the measurement of interest, the ontology allows for differing degrees of specificity. One could select known specific endpoints for humerus length (trochlea to caput; trochlea to ventral tubercle, etc.). If a data curator does not know which specific term to use, or if the researcher extracting the information is only curious about general measures of ''humerus length'' across taxa, then the general term ''humerus length'' can still be used. Because of the nested hierarchy, a search on the general term will return humerus lengths for all the ways it is measured. This allows the data captured to be both precise and flexible for the user and contributor.

Data validation
Once the data are processed and standardized, they are uploaded and validated in GEOME. In GEOME, researchers can access the template (described above) and/or uploaded data. Data validation in GEOME reports validation errors to data submitters and helps users fix their data. GEOME and VertNet data are then aggregated and processed using a data processing workflow (https://github.com/futres/fovt-datapipeline) which performs final validation steps, triplifies, reasons, and then loads reasoned data into a document store (ElasticSearch). Data reasoning is computed using the ontology-data-pipeline codebase (https://github.com/biocodellc/ontology-data-pipeline), which is run as an available Docker container and draws on FOVT. After data are validated, integrated, and reasoned, the pre-reasoned data are loaded into an ElasticSearch database where data are made available to researchers through the FuTRES website and an API (application programming interface), where researchers can visualize taxonomic and trait coverage. FuTRES data resources are also available via a prototype web portal that provides a simple faceted search approach for filtering by species, datasets and traits of interest.

Data cleaning
We developed a prototype data cleaning toolkit, first applied to body mass, body length, and tail length, but usable for all measurement traits. This cleaning toolkit is especially valuable for cases of automated trait extractions from heterogeneous reporting such as in the VertNet dataset, where trait values may either be misreported in the original record or assembled improperly during automated extraction from Traiter (https://github.com/rafelafrance/traiter). A key goal of the data cleaning effort was to provide a means to help users find and filter the most credible reports of adult trait values. This required both flagging improbable values and determining whether records without life stage reporting could be inferred as adults (see example in Figure 2, panel A). We developed an R-based (R Core Team, 2018) workflow to check for outliers on the full dataset. To accomplish this, we create a column, "measurementStatus" to report if the datum is an "outlier", if there are "too few records" to check, or if it is a "possible juvenile". First, we check whether a species has at least 10 records (otherwise labeled ''too few records'' in measurementStatus). The workflow starts with a Mahalanobis Distance outlier test using the package OutlierDetection (https://CRAN.R-project.org/package=OutlierDetection) in R for known adults where body mass units were recorded (i.e., non-inferred values), which is used in the case studies (below; Figure 2,