Long-term spring through fall capture data of Eptesicus fuscus in the eastern USA before and after white-nose syndrome

Emerging infectious diseases threaten wildlife populations. Without well monitored wildlife systems, it is challenging to determine accurate population and ecosystem losses following disease emergence. North American temperate bats present a unique opportunity for studying the broad impacts of wildlife disease emergence, as their federal monitoring programs were prioritized in the USA throughout the 20th century and they are currently threatened by the invasive fungal pathogen, Pseudogymnoascus destructans (Pd), which causes white-nose syndrome. Here we provide a long-term dataset for capture records of Eptesicus fuscus (big brown bat) across the eastern USA, spanning 16 years before and 14 years after Pd invasion into North America. These data represent 30,496 E. fuscus captures across 3,567 unique sites. We encourage the use of this dataset for quantifying impacts of wildlife disease and other threats to wildlife (e.g., climate change) with the incorporation of other available data. We welcome additional data contributions for E. fuscus captures across North and Central America as well as the inclusion of other variables into the dataset that contribute to the quantification of wildlife health.

present a unique opportunity for studying the broad impacts of wildlife disease emergence, as their federal monitoring programs were prioritized in the USA throughout the 20 th century and they are currently threatened by the invasive fungal pathogen, Pseudogymnoascus destructans ( Pd ), which causes white-nose syndrome. Here we provide a longterm dataset for capture records of Eptesicus fuscus (big brown bat) across the eastern USA, spanning 16 years before and 14 years after Pd invasion into North America. These data represent 30,496 E. fuscus captures across 3,567 unique sites. We encourage the use of this dataset for quantifying impacts of wildlife disease and other threats to wildlife (e.g., climate change) with the incorporation of other available data. We welcome additional data contributions for E. fuscus captures across North and Central America as well as the inclusion of other variables into the dataset that contribute to the quantification of wildlife health.

Value of the Data
• Understanding how wildlife populations change following an impact is critical as humaninduced disturbances can influence impact frequency and magnitude, as suggested for the growing number of emerging infectious diseases in wildlife [4 , 5] . • Emerging infectious diseases threaten wildlife to, or near, extinction, and without sufficient data prior to disease emergence, management strategies to conserve wildlife species may be unsuccessful. • Without data before disease emergence, calculating true losses to ecosystem services and function is extremely challenging. Thus, well monitored wildlife systems can help clarify how populations change following impacts from emerging infectious disease. • Here, we pair long-term capture records of Eptesicus fuscus (big brown bats) with spatiotemporal spread of Pseudogymnoascus destructans , the invasive fungal pathogen causing white-nose syndrome [6 , 7] . These data consist of spring through fall capture records spanning 16 years before and 14 years after the first detection of Pd in New York, USA, in 2006, where it was introduced from Eurasia [6 , [8][9][10] . • This E. fuscus capture dataset provides spatial and temporal data for both wildlife host and pathogen spread, but data are not limited to usage for only disease impact research. These data can be paired with other datasets presenting other conservation threats to bats with a county-level spatial resolution (e.g. climate change impacts, agricultural and urbanization intensification impacts, insect decline impacts, etc.). • We welcome additional contributions to this dataset for past and future E. fuscus capture records across their species range throughout North and Central America. We also welcome the inclusion of additional variables into the dataset such as Pd intensity and/or any metric representative of bat health upon capture (e.g., heavy metal concentrations, differential white blood cell counts, cortisol concentration, other present pathogens, etc.).

Objective
Our objective is to provide accessible, long-term data for a well monitored wildlife system impacted by an emerging infectious disease. Bat host species that are less susceptible to Pd infection (relative to highly susceptible species) persist despite annual winter infections and thus, can inform how long-term pathogen exposure impacts persisting host populations. Eptesicus fuscus (big brown bat) are classified as less susceptible to Pd infections and their populations persist despite annual winter infections [11 , 12] . E. fuscus in the eastern USA are relatively well moni-tored in spring through fall months (compared to other wildlife systems) because they are a common by-catch during federally endangered Myotis sodalis (Indiana bat) summer capture surveys, setting up the opportunity for a long-term dataset representing before and after the impact of an emerging infectious disease. Here, we provide a description of a dataset representing 30 years of E. fuscus spring through fall capture data across 11 eastern USA states paired with Pd introduction and invasion timing [1] . This dataset was previously used to quantify morphometric trait shifts [3] and capture rate changes across space over Pd invasion time (Simonis et al .; in review). We encourage future contributions to this dataset and its use for impact studies requiring long-term wildlife records. Future contributions to this dataset can be made by contacting the corresponding author of this manuscript, and newly integrated data will be made available through Dryad Digital Repository [1] .

Data Description
This dataset [1] consists of 30,496 E. fuscus captured across 11 eastern USA states (  Overall, this dataset [1] includes E. fuscus capture records representing 14,162 adult females ( Fig. 2 A), 9,967 adult males ( Fig. 2 B), 3,0 0 0 juvenile females ( Fig. 2 C) and 3,367 juvenile males ( Fig. 2 D). The number of E. fuscus capture records within these age and sex demographics vary across the year of capture but, in general, increase over time ( Fig. 2 ). Within adult-aged E. fuscus , the amount of capture records within the dataset also varies by their reproductive status at the time of capture. For adult females, the dataset consists of 1,928 non-reproductive bats, 2,085 pregnant bats, 5,044 lactating bats and 5,105 post-lactating bats ( Fig. 3 ). For adult males, 5,697 bats captured are non-reproductive and 4,270 bats were captured with descended testes ( Fig. 3 ). Finally, mass and forearm length varied by E. fuscus age and sex at time of capture with juve-   niles generally weighing less than adults ( Fig. 4 A) and no distinguishable differences in forearm lengths by age across the dataset ( Fig. 4 B).
The amount of E. fuscus capture records also generally increased across Pd invasion timesteps ( Fig. 5 A); however, the number of capture sites also increased causing the number of bats captured per site within each time-step (effort) to remain relatively stable for adult and juvenile E. fuscus ( Fig. 5 B). E. fuscus captures also generally increased over the years since confirmed or suspected Pd invasion within each time-step ( Fig. 5 C). E. fuscus capture data ("SIMONIS_et_al_BigBrownBatData_Dryad.csv") and variable descriptions ("SIMONIS_et_al_BigBrownBat_README_20220725.txt") can be accessed via Dryad Digital Repository [1] . Code for descriptive data figures ( Fig. 1 -5 ) presented in this manuscript ("DataDescriptor_20221026.Rmd") with its associated description ("README.md") can be accessed via MCS' Github page ( https://github.com/simonimc/Eptesicus _ fuscus _ big _ brown _ bat _ data _ descriptor _ figures ) and/or through Zenodo [2] .

Experimental Design, Materials and Methods
We gathered and collated historical E. fuscus mist net capture data from 11 USA states between July 2018 through May 2021 ( Fig. 6 ). Data were opportunistically collected from federal wildlife agencies, state wildlife and natural resource agencies, and individual wildlife researchers. Government and academic representatives were contacted via email, and those with available data (coauthored above or acknowledged below) provided state mist net capture data through email communication. Variables of interest within the gathered E. fuscus data included: date of capture (month, day and year), USA state of capture (Georgia, Illinois, Indiana, Kentucky, Mississippi, New York, North Carolina, Ohio, Pennsylvania, Tennessee, or Virginia), site name of capture, county of capture, demographic state of individual bat (adult or juvenile), sex of individual bat (male or female), reproductive status of individual bat (female: non-reproductive, pregnant, lactating, post-lactating; male: testes-descended), mass of individual bat (g) and forearm length of individual bat (mm).
Once collected and collated, the raw dataset consisted of 40,689 individual E. fuscus captures. If the date of capture, demographic state, sex, reproductive status, mass or forearm length was missing from individual E. fuscus capture entries (left blank or entered as varying versions of "unknown" or "NA"), the entry was removed from the raw dataset. We also removed entries with unclear reproductive status entries. For example, if an adult female was marked as "reproductive" without indication of reproductive stage ( i.e., pregnant, lactating or post-lactating), the entry was removed. For "age" (demographic state), "sex" and "repstat" (reproductive status) variables, we grouped varying versions of entries into distinct levels within each variable using filtering and find and replace tools in Microsoft Excel. Within the "age" variable, entries such as "A", "AD" and "Adult" were labeled as "adult" and entries like "J", "Juvi" and "JUV" were labeled as "juvenile". Within the "sex" variable, reported values such as "F", "Female" and "fem" were labeled as "female" and "M" and "Male" as "male". The "repstat" variable underwent a similar process where entries such as "N", "NR", "up" and "non" became "non-reproductive"; entries such as "P", "PR" and "PG" became "pregnant"; entries with versions of "L", "lact" and "LA" became "lactating"; entries with entries like "PL", "post lac", "post" became "post-lactating; and labels such as "TD", "scrotal", "S" and "down" became "testes-descended". Finally, in instances where entries for "age" and "sex" and/or "repstat" were placed under the wrong variable, we manually corrected the entry. For example, if a bat had an "age" of "post-lactating", a "sex" of "adult" and a "repstat" of "female", we correct the entry so the bat had an "age" of "adult", a "sex" of "female" and a "repstat" of "post-lactating".
We filtered the dataset to only keep E. fuscus captures that occurred between the months of March through October which corresponds to records for spring through fall. We only kept spring through fall records because 1) bats in winter months are typically not active when they are captured and, thus, different capture methods are used and 2) winter records were not explicitly requested. If neither the county of capture or latitude and longitude of capture were not provided, those entries were also removed except in cases where the site name or site description could be found to the county level online. When county of capture was provided, we used the reported county name. If latitude and longitude of capture were provided without a county of capture, we determined the county of capture by linking the spatial point provided to its respective county using the sp, maps , and maptools package in the statistical environment R [13][14][15][16][17] . To do so, we adapted publicly available R code for a function created to match a spatial point to a state but instead, matched to a county (see https://github.com/simonimc ).
If site of capture was not provided, we allowed a site description to take its place if available. For example, for a general location described as "Ash Creek @ Hwy", the site name became "Ash Creek @ Hwy". If neither a site nor a description was provided, we then created a site name using the naming pattern "No Site Name < county of capture > ". Site names were then manually cleaned in OpenRefine version 3.5 (available at https://openrefine.org/ ) to ensure variations in site names within each county and state were labeled as a single site. For example, if site names were labeled by capture nets (e.g., "Site 1 Net A", "Site 1 Net B"), we pooled nets under the single site name (e.g., "Site 1 Net A" and "Site 1 Net B" both become "Site 1"). Another example would be if multiple sites names were listed under the same general location (e.g., "Mammoth Cave National Park Site 1", "Mammoth Cave National Park Site 2"), we compiled those sites within the general location (e.g., "Mammoth Cave National Park Site 1" and "Mammoth Cave National Park Site 2" both became "Mammoth Cave National Park"). Finally, if spelling errors occurred across a single site name, we corrected the site name to the correct spelling (e.g. "Mamoth Cav National Park" became "Mammoth Cave National Park").
Due to the sensitivity of some of these data for disclosing locations of federally listed endangered or threatened bat species, we masked location and site data. Masking location and site data across the entire dataset also upheld data agreement terms and conditions contracted with Indiana, Kentucky and Tennessee. Once the county of each bat captured was identified as described above, we determined a county centroid point for each individual capture using the housingData package [18] . Thus, we created additional variables within the dataset for the latitude and longitude of those county centroid points, setting the spatial resolution of bat captures at the county level. To further ensure sensitive geographic data were not exposed, we also masked site names within the dataset. We masked sites by labeling each site with a unique identifier within the state. For example, "Site 1" in Georgia became "GA_01". Following initial data collation and cleaning, 30,496 individual bat captures across 3,567 unique sites remained in the dataset ( Fig. 6 ).
In addition to E. fuscus capture records, we added variables for geographic spread over time of the invasive fungal pathogen Pseudogymnoascus destructans ( Pd ), which causes whitenose syndrome in North American temperate bats. Using information publicly provided by the US Geological Survey for Pd surveillance in a map application ( https://whitenosesyndrome.org/ where-is-wns ), we determined the year of Pd introduction within each state of capture as the earliest year of confirmed or suspected Pd occurrence. The year of confirmed or suspected Pd detection is indicated by county color within the available US Geological Survey map application ( https://whitenosesyndrome.org/where-is-wns ). Therefore, we used the earliest confirmed or suspected Pd detection year (as indicated visually by county color on the map application at https://whitenosesyndrome.org/where-is-wns and confirmed through model predictions through the US Geological Survey [19] ) as the year of Pd introduction within each state of collated big brown bat capture data. Using the year of Pd introduction for each state as the baseline, we subtracted the year of each individual E. fuscus capture from this timepoint to standardize the timing of pathogen spread across the eastern USA. Therefore, the year of Pd introduction was set at '0', with negative integers representing years prior to Pd introduction and positive integers representing years following Pd introduction within each state of capture. From this variable ("years_Pd"), we created another variable ("disease_time_steps") categorizing pathogen occurrence timing into invasion time-steps [3 , 12 , 20] . These time-steps included pre-invasion years ( < 0 years since Pd introduction), invasion years (0 -1 years since Pd introduction), epidemic years (2 -4 years since Pd introduction) and established years (5 + years since Pd introduction). We used these time-steps to remain consistent with pathogen occurrence time groups within the white-nose syndrome literature [3 , 12] , in lieu of unavailable pathogen prevalence data.
To validate data ( Fig. 6 ), we removed individual E. fuscus capture entries that had inconsistencies in reporting using the filtering tool in Microsoft Excel. For example, if an E. fuscus record was marked as a male with a female reproductive status (e.g., pregnant male), it was eliminated from the dataset. Additionally, if a juvenile (bat within its first summer of life) female bat was marked with an adult-only reproductive status ( i.e., pregnant, lactating or post-lactating), the entry was removed. Juvenile males were allowed an adult reproductive status (testes-descended) because they begin to physically present as reproductive in late summer/early fall. These validation steps removed 29 additional entries from the dataset.
We also explored the ranges of mass (g) and forearm lengths (mm) within the dataset. Average adult E. fuscus body mass and forearm lengths have been reported near and around 17.6 g [21] and 45.8 mm [22] . Juvenile body mass and forearm length have historically averaged 12 g and 45.2 mm [22] . Within the data collated here, adult body mass and forearm length averaged 18.8 g (range: 9.2 g to 34.5 g) and 46.5 mm (range: 32.7 mm to 59.0 mm) and juvenile mass and forearm length averaged 15.7 g (range: 6.0 g to 29.7 g) and 46.0 mm (range: 32.0 mm to 55.0 mm). Being that E. fuscus masses and forearm lengths have not been collated with such a large sample size in the past, we kept all remaining entries within the dataset due their close proximity to historical averages for mass and forearm lengths. Following data validation, 30,496 individual bat captures across 3,567 unique sites remained within the final dataset ( Fig. 6 ) [1] .

Ethics Statements
Collection and collation of the data described here did not involve human subjects, animal experiments, or data collection through social media platforms. All data contracts and distribution policies required by primary data sources were complied.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Big brown bat (Eptesicus fuscus) capture records before and after white-nose syndrome (Original data) (Dryad).
simonimc/Eptesicus _ fuscus _ big _ brown _ bat _ data _ descriptor _ figures: Eptesicus _ fuscus _ big _ bro wn _ bat _ data _ descriptor _ figures (Reference data) (Zenodo). Lock, Taylor Verrett, Kristin Dyer and Meagan Allira for valuable feedback on clarity in various sections of the manuscript. We would also like to acknowledge the Wright State University Department of Biological Sciences' Biology Award for Research Excellence for funding materials needed for Yasmeen Samar's contributions to this manuscript and Wright State University's Environmental Sciences PhD Program for funding Molly C Simonis during the creation of this dataset. Data collection from Virginia for this publication was completed with funds provided by the Virginia Department of Wildlife Resources using resources from the national Wildlife Restoration program provided by the U.S. Fish and Wildlife Service. The findings and conclusions in this article are those of the authors and do not necessarily represent the views of the U.S. Fish and Wildlife Service.