Rapid digitization to reclaim thematic maps of white-tailed deer density from 1982 and 2003 in the conterminous US

Background Despite availability of valuable ecological data in published thematic maps, manual methods to transfer published maps to a more accessible digital format are time-intensive. Application of object-based image analysis makes digitization faster. Methods Using object-based image analysis followed by random forests classification, we rapidly digitized choropleth maps of white-tailed deer (Odocoileus virginianus) densities in the conterminous US during 1982 and 2001 to 2005 (hereafter, 2003), allowing access to deer density information stored in images. Results The digitization process took about one day each per deer density map, of which about two hours was computer processing time, which will differ due to factors such as resolution and number of objects. Deer were present in 4.75 million km2 (60% of the area) and 5.56 million km2 (70%) during 1982 and 2003, respectively. Population and density in areas with deer presence were 17.15 million and 3.6 deer/km2 during 1982 and 29.93 million and 5.4 deer/km2 during 2003. Greatest densities were 7.2 deer/km2 in Georgia during 1982 and 14.6 deer/km2 in Wisconsin during 2003. Six states had deer densities ≥9.8 deer/km2 during 2003. Colorado, Idaho, and Oregon had greatest increases in population and area of deer presence, and deer expansion is likely to continue into western states. Error in these estimates may be similar to error resulting from differential reporting by state agencies. Deer densities likely are within historical levels in most of the US. Discussion This method rapidly reclaimed informational value of deer density maps, enabling greater analysis, and similarly may be applied to digitize a variety of published maps to geographic information system layers, which permit greater analysis.


INTRODUCTION
Researchers increasingly are developing and improving tools that can be applied to a range of topics. Object-based image analysis is a relatively new avenue of research, with few publications referencing use before 2010 . Current application and development of object-based image analysis primarily is for remote sensing objectives of land use and cover mapping, but this analysis can be applied to achieve other objectives.  1994-1999 (hereafter, 1996), and 2001-2005 (i.e., 2003) maps of deer density (Adams, Hamilton & Ross, 2009) for the conterminous United States. The maps grouped deer densities into four colored classes: <5.8 deer/km 2 , 5.8-11.6 deer/km 2 , 11.6-17.4 deer/km 2 , and >17.4 deer/km 2 (Fig. 1). Because the 1996 and 2003 maps were very close in time, we used the 1996 map to fill information in the 2003 map that was not provided by state wildlife agencies, primarily in Nebraska and Colorado (Adams, Hamilton & Ross, 2009). The images were JPG and PNG files, which contained a stack of red, green, and blue bands. We imported these images directly into ArcGIS Pro (v2.2, ESRI, Redlands, CA). We set the projections to Albers equal area conic USGS version. We then georeferenced the images to a United States county layer using a third order polynomial transformation.
Once georeferenced, we imported the layers into eCognition (v9.3.2, Trimble, Westminster, CO) and built image objects (i.e., object-based image analysis delineates homogeneous pixels into shapes or polygons), using multi-resolution segmentation with all bands weighted equally. Because the images had poor resolution, the borders mixed in color, and thus, we applied a small scale factor to minimize mixed color objects and to delineate borders of deer density classes.
To determine the deer density classes of the image objects, we applied random forests classification, which involves combination of many classification trees and random samples to produce a strong model. We manually assigned deer density classes and no data values to a sample of image objects, which was the training input for a random forests classification (R Core Team, 2018; Supplement 1 provides example R code) along with the mean and standard deviation of each band color (red, green, and blue bands) to inform the model. Random forests used the relationship between the known deer densities and color variables to automatically assign deer density classes to all image objects. Random forests classification of image objects appeared to be both faster and produce a better result that required less manual correction than nearest neighbor classification in eCognition, based on classification of deer densities and our experience for remote sensing work, although we did not formally document processing and correction times.
To correct issues specific to these images, we used rule-based classifiers. County names and borders were colored black and we reclassified black objects to a colored deer density class using a process in eCognition called 'assign class algorithms'. Black objects ≤ 5 pixels were reclassified to the surrounding majority colored deer density class. We manually corrected any errors that persisted at the border of two colored deer density classes.
An alternative free option to eCognition software that provides a more serviceable product than multi-purpose GIS software is GNU Image Manipulation Program (GIMP v2.1.0.2; the GIMP team). For this software, after georeferencing the image, we selected each color and continued to select until we reached the full color range. We exported each color into the georeferenced layer, resulting in multiple copies. In ArcMap (v10.6, ESRI, Redlands, CA), we extracted by attribute for the raster value of 255, which was the information selected in GIMP. We filled in holes within colors using the union function and then erased any holes that were of a different color.
We used an approximate conterminous US population estimate of 30 million at year 2000 (Webb, 2014) to calibrate population estimates. The lowest value for each density class (i.e., 5.8, 11.6, 17.4 deer/km 2 ) and a value of 1.85 deer/ km 2 for the low density class was necessary to generate this population value from the 2003 map. We then estimated population and density for the 1982 map. The best available estimates for comparison were 15 million deer by 1978 and 26 million deer by 1993 (Miller, Muller & Demarais, 2003) for this map.
In order to compare spatial change at the county level, we determined the percent area of each deer density class by county. We assigned the majority, or greatest percent area, to each county. In order to prevent errors in counties with two or more deer density classes, we retained only counties with a clear majority of >25 percentage points for the majority deer density class compared to the next most abundant deer density class (for example, the majority class of 45% of the area compared to a class of 20%). We also excluded counties with partial information of less than 25% of the county area reported to be in any deer density class.
To provide an estimate of error, we located archived 2005 deer population estimates from the Quality Deer Management Association (Adams & Ross, 2015) for 25 US states and 2001 to 2005 deer population estimates from the Southeast Deer Study Group proceedings (Southeast Deer Study Group, 2002-2006, which is limited to 16 southeastern states. We compared reported estimates to our values derived from the 2003 map. State agencies provided all estimates.

RESULTS
After testing the procedure, the digitization process took about one day each for these images to complete the data processing steps of georeferencing, building image objects, random forest classification, and correcting errors, of which about two hours was computer processing time. Computer processing time will differ due to factors such as resolution and number of objects, that is, heterogeneity of the image. Errors that required manual correction resulted from borders, text labels, and poor resolution between colors. More colors and indistinct colors will increase manual correction time, and it may not be possible to have an acceptable product relying on automation for low contrast or complex symbology. In GIMP, processing time was minor compared to eCognition but more manual correction was required after automation to fill in areas without color.
Deer were present in 4,752,100 km 2 (about 60% of the conterminous US) and 5,561,500 km 2 (70% of the area) during 1982 and 2003, respectively. Population and density were 17,148,500 and 2.2 deer/km 2 during 1982 and (as calibrated) 29,928,700 and 3.8 deer/km 2 during 2003. The value of 17 million deer for the 1982 map is in line with 15 million deer by 1978 and 26 million deer by 1993 described in Miller, Muller & Demarais (2003) and based on best available population estimates. In areas where deer were present, deer densities were 3.6 deer/km 2 during 1982 and 5.4 deer/km 2 during 2003.
By state, the greatest density was 7.2 deer/km 2 in Georgia during 1982 ( Fig. 2; Table 1). Conversely, 11 states had densities greater than this value during 2003, ranging up to 14.6 deer/km 2 in Wisconsin. The greatest population increase (i.e., 2003 population/1982 population >7) occurred in Colorado and Idaho (Fig. 3). A few states had decreased deer populations, which may result from reporting differences rather than population changes due to ecological factors. Greatest area expansion (i.e., 2003 area/1982 area ≥ 5) occurred in Colorado and Oregon, although area of more eastern states had complete or nearly complete deer presence by 1982.
We determined changes at a county scale. The low density class was 69% of area during 1982 and 52% of area during 2013. About half of the counties remained in the same density class, 30% of counties increased by one class, and 12% of counties increased by >1 density class (Fig. 4).
Comparison between the reported estimates for 25 US states from the Quality Deer Management Association (Adams & Ross, 2015) and the 2003 map of deer densities was   (Table 2). However, mean absolute error (or difference) was about 235,000. Most reported deer density estimates in 16 southeastern states (Southeast Deer Study Group, 2002-2006 were greater than the 2003 map of deer densities and overall for the Southeast, the reported estimates were greater by a factor of 1.09 (18,500,000 vs. 17,000,000 deer). The mean absolute error (or difference) between mean deer densities reported by the Southeast Deer Study Group and estimated from the 2003 map of deer densities was about 250,000 (Table 3). Although this value is high, reported mean deer densities also ranged considerably from year to year, for example, changing by 250,000 individuals. For some states, reported deer densities did not change at all. In addition, although state agencies provided all estimates, reported estimates from the Quality Deer Management Association (Adams & Ross, 2015) varied in some states from the Southeast Deer Study Group, resulting in a mean absolute error of 175,000 for the 13 states with overlapping information. Thus, error in the 2003 map of deer densities may be little worse than error due to differential reporting by state agencies.

DISCUSSION
This research demonstrated automated methodology for digitization of thematic maps to accessible GIS layers. The methods presented here are effective at reclaiming a variety of thematic maps without associated data or GIS layers. Any application of software, whether eCognition or GIMP, will assist in reducing the amount of time needed to select, draw, and edit features to digitize a map image. In part, we used object-based image analysis to delineate homogenous objects that otherwise are digitized by hand. Although object-based image analysis is applied for classification of optical imagery, it is not yet a familiar tool for the GIScience community , and automation is not widely known as an option for digitization. Therefore, automated digitization is not being applied to published maps without GIS layers. A published map does not have to be historical to be lacking a GIS layer; for example, the QDMA deer density maps are from the 2000s. Many historical and relatively current publications do not have archived datasets and the necessity to digitize map images highlights the value of archiving data. It is not possible to fully analyze information in printed maps without access to that information in the form of GIS layers. We were able to estimate and compare white-tailed deer distribution and population densities and spatial change between approximately 1982 and 2003 in the conterminous US by using digitized maps, which we will archive (https://www.fs.usda.gov/rds/archive/). The described method can reduce the time and effort needed to retrieve other valuable datasets stored in thematic maps.
The value of 17 million deer for the 1982 map is in line with 15 million deer by 1978 and 26 million deer by 1993 (Miller, Muller & Demarais, 2003). During 1982, deer densities were 3.6 deer/km 2 where deer were present and during 2003, mean deer density was 5.4 deer/km 2 where deer were present, given a population of about 30 million. Six states had deer densities ≥9.8 deer/km 2 during 2003. White-tailed deer had a distribution of 5.6 million km 2 during 2003. Deer expanded about 810,000 km 2 between 1982 and 2003, primarily in the western US because deer already were present throughout most eastern states. Nonetheless, Seton (1909) included part of Washington, Oregon, Colorado, Texas, of deer in the western US, based on still conservative estimates (McCabe & McCabe, 1984). To place a maximum bound using liberal estimates of deer densities before Euro-American settlement, deer abundance may have reached 65 million to 80 million in North America, given moderately low densities of 10 to 12 deer per km 2 in most of the eastern US, with moderately high to high density landscapes of 15 to 20 deer per km 2 , and low densities of 3 to 4 deer per km 2 throughout the rest of the deer distribution.
Exploitation of white-tailed deer for commercial markets by Euro-American settlers reduced the deer population to about 12 to 14 million animals by 1800 and 300,000 to 500,000 animals by 1900 (McCabe & McCabe, 1984). Commercial hunting ended due to changed public attitudes, enforced state harvest restrictions, and banned interstate shipment of illegally caught animals by the federal Lacey Act of 1900 (McCabe & McCabe, 1984). The population may have recovered to about 6 million by 1948, 15 million by 1978, 26 million by 1993, and 30 million animals by 2000in the US (McCabe & McCabe, 1984Miller, Muller & Demarais, 2003;Webb, 2014). Since 2000, the deer population has been relatively stable, with potentially a slight decrease due to habitat loss and degradation, hunting pressure, severe weather events, and disease may be decreasing deer populations (Webb, 2014;Adams & Ross, 2015). A white-tailed deer population of 30 million may be at or even below historical levels in most of the US, based on a wide range of historical deer population estimates.
Since Euro-American settlement, comprehensive changes in vegetation have occurred. Generally, fire-tolerant open oak and pine forests with an herbaceous vegetation ground layer have transitioned to closed forests, comprised of diverse fire-sensitive tree species and tree layers throughout the vertical profile, typically replacing the herbaceous ground layer (Hanberry, Bragg & Hutchinson, 2018). The concurrence of increased tree recruitment and decreased deer populations during Euro-American settlement may suggest that relief from browsing pressure was influential in releasing tree growth because herbivores are potential drivers of vegetation structure. However, transition back to open forests is not occurring after resumption of deer pressure at or above thresholds of 3 to 9 deer/km 2 expected to cause change, even where deer densities exceed 10 deer/km 2 (e.g., in Mississippi, Hanberry et al., 2014;Hanberry, Coursey & Kush, 2018;Hanberry & Abrams, 2019). Overall, research indicates that deer reduce regeneration of tree seedlings (Habeck & Schultz, 2015;Ramirez, Jansen & Poorter, 2018), but most tree seedlings will not survive with or without herbivores due to density-dependent mortality. Indeed, if densities are within or lower than historical densities, then the level of herbivory may be tolerable to plants that co-existed with historical deer browsing.
Based on analysis using these digitized maps, deer densities do not appear to be correlated with tree stocking (i.e., percent occupied growing space accounting for both density and diameter) at landscape scales in the eastern US (Hanberry & Abrams, 2019; to account for time lag of effects, we used 1982 and 1996 deer densities and current tree stocking after about 30 years and 15 years of deer browsing). It may be that when tree regeneration is limited by fire, deer and other herbivores can help maintain open forests, grasslands, and shrublands. In addition, despite variable browse preferences, almost all tree species, including species such as northern white cedar (Thuja occidentalis) identified as preferred browse reduced by deer browsing, have increased in relative abundance of trees (diameter ≥12.7 cm) in the eastern US between the 1800s and the current decade of 2010 (Hanberry, Palik & He, 2013;Hanberry & Abrams, 2019;Hanberry & Dey, 2019). Increases in most tree species probably preceded increases in deer densities, but recent trends during the past decades are similar to historical trends (Hanberry, 2019; northern white cedar increased slightly in northern mixed forests, B Hanberry, pers. obs., 2019). Decreasing tree species include fire-tolerant oak and pines, some wetland species, and species that are affected by novel diseases perhaps in combination with forestry practices (Hanberry, Palik & He, 2013;Hanberry & Nowacki, 2016;Hanberry & Abrams, 2019). Although woody plants have benefitted from conditions during the past century, deer herbivory is an additional stressor on herbaceous plants that have become less abundant due to intense competition with woody species; with limited growing space, abundance of forbs and grasses necessarily decreases as tree densities increase (Hanberry, Bragg & Hutchinson, 2018).

CONCLUSIONS
Ecologically valuable published data may not be accessible except as thematic images. The functionality of thematic maps can be increased by digitizing pictures into Geographic Information Systems (GIS), or computer-readable layers. Digitized data facilitates access to and analysis of geospatial information that is relevant but relatively inaccessible. We documented a method to digitize maps into GIS layers, while providing valuable information about change in white-tailed deer population, densities, and range for the conterminous United States. These methods can be applied to other thematic maps to increase availability and recapture information stored in images.