Forest cover mask from historical topographic maps based on image processing

This study aimed to obtain accurate binary forest masks which might be directly used in analysis of land cover changes over large areas. A sequence of image processing operations was conceived, parameterized and tested using various topographic maps from mountain areas in Poland and Switzerland. First, the input maps were filtered and binarized by thresholding in Hue‐Saturation‐Value colour space. The second step consisted of a set of morphological image analysis procedures leading to final forest masks. The forest masks were then assessed and compared to manual forest boundary vectorization. The Polish topographical map published in the 1930s showed low accuracy which could be attributed to methods of cartographic presentation used and degradation of original colour prints. For maps published in the 1970s, the automated forest extraction performed very well, with accuracy exceeding 97%, comparable to accuracies of manual vectorization of the same maps performed by nontrained operators. With this method, we obtained a forest cover mask for the entire area of the Polish Carpathians, easily readable in any Geographic Information System software.


Introduction
Historical maps are the only or main reliable source of spatial information on land use and land cover in the past Munteanu et al., 2014;B € urgi et al., 2015;Fuchs et al., 2015). Their comparison to current land use and land cover data gives a possibility to study environmental changes in long time horizon (Yang et al., 2014). In particular, forest boundaries and forest extent have been frequently extracted from old maps (Axelsson and € Ostlund, 2001;Kozak et al., 2007), as they are often clearly marked on maps even at small scales. Although large collections of scanned and georeferenced historical maps of several countries are available via Internet (e.g. www.lib.utexas.edu/ma ps, http://igrek.amzp.pl, www.mapywig.org, http:// publicdomain.nypl.org, www.mapire.eu) the methods of extracting this information, to use in broad scale spatial analysis are still missing (Kaim et al., 2016). Most frequently, maps are interpreted visually and manually vectorized, even when the final data is required in a raster format. This approach is considered as the most accurate one although introduces subjectivity since the person who digitizes needs to make many ad hoc decisions while delineating feature boundaries. Manual vectorization is also highly timeand cost-consuming (Jackson and Woodford, 1991). For example, vectorization of forest boundaries on one map sheet may take (depending on its complexity) up to 1 day (approximate duration for various maps used in this study); for large area investigations another issue is mosaicking the vectorized map sheets into one seamless coverage and error removal, especially if the work is split among several persons (Leyk et al., 2005;Kaim et al., 2014).
A promising solution is an automatic or semiautomatic feature extraction over large areas (Ebi et al., 1994). Maps consist of many simple elements: points, lines, text and complex objects, for instance areas with points, lines and text labels. Frequent attempts of automatic extraction were related to map text (Myers et al., 1996;Luyang et al., 2000;Cao and Tan, 2001;Pezeshk and Tutwiler, 2011;Chiang and Knoblock, 2015) and linear objects, e.g. contour lines that were later processed into Digital Elevation Models, rivers or roads (Kaneko, 1992;Arrighi and Soille, 1999;Wu et al., 2009;Ghircoias and Brad, 2011;Oka et al., 2012). For two-dimensional objects (polygons) the automated approaches are more complex (Wise, 1999(Wise, , 2002Leyk and Boesch, 2010). Twodimensional objects are typically obscured by various other map symbols. Depending on map convention their boundaries may be represented by a distinct line or not, and the colour fill may exhibit several variations. Several recent studies have demonstrated that automated extraction methods may offer significant benefits in massive processing of cartographic data, at least supporting the process of map interpretation (Chiang et al., 2013(Chiang et al., , 2014. For instance, Leyk and Boesch (2009) made a successful attempt to extract forest cover information from three whole map sheets of 1:25 000 19th century Siegfried Map of Switzerland. Iwanowski and Kozak (2012) used morphological image processing to extract forest information from one map sheet of the Polish Military Map from 1930s, Herrault et al. (2013) extracted information on forest cover from the 19th century Map of France in scale 1:40 000 obtaining high overall accuracy. Godfrey and Eveleth (2015) successfully used unsupervised classification to extract normalized ground snow loads for Idaho. Auffret et al. (2017) have recently proposed an R-based tool to semiautomatically digitize land use from historical maps, receiving agreement of 80-90% with manually digitized maps. These examples clearly demonstrate the potential of automated extraction methodologies for different maps available over Europe. It is still challenging, however, to propose an approach allowing to efficiently extract from maps all details of the forested area over large areas.
In this study, we develop and test an approach of semiautomatic forest cover extraction tested for three various sets of topographic maps, created in different conditions and cartographic traditions. We implement the approach operationally for the area of the Polish Carpathians (20 000 km 2 ), receiving a forest mask that is directly readable in standard Geographic Information System (GIS) software for land use and land cover change assessments. Accepting that the automated forest cover extraction would not exceed the skilled manual interpretation, we expected to receive a set of automated procedures enabling to extract forest cover with a sufficient accuracy in areas with good quality map information, allowing a limited level of manual intervention. Our approach was based on image segmentation based on the Hue-Saturation-Value (HSV) colour space and morphological image filtering, capable to quickly process huge amounts of map data of various quality. To assess the overall performance of the method, we estimated its accuracy in various topographical contexts and for maps with various quality, comparing its performance to reference manual vectorizations. As a benchmark we used average accuracy of sample forest masks detected manually from the same maps by medium-skilled operators.

Materials
Our approach focused on forest area representations based on distinct colour (e.g. green), typical for various topographic maps published in the 20th century. Forest area calculations for different points in time can only be done based on very precise extractions including all details of mapped forest patches, in particular, resolving problems related to variety of cartographic symbols obscuring the green background representing forests. Thus, the requirements on robust performance of the method are very high. For the study, the following maps were chosen ( Figure 1 Map 1 was obtained from the Map Library of the Institute of Geography and Spatial Management, Jagiellonian University. Maps sheets were originally printed in either four or six colours. Forested areas were marked in bright green with a dotted black boundary and supplemented with signatures indicating the forest type (coniferous, deciduous or mixed). The colour variation on these maps was high and depended on many factors, e.g.: map edition, paper condition, quality of map storage and intensity of relief shading in mountainous areas. The latter factor contributed to a very high variation in forest colour on steep mountain slopes with contrasting aspects. The scans of 20 tested samples had 1090 dpi resolution, with a pixel dimension equivalent to 2.33 m on the ground.
Map 2 was received as digital image files from Provincial Archives of Geodesy and Cartography in Poland, for three provinces of southern Poland ( Sla z skie, Małopolskie, Podkarpackie). Forested areas were marked bright green with a dotted, black boundary. Due to errors in printing, in some places the dotted boundary did not adhere to green-marked forest areas. Contrary to map 1, colour variations among various sheets were not detectable, and signatures indicating different forest types were used rarely. The tested samples had 508 dpi resolution, with a pixel dimension equivalent to 1.25 m on the ground.
Map 3 was published by the Swiss Federal Office of Topography (swisstopo). Forested areas were symbolized with pale green colour. Depending on the clarity of forest boundaries, they were shown with line or dotted-circled signatures. The scans of 20 tested samples had 508 dpi resolution, with a pixel dimension equivalent to 1.25 m on the ground.
For each map type, 20 randomly selected squares 1000 9 1000 pixels were chosen for tests and analysis. Due to the extent of the study area for forest cover change detection, all squares were sampled in mountain areas of the Swiss Alps and the Polish Carpathians. For various scan resolutions and map scales, selected squares covered different areas (5.44 km 2 for map 1, and 1.56 km 2 for maps 2 and 3). Forest cover proportion varied significantly among map samples and ranged from 25% to 93% for map 1, 20% to 93% for map 2 and 17% to 85% for map 3.

Data Production Methods
The proposed forest extraction approach was a twostage process (Figure 2), building on a generic concept proposed by Iwanowski and Kozak (2012). In the first stage, the rough preliminary forest mask was extracted by means of colour segmentation. The preliminary mask did not fit precisely to the real forest regions as they were obscured by the presence of various map symbols. From the point of view of image processing, these details (e.g. contour lines, text labels, signatures) should be considered as noise that must be removed. The noisy preliminary masks were therefore further processed using morphological image processing (Serra, 1982;Soille, 2004;Iwanowski, 2009) to filter out the disturbances and to correct forest region boundaries, leading to the final forest mask. All processing steps were elaborated and fully implemented into MATLAB (2013), limiting manual interventions of the user to raster extent definition, choice of processing parameters and their values. A similar sequence of image processing operations based on colour segmentation was also successfully tested by Herrault et al. (2013).

Colour segmentation
For all analysed maps, forested areas are represented using various tones of green colour and various types of linear boundaries. This representation is quite legible for human observer, but it mayin some casescause problems in the automated detection process due to the variations of green colour within forest regions and occurrence of various cartographic signs in forested areas. Such problems are common in mountainous regions. The first stage of the proposed scheme (colour segmentation) leading to preliminary forest mask extraction was based exclusively on colour information. Colour segmentation may be applied within any colour space, e.g. Red-Green-Blue (RGB) colour space where three components refer to three primary colours. However, this colour space does not reflect human perception, which operates rather on values like hue (H), saturation (S) and lightness (or value -V). As colours used on maps are rather adapted to their perception by humans, the working colour space was therefore transformed to the perceptual HSV space, in which separation of regions represented with different colours is improved as compared to the RGB space. Due to visible disturbances within the colour regions (caused by scanning imperfections), prefiltering with linear low-pass Gaussian was applied to lower the internal variation in colour values and make colour regions more homogeneous. Filtered images were then subject to colour quantization or direct colour segmentation (Spencer, 1991;Gonzalez and Woods, 2008). Colour quantization reduces colours from true-colour representation to eliminate local variations that appear as a side effect of scanning process or map quality. The quantization method was based on Spencer (1991), requiring single parameter: the number of informative colours on the map. Direct colour segmentation was used to extract regions marked green and obtain the preliminary binary forest mask (1 for forests and 0 for all other areas) based on HSV colour component ( Figure 3). The binary forest mask was obtained as an intersection (logical AND operator) of three binary images created by separate thresholdings of each HSV colour component, using six parameters that referred to upper and lower thresholds of three HSV components. The thresholds were derived from reference colours that had to be indicated manually on screen.

Morphological improvement of forest mask
The preliminary forest masks contained a lot of noise, both false positives (pixels classified as forest in nonforest regions) and false negatives (pixels classified as nonforest within forest regions). False positives are typically related to colour sampling process that may result in single pixels similar to forest green located in nonforest regions, visible as salt-and pepper noise.
Similarly, false negatives may reflect single pixels significantly differing from forest green in the forested regions, more frequently, however, false negatives are related to various cartographic symbols present in forested regions. To remove the disturbances a twostage morphological filtering (Serra, 1982;Soille, 2004;Iwanowski, 2009) of forest mask was used (Figure 4). In the first stage, sequential morphological filters of opening, closing and again opening were used to remove salt-and-pepper noise. The opening filter removes forest pixels outside the forest regions (salt noise) and simplifies the forest mask by removing thin branches outside the mask. The closing filter removes nonforest single pixels located inside the forest mask and simplifies the shape of a mask by removing thin nonforest inclusions inside the mask. All these filters use a structuring element that defines the neighbourhood considered during the filtering process. The larger the structuring element, the stronger filtering result is obtained. In this study we used a disc centred at pixel being processed, with the radius that is determined based on properties of a processed map. The above described filtering does not remove larger nonforest inclusions inside a forest mask and false forest areas outside the forest mask. As applying bigger structuring element for opening and closing may result in too strong simplification of forest regions, removal of large false regions was done in the second stage of processing with opening by reconstruction and closing by reconstruction morphological filters (Vincent, 1993;Soille, 2004;Iwanowski, 2009). These filters, contrary to the previously applied ones, were able to remove false regions of forests (opening by reconstruction) and nonforests (closing by reconstruction) without modifying the contours of the remainder of the forest mask. The chain of (at most) three morphological filters by reconstruction concluded the processing and provided the final forest mask.

Processing and parameterization
The proposed method is a semiautomatic one and requires several input parameters at various stages, depending in particular on a map being analysed and scan resolution. Due to the differences in quality of the three map sets, the processing parameters were set-up separately for each map (Table 1), and once selected, they allowed to effectively process all maps of the same type.
In this work, we assumed that trial-and-error tests provide the most effective way to receive, for a particular map, as close to optimal parameterization as possible, hence the effects of various parameterizations were not systematically addressed in this study. This is in fact hardly (if at all) possible due to a large number of various combinations of parameters used at every processing step and variety of responses of particular map sheets to each combination. The time required to set up the image processing parameters depended in particular on variations of the green colour used to present forests and style of graphic elements on the maps e.g. the width of contour lines, roads and fonts. A visual analysis of HSV histograms performed by a skilled operator could help to reduce the time needed for parameterization to only few minutes.

Accuracy assessment
To assess the accuracy and robustness of the proposed methodology we used manually vectorized forest masks prepared for all sample squares. The vectorized forest masks were converted to raster format, retaining raster origin and pixel size of each tested map. The reference forest masks were then compared to both the preliminary and final forest masks, producing in each case four classes [forest referenceforest test (true positives, TP); forest referencenonforest test (forest underestimation or false negatives, FN); nonforest referencenonforest test (true negatives, TN); nonforest referenceforest test (forest overestimation or false positives, FP)]. On the basis of these classes we calculated various measures commonly used in accuracy assessments of products derived from historical maps (Leyk and Boesch, 2010;Herrault et al., 2013): precision for forest P + = TP/ (TP+FP) and for nonforest P À = TN/(TN+FN), recall for forest R + = TP/(TP+FN) and for nonforest R À = TN/ (TN+FP), overall accuracy A = (TP+TN)/ (TP+FN+TN+FP), Kappa and normalized mutual information criterion (NMI) (www.alanfielding.co.uk/multiva r/accuracy.htm) ( Table 2). The verification of preliminary and final masks allowed to answer what is the value of morphological image analysis in the automated process: to what extent it eliminates false positives and negatives that are due to variations of map    colours and occurrence of text, contour lines and various cartographic symbols. As a benchmark we used results of individual manual vectorizations of the maps carried out by moderately skilled operators. We asked a group of geography students (12 students) to vectorize 10 sample forest masks from each map. Next, we estimated accuracy of their work using the reference vectorization of the same samples and compared received values to the accuracy of automated forest mask extraction.

Forest cover detection accuracy
For map 1, preliminary masks (colour segmentation only) had accuracies for forested areas from 16.4% to 94.2%. Forests were overestimated from 2.8% to 6.4% and underestimated from 17.0% to 82.9%. Total accuracy ranged from 17.1% to 97.2%. The final forest masks for map 1 had forest area accuracies from 24.3% to 98.6%. Forest overestimation ranged from 0.0% to 38.6% and forest underestimation from 1.4% to 75.7%. Total accuracies ranged from 71.3% to 97.2%. Worse results for forested areas than for the map overall were most likely due to the stronger influence of shaded relief on green forest colour than on the nonforest map colours. The best results for map 1 were obtained for a sample with one large and compact forest patch (Figure 5(a)). In all cases morphological filtering improved the total accuracy, from 9.4 to 43.2 percentage points.
For map 2, preliminary masks showed accuracies from 54.6% to 91.8% for forested areas. Forests were underestimated from 8.2% to 45.4%, and overestimated from 0% to 1.3%. Total accuracy ranged from 59.9% to 98.0%. The final forest masks for map 2 had a very high accuracy, from 96.3% to 99.8% of correctly recognized forest pixels ( Figure 5(b)). Forest underestimation ranged from 0.2% to 3.7% and forest overestimation from 0.1% to 2.5%. Total accuracy ranged from 97.9% to 99.8%. Forests were correctly recognized even in case of high forest fragmentation, however, they could be confused with orchards, represented on map 2 with green colour with small round dots inside orchards. For all samples, morphological filtering improved the total accuracy from 6.2 to 44.4 percentage points.
For map 3, preliminary masks showed accuracies from 61.1% to 82.5% for forested areas. Forests were underestimated from 17.5% to 38.9%, and overestimated from 0.1% to 1.8%. Total accuracy ranged from 53.2% to 88.1%. The final forest masks had a high accuracy, yet slightly lower than in case of map 2. For all samples, accuracies ranged from 90.4% to 98.8% of correctly recognized forest pixels (Figure 5(c)). Forest underestimation and overestimation were found to vary from 1.2% to 9.6%, and from 0.4% to 3.9% respectively. Total accuracy ranged from 93.9% to 99.1%. Similarly as in case of maps 1 and 2, morphological filtering improved the total accuracy of forest delineation on map 3 from 17.5 to 39.0 percentage points.

Uncertainty of manual vectorization
For map 1, average total accuracies of students' manual vectorization ranged from 91.2% to 98.7% (Figure 6(a)), and, except for one map sample, they were higher than the total accuracy of automated forest cover extraction. For maps 2 and 3, the accuracies of manual vectorization, though better than in case of map 1, were lower than accuracies of the automated forest cover extraction in 7 (map 2) or 4 (map 3) out of 10 cases (Figure 6(b) and (c)).

Discussion
Our algorithms were tested on various maps, including also maps that were not tested by Iwanowski and Kozak (2012), and extending the scope of the study Figure 5. The worst (upper) and the best (lower) forest classification results and classification accuracy for 10 samples for map 1 (a), map 2 (b) and map 3 (c). Blackforest, whitenonforest, redfalse positives, greenfalse negatives.
beyond the previous ones which were typically confined to testing one particular map edition (Ansoult et al., 1990;Herrault et al., 2013). For tested map samples, HSV colour space provided better results (by 1-2% of correctly recognized pixels) than CIELab space used by Herrault et al. (2013) or RGB space that was employed by Wise (2002).
For two map sets (map 2 and 3) we found satisfactory accuracies of automated forest mask extraction, exceeding 94% for all accuracy measures used, with overall accuracy exceeding 97%, and Kappa values above 90% (Table 2). These results are similar or better than the accuracies of feature extraction received in other studies. For instance Wise (1999Wise ( , 2002 obtained accuracy levels of 80-90% compared with digitising the same maps by hand and Herrault et al. (2013) received similar Kappa values in their study. The accuracies obtained with the semiautomated feature extraction were also comparable to accuracies received if maps were manually vectorized by moderately skilled operators. Maps 2 and 3 are characterized by well-coloured forest regions. All meaningful colours were easily separable in the HSV colour space and stable, with small in-class variance. The colour segmentation of these maps performed well, therefore the resulting preliminary mask usually consisted of correct forest regions that required relatively little improvement with morphological filtering. The Polish topographical map published in the 1930s (map 1) had, in most cases, total accuracy of forest mask extraction below acceptable thresholds. It resulted mostly from lower quality of paper prints due to their age and inadequate preservation (e.g. yellowing of the paper prints). The other reason was the more complex cartographic style of older maps (use of dotted forest boundary and hill shading). Hence, green colour used to designate forests had significantly higher variation for map 1 as compared to maps 2 and 3 produced in 1970s, moreover, yellowish background was sometimes misinterpreted as forests. Notably, map 1 had high overestimation errors (colour similarity of forests and background), whereas maps 2 and 3 had very low forest overestimation except for areas with numerous orchards. Automated feature extraction or maps with similar, or worse than map 1 quality is difficult, though a viable alternative could be a method proposed by Leyk et al. (2006) and Leyk and Boesch (2009) aiming to interpret map information through recognition of context and relations between complex cartographic symbols.
While the study did not attempt to quantify the impact of specific parameters on the accuracy of forest mapping, it provided some useful tips in processing. For high quality 5-colour maps (greenforests, black roads, symbols, texts, redcontour lines, whitebuildings, nonforest area, bluewater), with green colour uniformly printed and standing out from other colour components of the map, colour quantization performed sufficiently well, finding forest areas even with a moderate Gaussian blur and only a rough indication of the sample green colour. If the map quality was low, direct segmentation with parameters determined empirically was a better choice, as the method gives more flexibility in adjusting ranges of for each colour individually. Map quality determined also morphological processing parameters. In general, larger structuring element had to be used for low quality forest preliminary mask. Moreover, size of the structuring element depended on the cartographic style of the processed map, in particular, size of objects in the foreground of the forest layer (letters, symbols). In general, size of the structuring element was not larger than a 5-pixel radius disc. Regardless these findings, at this stage we found it hardly possible and not practical to fully automate the choice of the structuring element, as for all studied maps the empirical trial-and-error approach was the best solution. High accuracies received for the Polish and the Swiss maps published in 1970s prove that automated processing could be a useful substitute for manual interpretation. Time and cost savings were the most important motivation to develop automated processing methods of scanned maps (Ansoult et al., 1990;Wise, 1999). In our approach, once parameters were accepted, processing took approximately 40 s per one map sheet (three-layered raster image with 9600 rows and 14000 columns, workstation with Intel Core7 and 16 GB RAM). Setting up correct parameterization was a manual trial-and-error procedure, yet relatively quick for a skilled operator. Moreover, for coherent map sets, parameterization set up for one map sheet worked for all sheets with a similar accuracy level. For large area processing, time savings were substantiale.g. processing of 170 map sheets over the entire area of the Polish Carpathians (approximately 20,000 km 2 ) took 2-3 days of work without manual corrections. Automated processing required manual corrections in regions with a high share of orchards marked like forests with green colour.

Conclusions
The approach presented here is an application of existing methods helping to speed up the extraction of forest cover from large archives of historical maps stored as paper prints or paper print scans. Our methods focused on forest areas that are shown on maps with distinct colour. We received accuracies similar to results of other studies which aimed to extract twodimensional features from scanned paper maps (Wise, 1999(Wise, , 2002Herrault et al., 2013), proving the effectiveness of our approach. In cases, where graphical, structural or textural forest representation is to be taken into account, potential of other methods might be higher (Leyk et al., 2006;Boesch, 2009, 2010).
Our research showed also some directions of future refinement of the proposed forest mask extraction methodology. First, increased level of automation should be attempted to speed up processing of large collection of maps. Next, we noted a potential to implement additional rules defined with respect to contextual properties of map symbols into map processing (e.g. interpreting colours and shapes of specific map symbols as noise or significant land cover types).