Historical dataset of mills for Galicia in the Austro-Hungarian Empire/southern Poland from 1880 to the 1930s

In this article, we present the dataset of mills from 1880 and 1920s–1930s in the area of the former Galicia (78,500 km2), now in Ukraine and Poland. The data was obtained as a result of manual vectorisation from 162 map sheets at scales of 1:115,200 and 1:100,000, according to the map legends. We found 4022 mill locations for 1880 and 3588 for the 1920s–1930s. We present them as vector points in shapefile, GML, GeoJSON, KML formats with attributes for seven types of mills for 1880 and ten types of mills for 1920s–1930s, and mills counted in a 10 km grid. The data can be used in economic, demographic and environmental reconstructions, e.g. to estimate historical anthropopressure related to settlement, agriculture and forestry. Mills are often associated with river structures such as floodgates, dams, and millraces and therefore they are a good example of human interference in river ecosystems. They can also be one criteria for identifying areas where the local population used traditional environmental knowledge. It can be useful for a contemporary assessment of the environment's suitability for devices using renewable energy sources. Finally, the data on the remains of former mills is suitable for the protection of cultural heritage sites that are technical monuments related to traditional food processing and industry.


a b s t r a c t
In this article, we present the dataset of mills from 1880 and 1920s-1930s in the area of the former Galicia (78,500 km 2 ), now in Ukraine and Poland. The data was obtained as a result of manual vectorisation from 162 map sheets at scales of 1:115,200 and 1:10 0,0 0 0, according to the map legends. We found 4022 mill locations for 1880 and 3588 for the 1920s-1930s. We present them as vector points in shapefile, GML, GeoJSON, KML formats with attributes for seven types of mills for 1880 and ten types of mills for 1920s-1930s, and mills counted in a 10 km grid. The data can be used in economic, demographic and environmental reconstructions, e.g. to estimate historical anthropopressure related to settlement, agriculture and forestry. Mills are often associated with river structures such as floodgates, dams, and millraces and therefore they are a good example of human interference in river ecosystems. They can also be one criteria for identifying areas where the local population used traditional environmental knowledge. It can be useful for a contemporary assessment of the environment's suitability for devices using renewable energy sources. Finally, the data on the remains of former mills is suitable for the protection of cultural heritage sites that are technical monuments related to traditional food processing and industry.

Value of the Data
• Data on the location of the mills, with different types of propulsion and functions, is obtained for a large area in Central Europe from consistent and detailed sets of maps over two time periods. • The vectorised points of the mills were verified based on counts with sources from statistical lists. The census data on the mills did not allow them to be easily localised in the region. Thanks to the geometric data, it is possible to verify them on both sides and with other sources. • The data is available in an open GIS formats, easy to visualise and use in spatial analyses.
• The data can be used in economic, demographic and environmental reconstructions, e.g. to estimate past anthropopressure related to settlement, agriculture and forestry. It can also be helpful in assessing human interference with river ecosystems. • The point locations of mills can be helpful in identifying areas where the local population used traditional knowledge of the natural environment, and thus in the contemporary assessment of the natural environment for renewable energy sources. • The data on the remains of former mills can be suitable for the protection of cultural heritage sites as technical monuments related to traditional food processing and industry and for the protection of places of natural value.

Data Description
Mills with various types of propulsion, especially natural ones, have played an important role in the cultural landscape of many regions of the world for many centuries [1][2][3][4][5] . Their presence is associated not only with the processing of agricultural produce, wood, fabrics, and paper, but also affects various natural and social processes, such as water retention and changes in the water network (e.g. millraces) [6] , changes in the relief (e.g. dikes, ditches) [7] , creating and maintaining important habitats for aquatic organisms [8] , and activating local communities [9] .
For the northern part of Poland, a cultural landscape typology was prepared based on mills [10] , and the presence of mills could be an important manifestation of socio-economic development [11] . Old mills still attract attention due to the high potential of landscape ecosystem services [12][13][14] , and due to the need to preserve their remains as cultural heritage or natural valuable areas. The data on the mills is usually reconstructed using old maps, but also other historical sources of information such as sketches, drawings, inventories, sometimes also thanks to field research [10 , 15] .
We used two sets of historical maps to identify the locations of the old mills. The first was 53 sheets of the 1880 administrative map, 1:115,200 scale, and the second one was 109 sheets of the military topographic map, 1:10 0,0 0 0 scale, from the 1920s-1930s.
Our data contains two point layers and six grid layers (10 km side squares). All data is available in an open shapefile, GML, GeoJSON, KML/KMZ formats, commonly used in Geographic Information Systems [16] .
The "Map_year" attribute is the same for the entire 1880 set and is 1880.
According to the legend of these maps and explanations, the following types of mills can be distinguished for 1880 ( Fig. 1  For the 1920s-1930s, the following types of mills were distinguished according to the legend of these maps and explanations ( Fig. 2  The "Map_year" attribute for mills discovered on the interwar maps ranges across the years from 1922 to 1939. We took into account the years of checking the map content in the field, and if they were not given, we indicated the years of the cartographic document. For most map sheets, they are at least one year earlier than their release dates. Files: gristmills_1880_GAL_GRID; gristmills_1920_1930_GAL_GRID; sawmills_1880_GAL_ GRID; sawmills_1920_1930_GAL_GRID; windmills_1880_GAL_GRID; windmills_1920_1930_ GAL_GRID A reference grid designed by the European Environment Agency (EEA) was used to create the grid layers, consisting of cells with sides of 10 km. In the set we provide, it contains the following attributes: auto-numbered numeric identifier of the cell (FID), geometry type (Shape), cell code (CellCode), east (EofOrigin) and north (NofOrigin) cell start coordinates, and an attribute (Count) in which aggregated mill types are counted for each cell: gristmills, sawmills, windmills ( Fig. 3 ).

Experimental Design, Materials and Methods
The data collection area is Galicia, 78,500 km 2 , currently in Ukraine and Poland. The Polish-Lithuanian Commonwealth (the Crown of the Kingdom of Poland and the Grand Duchy of Lithuania) at the end of the 18th century was divided by the Russian Empire, the Kingdom of Prussia and the Habsburg Monarchy, from 1867 the Austro-Hungarian Empire. Galicia covers the territories occupied by Austria, and its borders were shaped after the Congress of Vienna in 1815, the inclusion of the Republic of Kraków in 1846 and the exclusion of Bukovina in 1849. Within such borders ( Fig. 4 ), Galicia, as a crown land, existed until World War I, and after Poland regained independence in 1918, Galicia became a part of it until the World War II. After the World War II, most of its territory remained within the borders of the Soviet Union (Ukrainian Soviet Socialist Republic), and since 1991, independent Ukraine.
We used two sets of historical maps.  Militärgeographisches Institut, MGI). Scans with a resolution of 600 dpi were obtained from the University of Vienna in TIFF format, without georeferencing. The original map projection is unknown but probably is based on cadastral maps in the scale 1:2880 in the Cassini-Soldner projection. The maps were georeferenced by us to the LAEA projection (EPSG 9820) in ArcMap 10.8 ("Georeferenced" tools). Geometric correction and georeferencing was obtained using 2nd-order polynomial transformation. For map sheets with small amounts of coverage along the Galicia borderland, a 1st-order polynomial transformation was applied. Maps of the second military survey in the scale 1:28,800 served as the basis for the georeferencing [17] . For 53 map sheets covering Galicia, 2498 control points were used and the root mean square error was 87.01 m. During vectorisation, we found another type of mill, not explained in the legend. For the geographic coordinates of these mills, the second military survey maps shows the signatures of the mills described as ship mills [18] .
In the map from 1880, in the case of five types, the mills were classified according to their purpose, raw material or product. Water mills are divided by construction method, and windmills by the type of propulsion. The Gristmill type is defined more broadly than processing of cereals and refers to agricultural produce in general, e.g. processing of tobacco (snuff mills) or hops (breweries).
Map 2. The second source for the reconstruction of how mills were distributed is the topographic map of the Military Geographical Institute (Tactical Map of Poland) in a scale of 1:10 0,0 0 0. The area of Galicia is covered by 109 sheets. It was made in independent Poland, mainly in the 1930s, but sheets from the 1920s are also available for some areas. These maps were assessed as a reliable source of information in geographic and historical research of the period before the World War II [19] . The map sheets were obtained from the website mapywig.org, which identifies the following sources of scans: the Jagiellonian Digital Library (for 103 sheets), Library of Congress (for five sheets), and the Map Storehouse of the Earth Sciences Library of the University of Silesia (one sheet). The scans of the maps had a resolution of 600 dpi and were in .jpeg format. The original map projection was based on unified quasi-stereographic projection with a central point in Borowa Góra observatory. We have georeferenced all the sheets based on Polish topographic maps at a scale of 1:25,0 0 0 from the 1970s and 1980s, and modern World Imagery high-resolution satellite images, especially helpful for the area of Ukraine. Georeferencing was obtained using 2nd-order polynomial transformation.The LAEA (EPSG 9820) was adopted as the reference system. 5167 checkpoints were used, and the RMS error was 25.24 m. According to the legend and technical instructions of this map, ten types of mills can be distinguished. Technical progress and electrification led to the emergence of new types of mills compared to 1880.
Data acquisition. Manual vectorisation of the mills on the computer screen was carried out using the ArcMap 10.8 software ("Edit" tool). At the points of the main cartographic symbols, a dot was placed indicating the mill location. In cases where there was only an inscription denoting the mill, a dot was placed on the inscription, or, if the inscription clearly referred to the building, the dot was placed on the building. A screen zoom of 1:40 0 0-1:10,0 0 0 was used. The mill type attribute was assigned for each point, from a previously prepared database domain, in accordance with the legend and map instructions.
The grid layers were developed by us to allow comparing the mills' locations for both periods due to different RMS errors. The grid layer, designed by the European Environment Agency (EEA) in the LAEA projection (EPSG 9820), was trimmed to the administrative boundaries of Galicia and joined spatially (Join Data based on spatial location) with the point layers of the mill locations. The Count option was used, counting points within each cell (a square with sides of 10 km) Validation. Point data was verified in ArcMap topology ("Data Reviewer" tools) by Duplicates rule.
The number of mills acquired by us can be compared with the studies of historians and statisticians, although data for the same period is not always available or the period is generalised. For example, for 1886, 3,439 mills for districts in Galicia [20] ( Table 1 ), compared to 3797 mills vectorised by us for 1880. At the end of the 19th century, windmills accounted for 3% of all mills [21] , compared to 6% according to our data. For the interwar period [21] , the percentage of windmills versus mills was at 4.6%, compared to 20.9% vectorised by us and respectively: water mills -75% compared to 71% according to our data, motor mills -15.3% compared to 3% vectorised using maps.