1 Introduction

A map is an abstract representation of the real world (Brewer et al. 2003; Goodchild 1999; Li et al. 2007; Chen et al. 2011; Petrovic 2003). Map symbols are usually considered as the language of a map (Yamada 1993; Tao et al. 2007; Robinson et al. 2011; Che et al. 2013) and used to represent real spatial phenomena (Qin et al. 2000; Stefanakis 2002; Tao et al. 2005; Dang et al. 2011; Schlichtmann 2004, 2009). Therefore, the symbolisation of spatial entities is a crucial procedure in cartography (Comentz 2002; Tsoulos et al. 2003). GIS provides powerful mapping capabilities and functions for spatial data management (Frehner and Brandli 2006; Gustavvson et al. 2006; Cheng and Zhang 2012; Zou et al. 2012): it is increasingly used in a wide range of applications. However, the same spatial object may be symbolised differently in different mapping standards. For instance, in the petroleum industry, various map standards are used, i.e. the Dutch Shell standard, the United States Geological Survey (USGS) standard and the Chinese Petroleum standard: as such, the “oilfield” object is symbolised differently (Fig. 1). The presence of different mapping standards puts obstacles in the path of spatial data exchange and information sharing and decreases user efficiency when map reading.

Fig. 1
figure 1

Different expressions of an “oilfield” object under different mapping standards

In practical work, adopted maps come from various sources with different mapping standards (Qin et al. 2000; Brewer et al. 2003; Gustavvson et al. 2006) (hereunder, the map refers to the vector map made by a GIS platform, *.mxd format). To ensure consistency of the information contained in such maps, it is necessary to convert the maps according to a special mapping standard (e.g. convert USGS standard into Chinese Petroleum standard and vice versa). The conversion of a map between different standards is in fact the conversion of the symbols used therein. Theoretically speaking, each symbol in a map has its own specific code or name (Tao et al. 2005; Nass et al. 2011; Fan et al. 2011; Robinson et al. 2011), and it can be converted conveniently if the corresponding codes or names of the symbols are known. Where this is not the case, actually only the shape style and rendered information for each symbol are preserved in the map, while the symbol code and name are absent; thus, it is impossible to realise automatic batch conversion for a vector map according to the symbol encoding information. Therefore, the first question of symbols standard conversion is how to identify the symbols within the map for conversion. In order to solve this question, scholars have carried out a number of studies on symbol recognition and proposed a few technical solutions including the methods of statistics structure, template matching, neural network, line tracing and mathematical morphological (Yamada 1993; Zeng 2003; Llados et al. 2001a, b, 2002; Yang 2005; Liu et al. 2007; Wan and Liu 2007; Guo et al. 2012; Xie and Zhang 2014). However, the principle of these methods is so complex that it is difficult to be carried out. Additionally, a high requirement is need for the symbols recognition in these methods; hence, only a few symbols can be identified, resulting in a low recognition efficiency for real-time performance. Some scholars have proposed other ideas that are focused on symbol data structure, and established the description models of symbols based on various principles, such as XML, GML, SVG and TrueType, to design the universal map symbols (Yin et al. 2004; Tao et al. 2005; Antoniou and Tsoulos 2006; Mihalynuk 2006; Qin et al. 2008; Li et al. 2009; Chen et al. 2014). In theory, these approaches can provide favourable conditions and enlightenment for symbol recognition; however, only the description and design of symbols is discussed in this research, and no model or method is established yet. So far, research about the automatic recognition and conversion of symbols has not been reported. Currently, all the symbols within a map can only be manually converted one by one, which is not only laborious and time consuming but also burdensome, and one needs to understand the meaning of each symbol within the map in advance. If not, errors in symbol expression will arise, and the message from the map cannot be truly reflected. To accomplish symbol conversion rapidly and accurately, an approach using symbol recognition and automatic matching is proposed, and a conversion system integrated with the technical solution and an ArcGIS Engine platform is developed in this work: it aims to realise automatic batch conversion of the symbols within vector maps. In addition, as a case study, a practical map is converted among three petroleum standards (the Chinese Petroleum, Shell and USGS standards), which are frequently adopted in the petroleum industry, so as to illustrate the conversion accuracy and applicability of the conversion system.

2 Basic principles of symbol recognition and automatic conversion

2.1 Recognition of symbols in a map

As a result of symbols being graphical representations of spatial entities, the automatic conversion of symbols between different standards can be realised, as long as the corresponding symbol codes (names) of the same spatial entity in different mapping standards are clear (Stefanakis 2002; Schlichtmann 2004). Based on this idea, the corresponding relationship between symbol codes (names) in different mapping standards can be established, and then the mutual conversion of symbols between different standards can be realised by using the established relationship. So if the symbol for a given spatial entity is known, the corresponding symbol in the target mapping standard can be found through the relationship. Nevertheless, the symbol codes (names) are usually absent in maps made by common GIS platforms, only the symbol styles and rendering information are stored. However, the symbol libraries in GIS platforms store not only symbol codes (names) but also symbol styles and rendering information. Therefore, the critical problem to be solved is to identify the special symbols of spatial entities from a map and match them with the symbols stored in symbol libraries to get the codes (names) of the symbols in the map. To recognise a symbol, a pixel-by-pixel matching approach is proposed in which the symbols in a map, and those in the symbol library with the same mapping standard, are firstly transformed into BMP images, and then the symbol images within the map are matched with images from the library, to find out the correct symbol.

Table 1 shows an example of the recognition of the “oilfield” symbol in the USGS mapping standard.

Table 1 Symbol matching and recognition

2.2 Automatic conversion of symbols

Based on each mapping standard specification and symbol library in specific industries, the corresponding relationship to the same spatial entity is set up through the bidirectional mapping of the symbol codes (names) in different mapping standards, and the symbol codes (names) can be used as keywords to connect different mapping standards (Table 2). According to the identified symbol codes (names) and the relationship, the corresponding symbol codes (names) in other standards can be found, and then those symbols meeting the target standard can be obtained from the symbol library. The symbols in the converted map can be replaced by the target symbols to accomplish the conversion but accurate symbol recognition is crucial to the success of any conversion.

Table 2 Associated symbol mapping table

3 Approach of symbol recognition and automatic conversion

According to the aforementioned principle, the procedure of automatic symbol conversion can be deduced as follows: symbol association table construction, symbol matching, recognition and symbol standard conversion. Firstly, the corresponding relationship between symbols in different mapping standards should be constructed according to the symbol codes (names). Secondly, each symbol used in a map should be matched with the symbols stored in the symbol library that have the same standard and category as the map, so that the code (name) of each symbol can be acquired. Finally, based on the relationship of symbols across different mapping standards, the obtained codes (names) are used as keywords to deduce the corresponding target symbols, and then the target symbols found are used to symbolise the spatial entities expressed thereby. The “oilfield” symbol used in a USGS standard map is taken as an example to illustrate the implementation (Fig. 2): the other mapping symbols underwent the same type of conversion.

Fig. 2
figure 2

Flowcharts for symbol recognition and automatic conversion

3.1 Construction of symbol association tables

Constructing the symbol association tables for different mapping standards is the basis of the symbol standard conversion. Actually, it is a bidirectional mapping relationship for symbols in different mapping standards, or say, the different symbolisations of the same spatial entity, and the codes (names) of symbols are the unique keywords with which to associate the tables. The tables are stored in the database according to the structure outlined by Table 2. To improve query efficiency, three tables are established corresponding to the point, polyline and polygon spatial entities.

3.2 Automatic matching and recognition of symbols

The automatic symbol matching and recognition is the key to the scheme designed in this work. The purpose of this stage is to get the code (name) of the symbol from the USGS standard symbol library. To enhance the query and matching efficiency, the symbol can be matched with those sharing the same category. The automatic symbol matching and recognition procedure is as follows:

  1. (1)

    To analyse the symbol to get symbol information. The “oilfield” symbol can be analysed by a compiled computer program to obtain the symbol style (e.g. shape) and rendering information (e.g. size, fill colour, texture, etc.), which varies with different symbol categories.

  2. (2)

    To generate a BMP image of the symbol. Based on the style and rendering information from Step (1), a BMP image of an appropriate size is drawn. The image size should be larger than the shape of the symbol itself, but not be so large as to influence the matching efficiency. In this work, the size is set to 100 × 100 pixels.

  3. (3)

    To generate BMP images of the symbols in the USGS standard symbol library. Firstly, search the symbol library to find symbols within the same category as the “oilfield” symbol. Namely, if it is a point symbol, then find all the point symbols from the standard symbol library; if it is a polyline symbol, then obtain all the polyline symbols from its library and so on. Secondly, generate BMP images in the same way as Step (2) according to the style and rendering information from Step (1).

  4. (4)

    To match each pixel one by one and recognise the symbol. Match the picture generated in Step (2) with each picture obtained in Step (3) by pixel matching. After satisfactory matching, the code (name) of the “oilfield” symbol in its standard symbol library can be obtained according to the matched results.

3.3 Conversion of symbol standard

According to the symbol code (name) obtained in the “automatic matching and recognition of symbols” process, the corresponding target symbol code (name) can be found by querying the relationship table. To improve the query efficiency, the corresponding relationship table will be searched according to the category of the “oilfield” symbol: if the symbol is a point, the point relationship table will be searched; if it is a polyline symbol, then the polyline relationship table would be searched and so on. Based on the target symbol code (name), the target symbol can be found in its standard symbol library and can be rendered using the relevant information for an “oilfield” symbol. Subsequently, the rendered symbol is used to visualise the corresponding spatial entity to complete symbol conversion. Each symbol used in the map can thus be converted, and symbol standard conversion between different mapping standards can be realised.

4 Verification of the symbol recognition and automatic conversion approach

4.1 Basic data preparation

There are three mapping standards in the petroleum field: the Chinese Petroleum standard, the Dutch Shell standard and the United States Geological Survey (USGS) standard. Therefore, the conversion of a practical map from the petroleum field among these standards was selected as the study case to test the conversion accuracy and working efficiency of the developed system in this work. The basic data for the conversion of a map among three petroleum standards are mainly the mapping specifications and symbol library files of the three standards, symbol association tables, and a map based on a standard. The mapping specifications, symbol library files (*.style format) and a map based on the Chinese Petroleum standard are provided by the Research Institute of Petroleum Exploration and Development (China), and the symbol library files (*.style format) are converted into files (*.serverstyle format) that can be recognised by the ArcGIS Engine platform. The files with *.serverstyle format are stored in the blob field of an Oracle database. According to the three symbol libraries and the different symbol codes (names) of the same spatial entity in the three standards, the symbol association tables are constructed in the Oracle database. Based on the library files (*.serverstyle) and the association tables, the map can be conveniently mutually inter-converted among the three standards.

4.2 Verification of results

A map based on the Chinese Petroleum standard was selected for conversion to an equivalent map based on Shell and USGS standards, respectively (Fig. 3). The converted results show that the conversion is rapid, highly efficient, and most of symbols on the map, e.g., point symbols such as oilfield, gasfield, provincial capital, capital city and ocean; polyline symbols such as national borders, gas and oil pipelines; and polygon symbols such as oil basins, can be converted accurately. Only the point symbol for a local city and the polygon symbols for an oil field and small basin are not converted into their corresponding symbols (Table 3). Compared with the previous methods mentioned before, this proposed technical solution improves greatly in both conversion accuracy and working efficiency, and consequently, it can satisfy real-time demands and improve the conversion performance. According to the analysis of the results, and the conversion principle used, it can be deduced that if a symbol with a special style, namely only one symbol has the style in its symbol library, the symbol can be automatically matched and recognised. Based on the analysis of all the symbols in the three standard symbol libraries assessed, most of the point and polyline symbols have different styles, while some polygon symbols have the same style but with different sizes or colours. Therefore, most of the point and polyline symbols, and some polygon symbols, can be automatically matched and recognised by the conversion system. After the map was converted, human intervention is needed to manually match the several symbols that are not automatically converted by the system and to completely accomplish the conversion. In consideration of the fact that the manual matching method has several problems, the symbols that completely matched the unrecognised symbols can be displayed in list format, so that the cartographer can choose the appropriate symbol to accomplish the matching and conversion.

Fig. 3
figure 3

Conversion results for the map based on the Chinese Petroleum standard. a Chinese Petroleum standard map. b Shell standard map. c USGS standard map

Table 3 Conversion results for map symbols based on the Chinese Petroleum standard

5 Discussion and conclusions

5.1 Discussion

The conversion system makes it possible to automatically convert vector maps between different mapping standards, so that vector maps can be conveniently converted from one standard to another by the system. So information contained in the converted map can be rapidly, accurately, and efficiently obtained. In addition, the feasibility of the proposed method and the reliability of the conversion system are proven by its practical operation. Nevertheless, in the process of symbol matching and recognition, some symbols (especially polygon symbols) with the same style and different sizes cannot be automatically identified, and manual intervention is needed. This suggests that the expressed meanings of the symbols should be clarified in advance, so as to choose the corresponding target symbols to allow complete conversion. However, in some mapping standards, the meanings of symbols may not be obtained through their codes (names), and manual intervention is essential. To quickly and accurately choose the appropriate match for the converted symbols, a solution may be found as follows:

Based on each mapping standard in the field, a symbol table can be established to delineate the meanings of the symbols in the standard. When manual operation is needed in the matching process, the thumbnail images, names and meanings of the symbols that completely matched the converted symbol can be quickly and accurately obtained and displayed in a message box by querying the corresponding symbol table according to the appropriate standard (Fig. 4). To enhance the query efficiency, three types of symbol tables can be established to delineate the meanings of the symbols in each mapping standard according to the symbol categories (point, polyline and polygon).

Fig. 4
figure 4

Symbols information prompt window for manual operation

From the aforementioned discussion, it may be deduced that the proposed technical solution could be classified as a semi-automatic conversion. In the next work, research into symbol analysis will be carried out to further improve and optimise the method of automatic symbol recognition, especially the recognition of polygon symbols, so that the conversion of maps between different mapping standards can be realised more rapidly and accurately.

5.2 Conclusions

A novel approach for symbol recognition and automatic conversion is proposed, and according to the approach, based on the ArcGIS Engine platform and the integrated development environment of Visual Studio 2010, a conversion system for a vector map standard is developed by using component object model (COM) technology. Taking the conversion of a practical map based on the Chinese Petroleum standard among the three standards frequently adopted in the sector as a test case, the applicability and conversion accuracy of the system have been analysed. Based on this analysis, the following conclusions are drawn:

  1. (1)

    The conversion system has high applicability and universality. Besides the petroleum field, this system also can be applied into fields that have different mapping standards, such as the field of electricity, architecture, traffic and engineering. As long as each mapping standard specification and each symbol library file are known, the symbol association tables can be constructed in the database. Based on the tables and the symbol libraries, the conversion system can be used to convert vector maps between different mapping standards. Additionally, practical operation shows that the system has good encapsulation and is easily maintained and expanded.

  2. (2)

    Most of the symbols in the map could be automatically converted by the conversion system: it therefore has high conversion accuracy. The quality and efficiency of vector map conversion between different standards has been improved, and the previous challenge facing vector map standard conversion was overcome.

  3. (3)

    Some symbols in a symbol library may have the same style and different sizes or colours, especially polygon symbols. These symbols cannot be automatically matched and identified. In these cases, human intervention is needed to manually choose the appropriate symbol match. This is the main reason for the occasional incomplete vector map standard conversion when using this system.