A CONCEPTUAL MODEL FOR THE REPRESENTATION OF LANDFORMS USING ONTOLOGY DESIGN PATTERNS

A landform is an area of a terrain with its own recognisable shape. Its definition is often qualitative and inherently vague. Hence landforms are difficult to formalise in view of their extraction from a DTM. This paper presents a two-level framework for the representation of landforms. The objective is to provide a structure where landforms can be conceptually designed according to a common model which can be implemented. It follows the principle that landforms are not defined by geometrical characteristics but by salient features perceived by people. Hence, these salient features define a skeleton around which the landform is built. The first level of our model defines general concepts forming a landform prototype while the second level provides a model for the translation of these concepts and landform extraction on a DTM. The model is still under construction and preliminary results together with current developments are also presented.


INTRODUCTION
Landforms are defined as "any physical feature of the Earth surface having a characteristic, recognisable shape" (MacMillan and Shary, 2009). Although such definition is quite clear and the meaning expressed by landforms is commonly understood by humans, they are difficult to formalise in a logical model that can be implemented. Their description is often qualitative, using fuzzy terms, and is not unique as it depends on people's perception, domain of expertise and cultural background.
According to (Deng, 2007), landform classification falls into two groups: on one hand set theory where components are morphometric points (more than often pixels) yielding a segmentation of the terrain, and on the other hand category theory where landforms are identified as objects. On topographic maps, landforms are mainly qualitative objects which are not explicitly portrayed on the map but interpreted by the map reader. The reader will look for salient features on the map that characterise these landforms. Currently, the problem is mainly tackled by defining dedicated generalisation methods related to a type of map such as (Palomar-Vázquez and Pardo-Pascual, 2008)'s method of topographic map spot height selection for recreational purpose and isobath smoothing methods (Guilbert andSaux, 2008, Peters et al., 2014) for nautical charts.
A first step to move towards the automatic classification of landforms is to provide a conceptual description of these landforms for their instantiation on the map. However, landforms do not correspond to crisp areas of the terrain and the uncertainty of their boundaries is still a modelling issue (Smith and Mark, 2003). Their description cannot be quantitative and may be instead qualitative as they can be represented in multiple ways according to the user's understanding and the type of representation. The problem is often tackled by developing a domain ontology formalising landform definitions. But, such an ontology would be specific to a given representation. Therefore, we propose a framework where landforms are described at two levels. At the conceptual level, * Corresponding author landforms are defined from concepts designed in a common landform prototype. At the representation level, the properties from the conceptual level are translated into geometrical and topological properties that can be implemented according to the required type of representation (e.g. raster or vector map).
This paper contributes to landform classification and multiple representation of terrain models by presenting a framework for qualitative description of landforms which could be used for data enrichment (where landforms can be added as objects in the topographic database) and for spatial qualitative reasoning where landforms can be described and represented according to a purpose or a context. The remaining of the paper is organised as follows. Section 2 reviews recent works on qualitative aspects of landform representation, including landform definitions and ontologies. The new framework is proposed in Section 3. It is divided into a conceptual level, where the landform prototype is introduced, and a representation level where landform concepts are translated following existing data structures. The last two sections present preliminary results and discuss further developments.

Qualitative description of landforms
Following (Strobl, 2008), terrain segmentation methods are traditionally data-centric approaches while the object perception is based on semantic-centred concepts with a strong link between visual perception of landforms and natural language. Landforms are usually associated with salient terrain features and not with their boundaries which are not always well-defined. For example, the presence of a mountain is easily associated with the existence of a peak significantly higher than its surroundings but there is no consensual definition of its spatial extent or of the difference between a hill and a mountain.
Semantic concepts describing landforms are usually fuzzy and difficult to conceptualise although the meaning they express is ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume III-2, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic commonly understood by humans. This gap is addressed by (Mark and Smith, 2004) as the qualitative-quantitative divide. Furthermore, landforms are not perceived in the same way and it is not possible to provide a common set of landforms with universal definitions since the meaning of each term depends on the perception of the readers, which is related to their cultural background, past experience and the current context.
This problem is illustrated in (Harris et al., 2014) who provide a classification of the ocean geomorphology from a grid model of the ocean floor. For each type of feature considered, different definitions such as for sills and basins may be provided according to the area of interest. As an example, submarine canyons are not defined in the same way on the west coast and the east coast of North America: while a canyon can "extend over a depth range of at least 1000 m and to be incised at least 100 m into the slope at some point along their thalweg" (Harris et al., 2014), in the St-Lawrence estuary, canyons are much shorter, with a depth below 300 m (Normandeau et al., 2015).
The definition can also evolve within a domain. In its standardisation of undersea feature names, the International Hydrographic Organisation defined in 2008 a caldera as "a collapsed or partially-collapsed seamount, commonly of annular shape" (IHO, 2008). In its later definition (IHO, 2013), a caldera is "a roughly circular, cauldron-like depression generally characterized by steep sides and formed by collapse, or partial collapse, during or following a volcanic eruption". While the feature described is the same, the definitions rely on the interpretation of different terms and so on different implicit knowledge. The second definition gives also more importance to genetic implications requiring some geological or geophysical evidences.
Indeed, a main difficulty is that, as defined in naive geography (Egenhofer and Mark, 1995), "while many spatial inferences may appear trivial to us, they are extremely difficult to formalise so that they could be implemented on a computer system". Among the different elements of naive geography taken from (Egenhofer and Mark, 1995), some of them require specific attention for the definition of landforms: • Geographical information is frequently incomplete. People can reason and compensate for missing information. As said above, landforms are perceived from their salient features without a complete spatial description. Landform representation also includes inferences from thematic properties (e.g. geomorphological processes) and implicit knowledge.
• People use multiple conceptualisations of the geographical space. These conceptualisations come from differences between cognitive spaces as perceptions vary with individuals. They may also relate to a context: a submarine canyon is not conceived in the same way by a geomorphologist (who sees it as the result of a geomorphological process) and a fisherman (who sees it as a potential fishing area) but also from the scale at which the observation is carried out.
• Geographical space has different levels of detail; these levels can be levels of granularity or levels of scale at which phenomena are represented. Levels of representation are defined by the user's context and the purpose of the representation. Granularity in landforms is expressed in taxonomies yielding general and specialised landforms usually organised in a lattice. For example, the mountain and hill concepts can be defined as two specialisations of a prominence concept.

Landform ontologies
A solution to address qualitative reasoning and description of landforms is the use of ontologies to provide conceptual definitions tractable by a computer system. Much work focused on domain ontologies characterising specific landforms, for example valleys (Straumann and Purves, 2011), bays (Feng and Bittner, 2010) and prominences (Sinha and Mark, 2010). They define for each landform geometrical variables that can be measured from a map or a terrain model. However, these variables are specific to each landform category where a specific context was identified previously and cannot be generalised into a common framework.
National mapping agencies have worked on the development of ontologies describing cartographic objects (Gómez-Pérez et al., 2008). However, these ontologies focus on data integration from different sources and do not provide a formal description for reasoning. In the hydrographic domain, (Yan et al., 2014) define an ontology of undersea features following the International Hydrographic Organisation terminology (IHO, 2008) according to the framework defined by (Fonseca, 2001). Its purpose is to allow for the automatic classification of undersea features on nautical charts. It is divided into a domain ontology which describes undersea features from the IHO nomenclature by a series of shape properties and topological relationships, and a representation ontology where features are elements of the chart as portrayed by isobaths and soundings. The set of undersea features is organised into a taxonomy providing descriptions at different levels of granularity. (Yan et al., 2014) explicitly separate the representation from the definition, but feature definitions are based on glosses from the IHO with ambiguities from natural language definition and where implicit knowledge is not expressed. Both ontologies are defined for specific contexts and modifying the context requires the definition of new ontologies.
In order to facilitate the development of such ontologies, a framework shall be provided so that ontologies can be generated following a common pattern. Therefore, the objective of this paper is to propose a conceptual framework that helps constructing landform ontologies from a generic landform prototype that can be categorised according to the context.

Overall view
The proposed framework is based on the fact that landform definitions depend on the context including the user's field of expertise and the purpose of the representation. Therefore, each domain ontology of landforms as observed in the previous section does not provide an absolute description of landforms but a representation associated with a frame of reference within which the description is used.
We propose a framework defined in two levels. First, the conceptual level describes the main concepts structuring landforms and the context. Landforms are derived from a landform prototype which is defined as an Ontology Design Pattern (ODP), a small and easily reusable ontology (Gangemi and Presutti, 2009). Elements specifying the context define a frame of reference that characterises the type of representation in a similar approach to map generalisation where map specifications are inferred from user requirements (Balley et al., 2014). Second, the representation level introduces concepts required to represent the landforms on a DTM relying on the inherent topological structure of the DTM. The objective of this framework is to move towards  Figure 1: The framework with landform description at conceptual and representation levels.
a model allowing first for the generation of domain ontologies where a lattice of landforms can be designed from a context and second for the instantiation of these ontologies. The framework is summarised in Figure 1 and its main characteristics are discussed in the following section.

Context and domain knowledge
The objective of the framework at the conceptual level is to define a conceptual model describing all the landforms to be considered. Landforms are obtained by abstracting the knowledge defined in the domain knowledge and the context and are specialisations of the landform prototype from the ODP. The result is a landform lattice forming a concept hierarchy where the landform prototype is the root.
The domain knowledge contains the terminology specific to a community of users. Depending of the level of expertise, it can be a terminology agreed upon and used by domain experts or can be taken from common language definitions related to a purpose. As an example, in the maritime community, the terminology pushed forward in (IHO, 2013) can be used as a reference as it has been agreed upon by experts. In other domains, specifically addressing non expert users, the terminology may come from a corpus of terms collected from users or from definitions from resources such as Geowordnet (Giunchiglia et al., 2010) which provides a series of concepts reducing the ambiguity of natural language definitions.
The context relates to the purpose and user profile. The purpose is related to the task the representation is designed for, narrowing the domain and fixing the level of expertise. The user profile shall include the cultural context such as the language or cultural background of the user. Setting the context defines the frame of reference in which the representation is done, providing knowledge on the information content (Lüscher et al., 2007). It includes the different levels of detail (or scale) at which landforms are described. Scale can refer to different terms describing spatial data characteristics. For (Dungan et al., 2002), scale includes the resolution of the observation, the grain and the cartographic ratio. We think that these terms need to be instantiated from the context. Granularity also relates to the map purpose and depends on the user's level of expertise and on the language used.
3.2.2 The landform ODP As mentioned in the previous section, although landforms are not clearly delineated, they are characterised by salient features which are perceived by people, and vague landform core region A°w ide boundary cA Figure 2: Location of a landform in a regional partition (adapted from (Bittner, 1999)).
hence reveal their existence. In order to provide a formal description that applies to different contexts, a landform prototype is defined in an ontology design pattern as a reusable ontology. Two kinds of landforms are considered: elementary and complex landforms; hence two prototypes sharing similar properties are defined.
Elementary landforms are defined by their salient features only. For example, a mountain is characterised by its summit while a canyon is located on the map by its course line. We consider that these salient features are intrinsic structural components on which landforms lie and whose definition agrees with the principle of naive geography. Skeletons are mainly points or lines which would correspond to topographical features such as summits or ridge lines. They can also be lines defining a break of slope or delineating an homogeneous area as in a cirque or in a plateau where the skeleton can be a ring surrounding the flat upland part.
Complex landforms are not characterised by their saliences but by a specific arrangement observed over a terrain. This is mostly the case for compound groups of landforms such as mountain ranges whose existence depends on the existence of several individual mountains. Mountains are characterised by their summits and connected by their ridge lines. Hence, a skeleton is also defined in these landforms as it provides a topological structure supporting the landform from which shape characteristics can be extracted such as the orientation. Skeletons also provide the support for a topological structure connecting landforms together and allowing for further reasoning based on the spatial configuration they provide.
However useful skeletons can be in landform characterisation, they are not sufficient for a full description of landforms. People think mostly about space in terms of regions rather than points and lines (Hobbs et al., 2006). They would not locate the summit or the course line as a point or a line but as regions built around these elements. Hence these salient features are perceived as salient regions built around the landform skeletons. A salient region does not cover the whole landform but only a part of it. The remaining of the landform belongs to the vague region where the boundary is located. As a way to handle vagueness and indeterminacy of locations, (Bittner, 1999) located vague objects within a partition in three regions: the core, the wide boundary and the exterior. These three regions are used to provide the rough location of a landform (Figure 2). The wide boundary does not correspond to a fuzzy boundary but rather to a region in which the boundary is included but whose location is not known.
The core concepts of our landform ODP are summarised in Figure 3. The skeleton is defined by a geometrical shape and by spatial constraints. Constraints reflect spatial properties and relationships that apply to the skeleton which need to be expressed in a formal language. The core region and wide boundary of a complex region is not the union of its composing elementary  As landforms can be characterised by their spatial relationships with other landforms, a set of elementary relationships needs to be defined in the ODP. Similarly to landforms, new relationships shall be defined by refining the definitions based on the domain requirements. These relationships are defined from the spatial relationships between their components, i.e. relationships between skeletons and between core regions and wide boundaries.
Due to the vagueness of their definition, the basic set of relationships applied to core regions and wide boundaries is limited to the RCC-5 set (Cohn et al., 1997). For a landform A, we denote A • its core region, ∂A its wide boundary andĀ = A • ∪ ∂A the area occupied by both its core region and wide boundary. For two landforms A and B we identify the following three relationships: A is fully contained within B ⇔Ā is a proper part of B • (1) These three relationships provide the core set of relationships defined in our ODP from which further relationships can be derived. Derivation can be done by composing or by specialising the relationships, but also by including contextual elements and renaming the relationships according to the domain.

Representation level
Once the conceptual model is designed, the next step in the framework is to move to the representation level where landforms and their components can be identified from a DTM. Depending on the representation, this DTM can be a raster grid, a TIN or a set of contours and spot heights for example. The main role of the proposed DTM ODP is to provide an interface so that the translation between the conceptual level and the representation level can be conducted similarly for any kind of representation.
The first concern at this level is to translate the skeleton definitions so that they are extractable from the DTM. Skeletons are topological structures joining critical points and lines of the terrain that shall be extracted from the DTM. The most common topological data structures that fit with the definition of the skeleton are the surface network and the Reeb graph . The surface network is a planar graph formed by the critical points (peaks, pits, saddles) and the critical lines (ridge lines and valley lines) of the terrain (Figure 4). The Reeb graph is the dual of the surface network and provides a hierarchical structure of the critical points. The Reeb graph is also topologically equivalent to the contour tree hence such kind of topological structure can be extracted from any kind of DTM representation.
As a topological structure, the surface network has to observe some constraints related to the meaning carried by its critical points and lines. Hence, a ridge line always connects a saddle and a peak and a valley line always connects a saddle and a pit. Another rule to observe is the Euler-Poincaré rule stating that: Several algorithms were developed to extract the surface network of a terrain model. Most of them apply to TIN. They usually work by extracting first critical points and second critical lines joining the critical points or by growing regions whose boundaries are the critical lines. These different approaches are discussed in (Čomić et al., 2014). (Sinha et al., 2014) provide a surface network ODP where they use descriptive logic to define the ontology concepts and express topological constraints. Such an ontology can be adapted to our framework but hierarchical relationships between landforms need to be considered as they affect the classification. For example, a peak can be the summit of two prominences which correspond to two representations at different scales. The surface network must be extracted while considering the scale defined in the context. However, since the context can include different levels of representation, the model shall be able to handle landforms represented at different levels and relationships between these levels. For that purpose, the ODP shall be extended to include multiple representation and make available simplification tools such as those proposed in (Rana and Morley, 2000) and (Danovaro et al., 2003). Simplification is done by removing points considered not relevant and adjacent critical lines to maintain the topology. For example, if a peak is removed, a pass has to be removed.
For each kind of landform, the definitions of its core region and wide boundary can be refined since they relate to its shape and complexity. For example, in a valley, the core region can be defined by the valley floor and the wide boundary by the sides of the valley, which fits with the fact that the boundary shall be located on its sides without a precise location (Straumann and Purves, 2011). As another example, the core region of a plateau would be the flat horizontal table while the wide boundary would be defined by the areas corresponding to its steep slopes. However the definition may vary with the representation and the accepted degree of vagueness. The landform limit is contained in this wide boundary. The wider the boundary, the more vague the location of the limit. Definitions of both regions would be based on some terrain descriptor, on contours or on some critical lines. For example, in the case of a prominence, one can directly use the valley  lines surrounding the summit to define a crisp boundary (which would be a polygonal line) and a core region delineated by these boundaries as in (Sinha et al., 2014). Boundaries can also be defined by a contour line around the summit related to a given level of detail (Guilbert, 2013) or related to a morphometric classification (Chaudhry and Mackaness, 2008).

PRELIMINARY RESULTS
We currently work on an application of our conceptual framework to submarine geomorphology. The objective is to characterise submarine canyons and other kinds of valleys in the St-Lawrence estuary on the east coast of Canada using bathymetric data. Data were obtained from multibeam sounding and cleaned to produce a 25m resolution image of the bathymetry. Current work consists in developing the domain knowledge with the assistance of geomorphologists. The context is mainly set by the area of study since canyon dimensions as well as the scale at which to work are fixed by the estuary dimensions.
Starting from a qualitative description, a conceptual definition was built ( Figure 5). It is based on general terms that can characterise the canyon in comparison with other undersea features according to our landform ODP (Figure 3). A canyon is perceived as a long narrow steep-sided valley that we decompose mainly into its bottom marked by a thalweg line and its walls. The skeleton is hence defined by the thalweg line which provides the location and orientation of the canyon. In the St-Lawrence estuary, canyons are short with a height difference of no more than 300m. The core region corresponds to the narrow band around the thalweg line and shall have a relatively flat cross-section. Finally, the walls around the bottom define the wide boundary which may extend to the next ridge line or to some significant break of slope. This bounding line does not necessarily mark the boundary of the canyon but a noticeable topographic element that marks a limit or is outside the canyon.
Translation to the terrain model is done by identifying the skeleton, core region and wide boundary of the canyons on the DTM. Both conceptual model and representation were obtained through discussion with geomorphologists. Definitions were tested and adapted in order to match with manual classifications they provided. Currently, full extraction has not been completed and only Peak Pit Saddle Regular point Figure 6: Detection of critical points. Black points: elevation is lower than the red point. White points: elevation higher than the red point.
skeleton extraction has been implemented. The process first extracts the surface network of the whole area and second simplifies the network to the appropriate representation level.
The raster image was transformed into a triangulated irregular network by selecting VIP points in order to have a topological vector structure. Algorithms from (Takahashi et al., 1995) were applied to extract critical points and lines. Critical points (peaks, pits and passes) were first identified by classifying neighbouring points which are higher or lower ( Figure 6). If all neighbours are higher, the point is a pit; if they are all lower, it is a peak. Passes are points whose neighbours can be lower or higher than the point but which are organised so that when marching around, we move at least twice from one point lower to a point higher than the saddle. A saddle can be of different multiplicities. A simple saddle connects two ridge lines and two valley lines. A double saddle connects three ridge lines and three valley lines and so on. In the case two neighbour points were at the same elevation, a priority rule was set, choosing always the further right point as the highest. This arbitrary but simple rule guaranteed the robustness of the algorithm in order to validate the Euler-Poincaré rule (equation 4). This rule applies to domains forming a close 2D manifold, homogeneous to the surface of a sphere,. As our domain of study contains holes, corresponding to islands, virtual pits were added to close the domain.
Ridge lines are extracted by starting from a saddle point and moving along the TIN edges from one point to a higher neighbour along the steepest slope until reaching a peak. Similarly, valley lines are extracted by moving downward along the steepest slope until reaching a pit. We also applied the algorithm from (Takahashi et al., 1995) although we did not use the elevation difference as a criterion but the steepest slope as in (Bremer et al., 2003).
Simplification was done by applying the approach of (Rana and Morley, 2000) where non significant critical points are removed leading to the removal or the merge of critical lines (both valley and ridge) in order to maintain the topology. Only the network structure was simplified, the triangulation was not modified. Simplification is done by measuring the difference of elevation between each peak (respectively pit) and each saddle connected by a ridge line (respectively valley line). If this difference is too small, the peak (respectively pit) and the saddle are removed. The purpose of this step is to remove elements corresponding to too small variations of the terrain and preserve variations at the appropriate scale.
Valley lines that can belong to a canyon are identified by measuring the differences with the surrounding peaks. Skeletons are built by aggregating consecutive valley lines meeting this definition. Starting points and ending points of the skeleton are assigned by checking the slope difference between the valley liens. Let's note one pit p1 connected to two saddles s0 and s1 by two valley lines and the two pits p0 and p2 connected to these saddles (Figure 7). We note d0 and d1 the average slopes between p0 and p1 and between p1 and p2. If the ratio d 0 d 1 is close to one, the slope is homogeneous on both lines, and the saddle can be removed; otherwise it indicates a change of slope corresponding to the transition between the shelf and the slope or between the slope and the floor of the estuary and the saddle must be kept. Due to the lack of a precise quantitative definition and the influence of several parameters on the definition of valley lines, the approach was iterative where parameters are refined after being evaluated by geomorphologists.  Slopes on the sides of the estuary are relatively steep and become gentler when moving closer to the bottom. As it can be seen from the contours, canyons cut through the slope towards the bottom. Figure 9 shows the same area after simplification. Critical points that do not fit with the scale were removed and valley lines merged together to identify thalwegs that fit with the skeleton definition of Figure 5. Indeed, thalwegs identified on Figure 9 correspond to all skeleton lines of canyons, but also to other kinds of channels whose skeleton definition also matches the canyon's definition. Further distinction between them shall be made by extracting the core region and wide boundary as channel core region shall be larger and their boundaries shall not be as steep as for canyons. The skeletons extracted in our approach were shorter than those identified manually. This difference comes from the definition chosen for the conceptual model: canyons being defined by the height difference, only the part located on the continental slope was identified. The extension of the canyon in the flat seabed of the estuary was not considered. This extension is not explicit in the definition however it still is a part of the canyon where sediments are transported. Hence although the current definition seems appropriate for the extraction of canyons in a general context, a more elaborated definition may be required when addressing the needs of experts in submarine geomorphology.

PERSPECTIVES
Landforms classification from a terrain model is still a difficult task because of the subjectivity of their definition. Considering that landforms are indeed always described within a given context, this paper proposes an organisational pattern for landform representation where landforms are structured at two levels. At the conceptual level, landforms are abstracted from a domain knowledge according to a context. The main idea is that landforms follow a prototype built upon three components: the skeleton, the core and the wide boundary of the landform. At the representation level, the model is translated into concepts related to a particular kind of representation. The purpose is to provide a logical model that can be implemented in order to extract landforms from a DTM.
Preliminary results show that the surface network is a robust structure to extract salient features forming the skeletons of the considered landforms. Further work is still required for the definition of the core region and wide boundary. The objective here is not only to fix a set of parameters for canyons but also to identify and qualify the list of parameters in the current context in order to use them in the classification of other landforms. The model will then be extended to the general definition of landforms.
For that purpose, a more rigorous description could be provided using a formal description language to avoid current ambiguities. At this point, we only addressed the main concepts. Our next objective is to provide concepts and relationships to formalise a complete terminology. Once such a model is defined, mechanisms translating the conceptual model to the representation level shall be investigated with the objective to automate the process as much as possible.