MODELING AN APPLICATION DOMAIN EXTENSION OF CITYGML IN UML

This paper presents key aspects of the development of a Dutch 3D standard IMGeo as a CityGML ADE. The new ADE is modeled using UML class diagrams. However the OGC CityGML specification does not provide clear rules on modeling an ADE in UML. This paper describes how the extension was built, which provides general insight how CityGML can be extended for a specific applications starting from the UML diagrams. Several alternatives for modeling ADEs in UML have been investigated and compared. The best suited for the 3D standard option is selected and applied. Open issues and challenges are discussed in the conclusions.


INTRODUCTION
Recently a national 3D standard has been established in The Netherlands as a CityGML ADE (Van den Brink et al 2012; Stoter et al 2011).This ADE completely integrates CityGML with the existing national Information Model for large scale topography (called Information Model Geography or 'IMGeo').IMGeo is modeled in UML (Unified Modeling Language), and contains object definitions for large scale representations of roads, water, land use/land cover, bridges, tunnels etc. and their properties, and prescribes 2D point, curve or surface geometry for all objects.As the new version of IMGeo (version 2.0) is completely integrated with CityGML, 2D IMGeo data can be extended into 2.5D (i.e. as height surface representation) and 3D (as volumetric representation) according to geometric and semantic principles of CityGML.
In line with the Dutch practice of modeling geo-information in UML the IMGeo CityGML ADE is modeled using UML class diagrams.However, the CityGML specifications do not provide rules or guidance on modeling an ADE in UML (CityGML ADE, 2012).It describes only how an ADE must be modeled in the XML schemas.
Based on the lessons learned from developing the CityGML-IMGeo ADE, this paper describes how CityGML can be extended for specific applications starting from the UML diagrams.A complete description of the CityGML-IMGeo ADE can be found in Van den Brink et al (2011).This paper summarizes the technical modeling, i.e. how the UML models of CityGML can be extended to also support the concepts defined in a specific domain, and how a GML application schema (OGC 2007) conforming to the CityGML ADE rules can be automatically generated from the UML model.Our experiences can serve as best practice to standardize the developments of domain specific CityGML ADEs in the near future.Firstly, because this process is not standardized yet and, secondly, because our approach was established through intensive discussions on possible alternatives with the OGC CityGML Working Group, members of the Special Interest Group 3D (SIG 3D) and other experts.An important step in this process, has been the decision of SIG 3D to adopt our UML modeling approach for ADE's in March, 2012.
The paper is organized as follows.Section 2 describes the optimal modeling approach, which was selected from several alternatives.Section 3 explains how the selected modeling approach has been applied to model the CityGML ADE 'IMGeo 2.0' and Section 4 concludes on findings and topics for further research.

APPROACH FOR MODELING CITYGML ADES IN UML
This section presents the selection process of the optimal approach for modeling application domain extensions of CityGML in UML.During the development of CityGML_IMGeo, intensive discussions took place on possible alternatives for modeling the ADE in UML, specifically with the OGC CityGML Working Group, members of the Special Interest Group 3D (SIG 3D) and other experts.Some of the alternatives were based on having only generic extension placeholders in the UML model, while the extension would be described informally in some way outside the model.In other alternatives the extension would be described in UML.Most of these alternatives were in some degree in conflict with rules of UML, the ISO19109 General Feature model, or the modular specification standard of OGC (OGC, 2009).
After comparing the advantages and disadvantages of the alternatives, one alternative was selected as the best option for the IMGeo ADE.This approach defines the properties to be added in subclasses in the ADE package but suppresses these subclasses from the generated XML Schema.
An extension subclass is marked as ADE extension in the UML model using a stereotype or tagged value.A stereotype is preferred because it makes clear from the UML diagrams that the ADE subtype is not mapped to an XML Schema component.Tagged values are not always (and usually not) shown in the graphical notation.However, this could be viewed as violating the GML encoding rule that stereotypes are used for conceptual aspects and tagged values for encoding-related aspects.
Several disadvantages to this approach were noted by the participants of the discussions.1) It is confusing to introduce an ADE subclass of a CityGML type although the ADE hooks provide a means to avoid subclassing.The ADE hooks represent a concept of attribute substitution, for which there is no standard expression method in UML.In UML, using a subclass for extending another class is appropriate.2) In UML a subclass inherits all methods and attributes from its super class, but in this case that is not intended.In order to stress this, the generalization relation between subclass and CityGML superclass receives a stereotype <<ADE>>.3) Because in UML the extension is modelled with subclasses, it is not possible to create an instance diagram where an object instance is shown combining properties from different ADE's.The only way to show instance data is on the XML level.
These disadvantages notwithstanding, there are several reasons why we have chosen this approach.Firstly, conceptually IMGeo is an extension of CityGML and therefore defining the IMGeo classes as subclasses of CityGML classes and adding the extra properties to these subclasses is appropriate.Another aspect in favor of this alternative is that the use of subclassing is understandable for people with basic knowledge of UML class diagrams.This is an important requirement of the IMGeo UML model.Finally, this approach conforms to relevant rules of UML, the ISO 19100 series and OGC unlike most of the alternatives.Finally this approach is the most in line with the current geo-information modeling approach in the Netherlands.
The fact that in the XML Schema implementation the subclasses are omitted, is seen as a technical implementation choice to allow the combining of properties from different ADEs.While this is a valid reason on the technical level, it is not taken to mean that in the conceptual UML model subclassing should also be avoided.

MODELING IMGEO AS CITYGML ADE
This section explains how the selected modeling approach was applied to model the CityGML ADE IMGeo 2.0.Although the focus in this section is on realising a CityGML ADE for the Dutch information model "IMGeo", the followed procedure contains generalities that can be used to model other ADEs in UML as well.

Modeling IMGeo classes as subclasses of CityGML classes
Since CityGML is only available as xml schema, the first step is to recreate the UML model in the modeling tool Enterprise Architect, based on (OGC 2008).In the next step all IMGeo classes are modeled as subclasses of CityGML classes.Using the selected modeling approach, these subclasses get the same class name as the CityGML class they are extending.The stereotype <<ADEElement>> is assigned to the subclasses.This stereotype was proposed by Benner and Haefele during the discussion on the selected modeling alternative.This stereotype marks these classes as subtypes that only add properties to the CityGML class, and accordingly no XML component for these classes will be created in the XML Schema.In addition, the specialization relation with the CityGML class is marked with a stereotype <<ADE>>.For documentation purposes, a Dutch translation of the subclass name is added as an alias.
For all CityGML classes which are relevant for IMGeo, a subclass is created, adding at least a 2D geometry property to all classes.Figure 1  The implications of applying this inheritance structure, is that the domain specific information model gets the same structure as defined by the CityGML model.To identify equivalent concepts that can be modeled via this subclassing method, first a conceptual mapping was made between CityGML and the application information model IMGeo.These mappings compared the concepts at semantic level, i.e. independent of which LOD the concept appears in CityGML.
Obviously, not for every IMGeo class a 1-to-1 mapping to an equivalent CityGML class could be found.For these classes, two solutions are possible.The first option, which is preferred and therefore applied as much as possible, remodels the IMGeo concept so that an equivalent CityGML class can be found.For IMGeo this is for example done for Vegetation that models any vegetation-related concept (in IMGeo 1.0 divided over several classes) and AuxiliaryTrafficArea meant for road segments which are not used for traffic, such as verges (in IMGeo modeled under the classes Road or Land Use).
If it is not possible to remodel the concept into a CityGML class, CityGML is extended with a new class, as a subclass of one of the CityGML classes.Figure 2   Both CityGML and GML do not provide a normative way to structure code lists.Prominent choices are GML dictionary and SKOS (Simple Knowledge Organisation System; SKOS, 2012).GML dictionary was considered but not selected, because these are deprecated in GML 3.3, while SKOS adoption is growing in the geo community.SKOS was therefore selected.
The code lists are maintained in the UML model and XML structured code lists can be generated from the UML using a ShapeChange customization which allows generation of SKOSencoded code lists from UML classes with a <<codeList>> stereotype.The disadvantage of maintaining the code lists in the UML model, is that the UML model needs to be updated in case the code lists need an update.For IMGeo the code lists are considered as part of the standard and allowed to change only when a new IMGeo version is published.
The SKOS code lists will be published in a national, publicly available registry, which also contains the IMGeo XML Schema.Each code list and code list value is accessible via its own URL.
The Java tool ShapeChange is used to generate an XML Schema (GML application schema) from the ADE defined in UML (Portele, 2008).As mentioned before, ShapeChange implements the UML to GML encoding rules described in ISO 19136, ISO 10118, and ISO 19109.ShapeChange was modified to add a custom encoding rule for classes with the <<ADEElement>> stereotype.These classes are suppressed from the GML Application schema, while their properties are added to the ADE namespace as substitutes for the CityGML "_GenericApplicationPropertyOf<Featuretypename>" hooks as described in CityGML 10.13.1.
For the national 3D standard it is required to identify the reference system (x, y, z) to be used in the GML files.
IMGeo 2.0 compliant test data has been generated to show how the model works when applied to data, see Figure 3.

CONCLUSION AND FURTHER RESEARCH
This paper presents an approach for automatic generation of a CityGML ADE starting from UML schema.After some investigations of the several possibilities, the most beneficial for the Dutch purposes was applied for the CityGML ADE IMGeo (i.e. the Dutch national 3D standard).The main principle of the selected approach is that all classes of IMGeo are modeled as subclasses of CityGML classes and these subclasses get the same class name as the CityGML class they are extending.The stereotype <<ADEElement>> marks these classes as subtypes that only add properties to the CityGML class, and accordingly no XML component for these classes will be created in the XML Schema.
In These open issues are currently being studied in a follow-up project of the 3D Pilot NL (Stoter et al, 2012).The first phase finished in June, 2011 and has been reported in Stoter et al (2011).Since October 2011 over 100 organizations (Geonovum, 2012) are contributing to the six activities of the second phase of the 3D Pilot NL.The activities related to learn more about the UML modeling approach for ADEs are the generation of 3D IMGeo example data (several levels of detail and several classes) and the design and implementation of a 3D validator that tests whether both the semantics and the geometry of the data are compliant with the standard.
This is the first study on extending the UML diagrams of CityGML for specific domains.Since the OGC CityGML specifications do not provide rules or guidance on correctly modeling an ADE in UML, this paper may serve as best practice for future ADEs to be modeled in UML.

ACKNOWLEDGMENTS
shows an example in the IMGeo ADE of a subclass TunnelPart which contains additional properties compared to the equivalent CityGML class (2D geometry and LOD0 geometry properties).The yellow classes are classes from the CityGML Tunnel package.The <<ADEElement>> TunnelPart is a class defined in the IMGeo ADE package as a subclass of CityGML TunnelPart class.The Dutch alias is shown between brackets on the class diagram.

Figure 3 :
Figure 3: Visualisation of CityGML-IMGeo encoded data: CityGML LOD2 The class has its own properties which are modeled similar to CityGML classes, such as implicit geometry on different LODs as well as 2D and 3D geometry up to LOD3.
Examples are water management constructs such as pumping plants, locks, and weirs but also wharfs, fences, loose-standing walls, high-tension line towers and wind turbines.It is modeled as a <<featureType>> subclass of the CityGML class _Site (with a Dutch class name) which is not suppressed from the XML Schema.
the development of CityGML ADE IMGeo 2.0 a number of topics are identified that requires further research.Firstly, more research is needed to understand how this model works in practice including the consequences of this new modeling method for IMGeo when used for both 2D and 3D datasets, e.g.how to preserve the links between the different LODs and how to upgrade 2D LOD to higher LODs.Secondly, knowledge is required on the ability to use 3D IMGeo data in CityGMLaware software, i.e. whether software systems are compatible with our extensions and which adaption is required.Furthermore research is needed to assess how the IMGeo code lists are best represented and structured in SKOS.Finally, the creation and management of CityGML-IMGeo data needs more research attention.Which methods can be used to generate CityGML-IMGeo data?How should this data be validated and maintained?How can 2.5D topology be created and maintained?