GestUI: A Model-driven Method and Tool for Including Gesture-based Interaction in User Interfaces

. Among the technological advances in touch-based devices, gesture-based interaction have become a prevalent feature in many application domains. Information systems are starting to explore this type of interaction. As a result, gesture specifications are now being hard-coded by developers at the source code level that hinders their reusability and portability. Similarly, defining new gestures that reflect user requirements is a complex process. This paper describes a model-driven approach to include gesture-based interaction in desktop information systems. It incorporates a tool prototype that captures user-sketched multi-stroke gestures and transforms them into a model by automatically generating the gesture catalogue for gesture-based interaction technologies and gesture-based user interface source codes. We demonstrated our approach in several applications ranging from case tools to form-based information systems.


Introduction
New devices are now appearing with new types of user interfaces (e.g., interfaces that are based on gaze, gesture, voice, haptic, and brain-computers).Although the aim is to increase the naturalness of the interaction [1], the effort is not exempt from risks.Due to the popularity of touch-based devices, gesture-based interactions are steadily increasing in mouse and keyboard applications, as well as for video games and mobile apps.Information systems (IS) are likely to follow the trend, especially in supporting tasks performed outside the office [2].
Several issues may hinder the wider adoption of gesture-based interaction in complex information systems engineering.Gesture-based user interfaces have been reported to be more difficult to implement and test than traditional mouse and pointer interfaces [3].Gesture-based interaction is supported at the source code level (typically, third-generation languages) [4].This involves an extensive coding and maintenance effort when multiple platforms are targeted.Moreover, it has a negative impact on reusability and portability, and also complicates the definition of new gestures.Some of these challenges can be resolved by following a modeldriven development (MDD) approach, provided that gestures and gesture-based interaction can be modelled and that it is possible to automatically generate the software components that support them.

Background
In this section general gesture-related concepts are introduced and modelling related definitions are given.

Stroke-based Gestures
Karam et al. [13], describe a gesture taxonomy and consider semaphoric gestures, which are strokes or marks made with a mouse, pen or finger on a touch-sensitive surface.
A stroke-based gesture is defined as the trajectory sketched by a finger or stylus on a touchsensitive surface that can be further classified into single-stroke or multi-stroke gesture, according to the number of strokes required to sketch it (Figure 1).Note that the multi-stroke gestures categorize single-stroke gestures.A stroke-based gesture is defined by a set of points, each point is represented by coordinates (X, Y) and, optionally, a timestamp (t) [14].In this work we consider stroke gestures represented by coordinates and a timestamp (X, Y, t) that are used to issue commands, which are the names of the executable computing functions issued by the user.
Gesture-based interaction relies on gestures to select and use the functions provided by applications in touch-based devices.

Model-driven Related Concepts
In Model-driven development models are used as the primary source for documenting, analysing, designing, constructing, deploying and maintaining a system [15].In this paper we focus on software systems.
A model is a formal specification of the function, structure and behaviour of a system within a given context from a specific point of view [15].
A platform is the set of resources on which a system runs.This set of resources is used to implement or support the system [16].
Model-Driven Architecture (MDA) is an architectural framework for model-driven development.One of its fundamental aspects is its ability to address the complete development lifecycle, covering analysis and design, programming, testing, and component assembly, as well as deployment and maintenance [17].MDA specifies three default models of a system: a computation independent model (CIM), platform independent model (PIM) and a platform specific model (PSM) [16].
Model transformation is an important activity in Model Driven Engineering (MDE).Model transformation is the process of converting one model to another within the same system employing a model transformation language [15].Model transformation language is a language envisioned specifically for model transformation.Acceleo is a common model transformation language employed to specify M2T transformations.It offers a template-based language for defining code-generation templates [18] to specify M2T transformations.Acceleo is a code generator based on templates that implement the OMG's M2T specification [19].ATL is a language and a toolkit to enable M2M transformations.The field of MDE, provides ways to produce a set of target models from a set of source models.ATL is hybrid model transformation language that allows both declarative and imperative constructs to be used in enabling transformation definition [20].

Gesture Test Frameworks
We considered three existing gestures test frameworks:  $N is a lightweight, concise multi-stroke gestures recogniser that uses only simple geometry and trigonometry [6].Its goal is to provide a useful, concise, easy-to-incorporate multi-stroke recogniser deployable on almost any platform to support rapid prototyping.$N allows the user to define gestures through demonstration. Quill supports Rubine, one of the first algorithms to recognise mouse and pen-based gestures [7].It employs a statistical method of gesture recognition based on a set of 13 geometric features [21].It has been used for recognising single-stroke gestures like the unistroke or Grafitti alphabets [22].It also allows the user to define a gesture through demonstration. iGesture supports the SiGeR algorithm that classifies gestures based on regular expressions and describes them according to the eight cardinal points and statistical information indicators [9].

Related Work
Below we cite some of the most important studies published on gesture representation and gesture-based user interface development.

Gesture Representation
According to the related literature, there are many types of gesture representation that can be used to incorporate gestures into information systems:  Representation based on regular expressions.A gesture is defined by means of regular expressions formed by elements, such as ground terms, operators, symbols, etc. Spano [23] defines a gesture as a declarative and compositional model that presents regular expressions containing ground term elements, as well as composition operators based on Petri Nets that describe the gestures.In [24] the authors present a formal semantic analysis of iconic gestures employing a multidimensional matrix whose rows contain values that describe aspects of a gesture's forms.Proton [25] allows a declarative and customised definition of multi-touch gestures using regular expressions composed of touch event symbols.Proton++ is declarative multi-touch framework that includes a custom declarative gesture definition system [26] and is based on the Proton framework.GestIT [27] employs a declarative and compositional approach to define gestures using regular expressions.SiGeR (Simple Gesture Recogniser) describes gestures with eight cardinal points (e.g., N, NE, E) and provides statistical information [9]. Representation based on a language specification.Gesture ML [28] or Gesture Markup Language (GML) is an extensible XML-based language used to define multi-touch gestures that describes interactive object behaviour and the relationships between objects and applications.The Gesture Description Language (GDL) [29] enables body postures and gestures to be described under the assumption that gestures can be partitioned into a sequence of postures.The description is contained in a script written in a proprietary language. Representation based on demonstration.In this case, developers define gesture by generating the code to represent it, refine it and, once the developer is satisfied with its definition, include it in an IS.Gesture Coder [30] allows a gesture to be defined by demonstration, tests the generated code, refines it, and, once the developer is satisfied with this definition, incorporates the code into IS.Other solutions in this group are $1 [31], $N [6], and $P [32], which base the definition of single-stroke ($1) and multi-stroke ($N and $P) gestures on the trajectory of a finger or pen.In this case, developers can define gestures, generate the code to represent a gesture, refine it and, once they are satisfied with it, can include this code in an IS.Although the gesture representations described above permit developers to include gestures into information systems, none of them addresses model-driven gesture representation.In this work we propose a model-driven approach for representing gestures with a high-level of abstraction, thus, offering platform-independence and reusability.By providing the proper transformations it is possible to target several gesture recognition technologies.In this study we focus on user-defined, multi-stroke, semaphoric gestures [13].

The Role of Gesture-based Interfaces in IS Engineering
Gesture-based interfaces can play two major roles in IS engineering, depending on whether we intend to incorporate this natural interaction into (i) CASE tools or (ii) into the IS themselves.In the former case, the interest is to increase the IS developer's efficiency, whereas in the latter the aim is to improve IS usability, especially, in operations in the field, where the lack of a comfortable office space hinders the ergonomics of mouse and keyboard.In both cases, gesturebased user interfaces development methods and tools are needed.Some examples of methods and tools are described in [33] and [34], in which the authors propose a method of introducing gesture-based interaction into an interface.
Some studies have reported on the definition of methods to generate a user interface: UsiGesture [35] allows a designer to integrate gesture-based interaction into an interface, but it lacks the techniques to model, analyse or recognise gestures.The authors applied the method to developing a restaurant management tool.In [33] the authors propose a method that includes requirements definition, design, implementation and evaluation and apply it to creating a puzzle game.In [34] the authors describe a method with two variants (technology-based and humanbased) and provide guidelines for the definition and selection of gestures based on ergonomic principles.GestureBar [36] embeds gesture disclosure information in a familiar toolbar-based user interface.GestureBar's simple design is also general enough for use with any recognition technique and for integration with standard, non-gestural user interface components.The aim of Open Gesture [37] is to facilitate inclusive interface designs that are usable by the elderly and the disabled as applied to an interactive television project.
In this study we propose a similar flow to that proposed by Guimaraes et al. in [33], but automating the implementation of gesture-based interfaces by means of model transformations.
In future work we plan to provide support to the ergonomic principles proposed in [34].

Introduction to gestUI Method
The study applied an existing user interface development method to include gesture-based interaction in WIMP3 user interfaces by means of gestUI.The existing method can be codecentric or model-driven.In this section we describe the process to include gestUI in a codecentric method.Appendix A explains the process to include gestUI in a model-driven method to user interface development.In Figure 2, activities and products are shown in grey, and gestUI activities and products are shown in white.The flow of the process of the existing method is shown with blue arrows and in the case of gestUI is shown with black arrows.The existing method begins with the interaction requirements specification, then continues with the user interfaces design which are implemented in a programming language to obtain the information system user interfaces.The model-driven gestUI method is inserted into the existing method containing activities and products to help in defining the custom gesture catalogue and to include gesture-based interaction in a user interface.

gestUI
gestUI is a user-driven iterative method that follows the MDD paradigm.It is user-driven because the users participate in all non-automated activities and iterative because it aims to discover the necessary gestures incrementally and provides several loopbacks.
The main artefacts in gestUI are models which conform to MDA, a generic framework of modelling layers that ranges from abstract specifications to the software code.
Figure 2 shows a view of the method from an MDA perspective.The PIM that drives the process is the gesture catalogue metamodel.Using a M2M transformation with ATL we obtain the platform-specific gesture specification (model).This PSM is converted into gesture-based user interface source code using M2T transformation rules defined by Acceleo.Moreover, PSM is converted into the gesture catalogue by the gesture recognition tool (i.e.quill, $N, iGesture) using transformation rules defined by Acceleo.
According to Figure 2, gestUI employs M2T transformations to obtain the information required to define (i) custom gestures and (ii) to include gesture-based interaction in an information system user interface.The gesture catalogue obtained in each execution of the M2T transformation can be stored in a repository for reuse in other similar processes.
The activities and products shown in Figure 2 are as follows: The computation-independent layer is omitted because gestUI already assumes that the IS is going to be computerised.
In the platform-independent layer is included the Activity 1, "Define gestures", in which the developer specifies the gestures in collaboration with representative IS users.In our proposal, the gestures are defined by sketching on a canvas, then they are stored in the 'Gesture catalogue model', which conforms to the metamodel depicted in Figure 3.Each gesture is formed by one or more strokes defined by postures, which in turn are described by means of coordinates (X, Y).The sequence of strokes of the gesture are specified by means of precedence.Each posture in a gesture is related to a figure (line, rectangle, circle, etc.) with an orientation (up, down, left, right), and a state (initial, executing, final) which qualifies the order of the strokes.The gesture catalogue definition could be part of a larger 'Interaction requirements' specification.The product obtained in this activity is the gesturecatalogue model.In the platform specific layer, the activities A2 and A3 permit that the gesture catalogue can be defined from a previously defined gesture repository.That is, the gestures can be reused in other user interfaces or information systems.The description of each of these activities are as follows: Activity A2, "Generate gesture-based interaction", since the user interface is designed in this layer, the gesture-based interaction is also defined in this layer in collaboration with the user by means of a code-centric method.The filename of the user interface source code is inserted as attribute to the class "Gesture" in the gesture catalogue model with the aim of processing the source code to obtain the actions defined in the user interface.In a model-based IS user interface development the actions are specified in the interface model (see Appendix A).In a code-centric interface development they are implemented on the interface itself.The procedure mainly consists of applying a parsing process on the source code to obtain the components included in the user interface, after which the correspondence between the gesture and action/command included in the user interface is allocated.This correspondence allows a set of sentences (action/command) to be defined in the same programming language as the user interface and enable it to be executed by each previously defined gesture.The product obtained in this activity is stored in the "gesture-based interaction model".
Activity A3, "Generate gesture specification", consists in an M2M transformation using ATL as model transformation language.Figure 4 shows the M2M transformation that is executed by means of a transformation definition (script.atl)which contains the transformation rules written in ATL.In Figure 4, Ma is a gesture catalogue model which conforms to gesture catalogue metamodel, MMa; Mb is the platform-specific gesture specification (model) which conforms to gesture specification model, MMb.This definition contains the rule to create the class "Gesture" in the target model.In this transformation definition, the input is the gesture catalogue model and the output is platformspecific gesture specification.
In the code layer, we have two activities.Activity A4, "Generate gesture-based interface" where the gesture-based interaction model and the gesture catalogue model are transformed into an executable and deployable code of the user interface, written in the selected programming language.The tool generates components (e.g., Java code) that are embedded in the existing IS interface, 'Gesture based interface' is automatically generated by the platform-specific layer artefacts.
Activity A5, "Test gestures", in this activity the gesture catalogue model is transformed into language supported by the gesture recognition tool (i.e.XML) so that both the developer and the user can test the gestures using the gesture recognition tool (we currently support three gesture testing platforms: quill [8], $N [6] and iGesture [21]).We apply M2T transformation to generate the platform-specific gesture catalogue for each gesture recognition tool.This transformation is executed via a script containing the transformation rules written in Acceleo, applying a script that specifies information, such as the classes and components participating in the generation, output folders, etc.The combination of the components that support the code generation process is depicted in Figure 5.The template definition, which drives code generation, constitutes the most important part of the transformation process.Appropriate templates have been defined for the platforms considered in our work: XML ($N and iGesture), GDT (quill) and Java.The next paragraph includes an excerpt from the template written in Acceleo, for applying M2T transformation to obtain the gesture catalogue for the $N gesture recognition tool.It also includes a header containing the general information of the gesture (gesture name, date and time when the gesture was sketched, number of strokes, number of points, etc.), the strokes contained in the gesture and the set of points which conform the gesture.

The gestUI Tool
In order to demonstrate the applicability of the proposed method, we implemented tool support using the Java programming language and Eclipse Modelling Framework (Figure 6).In this figure the acronym included in brackets in each subsystem (A1, A3) and component (A2, A4 and A5) corresponds to the number of the activity that they support (see Section 4).The method's internal products are not shown but the relationship with the external gesture recogniser is represented.In Figure 6, the components showed with dark grey shapes belong to an existent ISs.The light grey shapes belong to our proposal.The subsystems and components are described in this section.The implemented tool support has three options (Figure 7): (i) "New Catalogue" to define gesture catalogue model, (ii) "Specific Catalogue" associated with platform-specific gesture specification and, (iii) "Gesture-Action" to define gesture-action correspondence and source code generation.

Gesture Catalogue Definition Module
This module supports the definition of new multi-stroke gestures by means of an interface implemented in Java containing a canvas on which the user sketches the gestures.Each gesture sketched by the user consists of one or more strokes, each stroke is defined by a set of points described by coordinates (X, Y) and a timestamp (t).In applying $N as the gesture recognizer, when the gesture is sketched on a canvas (Figure 8, left), the following data is captured: number of strokes specified during the sketching of the gesture, number of points contained in each stroke and the value of each point (X, Y) together with the timestamp (t) of each point.After capturing the data required by $N to analyse each gesture the data is stored in the 'Gesture catalogue model' which conforms to the metamodel defined in this study (Figure 8, right).

Model Transformation Module
This module makes it possible to obtain the platform-specific gesture specification by means of an M2M transformation.The transformation rules are written in ATL.The user must specify the parameters in the interface showed in Figure 9.The parameters required to execute the M2M transformation are: gesture catalogue model (input) must conform to gesture catalogue metamodel (input) and platform-specific gesture specification model (output) must conform to gesture specification metamodel (output).

Gesture-action Correspondence Definition Module
This module allows the developer to specify the action to be executed when the gesture recogniser tool validates a gesture sketched by the user on the user interface.We currently provide automated support to code-centric developments made in Java, i.e. this module parses the source code of the user interface to obtain a list of actions.This module requires two inputs (Figure 10): the previously created 'Gesture catalogue model' and the user interface (e.g., a Java source code).The output of this module is the source code of the previously specified user interface, but now includes a source code to support the gesture-based interaction.In order to apply the parsing process in the user interface source code, we included methods in the implementing the tool support to analyse two types of Java applications: (i) a Java desktop application using SWT, and (ii) Java desktop RCP application using JFace and SWT.In the former type, SWT provides widgets (controls and composites) to be included in the user interface with the aim of assigning actions [38].The user interface source code also includes other sections containing event listeners and "action-perform" structures in order to specify the actions to be executed when the user clicks on a widget (canvas, button, text field, etc.) on the user interface (Figure 11).The parsing process then searches for these actions in order to complete the gesture-action correspondence definition.In the second type, in conjunction with SWT, JFace provides actions to allow users to define their own behaviours and to assign them to specific components, such as menu items, toolbar items, buttons, etc. [38].In this case, the user interface source code includes structures to specify the actions to be executed when the user clicks on a widget in the user interface.These actions are taken during the parsing process in order to determine the gesture-action correspondence (Figure 12).
The parsing process analyses the user interface source code searching for keywords corresponding to widgets available in Java language to include elements of a user interface (text, buttons, image, etc.).Each widget found in the process is stored in the table containing the gestures selected to define the gesture-action correspondence.When generating the user interface Java source code, many references are included (e.g., to gestures management libraries, to gesture-recognition technology libraries (e.g., $N)), and some methods are added (e.g., to execute the gesture-action correspondence and to capture gestures).In addition, the definition of the classes is changed to include some event listeners.Finally, the source code obtained from the completed process should be inserted in the complete source code of the user interface and, of course, be compiled again.

Demonstration of the Method and Tool Support
We applied gestUI and the tool support in two scenarios: (i) we use gestUI and the tool support to obtain a gesture catalogue to be used in the $N, quill and iGesture frameworks; (ii) we used gestUI and the tool support to integrate gestUI into a code-centric user interface development method.

Applying the Method and Tool to Testing a Gesture Catalogue
Using the tool support, we define a gesture catalogue containing three gestures to test them in the above frameworks: a triangle, a line and the letter "S" (Figure 13).The gesture representation in each framework is contained in two sections: (i) a header specifying general information on the gesture, and (ii) the points specified by coordinates (X, Y) and a timestamp (t).$N and iGesture employ XML for gesture definition and quill employs GDT 2.0 for this purpose (Figure 14).
To test the gestures we used the M2T transformation described in Section 5.2, considering successively $N, quill and iGesture as the target platform.Our aim was to obtain the gesture catalogue in the structure specified for each framework (Figure 14).In this case, we specified the transformation rules with Acceleo and then we performed the M2T transformation for each framework.In the next step we use each framework to test the gestures.For instance, we included some quill interfaces.The quill interface was used to import the gesture catalogue obtained in the model transformation that is shown on left side of Figure 15.On the right, the gesture catalogue already included in the framework can be seen.In the last step the user sketched the gestures contained in the gesture catalogue using the sketch area defined in the interface of each framework.All the frameworks included the algorithm (not described here) used to recognize the gestures sketched by the users.Figure 16 shows how the gesture catalogues are effectively recognised when imported to SN, quill and iGesture frameworks.

Applying the Method and the Tool to integrate GestUI into User Interface Development
For illustration purposes, we used a form-based information system (IS).In this case a fictitious example of a university management system, and we narrate the project as if it had actually happened.Figure 17 shows the classroom management diagram of the fictitious university.In this section, we considered an IS with WIMP interfaces and for the sake of brevity, we only considered two interfaces for the demonstration: the main interface and department management interface.The form-based information system was developed in Java in Microsoft Windows.In the first iteration, the university tells the developers that it would like the gestures to resemble parts of the university logo.Therefore, they use the Gesture catalogue definition module to create the first version of the 'Gesture catalogue model' containing these three gesture indicators:  for departments, || for teachers and  for classrooms.However, when the first user interface design is available (see Figure 18), they realise that other gestures are needed.After defining and testing new gestures, they decide that navigation will be by means of the abovementioned gesture indicators, and that similar actions that appear on different screens will have the same gestures indicators (e.g., the gesture  will be used to create both new departments and teachers).The developers would assign the gesture-action correspondence in collaboration with the user, supported by the Gesture-action correspondence definition module.The correspondences are informally shown in Figure 18, next to each action button and are described in Table 1.The user can employ the model transformation option to apply an M2T transformation and to obtain a platform-specific gesture catalogue.We found that if the Java source code of the user interface using traditional keyboard and mouse interactions is available, then the components that support the gesture-based interaction can be generated.In this case, the underlying gesturerecognition technology selected was $N.Since the users felt more comfortable with multi-stroke gestures (especially when tracing certain letters and symbols), quill was discarded.The final IS interface consists of several screens for managing university information.In this scenario, the users can still interact with the IS in the traditional way (i.e. by the keyboard and mouse), but now they can also draw the gestures with one finger on the touch-based screen to execute the actions.
Figure 19 represents three interfaces from the IS: the task starts with the main interface (Figure 19, left) where the users can select one of the options on the menu.For the sake of simplicity, the menu is showed as an array of buttons.According to the requirements indicated above, if a user sketches the gesture "" in the main interface of the IS then he/she obtains a second user interface containing the information on the existing departments (Figure 19, centre).In order to create a new department, he/she draws a "" on this second user interface obtaining a third user interface with the fields for entering information on a new department (Figure 19, right).When the user finishes entering the information, sketching "S" on this third interface saves the information to a database.This paper describes gestUI, a model-driven method together with its tool support system to specify multi-stroke gestures and automatically generate the information system components that support gesture-based interaction.
We assessed the method and tool support by applying them to a gesture testing case, generating the platform-specific gesture specifications for three existing gesture-recognition technologies in order to verify the tool's multiplatform capability.All the gestures were successfully recognised by the corresponding tools.When the proposed method was applied to a form-based IS, the final gesture-based interface components were automatically generated and successfully integrated into the IS interface.This process was applied in both Microsoft Windows and Ubuntu (Linux) systems to demonstrate its multiplatform capability.
The advantages of the proposed method are: platform independence enabled by the MDD paradigm, the convenience of including user-defined symbols and its iterative and user-driven approach.Its main current limitations are related to the target interface technologies (currently, only Java is used) and the fact that multi-finger gestures are not supported.These limitations will be addressed in future work.
We also plan further validation by applying the approach to the development of an actual IS and to extending a CASE tool with gesture-based interaction (the Capability Development Tool being developed in the FP7 CaaS project).We also plan to integrate gestUI into a fully-fledged model-driven framework capable of automatically generating the presentation layer and extend its application with gesture-based interaction modelling and code generation.possible to specify actions/commands for execution by the user.Figure A1.2 shows an excerpt of the MARIA AUI metamodel with the gesture catalogue metamodel linked by means of "event" and "Gesture" classes.The inclusion of gesture catalogue metamodel in the AUI defined in the MBUID specification is shown in Figure A1.3.MBUID facilitates the interchange of designs through a layered approach that separates out different levels of abstraction in the user interface design.The AUI metamodel contains the "InteractionEvent"class, which defines an interaction event.This metamodel contains a generalization definition with TriggerEvent, SelectionEvent, DeselectionEvent, and InputEvent as classes that permit the specification of event types that can be executed by gestures.Considering gestUI, the class "Action" corresponds to the "InteractionEvent" class in order to define the action to be executed with a gesture sketched by the user.

Figure 3 .
Figure 3. Metamodel of the gesture catalogue modelling language

Figure 5 .
Figure 5.The code generation process

Figure 7 .
Figure 7. Main interface of the tool support

Figure 10 .
Figure 10.Interface for defining gesture-action correspondence and to generate source code

Figure 11 .
Figure 11.SWT components to define actions

Figure 12 .
Figure 12.JFace and SWT components used to define an action in a user interface

Figure 15 .
Figure 15.Importing the gesture catalogue to the quill framework

Figure 17 .
Figure 17.UML class diagram of the demonstration case

Figure 18 .
Figure 18.Screen mockups (gestures are shown in red, next to action buttons)

Figure 19 .
Figure 19.Using gestures to execute actions on the interfaces

Table 1 .
Platform-independent gesture catalogue definition