1 Introduction

In the era of mobile computing and internet technology, mobile gaming has seen a huge increase, gaining almost all population groups. According to SmartInsights (Chaffey 2016), there are about 1.7 billion mobile users worldwide who averagely spend 3 h per day on their mobile phones. In another market study (see, Iqbal 2019) showed that 33% of mobile phone users regularly play games on their phones where 50% of them download apps to their phones.

Besides, with the development of smartphones, fast mobile broadband and platform availability, mobile gaming has moved deeper into the broader culture of individuals and communities. According to Intelligence Blog Sonders (2017), 62% of smartphone users download game applications within a week after purchasing their phones, which is higher than any other downloaded applications. This generated more than $60 billion of revenues, which are expected to exceed $100 billion by 2021 according to estimates raised by Newzo (2018). On the other hand, the ever-growing popularity of Android that constantly attract new developers and business, has unfortunately led to a wide discrepancy of distinct Android devices employed. As such, ensuring efficient and reliable testing of newly developed mobile game applications becomes of paramount importance and one of the toughest challenges faced by game developers, service providers as well as regulators. This is often referred to as quality assurance (QA) testing, which focuses on identifying technical problems with the games. Although QA techniques appear almost at every stage of the software development lifecycle, starting from requirement eliciting to product deployment and maintenance where a special interest is attributed to testing automation. The latter is an integral part of a continuous integration pipeline (Novak 2008) where simple automated tests are used for basic program elements, such as individual class methods or separate functions. Especially, test automation reduces the time, cost and resources, while enhancing the reliability through exposure to a large number of test cases that cannot be performed solely by human interaction in practice.

Compared to traditional desktop-applications, test automation for mobile applications bears additional challenges. First, through sandboxing, only limited access to internal processes is provided, which challenges the developers to optimize resource allocation. Second, the general user interface navigation of mobile apps is vulnerable and hard to control due to uncertainty pervading the response time of the interface (s). This includes, for instance, grab and hold like interactions. This problem is also referred to as the fragile test issue pointed out by Meszaros (2007). That is why it is recommended that the application functional logic should not be tested via the application’s user-interface, although such rules are often violated by developers themselves. Third, mobile devices are often in a steady movement, which can cause the currently executed test-case for an app to break down. Fourth, often the complexity of the allocation task together with resource limitations cause a change in size and resolution of the screen, which, in turn, makes any user-interface based test likely to fail. Fifth, despite the effort to harmonize the software-hardware configuration in mobile platforms, the number of distinct configurations is sharply increasing, which makes the application of a single testbed very difficult. For instance, the number of distinct Android devices is exponentially increasing (e.g., more than 24,000 distinct (Android) devices were reported in 2015Footnote 1). Therefore, it is nearly impossible to test an application on every distinct device in a real environment and, at the same time, provide the best user experience where the underlined application works flawlessly on all other devices. Sixth, mobile games bear additional inherent features that add extra difficulties. For instance, games involve a lot of graphics and other assets, which substantially increase the loading time. This, in turn, challenges the efficiency of the resource allocation policy. Besides, games have inherent hooks that are intended to make the player play the game again and again (Novak 2008). This makes it difficult to automate the process of accessing the various passes of the game. Finally, games bear a psychological factor referred to as fun factor (Novak 2008). Indeed, even in the case of a bug-free scenario, the game can fail because the players do not feel the fun factor so that their actions are random and not comply with game rules. Because of its inherent subjectivity and vulnerability from one player to another, it is almost impossible to automate the fun factor in testing.

Due to the above challenges and the lack of effective fully automated testing platforms, mobile app testing is still performed mostly manually, costing the developers and the industry significant amounts of effort, time, and money (Novak 2008; Choudhary et al. 2015; Kochhar et al. 2015). This requires attention from both the research community and practitioners. Although, setting up the test automation scheme would imply an additional investment, sometimes referred to as the “hump of pain” learning curve, the expected benefits gained from this process will return back such investment sooner or later (Crispin and Gregory 2011).

In this perspective, we present in this paper a new take on a mobile game application testing called MAuto. The latter aims to help the tester to create tests that work with Android games. The tests can then be re-run on any other Android device. The tool records tests from user-interactions and exports them to Appium framework (Appium 2012) for playback. MAuto belongs to the class of image-based recognition tests where AKAZE features (Alcantarilla et al. 2013) were used to recognize the objects from the screenshots. When the user performs the recording, MAuto generates a test script that reproduces the recorded events. MAuto then uses Appium framework to perform the replay of the test script task. To validate the developed MAutol, tests are created with the tool for Hill Climb Racing mobile game and successfully executed. The rest of this paper is organized as follows. Section 2 reviews the state of the art in the field of mobile testing. The description of the developed MAuto system is reported in Sect. 3 of this paper, while experimentation and exemplification using Hill Climb Racing game are examined in Sect. 4. Section 5 summarizes the key findings and ways forward.

2 State of Art

2.1 Test Automation Pyramid

The traditional test automation pyramid introduced by Cohen (2006) is highlighted in Fig. 1. It consists of a three layer-pyramid corresponding to End-to-End (E2E) test, an Integration test and a Unit test at the top of the hierarchy. The width of the pyramid reflects the number of tests to be written in each layer (Knott 2015). Usually, manual testing is not part of Cohen’s test automation pyramid so it was drawn as a cloud on the top of the pyramid of Fig. 1 for illustration purposes only.

Fig. 1
figure 1

Traditional test automation pyramid (Knott 2015)

Mobile test automation tools are not yet good enough to support the traditional test automation pyramid. Besides, mobile devices are armed with a variety of sensors (e.g., camera, accelerometer, gyroscope, infrared, GPS) and other distinguished features (e.g., memory and CPU resources and various embedded software that accommodate current and future installed apps), which restrict the development of universally accepted testing tools (Knott 2015). We primarily focus on E2E testing because of its criticality.

2.2 Types of Mobile Test Automation Tools

We distinguish five types of test automation modes: image-based, coordinate-based, OCR/text recognition, native object recognition, gesture record and replay (Knott 2015).

2.2.1 Image-Based Tools

The key in this testing mode is to determine the type and location of icons/objects on the screen to be matched with a set of predefined graphical elements of the game taking into account the user’s actions and status of the game. More specifically, the application objects and controls are stored as images, which are then compared to the objects displayed on the screen to identify any potential matching. Once a match is found, the pre-defined step can be re-executed. These frameworks can also associate specific actions, such as clicks and text entry, to the controls. Besides, every user-interface (UI) object, which includes buttons, text boxes, selection lists, icons, among others, has a set of properties that can be used to identify, define or validate the underlying object (MacKenzie 2012). This provides the tester with useful and powerful tools for GUI-testing. As a result, the automation engineer achieves a high reusable and good maintainable low-cost script development. Such a method is widely accepted in the field and recognized as the best practice according to test automation.

However, it is acknowledged that the image recognition-like technique runs on elaborate and time-consuming pixel-comparison algorithms. Image-recognition automation is also infeasible if application objects are dynamic. On the other hand, such tests can be fragile if the predefined graphical elements are not carefully chosen. For instance, badly chosen algorithms or algorithm parameters can lead to flaky tests (Knott 2015). Therefore cautious analysis of the context is needed before the application of image like technique.

2.2.2 Coordinate-Based Recognition

In this approach, user actions are captured and automated based on their associated xy coordinates on the screen. This allows interactions with UI elements such as buttons and images present at specific, pre-defined locations in the application UI to be reproduced. However, if the screen orientation or object layout changes, scripts need to be rewritten. Indeed, the test just blindly executes a given action on a given coordinate so that whenever the screen size varies between devices under testing, the test can be broken down easily (Knott 2015). Therefore, the approach is rarely applied in practice, and only very few tools provide coordinate-based identification.

2.2.3 Optical Character/Text Recognition

The key in this approach is to identify the characters displayed on screens, e.g., “login” or “logout” button, by matching the text with the correct object on the screen, to determine the relevant application control (s).

However, OCR technology is dependent on the ability to identify the visible text, so that any blurring or screen resolution change may have a negative effect on the identification accuracy. Also, such tools are not suitable to test user-interface elements that are not visible or are continuously changing. Untestable elements for OCR and text matching would include a list of options that are not visible such as application controls that might have hidden text or dynamic text such as account balances or clocks that live-update. OCR recognition tools tend to be slower than other types of tools because they need to scan the whole screen for the text (Knott 2015). Therefore such techniques experience significant limits and, thereby, are commonly used in tandem with image-based recognition tools.

2.2.4 Native Object Recognition

Native object identification is based, first, on recognizing application object properties in the application code, such as ID, XPath, and, second, testing those elements. Especially, native object recognition is one of the most widely used types of mobile test automation tools where the UI objects are identified using the UI element tree. There are many ways to access the UI elements, such as XML Path Language (XPath), Cascading Style Sheet (CSS) locators or the native object ID of the element. With native object recognition, the developer can define the IDs or the locators properly and build very robust tests. The biggest advantage of this approach is that it does not depend on changes in the UI, orientation, resolution or the device itself (Knott 2015). The identification of programmatic objects makes this technique the most resilient to changing code, and hence quite reliable, although, it requires more effort and programming knowledge.

2.2.5 Gesture Record and Replay (GRR)

The basis of this approach is to record screen interactions during manual testing, including every mouse movement, keystroke, and screenshot to be replicated later on. This utility usually comes bundled as a record and a playback tool to enable testers with no programming skills to record and replay the flow of a test case. The test is primarily used for repetitive testing across various platforms and device models. Since each recording is unique, this automation technique is only meaningful in case of stable applications that do not involve important UI modifications. This concerns mainly quick and easy automation of unchanging flows. However, whenever the environment becomes dynamic with external interruptions such as incoming text and calls notifications or changes in orientation/layout, this approach has shown serious limitations. Many tools such as UFT and PerfectoFootnote 2 have capture-and-replay capabilities.

Figure 2 describes the basic principle of R&R like technique (MacKenzie 2012). In the record stage, the UI is connected to the business logic directly. When the test is recorded, the signals from the UI are intercepted by the Recording Decorator. Once the signals are stored, the decorator sends the signals to the business logic and the AUT will continue as it would without the decorator. During the execution phase of the test, the UI is put on the sleep mode. Playback Driver reads the signals from the container and sends them to the business logic.

Fig. 2
figure 2

Record and replay using a recording decorator adapted from MacKenzie (2012)

Besides, in practice, many test automation tools are a combination of the aforementioned types and they are not usually locked into a single object recognition type. Every type has its pros/cons, so the developer has to choose the best approach that fits his needs and constraints based on the mobile platform employed and the mobile game properties (Knott 2015). Adapted from Linares-Vásquez et al. (2017), Table 1 summarizes the main tools employed in Automation framework APIs, Record & Replay Tools, Automated GUI-input generation tools.

Table 1 Overview of automation frameworks and APIs

On the other hand, one distinguishes noticeable tools that are of paramount importance for the developers: Appium (2012) is an open-source test automation framework that can test native, hybrid and mobile web applications on Android, iOS and Windows platforms. One special feature of Appium is that the developers do not have to modify the application binaries to test the application, because Appium uses vendor-provided automation frameworks. On the other hand, Appium uses WebDriverFootnote 3 protocol to wrap the vendor-provided framework into a single API. WebDriver specifies a client–server protocol (known as the JSON Wire ProtocolFootnote 4) for the communication. The clients have been written in many major programming languages like Ruby, Python, and Java.Footnote 5

In terms of software implementation, Appium sets up a server into the host machine. The client, where the test logic is located, connects to the server. If the operating system of the device is Android, the server forwards the commands from the client to the device via UI Automator frame-work (see, Fig. 3). On older Android devices the server communicates with the device via Selendroid (Android API level < 17).

Fig. 3
figure 3

Appium on Android architecture

Close alternative candidates to image-based automation tools are summarized below.

SikuliXFootnote 6: is a tool that automates everything on the screen. It is formally known as Sikuli.Footnote 7 It uses OpenCV image recognition to find the objects to click on the screen. SikuliX does not have support for mobile devices out of the box, but it is possible to make it work with simulators, emulators or VNC (virtual network computing) solutions where the mobile device screen can be accessed from the desktop (Yeh et al. 2009; Chang et al. 2010).

JAutomate (Alegroth et al. 2013): is a commercial tool combining image recognition with Record and Replay functionality. JAutomate does not support mobile devices out of the box, but it is possible to make it work with simulators, emulators or VNC solutions where the mobile device screen can be accessed from the desktop.

Also, in terms of record and replay capability, one shall mention Testdroid RecorderFootnote 8 a free plugin for Eclipse.Footnote 9 It is a record-and-replay tool that records user actions with the application under testing (AUT) and generates reusable Android JUnit, RobotiumFootnote 10 and ExtSoloFootnote 11 tests. The generated tests can be replayed afterward.

Robotium RecorderFootnote 12: is a commercial plugin for Android Studio, very similar to Testdroid Recorder and it can also record and replay Robotium tests.

Appium GUIFootnote 13: is a project which provides a graphical user interface (GUI) for Appium. There is an inspector which can tell information about the objects on the screen and also a recorder that can record and replay Appium tests. Table 2 summarizes the aforementioned main application testing tools.

Table 2 Review of main application testing tools

Nevertheless, despite the multiplicity of the mobile automation testing tools as highlighted in Tables 1 and 2, the effectiveness of such tools was limited in practice, especially when dealing with complex interactive mobile games as pointed out in [56–57], which motivates the proposed MAuto.

3 MAuto

3.1 Motivation and Rationality

Usually, the functionality of a mobile game is executed during the runtime stage in a graphic container, e.g., OpenGL, to provide better graphics and interaction capabilities to users. Often, the container wraps all the functionality of a game. Thus, it is not possible to access the wrapped functionality to test it. Several methods have been developed to overcome this problem. The most common and effective techniques are (1) programming the container in a particular way to expose functionality outside the container and (2) implementing image recognition approaches to identify functionality from the screen, which is then transmitted to the testing process.

Nevertheless, to use an image recognition-based technique, the user needs the graphical representation of the object to find, e.g., buttons or game characters. Sometimes the user can get those elements directly from the graphics designer, but this is not always the case. Also, the game might change the environment and the context where the object is presented, e.g., shadows and lighting, which, in turn, will affect the success ratio of image recognition. Therefore, it is better to use the actual context from the game and take screenshots while playing the game.

In this sequel, screenshots are taken while the game is running in the mobile device and objects are extracted in a real context. Indeed, the screenshots are stored in the memory of the mobile device so that the user needs to transfer the image to his/her machine. Once the screenshot becomes available in the user’s machine, the object or partial image could be extracted from the screenshot. Finally, when every object required to run the game is automatically extracted, the user can utilize these objects to write the automation code that replays the sequence he played before.

To make the cycle above easier and faster for the user, we propose a tool called MAuto. Especially, MAuto will automatically take the screenshots and extract the objects from those screenshots while the user is playing the game. Once the sequence is ready, MAuto will generate the Appium test code to replay the sequence. The design and architecture of MAuto are detailed in the next section.

3.2 General Architecture

The developed MAuto mobile is a new mobile game automatic testing tool, which targets users without programming skills. Indeed, the tool enables generating an Appium test without a single line of code. From the mobile game categorization techniques highlighted in Sect. 2, MAuto makes use of two of the above techniques: image-based recognition and Record & Replay like technique. The image-based approach uses AKAZE features (accelerated KAZE features) (Alcantarilla et al. 2012).

From an input–output perspective, MAuto overall architecture involves three elements; namely, user, browser and mobile device, and next, it generates a test script that the user can run later on (see Fig. 4). Once the user has launched MAuto, all the interactions between the user and the tool make use only the browser input. MAuto takes care of the mobile device so that the user role is reduced to start MAuto and interact with the application via a web browser. When the user performed the recording task, MAuto will generate a test script that can reproduce the recorded events. MAuto itself is not able to replay the test, but the test script can be replayed with Appium.

Fig. 4
figure 4

System overview highlighting its components: user, mobile and browser

In Fig. 5 is described a more detailed view of the system. The two physical components are a mobile device and the host machine. In the beginning, MAuto installs and launches the Application Under Test (AUT) and the VNC (Virtual Network Computing) server to the device. Then MAuto initiates the connection between the VNC server and the client. If a user-related event occurs, the VNC client forwards this event together with its coordinates to MAuto. The latter captures the screenshot from the mobile device, saves them in a separate database, and gives VNC client permission to continue its processing. VNC client sends the same event to the VNC server. Next, the UI view from the mobile device will be updated to the VNC client. When the user has finished his manipulations, MAuto generates the test script from the screenshots and events.

Fig. 5
figure 5

Architectural design

In summary, MAuto acts as an R&R tool where the tool records the user interactions and Appium is employed to replay the tests. The recording decorator is a modified VNC viewer in the browser, while the replay driver is an Appium test together with the image recognition module.

Figure 6 summarizes how the recording sequence interacts with MAuto tool. At first, the user launches the MAuto from the command line, which is also transmitted to AUT so that MAuto installs the VNC client in the mobile device. Then it installs and executes AUT. MAuto runs an accessible webserver such that whenever the recording task is ready, MAuto opens up the associated webpage. The user can then visualize and monitor various UIs of the device so that the events associated with the VNC client running on the browser can be monitored. Next, MAuto runs a modified VNC client which will send the event (s) together with the associated coordinates to MAuto. The latter saves the event, takes a screenshot from the screen of the mobile device and extracts the query image around the event coordinates.

Fig. 6
figure 6

MAuto sequence diagram

The API call from the VNC client to MAuto is enabled after the corresponding images have been processed. Then the VNC client passes the event to the device via VNC protocol. In turn, the UI in the VNC client will be updated. This will continue until the user decides to stop the recording. Finally, when MAuto gets the command to stop the recording, it generates the test script which can be used with Appium to replay the test on any given device.

MAuto stores the screenshots and query images to the session folder. The latter has a CSV file where the events and image names are saved. The test script generator loads the CSV file from the disk and transforms it into Appium compatible test file.

3.3 Image Recognition in MAuto

MAuto calculates AKAZE features in the query image and the current screenshot. Fast Explicit Diffusion schemes (FED) are used in AKAZE to speed up the feature detection in nonlinear scale-spaces. AKAZE also introduces Modified-Local Difference Binary (M-LDB) to preserve low computational demand and storage requirements. Once both features (at the original image model and current image) are calculated, thresholding is employed to compare those features to ascertain whether the query image is currently shown on the screen and ascertain its coordinates accordingly, see Fig. 7 for a detailed implementation description. Examples of experimental results using these features are reported in Sect. 4 of this paper, see, e.g., Fig. 12 where green circles are the calculated features and the red lines are the matching features in both images. Especially, once the matching features are identified, we calculate the average coordinate from the inliers to get the coordinate of the query image in the screenshot.

Fig. 7
figure 7

Pseudo-code implementation of AKAZE based image recognition algorithm

4 Exemplification and Evaluation

MAuto tool has been tested and validated using Clash of Clans Android mobile game (version 8.551.4) (available from SupercellFootnote 14). Clash of Clans (CoC) is a mobile Massive Multiplayer Online Game (MMO/M-MOG) where the player builds a community, trains troops and attacks other players to earn assets. The game has a tutorial that the player should pass to play the game. The tutorial guides the player to click certain elements to continue the game. Therefore, if the tutorial can be passed without serious bugs, it is likely that the game works properly. Besides, since the variations are quite limited in the tutorial, this makes it a good test subject for MAuto. During the recording phase of the MAuto, the browser pops up indicating the readiness to start the recording task as seen in Fig. 8.

Fig. 8
figure 8

Browser’s view in case of Clash of Clans

The first view which requires user interaction in Clash of Clans is the important notice view, see Fig. 9. This view is accessible using the native object recognition tool. The object’s resource ID is android:id/button3 whose associated package is com.supercell.clashofclans.

Fig. 9
figure 9

The first view which requires interaction in Clash of Clans

Clash of Clans requires the user to select one Google Play account for the game, accessible using the native object recognition as well (see Fig. 10).

Fig. 10
figure 10

Google Play account selection

The subsequent views are no longer accessible using native object recognition, therefore, the use of image recognition is needed. See an example of this task in Fig. 11. The green circles are the calculated features and the red lines are the matching features in both images. Once we have the matching features, we calculate the average coordinate from the inliers to get the coordinate of the query image in the screenshot.

Fig. 11
figure 11

Example of image recognition using AKAZE features—query image is correctly matched from the screen

An example of a file created by MAuto in order to save the click coordinates, screenshots and query images is shown in Fig. 12, while an example of Appium script in MAuto is reported in the “Appendix” of this paper.

Fig. 12
figure 12

Example of MAuto output file

Next, the Record and Replay phase is carried out using Appium test script that takes into account the image recognition based approach. An example of Appium test script for this purpose is shown in “Appendix”.

5 Discussion

It is very time consuming and prone to errors for human testers to take regular screenshots from the device, transfer them to the host machine and crop appropriate query image. However, such a repetitive task can more efficiently be automated. MAuto is designed to do so and, thereby, decrease the amount of required manual work. It can take the screenshots and transfer the images to the host machine automatically. More specifically, MAuto crops the images properly and creates reusable tests through the appropriate use of Appium. To demonstrate its feasibility and technical soundness, MAuto was used to create automated test scripts for Clash of Clans mobile game. Strictly speaking, although MAuto does not automate everything, still it can significantly improve the speed of test automation script creation. Nevertheless, the selected query images have a huge impact on test stability on other devices. Indeed, the query image must have a good layout for the AKAZE features to be matched appropriately on the screen. Figure 13a highlighted an example of the query image of click where MAuto and AKAZE found only 4 features so that most likely this query image cannot be found from the screen when the test is run (see Fig. 13b).

Fig. 13
figure 13

Example of bad AKAZE matching

The current version of MAuto has a predefined box to crop the query image from the click-coordinate and, sometimes, the box size becomes too small to contain a usable number of features. To circumvent this limitation, the user needs to manually crop a better query image from the screenshot to expect enhanced results.

Initially, the user should use native object recognition whenever possible. Indeed, native object recognition is found to be relatively stable, thereby, once available, such an approach should be privileged over MAuto. The latter often takes shortly after the native object recognition step.

When creating the tests, the screenshot operation is quite slow. It can last few seconds to take the screenshot. This means the usability of the application in the browser is not the same as in the device without MAuto. It is also harder to play games through the browser than in the device.

MAuto cannot work with fast-paced mobile games, because it is too slow. It takes relatively too much time to transfer the screenshot from a mobile device to the host. Therefore, it is almost impossible to play fast-paced games with MAuto, because the game can end up in a couple of seconds without new inputs.

Many recording tools do far better when native object recognition can be used in the application. If a native object recognition cannot be used, then MAuto will take over. However, we should notice that it is not possible to give inputs to mobile device sensors through MAuto. This means that it is not possible to test directly games that use sensor data. Although Appium has some support for sensor inputs, MAuto cannot record those inputs.

6 Conclusion

This paper focused on mobile game testing. We reviewed the motivation, key milestones and challenges pervading the development of automated mobile game testing tools. Especially, we highlighted why mobile game testing and test automation are harder than testing traditional mobile applications. One of the key reasons lies in the fact that native object recognition is less applicable to games where the use of additional object recognition methods, like image recognition, is necessary. Besides, the acknowledged fun factor renders traditional sequential like approaches quite inefficient. A review of existing technologies revealed five key approaches for mobile game testing: image-based, coordinate-based, OCR/text recognition, native object recognition, and gesture record/replay. Image-based recognition test has shown increased performance, although with limited scope.

Tools to create image-based recognition test scripts have not reached maturity yet and still are under development. This paper has introduced a testing tool, MAuto, to make it easier to create automated mobile game tests. The approach is based on a fruitful combination of AKAZE features, Appium, record and replay, and native object recognition. Evaluation and testing have been conducted using the Clash of Clans game. It is stressed that a good choice of query images is required to make the test stable for production use.

MAuto is a very raw approach to solve challenging mobile automatic testing problems. With some polishing, MAuto would work on slow games, but it does not work with fast games that require rapid user interactions.

MAuto allows the user to create Appium tests with image recognition without coding a single line of code. The main target is to help the developers to test mobile games, but the tool can be used for other application types as well.

As a perspective work, it would be a good idea to make the cropped image size dynamic. At the moment it is a static 10 pixel square around the click-coordinate. When cropping the query image we could calculate the number of features in the image and if there are fewer features than 20 for example, then the algorithm should increase the query image size and calculate the features again until the image has a good amount of features. This would decrease the manual work the user has to do to fix the low-quality query images. It should be quite easy to add iOS support to MAuto as well. Appium works al-ready for Android and iOS. The corner problems are to find a way to take a screenshot from the iOS device and to find a quality VNC client for iOS. The image recognition solution MAuto can therefore easily be extended to an iOS environment.

To overcome the slowness of MAuto, especially in the recording phase, one solution could be to compress the image in the mobile device and then send it to the host machine. Another solution consists of tapping into the Android operating system and remove the VNC solution completely. From an input–output perspective, MAuto takes the user inputs from a desktop browser, which is not an ideal way to interact with a mobile device. It would be better for instance to trap the inputs directly from the screen of the mobile device and transfer the clicks and images to the host machine after the test has been recorded. Indeed, if the test recording would be in the mobile device, MAuto might be able to trap the sensor inputs and write those inputs to tests as well.