Sustainable mobility persuasion via smartphone apps: Lessons from a Swiss case study on how to design point-based rewarding systems

In the effort to counteract problems associated with the current carbon intensive transport system, app-based tools persuading mobility behaviour change have emerged worldwide. Most of such apps adopt a gamified approach and motivate behaviour change through external extrinsic motivational factors such as real-life prizes, that are attributed based on the distance travelled by non-car transport modes. Despite this approach might be effective in promoting additional leisure trips by sustainable mobility, it might keep car-based commuting habits unaltered, or even stimulate unfair app behaviour to gain points. In this paper, we focus on the Bellidea persuasive app, that was co-designed with interested citizens in a Swiss-based living lab experiment, and present how we addressed the shortcomings of prize-based rewarding systems, while also dealing with the constraints imposed by current levels of accuracy in automatic transport mode detection. We illustrate and discuss our design choices and the related algorithmic solutions by referring to the following dilemmas: “ single transport modes versus modal split ” , “ trust versus control ” , “ dynamism versus rigidity ” , and “ global versus local ” . We conclude by analysing real-life mobility data-sets collected by the Bellidea app and discussing our design solutions against their capacity to attract its target user group, namely car driver individuals.


Introduction
Cities worldwide are trying to counteract problems associated with the current intensive use of private cars, such as traffic congestion, energy consumption, carbon emissions and safety, just to mention a few key ones. Nowadays improving a city transport system does not solely mean building new infrastructures or repairing aging ones: transportation does not only rely on concrete and steel, and it is also more and more dependent on information and communications technologies (ICT) (Ezell, 2010;Gössling, 2018), frequently provided by smartphone apps (Shaheen et al., 2017). Among the soft policy measures that were introduced to strengthen traditional urban mobility management and favour the reduction of car use and the adoption of more sustainable mobility patterns (Bamberg et al., 2011;Semenescu et al., 2020), ICTbased soft policy measures are increasingly adopted to support the adoption and effectiveness of cognitive-motivational tools (Steg and Tertoolen, 1999) that promote more sustainable mobility patterns at the individual level.
In particular, in the framework of persuasive technology (Fogg, 2002) and behaviour change support system approaches (Oinas-Kukkonen, 2013), many smartphone-based behaviour change apps for sustainable mobility have recently been developed, such as those listed in Table 1. Even though they are seldom explicitly grounded in a behaviour change theory, depending on their specific contents, they can generally be related to the norm-activation model (Schwartz, 1977), to the theory of planned behaviour (Ajzen, 1991), or to the transtheoretical model of behaviour change (Prochaska and Velicer, 1997). They monitor individual mobility data, provide a feedback on the related ecological, energy and climate footprints, and leverage a number of mechanics and elements to promote a reduction of such footprints (Weiser et al., 2015).
Usually they exploit a combination of persuasive techniques, such as providing feedback on consequences of individual mobility choices (such as the impact on energy consumption or CO 2 emissions), inviting to define personal goals for change or engage in challenges, and comparing performances within the virtual communities of all app users. Their real effectiveness in bringing about tangible impacts in reducing car use at the societal level is however still an open research question, as indicated by a number of review analyses (Shaheen et al., 2016;Sunio and Schmöcker, 2017;Vlahogianni and Barmpounakis, 2017;Anagnostopoulou et al., 2018;Andersson et al., 2018), that tend to support previous analyses on soft policy measures (Graham-Rowe et al., 2011;Arnott et al., 2014;Scheepers et al., 2014). The related interventions tended in fact towards weak research designs (lack of control groups and randomization, poor reporting of both the intervention characteristics and the statistical significance of their outcomes) and insufficient grounding in behavior change theories.

Persuasive apps and the risk of "preaching to the converted"
One of the most frequently adopted approaches to persuade behaviour change through apps is gamification, which is usually defined as the use of game elements in non-gaming contexts (Deterding et al., 2011). The most exploited mechanics encompass goal setting, challenging, rewarding, competition, and collaboration, which in turn rely on a number of elements, such as quests, achievements, badges, points, and leaderboards. According to the Self-Determination Theory by Deci and Ryan (2004), gamified mechanics and elements act as "extrinsic motivational factors", namely they move individuals towards a change in their behaviour due to separate benefits they might gain. Opposite to them are "intrinsic motivational factors", which instead move individuals to change their behaviour since it is inherently interesting or enjoyable in itself. Not all extrinsic motivational factors, however, are characterized by the same magnitude and type of motivational power. For instance, Ryan and Deci themselves (2000) created a taxonomy of human motivation, that further classifies extrinsic motivation in categories, which range from "external regulation" (extrinsic rewards and punishments) to "integrated regulation" (hierarchical synthesis of goals), depending on the level of autonomy (self-determination) of individual actions. Among such extrinsic motivational factors, the external regulation ones are presented as "impoverished forms of motivation", while the integrated regulation ones are seen as "active, agentic states", which tend to have comparable motivational effect as intrinsic factors (Ryan and Deci, 2000, page 55).
Some persuasive apps opted for exploring extrinsic motivational factors closer to "integrated regulation", such as for example the GoEco! app, that was developed and tested in Southern Switzerland. Such an app revolves around individual goal-setting features and only offers virtual rewards such as badges. Field interventions aimed at assessing use and effectiveness of the GoEco! app, however, showed it tended to raise the interest of "already converted" individuals (Cellina et al., 2019a), namely people who were already using public transport and active transport modes (walking and cycling), whose mobility patterns were quite different from those of the average "mainstream car driver" population of Southern Switzerland, described by the Swiss Mobility and Transport Census (SMTC, FSO/ARE, 2017).
Failing to engage average "mainstream car driver" citizens, the GoEco! app was therefore seen to produce a limited impact in the transition towards more sustainable mobility patterns. Consistently, previous research at the international level summarised by Habibipour et al. (2016) reports that building a long-term commitment between an ICTbased system and its users requires a proper economic reward. Similarly, Tsirimpa et al. (2019) found that reward-based incentive schemes are effective in favouring multi-modal mobility. These findings, therefore, suggested us to explore a different approach and to explicitly include tangible, real-life prizes, in the persuasive process: acting as "external regulation extrinsic motivation factors", we made the hypothesis that such prizes could help to also raise the interest by the less intrinsically motivated "mainstream car drivers". Once engaged in app use, and while putting the new behaviour into practice, then other motivational factors would hopefully become dominant: individuals would start experiencing the intrinsically interesting properties of the new behaviour, and become intrinsically motivated to keep putting it into practice over time (Ryan and Deci, 2000).

Shortcomings of typical point-based rewarding systems and related open challenges
The most immediate way to include real-life prizes in an app is to anchor them to a point-based rewarding system. Most of the current behaviour change apps in the mobility field in fact automatically attribute points based on the tracked mobility data, coherently with a given set of rules. Point-based rewarding, however, may have undesired counter-effects. As shown in Table 1, apps usually acknowledge points based on the kilometres travelled with given transport modes (usually Table 1 Examples of persuasive apps aimed at reducing individual car use. Active modes are walking and cycling. active transport modes). This implies that app users who go for a bicycle ride during their leisure time are rewarded with points, even though they keep using their car when commuting to work during peak hours. Namely, they are rewarded even though they are not contributing to address the current critical impacts due to car use, which is why most of those apps were introduced. Additionally, as remarked by Froehlich et al. (2009Froehlich et al. ( , 2015, such a point system might encourage people to take more trips simply to earn more points, leading to an increase in energy consumption, emissions and the related environmental and climate impacts: as stated by Froehlich et al. (2009, p. 10), "some green trips could lead to more emissions than no trip at all". Namely, exactly the opposite of what apps persuading more sustainable behaviour are designed for. Another limitation that can critically affect point-based rewarding systems, when points (and especially real-life prizes) directly depend on the collected mobility data, refers to the possibility for the users to validate the app-detected transport mode. In fact, while the large majority of apps are trying to move towards a totally automatic mobility tracking system, in order to minimize the need for explicit user input (Bothos et al., 2014), trip validation is still required by the current transport mode detection algorithms. Earlier apps required users to manually enable tracking before starting a trip, indicate the mode of transport they were going to use, and then manually stop tracking, once the trip had ended, thus requiring a strong interaction with the user. More recent apps instead run in the background and automatically detect start and end of every trip, also recognizing the transport mode used (Söderberg et al., 2021). Despite of such progresses, however, Jonietz and Bucher (2018) note that at some degree all such apps would still benefit from a manual check of the transport mode by the user, if a good level of accuracy is sought for. A recent trial aimed at assessing the effectiveness of state-of-the-art mobile tracking apps also confirmed that automatic detection capability is still limited (Harding, 2019) and there is room for improvement of precision, recall and overall accuracy. Therefore, apps frequently adopt a mixed approach, combining automatic detection with manual validation by the users, who are requested to either confirm the detected transport mode or to indicate the correct one (a procedure that hereafter we will refer to as validation). This is the case, for instance, of the above-mentioned GoEco! app: for every recorded route, GoEco! predicts the transport mode, but it always asks users to manually validate it, thus ensuring high accuracy in the identification of individual mobility patterns (Bucher et al., 2019).
On the one hand, providing users with transport mode validation features is essential in order to avoid disaffection and app churn due to low detection accuracy. On the other hand, however, a risk of cheating appears when real-life prizes are at stake: users should not be allowed to freely modify any tracked data, namely to indicate different transport modes than the ones they actually used, if this would allow them to collect more points (and thus prizes) than they actually deserve. Thus, the problem arises about finding the right trade-off between the two contrasting needs of validating the detected transport mode and of preventing the users to cheat. More generally, how to guarantee the overall fairness of the system, when point-based rewarding systems are exploited and real-life prizes are offered, emerges as an open research challenge.

Research goals and content of this paper
Against this background, we aimed at developing a persuasive app that, to overcome the risk of "preaching to the converted", exploited external regulation motivational factors such as real-life prizes, in a way to mitigate the sustainability and fairness shortcomings that affect the currently dominant "validation-friendly", point-based rewarding systems. The Bellidea living lab, an initiative aimed at reducing car-based traffic and enhancing sustainable mobility, which took place in the Italian speaking part of Switzerland (City of Bellinzona), provided us with the opportunity to tackle this challenge. The Bellidea living lab, which was held in the same geographical, socio-economic and cultural context where the GoEco! app had been field tested, aimed in fact at engaging interested citizens in co-designing a persuasive smartphone app, named Bellidea, which targeted a reduction in car use at the local level and was expected to be made freely available to the whole population of Bellinzona among the local policy measures favouring active mobility and the use public transport.
In this paper, we first present the methodologies and techniques we adopted to perform automatic mobility tracking and transport mode detection in the Bellidea app (Section 2). Then, we present and discuss our choices and strategies, namely how we addressed the research challenge, in terms of the proposed point-based rewarding system and the procedure for the validation of the transport mode (Section 3). We also analyse and discuss data collected after the launch of the Bellidea app to the population and discuss its capability to address the "preaching to the converted" phenomenon and to raise the interest by "mainstream car drivers". While this work does not attempt to demonstrate the overall behaviour change effectiveness of the Bellidea app, which would require setting up a controlled intervention (possibly a randomized controlled trial), it can enrich with new experiences, ideas and solutions to the knowledge-base currently available for the development of persuasive apps in the field of sustainable mobility. Therefore, in Section 4 we conclude by providing recommendations for future persuasive apps aimed at exploiting external regulation motivational factors such as reallife prizes, and by highlighting remaining open challenges for future research activities.

Materials and methods
The Bellidea app was developed within a transition experiment (Hoogma et al., 2002) launched in Spring 2017 by a team composed of the Swiss City of Bellinzona, a non-governmental association advocating for the diffusion of bicycle use (Provelo), and the local University of Applied Sciences. The app's features were in fact co-created in a living lab experiment (Almirall et al., 2012) engaging the project team and a group of volunteer citizens, who regularly met once a month from March 2017 to February 2018, with a break during Summer. Hereafter, we will refer to these people as living lab participants. The users of the app after its official release to any interested citizens in Bellinzona will instead be referred to as standard users.
Details about the Bellidea living lab are provided elsewhere (Cellina et al., 2020). In short, living lab participants were recruited through a communication campaign targeting citizens living or studying in the city of Bellinzona and its surroundings. A press release was published, which was amplified by the local radio and newspapers. Personal contacts by the officers of the city of Bellinzona and the related word-of-mouth process were particularly helpful in spreading the word and allowed to engage a sufficiently large and diverse group of participants (n = 46). About 40 % of the living lab participants were females and the group included representatives of younger and older generations, normally spread around a dominant group of 40-49 years old individuals (45 % of the participants). A variety of levels of education was included as well, with a tendency towards highly educated people. About 40 % of them in fact had a high school diploma and 8 % of them had a PhD degree, while only 13 % of them had only concluded the compulsory education cycle.
First Bellidea lab meetings were organized as co-creation workshops, aimed at identifying the key features to be included in the app, as well as designing the rules behind them; later meetings were instead conceived as test-beds for the Bellidea app prototype, as long as professional software developers were translating the lab proposals into pieces of software code. During the workshops living lab participants where thus initially requested to provide their inputs and comments on possible alternative app features. Later on, they were invited to test the app features and, between one meeting and another, to use prototype versions of the app to collect and validate their mobility data, in order to properly train our transport mode detection algorithms. The number of participants who kept interacting with the living lab decreased over time: the sample of individuals who contributed to the app-based mobility data collection was in fact equal to 28.
The requirements for the Bellidea app features, as they resulted from the co-creation workshops, can be summarized as follows: • the app performs automatic mobility tracking; • the app provides users with (eco)-feedback on their individual mobility patterns; • the app stimulates users with mobility-related challenges at the individual and collective level; • the app rewards users with points: the higher the amount of collected points, the more sustainable the individual mobility patterns; • points allow for comparison between app users; • points are attributed for use of active mobility and public transport (sustainable transport modes); • points can be redeemed for real-life prizes, such as discounts on energy bills or vouchers for local stores and public transport tickets, to be offered by the City of Bellinzona. Fig. 1 shows a selection of the screenshots of the Bellidea app (only available in Italian) as the outcome of co-creation. Our research goal in the Bellidea process was then to design such features and the related algorithms and implementation procedures system in order to address the shortcomings on sustainability and fairness introduced in Section 1. Since the point-based rewarding system as well as possibilities for cheating are strictly dependent on the mobility tracking and transport mode detection algorithms, here we introduce the algorithms we relied upon.

Automatic mobility tracking
Due to time and budget limitations, developing mobility tracking components from scratch is often impossible within city-led projects such as the Bellidea one. Therefore, existing commercial tools are usually exploited, so that software design and development efforts can focus on the app's persuasive features. For Bellidea, following the approach of GoEco! (Bucher et al., 2016;Bucher et al., 2019), we opted for exploiting the activity tracker Moves app, originally developed for fitness purposes (currently no longer accessible, since it was discontinued in Summer 2018). Even though we were well aware of the important limitations associated with the use of an external commercial fitness/activity tracking app such as Moves (particularly, the lack of control and availability of the raw data collected, such as regular GPS records and accelerator or gyroscope measurements, the lack of knowledge on the specific procedures used to collect them, and the high dependency on external decisions), at the time of the app development we made such a choice due to the lack of freely available, equally well-performing automatic mobility tracking tools.
This implies that, when installing Bellidea, users were also requested to install, agree on the "Terms and conditions" and "Privacy policy" of the Moves app, and launch it. The basic mobility data collected by Moves were thus automatically imported in the Bellidea datastore via Moves API services, organized in routes and activities (segments of routes travelled with the same transport mode). For each activity, Moves provided the following features: distance, duration, start and arrival time, GPS coordinates of a few tracking points (their number depending on the specific route and activity), and estimated transport mode (walking, running, cycling, or "transport").

Transport mode detection
While being unable to tell the difference between a car, a motorbike, a bus or a train is not a problem for a fitness tracker app, it becomes critical in a mobility tracking app aimed at reducing car use. Therefore, the mobility data of the day before, collected via the Moves API, were once a day sent to the Bellidea classifier. For each activity, the classifier took as input the features provided by Moves, computed a list of features regarding both space and time attributes (distance, speed, distance between first and last track-points and public transport stops, differences with public transport schedule time, changes of direction between trackpoints,), and fed them into a random decision forest algorithm (Breiman, 2001) to infer the probability of having been performed by each transport mode (walking, cycling, train, bus, car, other). Based on such probabilities, the classifier selected a transport mode to be sent back to the Bellidea app. More details about the classifier are presented in Bucher et al. (2016).
Before the launch of the Bellidea app to the population of Bellinzona, the Bellidea classifier was trained using the following sets of validated activities (i.e., activities for which the transport mode actually used was available): • pseudonymised data collected by Bellidea living lab participants during the Bellidea app design phase: 6 ′ 047 validated activities collected by 28 voluntary lab participants between January, 15 and March, 11 2018, through an on purpose developed version of the Bellidea app; • pseudonymised data collected by the GoEco! app by participants to the GoEco! Project, living in the same region involved in the Bellidea Moderate differences in the two samples of users were not critical to the learning of the classifier; to the contrary, richer training data were expected to potentially improve the generalization capability of the classifier. On the other hand, too relevant differences in the mobility patterns of the two samples of users, or between them and the target of the Bellidea app (Bellidea standard users), namely the people living in Bellinzona, were instead expected to negatively affect the performances of the classifier. We considered, however, such risk very small, for several reasons: • the set of data collected in the GoEco! project and in the Bellidea project were obtained through the same app (Moves); • the GoEco! and Bellidea living lab participants belong to the same geographical area as the potential Bellidea users, and thus we hypothesized that they were characterized by the same transport supply as well as socio-economic and cultural context driving mobility needs; • recruitment of participants to the Bellidea living lab and to the GoEco! project as well as of Bellidea standard users was performed with similar techniques, namely they were both voluntary individuals, recruited through a press release, interventions on local mass media (radio, TV, newspapers and magazines), social media campaigns, as well as personal contacts and word-of-mouth, at a time distance of less than two years.

Results and discussion
In this Section we present the outcomes of the living-lab, namely our design and computational choices regarding the Bellidea point-based rewarding system and the related transport mode validation procedure, which were aimed at ensuring the overall fairness and sustainability of the Bellidea persuasive app and at limiting the undesired effects that emerged in previous experiences. We illustrate and discuss our choices by referring to the following dilemmas: • "single transport modes versus modal split": should points be assigned by accounting for travel by single transport modes or instead by accounting for individual mobility patterns as a whole? • "trust versus control": should point assignment and transport mode validation procedure mostly rely on control or on trust attitudes by the app owners and managers? • "dynamism versus rigidity": should points be dynamically updated whenever new travel data is detected or should a more rigid scheme be selected, by updating points of all users at fixed periods in time? • "global versus local": should points be assigned by accounting for any travel, independently on the region where it occurs, or should points only be attributed to travel in the local region where the app is mostly used?
We then analyse the mobility characteristics of the Bellidea standard users and discuss the capability of our choices to raise the interest by "mainstream car drivers".

Single transport modes versus modal split
The limitations of most frequently adopted approaches to the attribution of points were clear: • they reward occasional activities by sustainable transport modes (e. g., trips by bicycle, walking or public transport) without penalizing for unsustainable car-based routines; • the amount of the reward increases with the travelled distance, thus favouring people who have higher demand for travel, in absolute terms.
The very fact that an individual travels 100 km by bicycle, in fact, is not relevant per se: they might be travelled in a week-end leisure tour, in addition to the weekly car-based commuting and errands. Instead, if such 100 km travelled by bicycle coincide with the totality of the kilometres one individual travels over a given period of time, then they become highly relevant. Additionally, travelling long distances by public transport might be a necessary consequence of a previous choice to live in the suburbs of the city, instead of the city centre: therefore, individuals living in the city centre would be penalized with respect to individuals living in the city suburbs, since in absolute terms they travel less.
To identify a fairer point attribution strategy, we compared the different point attribution strategies by means of illustrative examples, based on "personas", namely fictitious characters, whose mobility patterns and needs are however plausible and realistic in the context of the region of Bellinzona, and that could therefore represent possible Bellidea users. The characteristics and mobility needs of such "personas" were in fact inspired by the description of the mobility patterns by living lab participants, as they presented them during the living lab meetings. For instance, we considered "personas" and weekly trip schedules such as those presented in Table 2. Fig. 2 shows how the points assigned to the four personas change on varying the adopted point attribution strategy. In any of the considered strategies, we regard as "sustainable transport modes" either active mobility (walking or cycling) or use of public transport. When points are attributed based on the amount of travel distance by sustainable transport modes, Luca, who never uses the car, is rewarded with the highest amount of points. However, Paolo, who weekly travels a relevant amount of kilometres by car, is also rewarded with a high amount of points. And Marta, who travels a lot of weekly kilometres by car, is rewarded with the same amount of points as Anna, who instead uses the car four times less than Marta. When the modal split is instead considered, attributing points based on the percentage use of sustainable transport modes, differences between the users regarding their impact on traffic and the local transport system, are acknowledged.
Based on similar considerations on a number of "personas", we decided to abandon the dominant approach attributing points based on the amount of use of a given set of transport modes (travel kilometers) and to consider, instead, personal mobility patterns as a whole and the percentage use of each transport mode (modal split approach). Doing so, Bellidea accounted for all routes and activities, no matter for their purpose or whether they were systematic or not: leisure-time mobility needs, as well as errands, were seen as relevant as commuting trips for work or education purposes, in determining one's mobility patterns and related impacts. By breaking the direct correlation between the amount of travel and the attribution of points, such a choice was also expected to weaken the incentive to perform additional, carbon-emitting travels just to gain more points. This does not completely remove the latter risk, but we expected such a design choice to reduce the chances this occurs, since the number of points depends on all the other trips performed during the week (amount of travel and transport mode used).
Then, a decision was made about the specific variable to be used to measure the modal split, namely the percentage use of each transport mode: one could in fact refer to the percentage of trips, of the travel distance, or of the travel time.
To select the variable to use, we again refer to the "personas" represented in Table 2 and Fig. 2. When using the percentage of distance travelled by sustainable transport modes, Paolo is rewarded more than Marta, though the latter travels much less by car in terms of both travel time and travel distancewhich means she has a lower impact on traffic and on the transport network. When using the percentage of trips by sustainable travel modes, Marta and Paolo are awarded exactly the same number of points, which is even worse in terms of acknowledging their actual impact on traffic. When instead the percentage of travel time by sustainable transport modes is considered, Marta is rewarded with slightly more points than Paolo. Also, Anna, who in total travels a really limited amount of kilometres, is rewarded with a high amount of points. This better captures her very limited overall impact on the transport network. Based on such considerations, and however acknowledging that there is no univocal, always dominant "one-size-fits-all" point attribution strategy, we regarded the percentage of travel time as the best suited approach for a case such as Bellidea, that aimed at reducing car use due to its impact on traffic and on the transport network.
Adopting the modal split approach implied to set a period during which mobility data are collected, at the end of which points are attributed. A weekly time-step was selected, which allows to take into account the variety of mobility needs users usually have, also including leisure and non-systematic trips, and therefore can provide a proper overall assessment of how sustainable one's modal choices are. A shorter period (e.g. one day) would result in too much variability: one day might appear to be more sustainable than the other not because of active choices by the users, but simply because of different factors specifically influencing individual mobility needs. A longer period (e.g. one month), instead, would also be appropriate to summarize mobility patterns, though the feedback offered by the app would be too rare in time to have an impact, leading users to soon lose interest in the app.
The procedure we designed to attribute the Bellidea weekly points thus works as follows. At the end of the week, the weekly total travel time was computed: if the percentage of travel time by sustainable transport modes was equal to 100 %, the user was attributed 100 points; otherwise, she was rewarded with a smaller amount of points, in a linear proportion to such a percentage. Both public transport and active transport modes were accounted for as "sustainable transport modes", since both types of transport modes were under-used in the Bellinzona area and their use needed to be increased. However, to favour use of the bicycle in the Autumn and Winter seasons, when it is less popular a choice, due to comfort reasons, any recorded travelling time by bicycle during the cold season was doubled, thus providing the user with bonus points. Finally, to avoid attributing the maximum amount of weekly points to individuals with a very limited use of the Bellidea app, we opted for attributing no points in weeks with less than four tracked activities.

Trust versus control
At the very heart of the Bellidea concept there is an attitude of trust between the City of Bellinzona, owner and provider of the Bellidea app, and its citizens. The mobility data collected by the Bellidea app is in fact supposed to correspond to the overall mobility data of each app userswhile this might not always be the case. For instance, smartphones might run out of battery power when travelling, or users might temporary disable the GPS-based Moves's location tracking features for a few hours, for battery-saving purposes. This implies that parts of trips, or even entire trips, might not be accounted for, in case of users interacting with the Bellidea app in good faith. However, city authorities acknowledged the risk of app users on purpose trying to cheat the system, therefore a proper control strategy was designed to reduce cheating as much as possible. To this purpose, we first focused on the design of the classifier for the automatic detection of the transport mode, and then on the procedure for the validation of the detected transport mode.
For the design/architecture of the classifier, we considered three alternative solutions: • a set of individual classifiers, one per every Bellidea user; • a collective classifier, common to all Bellidea users; • an intermediate solution, the collective classifier with user ID, which partially accounted for each individual user's characteristics, by including the user identifier (ID) among the input parameters.
For the GoEco! app, the authors had opted for a set of individual classifiers, one per every user. This allowed them to achieve high transport mode detection accuracy, by better modelling the characteristics of each user and learning user-specific mobility routines and Table 2 Weekly trips travelled by four "personas" we considered in order to design our point attribution strategy. patterns. However, a set of individual classifiers, each one trained using only the activities of a given user, is very sensitive to non representative modal choice (if, for instance, training data are collected in a period were the car is not used because it is, e.g., undergoing maintenance) and to incorrect validations. In fact, if through validation a user untruthfully indicates a transport mode which provides more points (e.g., bicycle) for all her activities, and such data is used to train the individual classifier for such a user, then the individual classifier attributes that same transport mode to all her future activities. Instead, a collective classifier which is trained on data from all users, without differentiating between them, is only slightly biased by the limited number of validated activities collected for each user. Additionally, the amount of data needed to train an individual classifier for each user and achieve a reasonable accuracy (in case of an honest user) is much larger if compared to the collective classifier solution, from which one can expect reasonable predictions even for unseen users. On the other hand, we expected the performance of the collective classifier to be worse than those of the set of individual classifiers. Finally, the intermediate solution of the collective classifier with user ID is more sensitive to cheating than the one without user ID, yet, it is more robust than the set of individual classifiers.
To choose among these three solutions, we assessed their performances by a tenfold cross-validation over the dataset obtained by merging the above-mentioned validated activities collected during both the design phase of Bellidea (data collected by living lab participants) and the GoEco! project. To this purpose, we considered: 1. the accuracy, i.e, the fraction of activities with correct transport mode detection; 2. the precision for each transport mode T, i.e., the fraction of activities classified as T that were actually travelled by T; 3. the recall for each transport mode T, i.e., the fraction of activities travelled by T that were correctly classified.
The results of this assessment are presented in Table 3. The set of individual classifiers approach outperformed the other ones in almost all indicators. However, performances of the three approaches were very close, except for the recall of bus and cycling, which was significantly lower when the collective classifier was used. This is because activities travelled by bus are particularly difficult to be automatically distinguished from car. Additionally, in the city centre where mobility is slow, also the features of bicycle-based activities overlap with those of carbased and bus-based ones. Such a drastic decrease in the recall performance, however, was not observed when the user ID was also considered in the classification. Therefore, we opted for the collective classifier with user ID, which overall produced an acceptable value for both accuracy, precision and recall, and was less sensitive to cheating than the set of individual classifiers.
For this choice to be effective, however, a sufficiently large number of validated data for each standard Bellidea user had to be included in the classifier training dataset. Therefore, to maintain and even increase the classifier performance, we decided to periodically re-train the classifier with new data collected from Bellidea standard users through validation. However, allowing for manual validation could significantly increase the risk of cheating, since points were attributed depending on the transport mode. To maintain the number of mode detection errors acceptable, validation was admitted, but the number of activities that the users were asked to validate as well as their weight in the point tally was limited as much as possible. To this end, the interaction between Bellidea and its standard users was organized in two phases: 1. a short training period just after the download of the app, during which users were requested to validate all the detected activities, but did not get any points. This period, which on average lasted a couple of weeks, ended when a minimum number of validated activities, set to 80, was collected; 2. a standard use period following the training one, during which validation was only requested when the classifier could not reliably detect the transport mode used for an activity. In the standard use period, points were assigned every week, according to the rules introduced in Section 3.1.
To decide whether the classifier decision about the transport mode used was reliable or not, we defined the confidence of a classification as the difference between the probabilities attributed by the classifier to the two most probable transport modes. A classification was considered unreliable, and validation was requested, when the confidence was below a given threshold, i.e., when the probabilities of the two most probable transport modes were too close to one another. In such a case, a validation was requested to the user and the validated activity was included in the training dataset. Otherwise, if the confidence of the classifier was above the threshold, the transport mode with the highest probability was assigned to the activity, with no possibility for the user to modify it.
To set the confidence threshold, we considered the probabilities assigned by the classifier in the cross-validation over the merged GoEco! and Bellidea datasets, and, for different values of the validation threshold, we computed: • the fraction of activities requiring validations; • the accuracy in transport mode detection over the non-validated activities.
Based on the resulting simulations (Fig. 3), we opted for a trade-off validation threshold of 25 %, which implied requesting the users for a validation of 14 % of their routes, and guaranteed an average detection accuracy of 87 % for the remaining 86 % non-validated routes. Such a configuration was regarded as the optimal trade-off, because: • it was not critical with respect to cheating, since it would have allowed users to only modify 14 % of the recorded activities, if they wanted to dishonestly gain more points; • it was not critical in terms of validation effort: according to the average data collected by participants to the Bellidea living lab, on average each app user registered 38.8 activities per week, which corresponds to a weekly request for validation of 5.5 activities, namely less than one validation per day; • it was still acceptable regarding the risk of user abandon due to a lack of satisfaction in the quality of automatic mobility monitoring, since it only implied a 13 % error in the detection of the transport mode for routes for which validation was not allowed.
By adopting this procedure, we accepted the risk of not attributing Table 3 Performances of the three considered instances of the Bellidea Classifier, based on the GoEco! and Bellidea validated sets of data. Accuracy, precision and recall are expressed on a (0-1) scale. points to users who deserved them. In fact, as shown by the recall values reported in Table 3, the 13 % average error in automatic detection of the transport mode mostly affected bus activities, which were underdetected by the classifier. Therefore, the Bellidea app was likely to estimate a lower travelling time by sustainable transport modes than it really was, and users were at risk of not being attributed the full amount of weekly points they deserved. In this framework, Bellidea still provided its users with the possibility to notify errors. For each tracked activity, in fact, users could report any the following errors: not performed activity, wrong duration, wrong path, wrong mode of transport. Upon notification of error, Bellidea automatically deleted activities of the first type (not performed). It is not infrequent, in fact, that GPS devices record short activities that have not been really performed, for instance around places where users are standing still for a few hours. The other notified types of errors, instead, were not followed by any automatic action, being manually inspected. In fact, any automatic removal of activities or change of the transport mode would again have paved the way to cheating (even though via a less straightforward procedure than direct validation). If specific users were found to frequently report errors, they were directly contacted by the Bellidea help-desk team to investigate their problem and understand whether the experienced poor performances were actually due to technical problems (for instance related to the phone, to the classifier, to poor GPS signal in frequently visited places, ecc.) or were instead an attempt of cheating.

Dynamism versus rigidity
An additional challenge addressed in developing the Bellidea app referred to the feasibility of developing a dynamic tool, as much as possible capable of operating in real-time. This was in fact what users expected: if their phone was tracking their mobility data, they expected to immediately see their routes and activities on the phone, and possibly also to experience the instantaneous increase in the amount of their available points.
However, the choice to attribute points based on the modal split detected over a weekly time-step imposed to abandon the real-time framework: Bellidea points were updated on a weekly basis and they referred to the mobility data collected from Monday to Sunday. They could not be updated on Sunday evening, however, due to the possible need for validation of a few activities. The request for validations, in fact, implied that users were given enough time to check their activities and validate the transport mode, before points were attributed. For the sake of simplicity, we decided to introduce the same rule for all the users, no matter whether they had requests for validations or not, and set the attribution of points every Tuesday morning at 10 a.m.: • during the week users were free to validate their activities whenever they liked, though every Monday evening at 6.30p.m. they received a push notification reminding to complete by Tuesday at 10 a.m. any request for validation of activities travelled in the previous week (Monday-Sunday); • if on Tuesday at 10 a.m. there were still open requests for validation of activities of the previous week, no weekly points were attributed, and validation of those activities was disabled.
Thus, the Bellidea system was quite rigid and precluded possibilities for users to validate old activities and correspondingly to get the related points updated. This was in part due to avoid retroactive management of the point system, which would have been too complex and time consuming with respect to the available time and budget for the app's development and management. However, the main reason for this choice is that we preferred to force users to at least have one weekly interaction with the Bellidea app, in order to guarantee that they still remembered about the routes they travelled throughout the previous week, and were thus able to correctly validate the transport mode, when validation was needed. Moreover, requiring at least one weekly interaction with the Bellidea app was expected to contribute to rekindle the users' interest and was expected to at least partially counteract a possible tendency to app churn due to the lack of real-time dynamism.

Global versus local
The final dilemma we addressed when designing the Bellidea app dealt with the concept of "boundary". Since Bellidea aimed at tackling mobility patterns of a user as a whole, should it have considered any travelled route, wherever it took place, or only focus on the subset of mobility patterns that involved the Bellinzona region? In principle, we were convinced of the need to stimulate global improvements in one's mobility patterns, no matter for the place where they occurred. However, the promoter of the Bellidea initiative, the City of Bellinzona was specifically interested in a tangible improvement of traffic-related problems in the City and its surroundings. Particularly, as the main sponsor of the Bellidea prizes through the municipal budget, the City of Bellinzona was only interested in offering prizes to reward an improvement of individual mobility patterns over its own territory: if improved mobility patterns were also registered outside the City, it was definitely valuable, but the City was not inclined to reward them.
Such a tension, which reflects the well-known tension between local and global costs and benefits that characterizes common-pool resources and ecosystem services whose management depends on cooperation between different administrative areas, such as the climate-related ones (Ostrom et al., 1999), ended up with the introduction of boundaries. Eventually in fact Bellidea only considered activities with either a starting or an arrival point in the area of Bellinzona (schematized, for the sake of simplicity, as a rectangular box). Even though it was regularly tracked in Moves, any travel activity completely outside such a "box" was not imported in Bellidea, therefore it did not play any role in the attribution of Bellidea points and the related prizes.
We accepted such a compromise, even though we were well-aware that in some particular cases it would have lead to wrong assessments of the level of sustainability of one's individual mobility patterns and to reward people who did not deserve it. For example, this would have happened in the case of an individual who always used the bicycle for short routes within Bellinzona but who daily commuted by car to her workplace in the nearby city of Lugano-a distance of about 25 km and a common situation for people living in Bellinzona -always stopping for a coffee at the service station outside Bellinzona. In such cases, Bellidea would have only recorded the short daily commuting time by car from her home to the service station, neglecting the remaining daily travelling time by car, from the service station to the workplace in Lugano. Therefore, the user would have been rewarded by Bellidea points, even though most of her weekly travelling time would have been spent by car, within the region around the City of Bellinzona. The only option to avoid such a paradoxical effect would have been to enlarge the boundaries of the area taken into account, for example by securing collaboration between neighbouring cities, thus enlarging the "box" as to account for the areas which were most likely to be covered by daily mobility needs of average citizens. As in many other real-life challenges, overcoming administrative boundaries and focusing on the actual spatial and geographical boundaries of a given phenomenon would thus to be an effective strategy to produce a better solution to the challenge.

Engagement of the « mainstream car driver» app's target group
The Bellidea app was launched on April, 25 2018 to the whole population of the Bellinzona area, by means of a press conference by the City of Bellinzona. From that day on, all interested citizens living, studying or working in Bellinzona could download and start using it whenever they liked. Prizes offered by Bellidea, of the value of about 20 Swiss francs each and fully paid by the City, consisted in tickets to enter the local cinema and swimming pool, tickets for a close-by leisure cable-car, and honey jars. The Bellidea app was available until the end of July 2018, for thirteen consecutive weeks. During that period, 721 standard user accounts were registered, 207 of which collected at least two full weeks of data, which was the minimum period requested to train the transport mode detection algorithms, before the Bellidea persuasive features were unlocked. The mobility data collected by standard Bellidea users during such a period does not allow us to perform any assessment of the app effectiveness in persuading a reduction in car use, which would require an experimental design, through a control group of individuals not using the Bellidea app and possibly a random assignment to the treatment (use of the Bellidea app) and control group. However, available data allows to analyse and discuss the effectiveness of the above Bellidea design and computational choices in addressing the problem of "preaching to the converted", that is the main reason why we had decided to introduce real-life prizes and the related point-based rewarding systems.
To check the amount of "mainstream car drivers" among the Bellidea app users, we consider the mobility data collected by each user during her first two weeks of app use, namely during the training period introduced in Section 3.2. During such a training period, in fact, no feedback, point or invitation to join challenges were provided, and users were requested to validate the transport mode for all their trips. Therefore, such data can be regarded as individual "baseline mobility patterns", namely a reliable observation of the mobility patterns of the Bellidea app users before Bellidea started to persuade their mobility behaviour.
Such baselines can be compared with the 2015 statistics of the Swiss Mobility and Transport Census (SMTC, FSO/ARE 2021). Aggregate statistics of key mobility variables for the region around the city of Bellinzona (agglomeration of Bellinzona) and the year 2015 are in fact available (daily total travel distance and daily travel distance by car, both expressed in kilometres per day), based on a random sample of n = 564 individuals. We can thus consider the mean values characterizing the mobility baselines of the n = 207 users of the Bellidea app for which baselines are available and compare them with the mean mobility data provided by the 2015 SMTC survey for the agglomeration of Bellinzona. This comparison allows us to check whether Bellidea users can be best labelled as "mainstream car drivers" or as "already converted" users of public transport and active mobility.
Mean values observed by the 2015 SMTC appear to be slightly larger than mean values observed in the Bellidea dataset (Table 4). To compare them, we perform a statistical test. The plotted probability distributions of the Bellidea sample observations for the two baseline variables do not appear to have a Gaussian distribution. Even though a t-test would be asymptotically valid, since the Bellidea sample is sufficiently large (n = 207), we also perform a bootstrapped, one-sample t-test, which performs no assumptions on the normality of the probability distribution function of the dataset. For both variables, our null hypothesis H 0 is that the mean of the Bellidea baselines coincides with the mean value identified by the 2015 SMTC census. Our alternative hypothesis H a is that the mean of the Bellidea baselines is statistically larger than the 2015 SMTC mean values. Performing both exact and bootstrapped one-tailed t-tests, the results of Table 4 are obtained: for both variables the null hypothesis fails to be rejected at any significance level. The observed differences between the Bellidea and the 2015 SMTC mean values are thus not statistically significant and therefore we conclude that the Bellidea users have the same mobility patterns as the average population of Canton Ticino. Namely, Bellidea users can be regarded as "mainstream car drivers", since their overall travel distance and travel distance by car cannot be distinguished by those of the (car driver) average population.

Conclusions
In this paper, we introduced the Bellidea gamified smartphone app we co-designed with interested citizens in a living lab process launched by the city of Bellinzona, in the Italian-speaking part of Switzerland. Bellidea aimed at persuading a reduction in individual car use, as an attempt to integrate and support already existing urban policies and regulations tackling traffic-related urban problems. In order to raise the interest of "mainstream car drivers", instead of "already converted" users of public transport and active mobility, we opted for including external regulation motivational factors, such as real-life prizes, through a point-based rewarding system. Our key research goal was therefore to find effective ways to exploit point-based rewarding systems, while addressing the limitations that characterized previous gamified apps. Particularly, we aimed at avoiding counter-productive effects that were observed in other apps (such as travelling additional trips just to gain more points, while keeping car-based commuting habits unaltered) and at guaranteeing fairness of the whole process, reducing the chances for app users to cheat the system, through validation of the transport mode.
The insights we gained from the Bellidea process, strengthened by the collected field data, seem to confirm that Bellidea managed to engage "mainstream car drivers" and allow us to develop the following two recommendations. First, we recommend to reward app users with points by considering their mobility patterns as a whole, instead of just considering the distances they travel by a given set of transport modes, considered more sustainable (which might paradoxically lead users to travel additional trips, instead of replacing those that they travel by car). To this purpose, we suggest to attribute points on a weekly basis, by considering all the travelled routes, and proportionally to the distances or travelling time with a given set of transport modes. Second, we suggest to constrain possibilities for validation of the transport mode, by exploiting most recent technological advances to enhance the automatic identification of the mean of transport and adopting a hybrid solution, which requires validation when the transport mode classification confidence is low (in our case, on average, 14 % of the collected mobility data), blocking validations in all the other cases. The adopted mechanisms, which were not found in any other currently available app, increase chances of fair attribution of points, and therefore prizes, while limiting detection errors and largely reducing potential cheating effects.
Achieving high accuracy in automatic detection of the transport mode is essential for effective implementation of the above recommendations. In developing the Bellidea app, however, many compromises were made, in order to trade-off suggestions by the living lab participants and the City of Bellinzona, current technological limitations in automatic mobility detection and budget constraints. This work has shown that, although an acceptable classification accuracy of the transport mode can be achieved, there is ample room for improvement, especially concerning the classification accuracy of public transport modes (buses, in particular), and -correspondingly -concerning the amount of validations still required. In the case of Bellidea, achieving higher accuracy was also limited by the lack of availability of low-level sensor data, such as accelerometer and gyroscope measurements, which were collected and exploited by the external activity tracking app Moves we relied upon, but were not made available to us through its APIs. Such low sensors data have in fact largely proven to be effective in distinguishing between different motorized transport modes . Also, the sometimes low quality of the GPS measures and, more often, the inaccurate segmentation of the recorded positions performed by the external activity tracking app Moves, represented a limit to the classification performance. Future research will therefore be needed to improve such segmentation and measurement of raw mobility data, for instance by developing new app components capable of directly collecting low-level sensor data, thus also removing the critical dependence on an external activity tracking app.

Availability of data and material
The datasets generated and analysed within the Bellidea project are not publicly available since they are sensible personal data. However, they are available from the corresponding author on reasonable request and after proper anonymization.

Funding
The Bellidea project was supported by the Swiss Federal Office of Energy (SFOE) under the ERA-NET scheme and by Innosuisse -Swiss Innovation Agency within the Swiss Competence Center for Energy Research (SCCER) Mobility. The authors bear sole responsibility for the findings and conclusions.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Table 4
Comparison between the baseline mobility patterns of the Bellidea app users and the average mobility patterns by the population in the agglomeration of Bellinzona in 2015 (SMTC). Both an exact and a bootstrapped t-test are performed, leading to the same conclusions.