JMIR mHealth and uHealth

Background: Cancer survivors are at an elevated risk for several negative health outcomes, but physical activity (PA) can decrease those risks. Unfortunately, adherence to PA recommendations among survivors is low. Fitness mobile apps have been shown to facilitate the adoption of PA in the general population, but there are limited apps specifically designed for cancer survivors. This population has unique needs and barriers to PA, and most existing PA apps do not address these issues. Moreover, incorporating user preferences has been identified as an important priority for technology-based PA interventions, but at present there is limited literature that serves to establish these preferences in cancer survivors. This is especially problematic given the high cost of app development and because the majority of downloaded apps fail to engage users over the long term. Objective: The aim of this study was to take a qualitative approach to provide practical insight regarding this population’s preferences for the features and messages of an app to increase PA. Methods: A total of 35 cancer survivors each attended 2 focus groups; a moderator presented slide shows on potential app features and messages and asked open-ended questions to elicit participant preferences. All sessions were audio recorded and transcribed verbatim. Three reviewers independently conducted thematic content analysis on all transcripts, then organized and consolidated findings to identify salient themes. Results: Participants (mean age 63.7, SD 10.8, years) were mostly female (24/35, 69%) and mostly white (25/35, 71%). Participants generally had access to technology and were receptive to engaging with an app to increase PA. Themes identified included preferences for (1) a casual, concise, and positive tone, (2) tools for personal goal attainment, (3) a prescription for PA, and (4) an experience that is tailored to the user. Participants reported wanting extensive background data collection with low data entry burden and to have a trustworthy source translate their personal data into individualized PA recommendations. They expressed a desire for app functions that could facilitate goal achievement and articulated a preference for a more private social experience. Finally, results indicated that PA goals might be best established in the context of personally held priorities and values. Conclusions: Many of the desired features identified are compatible with both empirically supported methods of behavior change and the relative strengths of an app as a delivery vehicle for behavioral intervention. Participating cancer survivors’ JMIR Mhealth Uhealth 2017 | vol. 5 | iss. 1 | e3 | p.2 http://mhealth.jmir.org/2017/1/e3/ (page number not for citation purposes) Robertson et al JMIR MHEALTH AND UHEALTH


Introduction
Because of advances in early detection and treatment, the number of cancer survivors in the United States is increasing dramatically. In 2014 this number was an estimated 14.5 million, and by 2024 it is expected to increase to 19 million [1]. Despite advancements regarding cancer-related mortality, cancer survivors still face significant long-term health challenges, including an increased risk of all-cause mortality, obesity, type 2 diabetes, osteoporosis, anxiety, and depression [2]. Cancer survivors also face the risk of cancer recurrence and second cancers, sequelae like lymphedema and fatigue, and decreases in physical functioning that can impede the ability to conduct activities of daily living [2]. Physical activity for this population is generally safe and can play a vital role in ameliorating these physical and psychological challenges [2]. Despite this, most cancer survivors do not meet the minimum level of physical activity recommended by the American Cancer Society [3]. A study that interviewed a nationally representative sample found that only 30% of cancer survivors report meeting recommended levels of aerobic physical activity [4]. Innovative behavior change efforts are needed to increase physical activity in cancer survivors.
Mobile health (mHealth), utilizing mobile devices for health-related applications, has emerged as an important tool for health-related behavioral interventions [5]. The use of mobile devices has many potential advantages for such interventions, including the propensity for widespread dissemination, cost-effectiveness, the potential to minimize participant burden, sophisticated on-board sensors, the ability to provide immediate feedback, and the ability to provide experiences that are inherently enjoyable to users [6]. Importantly, cancer survivors are typically older adults, and technology use in this segment of the population is increasing rapidly [7]. Indeed, an increasing body of evidence indicates that technology-based interventions may be well received by cancer survivors and hold promise for physical activity promotion initiatives [8,9].
While there are many fitness and physical activity apps currently available for download, the majority of these apps are centered on measuring and improving athletic performance [10]. Such apps are generally not well suited for the majority of cancer survivors, who may be less motivated to engage in physical activity [11] and who face unique barriers to engaging in recommended levels of physical activity [12][13][14]. Using theory-based behavior change methods may be a particularly useful approach to increase physical activity in this population; however, most existing apps are not grounded in behavior change theory [15][16][17].
Incorporating users' preferences has been identified as important for delivering technology-based physical activity promotion programs to older adults [18]. However, at present there is limited research to offer insight as to the practical preferences of cancer survivors for an app designed to increase physical activity levels. Puszkiewicz et al [19] conducted individual interviews with 11 cancer survivors regarding their experience with an existing physical activity app designed for the general population. Participants in this study reported that the app was generally well received but did not adequately address a number of factors relevant to understanding physical activity patterns in this population; these included fatigue, receipt of trusted information, cancer-related limitations, and social support. The authors of this study highlight the benefits of addressing such factors, as well as the utility such an app could provide as a means to facilitate physical activity-related communication between health care providers and cancer survivors.
Given the substantial resources required for software development, and the daunting reality that 23% of mobile apps are abandoned by the user after only one use [20], it is important to appropriately address the practical points regarding how an app may be well received and able to provide lasting value to the priority population. The purpose of this study was to use focus groups to generate insight as to cancer survivors' preferences regarding the features and types of messages of an app to increase physical activity. Identified preferences were then applied to established behavior change methods [21][22][23], such as enactive mastery experiences [24] and verbal persuasion [25], to provide recommendations for future app development.

Recruitment
Inclusion criteria were that each participant be a survivor of stage I-III breast, colorectal, prostate, or endometrial cancer; be at least 18 years of age; have completed primary treatment; and have the ability to read and speak English. Participants were recruited (1) from survivorship clinics and support groups at MD Anderson (through a media-based approach that included distributing flyers, in-person presentations, and advertisements in MD Anderson's internal and external publications), (2) in person at an MD Anderson Cancer Survivorship Conference, and (3) by sending a letter and placing a phone call to eligible individuals identified in the MD Anderson patient database.

Focus Group Interviews
Data collection took place from November 2013 to March 2014. Each participant agreed to attend 2 focus group sessions at the MD Anderson Behavioral Research and Treatment Center and was compensated with a US $15 gift card at the completion of each session. Participants provided informed consent before the beginning of the study; this study was approved by the MD Anderson Institutional Review Board (protocol number 2013-0501). All focus group interviews were moderated by a master's level senior research coordinator (female) with more than 3 years' experience, trained in qualitative research methods; she conducted the interviews with the assistance of a semistructured interview guide and a colleague who took field notes (female, master's level senior research coordinator with more than 10 years' experience and trained in qualitative research methods). A demographic questionnaire, a measure of physical activity, and a questionnaire on technology use were administered at the beginning of the first focus group. Physical activity was assessed using a modified short-form version of the International Physical Activity Questionnaire (IPAQ-SF); this is a widely used measure with good test-retest reliability (.80) and acceptable criterion validity when compared with accelerometer data (median Spearman correlation =.30) [26].
Each participant attended 2 focus groups, and each focus group consisted of 2 parts. An outline of the content covered is presented below (Multimedia Appendix 1). The first part of both focus groups was a discussion in which the moderator asked open-ended questions and followed with conversational probes as appropriate. These questions were derived from a combination of Social Cognitive Theory constructs (eg, goal setting) and practical questions (eg, texting preferences) [27]. The second part of both focus groups consisted of a slide show presentation, during which the moderator asked participants to share their thoughts and opinions on the featured content. In the first round of focus groups, the slide show featured various physical activity app features (Multimedia Appendices 2 and 3), such as receiving tailored text messages (Figure 1). In the second round of focus groups, the slide show featured 18 example text messages (Multimedia Appendices 4 and 5).
The researchers held weekly meetings in which they discussed the focus groups and reviewed field notes; additional focus groups were conducted until the researchers were confident that data saturation regarding the study's research questions had been achieved. This was determined to be the case after a total of 13 focus groups had been conducted (8 focus groups in the first round were consolidated into 5 for the second round). Among the participants who attended the first focus group, 7 did not go on to attend the second. On average, each focus group had about 5 participants. All sessions were recorded with a digital audio recorder and professionally transcribed verbatim.
Participants were asked to not use names during the focus groups, and surveys and transcripts were de-identified; all data were stored on encrypted, password-protected computers.

Data Analysis
Transcripts were imported into the qualitative data analysis management program ATLAS.ti (version 7.0, Scientific Software Development GmbH, Berlin, Germany) [28]. Data analysis was conducted by 3 independent reviewers and consisted of both a deductive and an inductive phase. During the deductive phase, conducted first, for all transcripts 2 coders (ET and MCR) independently matched codes determined a priori to each comment that introduced a substantive point germane to this study's topic. For the purposes of this study, this step served to allow the coders to become familiar with the data and screen out content that was not relevant. For the inductive phase, thematic content analysis was performed [29]. Codes were created and assigned to each discrete point made by each participant for all transcripts, and consolidated and organized in an iterative process to identify recurring themes and subthemes. A meeting (KMB-E, EJL, MLB, SS, and MCR) was held to resolve any differences in coding. Preliminary results were then presented to a third coder (SS), who verified the accuracy and exhaustiveness of the findings against all complete focus group transcripts. Finally, illustrative quotes for each subtheme were identified.

Participant Characteristics
The age of the 35 participants ranged from 41 to 84 years, with a mean of 63.7 (SD 10.8) years. Demographic characteristics are presented in Table 1. Participants were well educated, mostly female, and mostly white. Most (21/35, 60%) had been diagnosed with breast cancer. IPAQ-SF scores indicated that 41% (14/34) of participants did not meet recommended physical activity levels ( Table 1). Most participants reported being very interested in technology (Table 1) and most participants (≥69%) reported having ready access to technological devices, a computer, and high-speed Internet ( Figure 2).

Themes
We identified 4 themes regarding participants' preferences for an app to increase physical activity in cancer survivors: tone preferences, tools for personal goal attainment, prescription for physical activity, and a tailored experience. Subthemes within each overarching theme are presented along with illustrative quotes in Tables 2-5.

Tone Preferences
The first theme that arose was related to preferences for the tone of messages. Participants indicated that they would prefer messages to be casual, concise, and positive ( Table 2). A casual tone was preferred to a clinical one; participants indicated that messages that are familiar, warm, friendly, and even funny would be more agreeable than those that were more formal.
Proposed example messages were criticized as being too long, and participants frequently made comments explicitly stating that short messages are preferable to longer ones. Participants also indicated a preference for messages to exhibit a nurturing and supportive tone; they cautioned that, if not worded carefully, messages could be off-putting or even damaging. Messages that were perceived to be negative in any way were almost uniformly rejected. For example, a message that attempted to highlight the fact that walking is an excellent form of physical activity started with "While running or playing tennis might not be enjoyable..." and, as a result of this negative framing, was not well received. Finally, some participants expressed a preference for using a tablet to interface with an app. This was indicated as an easier way to access app content and manage frequent app messaging. Receptivity to using a tablet

Tool for Personal Goal Attainment
A second theme that was identified was that the app serve as a tool for personal physical activity-related goal attainment (Table  3). Participants indicated that physical activity goals tended to be manifestations of personally held values. For example, some participants wanted to be physically fit so as to be able to play with their grandchildren. Participants expressed a desire to be able to input personally held goals into the app, then be able to utilize the app as a tool for accomplishing them. Participants indicated they were more likely to engage in action planning if they had made a commitment to their peers, and the ability to use an app to enlist social support was identified as a noteworthy subtheme. Participants also talked about the potential utility of an app to provide periodic reminders to be physically active. They indicated that such periodic cues could help them to get on track for goal attainment, particularly if the reminder messages were delivered at opportune times. Such reminders were usually discussed in the context of a wearable device's ability to detect prolonged bouts of inactivity and automatically send a cue to break up sedentary time.
Role model narratives emerged as a potentially powerful feature to empower cancer survivors to live a more physically active lifestyle. The notion of being presented with stories from individuals who had overcome salient obstacles was well received. One oft-cited caveat to featuring role model narratives was that such stories need to be relevant to the user; participants wanted to be matched to stories from individuals who had faced similar barriers. Were such matching not feasible, participants suggested allowing the user to individually browse for stories that they feel may be relevant to him or her. Participants cautioned against unwittingly presenting someone with the success story of a cancer survivor who was too physically active or did not face similar adversity.
Participants generally articulated a preference for either no social connection at all or a more private social experience that would allow them to enlist social support from those they know intimately and trust. Most participants said they did not want a social networking feature that would involve the public posting of information, such as one's step count achievements or calories burned; instead, participants indicated that a social networking feature would be most attractive if such information were shared privately with a small group of user-selected friends or family. Using personal physical activity data as a means to compete with friends was not well received. "I think that's very encouraging because each and every patient has their own story and how and why they have cancer, and how they can succeed and move on and live a healthy life." [P18] Role model narratives "It's always nice to hear about people who have done it and how they struggled and how they overcame their struggle to get to their successful point." [P6] "Stories that you can opt in or out of or read or not read...so you read a story and go, 'Oh, that's really nice, except it doesn't apply to me.'" [P34] "No. That's too personal. I don't post anything personal, really." [P26] Social networking "I agree, I mean I don't like just to put progress on my weight or whatever to everybody, all my friends or whoever. But if there is some group..." [P4] "No. I don't like to compete with anybody. I mean I like to compete. But I'm always competing against myself. And to put it next to somebody else, that would defeat me." [P17]

Prescription for Physical Activity
Another overarching theme concerned the presentation of a prescription for healthy types and levels of physical activity for cancer survivors (Table 4). Participants indicated a desire to be presented with concrete, short-term goals for physical activity that would ultimately help them realize their more abstract, value-based goals. Participants stressed the importance that recommended goals be attainable and come from a trusted source (eg, their cancer hospital or an authoritative health agency). Furthermore, participants reported wanting app features that could help the users appreciate progress and visualize incremental improvements related to their recommended goals.
Participants also expressed a desire to be presented with new ways of being physically active and to be educated about how to perform new exercises safely. To this end, participants expressed a strong preference for video demonstrations over text or pictures. Participants also indicated a desire for receiving a summary of the relevant literature on physical activity and cancer survivorship presented in layman's terms; they noted that such information could be very motivating and that some confusion exists over what is perceived to be inconsistent recommendations for physical activity in cancer survivors. Many participants voiced surprise at the fact that physical activity can reduce the risk of cancer recurrence for some types of cancer. Again, an important qualifier identified for receiving this kind of information was that it come from a reputable and trusted source. Goal suggestions "So if this machine knows that you've been sitting, maybe it can suggest some exercises you can do when you're sitting, or mention that you've been sitting for an hour, and after an hour you should get up and just walk around for ten minutes or something like that." [P6] "Or tightening your stomach while you're sitting in a chair in the office kind of thing, or standing instead of sitting when you're doing an activity.

Tailored Experience
A final theme identified from the focus group interviews was a preference that an app provide an experience that is highly tailored to the individual user (Table 5). This emerged as an important parameter of use for many of the subthemes presented above. Frequently cited factors to incorporate for individualization included cancer-related information, age, personal health concerns, physical limitations, physical activity preferences, location, weather, current physical activity levels, and trends over time.
Participants indicated wanting to be recognized and congratulated for activity-related achievements and presented with information about how such achievements translate into physiological processes (eg, calories burned). They talked about wanting to be able to see and track changes in activity levels over time, along with corresponding changes in health indicators (such as waist measurement, body mass index, cholesterol, blood pressure, and heart rate). This was often discussed in the context of incorporating a wearable fitness tracker. Participants emphasized a strong preference for information to flow from the app to the user and not the other way around. They stressed that a burdensome process of inputting data would pose a great threat to sustained use of the app. Participants reported wanting rich, personalized data, especially to inform such features as physical activity goal suggestions and personalized role model narratives. Generally, suggested weekly step count goals should be based on an incremental increase from the user's previous week's step count, and role model narratives from especially active individuals should not be presented to individuals who are less active.
Participants expressed the importance of receiving information that is relevant for their unique health profile; they suggested that the app offer content that is sensitive to user-identified information, such as cancer diagnosis, personal health considerations, age, and physical limitations. For example, they reported wanting exercise demonstrations that are sensitive to the user's physical limitations and novel ways to perform physical activity that would not aggravate such limitations. Also, participants expressed a preference to be able to interface with the app to indicate health-related changes. If, for example, an injury were to occur, participants indicated that they would like to be able to note this in the app and receive a temporary reduction in message frequency or altered message content.
Finally, participants made suggestions for tailoring content based on the user's location. Participants appreciated the idea of being presented with nearby opportunities to engage in physical activity. Walking paths, public parks, outdoor events, and yoga or Pilates studios were identified as some opportunities that an app might inform the user of. Poor weather was repeatedly cited as a barrier for engaging in physical activity, and it was suggested that the app might address this by providing recommendations for alternative activities if this was the case.

Principal Findings
In this study, we conducted focus groups to ascertain cancer survivors' preferences for the features and types of messages of an app to increase physical activity. We identified 4 overarching themes for desired app content: (1) clear, positive, and concise messages, (2) various tools for personal goal attainment, (3) an appropriate prescription for physical activity, and (4) an experience that is tailored to the individual. Taken together, our results indicate that participants want an informal interface with an app that provides a highly individualized experience to facilitate engagement in healthy levels of physical activity. This can be achieved by an app that provides real-time feedback and personalized content sensitive to the user's unique health concerns and physical activity preferences.

Comparison With Prior Work
In their study, Puszkiewicz and colleagues [19] conducted in-depth interviews and used thematic analysis to identify themes related to cancer survivors' feedback on an app designed to increase physical activity. The 4 themes identified in this study included (1) barriers to physical activity, (2) receiving advice about physical activity from a reliable source, (3) tailoring the app to one's lifestyle, and (4) receiving social support from others. Our study complements these findings. Similarities include the importance of the perceived trustworthiness of a physical activity app and the ability of the app to provide tailored content to the user. Puszkiewicz and colleagues also identified a preference for receiving social support from others. Results from our study qualify this finding by highlighting privacy concerns; one way to address this would be to avoid public social network postings in favor of more carefully matched, private connections. Puszkiewicz and colleagues also point out the potential utility a physical activity app may have for health care providers, who often are unable to adequately discuss physical activity with patients owing to competing demands for time.
In accordance with the findings of this study, in a review of the literature Higgins [30] found that tailored physical activity feedback is associated with apps that are more effective at inducing behavior change, and that decreasing participant burden tends to increase adherence rates. However, qualitative work done by Miyamoto et al [31] found that simply tracking and presenting data may not be sufficient to lead to long-term behavior change maintenance, and that the context of this feedback is critical. Findings from our study provide insight on some contextual issues that may improve acceptability and, ultimately, efficacy of such apps (eg, presenting physical activity feedback alongside the implications of meeting recommended physical activity levels on one's risk of cancer recurrence, or personal health concerns such as lymphedema).
Results of this study are consistent with previous research findings for the preferences of a physical activity app in the general adult population. Similarities found by Dennison and colleagues [32] include preferences for minimal user burden, backing by a trusted source (eg, hospital), inclusion of a goal setting and monitoring component, feedback and advice on how to change behavior, accurate information and tracking features, messages that have a positive tone and are not too frequent, and privacy protection. In formative research for the development of an app to increase physical activity in the sedentary adult population, Rabin and Bock [33] identified participant preferences that included automatic tracking of steps, feedback on physical activity accomplishments, goal setting, and suggestions for how to overcome barriers. Dennison and colleagues [32] found some additional preferences not directly identified in our study: participants expressed a desire for an app that is free, can be easily turned off, does not negatively affect other device uses, has clarity about what it will do, and does not present undue surprises. These additional findings may hold true for cancer survivors.
In their formative development of iCanFit, a Web-based app to increase physical activity in older cancer survivors, Hong and colleagues [34] presented 6 key functions. These were "Locator, Goals, Community, Healthy Tips, Library, and Support" features, which served to provide a tailored experience regarding local resources for physical activity, the ability to input short-term and long-term goals, social networking features, advice providing a prescription for healthy living, access to relevant literature, and technical support, respectively. These features are concordant with the findings of our study. Technical support was not an explicitly identified theme in our study but may be particularly important given that older cancer survivors may not be as tech-savvy as the general population; indeed, this study found that most (21/35, 60%) rated themselves as somewhat, not very, or not at all skillful with technology.
Cancer survivors are generally older adults, so an app to increase physical activity in this population may face challenges due to lower rates of technology use in this population. A study conducted by Martin and colleagues [35] found that cancer survivors' interest in interventions delivered by a mobile phone was relatively low. However, this study used data from 2010, and older adults' use of technology is increasing rapidly [7]. Part of a reported lack of interest of mobile phone use in this population may be due to age-related declines in vision and manual dexterity. Martin and colleagues did find a relatively high interest in older adults for interventions delivered via computer, but they did not explore cancer survivors' perceptions and interest in tablets. The use of tablets may circumvent some of the physical challenges faced by older adults due to having larger screens that offer higher visibility and a larger touch screen. There also may be a difference in perception: some evidence indicates that older adults may tend to view smartphones as especially complex phones, while on the other hand viewing tablets as relatively simple computers [36]. Indeed, several comments made by participants in this study corroborate this notion, and a recent survey showed that tablet and e-book reader ownership in older adults is higher than smartphone ownership [7]. Future studies should explore cancer survivors' interest in this intervention modality.

Implications for Research and Practice
Our findings suggest that presenting goal-setting exercises in the context of participants' personally held priorities and values may be a particularly useful approach to elicit intrinsically motivating goals. Self-determination theory posits that greater internalization of goals is more likely to lead to lasting behavior change [37], and empirical tests in physical activity support this notion [38]. This may be accomplished by a program that has the users reflect on and identify their values and then generate physical activity-related goals in light of this content; this input could then be periodically leveraged in order to maximize participants' physical activity adherence. An app may be especially well suited for this owing to onboard technological components, such as a camera that could capture values (eg, pictures of grandchildren), and an onboard accelerometer or the ability to sync to wearable activity tracking sensors that could responsively register changes in physical activity levels.
Importantly, many of the desired features articulated are compatible with both empirically supported methods of behavior change and the relative strengths of an app as a delivery vehicle for behavioral intervention [21][22][23]. For example, participants talked about being presented with stories from other cancer survivors who have overcome similar obstacles and also being presented with instructional videos demonstrating how to perform various physical exercises. These preferences align well with behavioral change methods (behavioral journalism and demonstration of behavior, respectively) theorized by Bandura's Social Cognitive Theory to influence behavior via observational learning [27]. Table 6 presents our suggestions for how an app might incorporate behavior change methods. We arrived upon these suggestions by applying the preferences identified in this study to empirically supported behavior change methods drawn from both the Intervention Mapping approach [21,22] and Michie and colleagues' [23] Behavior Change Technique Taxonomy.
Future formative research for the development of an app to increase physical activity levels in this population might corroborate these findings with quantitative data and provide insight as to the relative rank-ordered preferences of desired app features and messages. It would also be useful to ascertain what qualities of a physical activity-related app are associated with higher rates of participant engagement (eg, messaging or notification frequency, type of content featured, social networking features). Additional studies are needed to determine whether an app-delivered intervention can lead to increased physical activity initiation and maintenance in cancer survivors and, if so, which behavior change methods might be the mechanisms through which these outcomes are achieved. Recommendations Behavior change methods [21][22][23] Have the user start with a physical activity-related goal (eg, step count) that is comfortably accomplished and have goals incrementally increase over time Enactive mastery experiences [24]; set tasks on a gradient of difficulty [25] Dispel commonly held misconception regarding barriers to physical activity by offering a digest of relevant literature (eg, address the misconception that physical activity is contraindicated if one is at risk for lymphedema) Consciousness raising [39] Encourage users to reflect on personal values during goal setting and the potential outcomes of behavior change from multiple perspectives; encourage users to create value-based goals for physical activity Goal setting [40]; self and environmental reevaluation [39] Maximize mHealth program potential to provide specific, personalized information relevant to the user; minimize participant data entry burden Tailoring [39,41] Go beyond simply presenting physical activity summary information; provide interpretation of personal physical activity data relevant to users' health concerns and cancer experience Self-monitoring or feedback on behavior [42] Feature private sharing outlet with personal friends and family, or match user to others who have experienced a similar cancer journey; avoid sharing indiscriminately to broader social network Stimulate communication to mobilize social support [43] Offer role model narratives that demonstrate that others, like the user, can overcome salient barriers and experience real benefits regarding physical activity Behavioral journalism [25,44] Provide videos for recommended exercises that demonstrate proper technique and address personal physical limitations and health concerns; provide individualized feedback regarding user's performance Guided practice [23] Offer periodic prompts to influence behavior by making it more salient in the mind of the user; allow the frequency of messaging to be determined by the user to minimize perceived burden Providing cues to action [45] Assume a casual tone from a trusted source; provide positive reinforcement by celebrating successes, and provide minimal negative content Verbal persuasion about capability; improving physical and emotional states [25] Strengths A strength of this study's focus group qualitative approach is the ability to generate rich data to provide insight that extended beyond the preconceived notions of the researchers. This study's use of 3 coders to analyze the data in a systematic, iterative process was a strength, as was the use of 2 phases of data analysis to strengthen the authors' familiarity and understanding of the content.

Limitations
A potential limitation of this study includes the use of recruitment methods that may have introduced self-selection bias; individuals who agreed to participate may have been especially active or interested in technology. However, results indicate that this threat may not be particularly salient, as IPAQ-SF scores categorized nearly 42% (14/34) of participants as exhibiting "low" physical activity levels. Still, our sample's educational level and racial/ethnic diversity does not match that of the larger priority population, which may limit the generalizability of our findings. Furthermore, the generalizability of our findings is limited by the fact that the majority of participants were female, breast cancer survivors. Different types of cancer can lead to unique patient experiences regarding physical limitation and psychological challenges [46]. For example, breast cancer survivors may be more likely to suffer from depression than lung cancer survivors but less likely than those diagnosed with brain cancer or females diagnosed with genital cancers [47]. Preferences for forms of mobile or Web-based support may also differ across cancer types, possibly owing to these different experiences [35,48]. Indeed, quantitative and qualitative studies of individuals with different cancer types have found different experiences and different preferences for online support [49][50][51]. Accordingly, our findings may not be applicable for survivors of certain types of cancer. Another limitation was that the focus groups were not homogeneous with respect to participants' physical activity level. This may have had the effect of systematically influencing the dynamic of the sessions and created a bias in the data. While qualitative research methods can be especially effective at generating a comprehensive breadth of information on a particular topic, as they were conducted here, little insight was provided on the relative rank of preferences for the many app features identified. Given the resources required for app development generally, and the inherent challenge of providing an app that is able to satisfy all identified preferences, narrowing this list in order to focus on priority app features may be necessary.

Conclusions
Given the dramatic uptake in technology use, utilizing an app as a modality for behavioral intervention holds promise for increasing physical activity in cancer survivors. Presenting rich physical activity data and feedback, while minimizing user data entry burden, would be a critical feature of such an app. Results of this study provide preferences that may be used to enrich the context in which an app provides physical activity feedback. Useful approaches may be to capitalize on personally held values during the goal-setting process, to present an individualized prescription for physical activity from a trusted source, and to provide tools that facilitate goal fulfillment. Future studies should incorporate the perspectives of oncologists and other health care providers, as well as test these findings in a pilot version of an app to increase physical activity in cancer survivors.

Acknowledgments
The University of Texas MD Anderson Cancer Center is supported in part by the National Institutes of Health (NIH) through Cancer Center Support Grant CA016672. This study was supported in part by the Center for Energy Balance in Cancer Prevention and Survivorship, which is supported by the Duncan Family Institute for Cancer Prevention and Risk Assessment.

Conflicts of Interest
None Declared.

Multimedia Appendix 1
Content covered in focus group sessions.

Multimedia Appendix 2
Slideshow of various app features.

Multimedia Appendix 3
Percentage of positive feedback for various app features.

Multimedia Appendix 4
Slideshow of example text messages.

Multimedia Appendix 5
Ranking by percentage of positive feedback for specific messages.

Introduction
Alcohol use disorder (AUD) contributes to a substantial number of contacts with the treatment system [1,2], given that relapse is the most likely outcome of treatment [3][4][5]. Apart from being a source of suffering for affected individuals and their relatives, AUD places a significant burden on the health care system [1,2,6]. This burden is particularly prominent in the Nordic counties, Denmark being among those with the most liberal alcohol culture, leading to pervasive exposure to alcohol and associated situations. Such pervasive exposure may consequently lead to more individuals developing AUD and induce urges that can increase rates of relapse after the treatment has ended [7][8][9].
Within the Danish treatment system, individuals with AUD are most commonly treated with motivational interviewing, cognitive behavior therapy and family therapy, classified as evidence-based treatments [10,11]. In several Danish treatment institutions, additional cue exposure treatment (CET) is often used to reduce urges and prevent relapse in order to prepare AUD individuals to navigate in the Danish society. During conventional CET, patients are exposed to alcohol or related stimuli in vivo while their habitual drink response is hindered, so that conditioned automatic responses can be extinguished [12][13][14]. CET is often combined with the use of urge-specific coping skills (USCS), as there is evidence to suggest that this method provides better treatment outcomes [15][16][17].
When addressing the need for AUD treatment (such as CET), it is evident that the duration of the treatment is decreasing and that it is increasingly being used in group-rather than individual sessions were found appropriate and reasonable [11]. However, more individuals could potentially benefit from individual-as well as continued treatment [18,19]. There are also many individuals with AUD who never enter the treatment system [19][20][21], which may, in the future, cause severe collateral damage and exacerbate the burden on the health care system [1,6,9,18,20]. The implementation of e-health interventions through devices such as computers, tablets, and smartphones represents a new pathway for treatment delivery, one which overcomes some of these issues and assures accessibility to as many patients as possible nationwide [22][23][24]. Yet, very few of the currently available eHealth interventions are based on a theoretical framework and experimental evidence [22,[25][26][27].
Less is known about evidence-based mobile devices, such as smartphone apps [26,28,29]. Dedert et al (2015) recently conducted a systematic review on eHealth interventions targeting AUD, revealing a huge gap in experimental evidence; they identified only a single randomized controlled study that investigated a mobile device [26,30].
Mobile eHealth interventions have the potential to play a crucial role in the future provision of continuing care and relapse prevention helping to lower the socioeconomic burden on the health care system by decreasing the number of contacts it gets, as well as augmenting the reach of relevant treatment. However, there is a need for more transparency regarding the underlying psychological framework of mobile eHealth interventions, their design, and development, as well as the provision of evidence to gain more knowledge about their effectiveness.
In order to add to the evidence base for mobile eHealth interventions, a CET smartphone app that mimics CET with USCS was designed and developed and is currently being tested in a large-scale, randomized controlled trial (ClinicalTrials.gov NCT02298751) [31,32].
The objective of this paper was to describe the design and development of a manual-based smartphone app that mimics CET with USCS, which is currently being delivered in Danish inpatient and outpatient clinics. The CET app has the potential to contribute to the reach of evidence-based psychological treatment for AUD.

Methods
CET features in treatment programs being used in Danish alcohol clinics in both inpatient and outpatient treatment settings.
CET is most commonly used in combination with various urge-specific coping-strategies (USCS), due to the promising outcomes shown [15][16][17]. When developing the CET app, we applied Monti and colleagues' (2002) treatment manual for CET with USCS, which emphasizes the importance of "confrontation with alcohol" in diminishing cue reactivity. According to the treatment manual, patients were introduced to a USCS during each CET session and were thereafter required to train the learned strategy while being exposed to alcohol in vivo [33]. Due to the highly structural properties of this treatment and our clinical experience with using it, we were able to convert it into a smartphone app.
The initial plan for the structure and content of the app was developed by a group of psychiatrists and a psychologist relying on the aforementioned manual. When converting the manual, designing the app as simply, intuitively, and feasibly as possible was of utmost importance, given that the target population may have very different cognitive profiles [34][35][36]. Although patients with severe cognitive impairments are not candidates for this type of treatment, some of our patients might have had mild to moderate cognitive impairments after years of suffering from AUD. In accordance with the plan, programmers and graphic designers developed a preliminary version of the CET app. After several modifications and user tests with the involved psychiatrists, psychologist, and programmers, a more detailed structure of the program was confirmed. Hereafter, the app was presented to 2 patient focus groups (2×n=5) who gave feedback. All patients found the app to be simple, intuitive, and feasible. Suggestions for improvements centered mainly on the used terminology. A final version of the CET app was developed based on the patients' feedback and is currently being tested in the previously mentioned Cue Exposure study [31], which is part of the RESCueH studies [32].
Along with the smartphone app, an online database was designed and developed in the system which can monitor patients' data in real-time.
The open-source Linux-based operating system Android was selected as the platform for developing the smartphone app. A customized version of Java in Eclipse (Oracle Corporation) was used as the main programming language. An online server is registered for the database and monitoring of the treatment process remotely.

Results
The Structure and Content Figure 1 illustrates the structure of the app and its main content comprising the following components: introduction, 4 sessions with USCS, 8 alcohol exposure videos featuring guidance for applying one of the USCS, and a results component providing an overview of training activities and potential progress.
The information in the app is presented in text format and read out loud simultaneously.
The software requires patients to train on a regular basis and sends a text reminder for this. As little is known about the effectiveness of intensive CET [15][16][17][37][38][39][40][41][42], patients are allowed to train only once a day for a maximum of 4 weeks in order to prevent overexposure. The specific components of the app are outlined in the following sections.

Delivery and Access to the App
A CET therapist provides patients with both oral and written information about the app prior to the commencement of treatment. Patients can download the app directly onto their smartphones if they already have one. Otherwise, they can borrow a smartphone from the alcohol treatment clinic.
As can be seen on the Log-in page (part A), patients are provided with a user ID that is easy to remember and that assures anonymity, permitting them to login to the CET treatment program ( Figure 2, part A). We predefine all user IDs in the form of a combination of ciphers and letters, for example, 001001aa, 002002bb. Considering that some patients may have mild to moderate cognitive impairments (eg, impaired memory), we designed the login procedure to incorporate a user ID that doesn't require a password, thus simplifying the login process. User IDs do not resemble one another so as to avoid double usage by patients logging in on another patient's user ID. In addition, phone stickers displaying contact numbers for technical and treatment support are given to patients in case they forget their user ID, or if other technical or therapeutic issues arise during treatment.

Introduction to CET
The Introduction to CET component plays automatically the first time the app is activated and a patient is logging in. The purpose of the Introduction is to inform patients about the purpose and content of the CET with USCS, technical functions, as well as key safety functions (Figure 2, parts B and C). The Introduction to CET emphasizes that it may indeed be difficult to avoid being exposed to alcohol in Denmark, and that the purpose of using the app is to learn how to cope with cue-induced alcohol urges and associated situations in order to prevent relapses when outside the treatment setting. Hence, the treatment consists of teaching coping strategies to reduce urges, and, by exposing patients to alcohol in vivo, it trains them to tolerate urges by using the USCS.
The technical functions such as audio/video replay, audio/video pause/start, and continue to next page, are illustrated through arrows explaining how they work ( Figure 2, part B). The safety function's main component consists of a call icon (at the upper right-hand side of the screen) connecting to a CET therapist ( Figure 2, part C), which becomes available whenever the app is activated. The call icon provides the same contact numbers as displayed on the phone stickers, hereby assuring that patients can still get in touch with a therapist even if they lose their sticker or for any other reason are not able to use the call icon. The therapist is available Monday to Friday from 9:00 am to 18:00 pm, and should be consulted in the event of experiencing uncontrollable urges. For practical and safety reasons, the app is closed for use on weekdays from 18:00 pm to 9:00 am and during the weekend, that is, when the therapist is out of reach.
If patients wish to replay the Introduction, they can click on the icon illustrated in the Main menu (see Figure 2, part D), which is also where they are directed to when logging-in in the future.
Patients are ready to proceed to the USCS sessions after the Introduction has played.

Sessions With USCS
As shown in Figure 3 (part A), the Strategy icon comprises 4 sessions, each promoting the use of 1 of the 4 USCS recommended in the manual. Each session starts with an introduction to the USCS and an explanation for how to apply it during alcohol exposure ( Figure 3, part B). Patients are then required to select an exposure video ( Figure 3, part C). At the end of the exposure video, a summary of the USCS training and how to use the USCS in the future is provided (Figure 3, part D).
T he recommended strategies are as follows:

Waiting It Out
This is used as a cognitive strategy, whereby patients are explained what to expect when waiting out an urge. It is elucidated that they perhaps haven't ever waited for a strong urge to pass naturally, and that urges actually reduce quite quickly to a manageable level. The urge passes if one waits. It can be expected to peak within 4 min, start to decline after approximately 8 min, and fall completely within 10-15 min. Moreover, the urge declines faster each time exposure occurs without resulting in alcohol consumption. This approach stands in contrast to the exclusively behavioral version of CET in which no information is provided about what to expect.

Thinking About the Negative Consequences of Drinking
Patients are encouraged to imagine the most negative future consequences associated with resuming alcohol abuse. In order to systemize and register the negative consequences in the database, patients are required to select between 1 and 3 consequence categories from a list comprising physical health, mental health, family and friends, work and education, economy, offences, and loss of control. To the best of our knowledge, these categories incorporate the vast majority of negative consequences that individuals with AUD may experience [43][44][45][46]. After selecting the consequence categories, patients are guided in rehearsing the USCS. It is emphasized that the USCS has proven to be particularly effective when one is experiencing predominantly positive emotions and feelings, leading to permissive thoughts about alcohol consumption. It may be useful in this situation to think carefully about the future negative consequences associated with reverting to old bad habits.

Thinking About the Positive Consequences of Sobriety
Patients are encouraged to imagine the positive future consequences associated with restraining from alcohol abuse.
In accordance with session 2, patients are required to select between 1 and 3 consequence categories from a list comprising the same domains, and are also guided in rehearsing the USCS.
In contrast to the prior USCS, it is explained that this strategy is effective when negative emotions and feelings prevail, and when one has the urge to drink in order to distance oneself. In such a situation, it may be useful to think of the positive consequences that lie ahead if the urge is resisted.

Alternative Food and Beverage
In this last session, patients are encouraged to consume alternative food and drink during exposure in order to reduce urges. It is explained that people have a tendency to prefer the food and drink that is most readily available in risky situations, and that it is a good idea to distinguish between 2 types of risk situations: (1) Alone or alone at home after work, watching TV, bored, and (2) Social events: after work, with friends, or celebration. Patients are encouraged to choose healthy alternatives that will form the basis for new habits.
In the Summary of every session, it is recommended that patients use the associated coping strategy when confronted with alcohol and risk situations in real life; however, in line with the safety functions featured in the Introduction to CET component, they are also advised against actively seeking out risk situations.
To ensure that each strategy is trained at least once, it is not possible for patients to proceed to the next session until they have completed the previous session. While the strategies are being trained, the Session icons change their colors from red to yellow and then from yellow to green, to indicate not trained, moderately trained, and trained.
The Exposure icon remains locked until all strategies have been trained.

Exposure
As illustrated in Figure 4, exposure to alcohol is simulated by alcohol videos.
The app contains 8 different alcohol videos comprising the following categories: ordinary beer, strong beer, alchopops, red wine, white wine, brown spirits (eg, whiskey and cognac), white spirits (eg, vodka and rum), and hard liquor. One of these should be selected. Patients can select for their preferred beverage to feature in the exposure material or vary the beverage used from one session to the next. The alcohol video is selected from the list presented in Figure 3, part C. The alcohol exposure videos imitate sessions with a therapist, and the alcohol in the videos becomes increasingly more appetizing during the exposure session so as to induce cue-controlled urges. A variety of the most common brands in Denmark are presented, as individuals with AUD have different alcohol preferences within the alcohol categories. The duration of each exposure video is 15 min. After 4 min of exposure, patients are guided in how to use the learned USCS in order to reduce the cue-induced urge. When the urge decreases to a manageable level (urge level≤2), the exposure can end, and patients can then proceed to the summary session. However, a minimum of 8 min of exposure is required.
It is possible for patients to go directly to the exposure videos after they have been trained for all the USCS, as there is no need to repeat the abovementioned information to them every time they watch a video. When patients click on the Exposure icon, they must register which USCS they want to train. They can then proceed to the exposure session.

Real-Time Measures of Cue-Induced Urges
Real-time cue-induced urges are measured at 3 time points: (1) at the baseline (before exposure), (2) when the urge is expected to peak (during exposure/4 min), and (3) at the endpoint (after exposure). Urges are measured on an 11-point Likert scale, ranging from 0 (no urges) to 10 (severe urges). As can be seen in Figure 5, we chose to use an unconventional glass-Likert scale to animate the ratings more. The liquid in the glass changes color in accordance with urge ratings. The cut-offs are 0-2 for Green, 3-6 for Yellow, and 7-10 for Red.
Based on these measures, 3-point graphic illustrations of urge development during exposure can be produced. Proxy measures of the intensity of the urge induced by the selected exposure video and the effectiveness of the selected USCS in reducing the urge can also be calculated. The first measure is calculated by subtracting the baseline measure from the peak measure. The effectiveness of the USCS is calculated by subtracting the endpoint measure from the peak measure. These algorithms together with other training activity variables are used to calculate the results in My progress.

My Progress
As illustrated in Figure 1: Main menu, the final icon is named My progress. My progress displays several measures and graphs allowing patients to keep track of their training activities and potential advances in controlling cue reactivity.
A s shown in Figure 6, the menu is similar to that in Data from these measures are recorded in the database in order to register and monitor training activities. The data would be used to measure the extent to which each USCS is applied by patients, as well as the effectiveness of each USCS and the CET intervention in general.

Monitoring Use and Urges
Along with the smartphone app, an online database was designed and developed in the system which can monitor patients' data in real-time. After the patients have used a USCS, the app saves the data package locally on their phones and directs them automatically to the online database (as long as there is Internet connection). The data package includes user ID, time of using the app, applied strategy, and the real-time urge data. As already mentioned, the user IDs that we use to identify patients do not contain any personal information and are encrypted when being transferred through the Web domain to the database. The external webhosting provider, DanDomain A/S, DK, is responsible for the server and the database, and has signed an agreement with the hosts of the project to ensure that rules regarding safety and ethics are met. The database cannot be accessed by any members of the research group before all data have been collected; only contracted data managers have access to the database at this time. Any access and changes made to the database is recorded and documented.
The user IDs will be used to merge data from the database with data from other sources (in an internal database) suitable for personal identifiers.

Principal Findings
This paper describes the design and development of a smartphone app that mimics the CET treatment delivered to AUD patients in Danish inpatient and outpatient clinics.
Although CET along with USCS is widely used in Denmark, studies providing evidence for the effectiveness of this treatment are yet to be conducted. If we draw on the evidence from international studies, CET has, in its conventional delivery form, demonstrated superior performance to meditation and relaxation techniques [15][16][17]38], and equivalent or even superior performance to cognitive behavior therapy [37,39,41,42]. Some of the best results for the effectiveness of applying CET with USCS have been reported by Monti and colleagues [15][16][17], and are based on the same manual that is used in most Danish treatment settings, and which this study is built on [33]. CET with USCS has been shown to work in both individual and group sessions [15,17,42].
The critical question, then, is whether this evidence-based AUD intervention demonstrates an effect when converted into a smartphone app. To answer this question, we based the present app on a behavioristic psychological framework and embedded the examination of it in a large-scale, randomized controlled trial. About 300 AUD individuals were randomized into 1 of the following 3 aftercare treatment groups: (1) CET as a mobile phone app (n=100); (2) CET as a group therapy (n=100); and (3) treatment as usual (n=100). The 2 experimental CET conditions were based on the same manual, and the treatment as usual consisted of a single follow-up session to observe how patients were doing and whether further treatment was needed. The real-time urge measures were applied in both experimental CET conditions, and a number of effect measures were conducted for all enrolled participants [31,35].
The experimental design allows for comparison between the experimental groups and the nonactive controls, which adds to the general knowledge base pertaining to the effectiveness of CET targeting AUD. Of more importance is the fact that the study design allows for a comparison between the 2 experimental conditions on USCS, real-time urges and effect measures, which clarifies whether it is beneficial for patients to progress from CET group sessions to using a CET smartphone app.
It is hypothesized that the experimental groups will achieve better outcomes compared with controls on primary and secondary effect measures, including alcohol consumption, urges or cravings and coping skills. It is a more of an explorative research question that answers whether similar or improved results for one of the CET conditions will be found. Thus, the study context will either validate or falsify the app as an evidenced-based treatment form.
Obviously, the app might have a number of disadvantages compared with CET group sessions, which may hinder its effectiveness as a pathway for treatment delivery. First, the alcohol exposure videos aim to target possibly all the individuals in the study population, hereby including several alcohol brands at the expense of the individually tailored exposure. Second, patients cannot smell the alcohol shown in the videos, and we know that smell is the only sense that is linked directly to the frontolimbic reward system [47][48][49]. Third, the time point for using the USCS during the exposure is based on an average for when the urges are expected to peak. Although this may be the best proxy measure, an average is an abstract value and the peak may have a broad range, hence, not capturing the real peak in many cases. Indeed, variations in the average peak have been reported across studies [14][15][16][17]. Nevertheless, both the real peak measure and the average peak measure have been reported in our CET group comparisons. This will give an estimation of the validity of the peak measure within the study population. Finally, although the app was designed to be as simple, intuitive, and feasible as possible (also for patients with minor or moderate cognitive impairments due to drinking) and a contact number for a CET therapist is provided, treatment may be affected if patients have no personal interaction with the therapist [27,50]. However, CET in app form also has several advantages. First, the CET app may facilitate extinction learning as it enables the patient to train in a variety of situations in real life. Compared with the CET treatment currently delivered in Danish alcohol clinics, this approach may increase the likelihood that extinction learning will generalize to various other contexts outside the usual treatment setting [51]. Second, the CET app is independent of time and place, and patients do not need to show up at specific times for treatment, but can instead train whenever and wherever they find it convenient. Thus, patients who have busy work schedules and family lives, live in rural areas, or have other challenges that impede them showing up regularly at the clinic may find this type of treatment more suitable. Third, apart from providing a forum for meeting the needs of AUD patients in a modern society, the application of smartphone app treatments in clinics may, also, decrease the amount of requests made for therapist-based treatment. This will, indeed, lower the burden on the health care systems' budget. Finally, in the longer term, when evidence-based apps become more available, more patients could potentially benefit from individual-as well as continued treatment [18,19]. Moreover, AUD individuals who never enter the treatment system [19][20][21] could also benefit from these app services.
Although there exists a gap in knowledge about the effectiveness of evidence-based psychological treatment delivered through mobile devices [26], alcohol-related apps are becoming increasingly more available in app stores [52][53][54]. Worryingly, the majority of these apps are developed with the purpose of encouraging and facilitating drinking. A review of 384 apps found that only 11.5% (44/384) promoted reducing alcohol consumption to at least a moderate level of; either through providing information about detrimental effects, or through psychological interventions [54]. Although it was beyond the scope of this review to comment on the specific evidence provided by the interventions, it is doubtful that these apps are based on theory and empirical data (eg, hypnosis and motivation messages) or even guidelines. Similarly, another recent review of 662 apps found that 13.7% (91/662) targeted a moderate level of alcohol consumption. In contrast to the former review, the authors of the latter review assessed whether the promoted behavior change techniques were theory-driven and empirically validated, and found that none of them were based on theory or empirical evidence from the randomized controlled trials [52].
Despite the lack of availability of theory-driven and empirically supported apps, many new intervention initiatives targeting both subclinical and clinical AUD populations are seen in research [30,[55][56][57][58]. These, as well as the present app, may contribute to the reach of more appropriate treatment in the longer term. Indeed, we are most likely witnessing a paradigm shift where delivery pathways for evidence-based treatment are progressing from individual and group sessions to (partial) mobile apps and similar delivery pathways (eg, tablets and computers) [22,24,[59][60][61]. The delineating of these eHealth interventions is independent of time and place and could potentially contribute to reductions in problematic addictive behaviors and associated damage to a broad range of populations. However, in order to answer the question of whether mobile devices are a smarter pathway for delivering psychological treatment when targeting AUD, there is still a need for extensive research, as it is currently only in its early stages. This question will be further addressed by upcoming research in this fast growing area of study.

Conclusions
It is our hope that the present CET app will contribute to the availability of evidence-based mental health apps targeting AUD. Future work will customize the CET app according to the findings generated by the longitudinal randomized controlled trial in which the examination of this app is embedded in.

Introduction
Medical resident wellness, burnout, and lack of self-care is a multifaceted problem complicated by long work hours, demanding work environments, and a multitude of psychosocial stressors [1,2]. The recent suicides of 2 medical residents in New York [3] has refocused the conversation and has motivated leaders of medical training institutions to pilot interventions for improving resident wellness and decreasing burnout.
Medical residents have been found in multiple studies to have low levels of physical activity [4,12,13]. Specifically, internal medicine residents have been shown to have low levels of physical activity, with only 15% of them being above average or excellent [12]. In a national survey, resident physicians met the US Department of Health and Human Services (USDHHS) guidelines for physical activity approximately 73% of the time, but this percentage was lower than that for both attending physicians (84.8%) and medical students (84%). These results suggest that an intrinsic characteristic of life in residency training decreases a person's physical activity levels [13,14]. Physical activity among physicians is not only important for their own health, well-being, and career longevity, but is also correlated with their individual practice of counseling their patients on the benefits of exercise [15][16][17].
The majority of wellness interventions have focused on internal medicine and surgery residents, while few have focused on emergency medicine residents. Emergency medicine physicians experience nearly three times higher rates of career burnout than other physicians [18,19], and emergency medicine residents have demonstrated low levels of overall life satisfaction [20]. Wellness experts have called for a proactive, rather than reactionary, approach to improving the wellness of emergency medicine residents [21]. It is believed that physical activity is an inverse correlate of burnout among physicians, and engagement in physical activity is a modifiable behavior [11,22,23]. To date, there are no studies to our knowledge that have evaluated baseline physical activity among emergency medicine residents, and its effect on wellness is not described. Despite the perceived frenetic nature of the specialty of emergency medicine, the typical emergency medicine resident does not achieve the baseline physical activity recommendations posited by the USDHHS and the Centers for Disease Control and Prevention (CDC) [24,25] during a standard shift in the emergency department. When researchers placed pedometers on residents in a single, urban, academic, emergency medicine training program, only 9.9% of the residents took at least 10,000 steps during a shift [26]. Little is known about the physical activity behaviors in this population outside of the emergency department.
Pedometers have been shown to improve physical activity in different populations [27][28][29]. However, newer wearable devices for tracking physical activity have been used in an attempt to improve physical activity in specific populations. These wearable devices use complex proprietary algorithms to collect and provide physical activity data to the wearer, while being interconnected with computers and mobile phones. One study of internal medicine residents who used a Fitbit activity monitor in the clinical setting showed good adoption and adherence [12]. However, this study was not designed to measure the change in physical activity among residents after receiving the device, but rather the effect of the data provided by the device on their physical activity. Prior research has not shown how implementing a wearable exercise tracker will affect the physical activity behaviors of medical residents.
The primary purpose of this study was to measure the effectiveness of using a wearable device for tracking physical activity on the physical activity behaviors of emergency medicine residents. We hypothesized that self-reported physical activity levels would increase after receiving the device.

Study Design
This study was designed as a pre-post cohort study and involved both active data collection and participant-completed questionnaires. This study was approved by the institutional review board. All participants provided written informed consent and research was conducted in accordance with the Declaration of Helsinki. The data collection portion of this study lasted for 6 months, from September 1, 2014, to March 1, 2015.
The study population consisted of the members of a 3-year, accredited, academic, emergency medicine residency in the United States. The residency is composed of 62 total physicians, divided into 3 postgraduate years. Among the residents, 3 were involved as researchers and therefore were not eligible to participate. All other residents in the program were otherwise eligible to participate and all 62 residents were given a device.

Outcome Measures
The primary outcome measure was the change in the self-reported days per week of at least 30 minutes of physical activity, measured by questionnaire at study start and after 1 month of physical activity tracker use.
The secondary outcome was the change in weekly physical activity-defined by the number of days per week with at least 10,000 steps or 30 minutes of active time-as measured by the Fitbit (FitBit Flex; FitBit Inc, San Francisco, CA, USA) wearable activity tracker compared with the baseline self-reported estimate of physical activity. The accuracy of wearable devices for tracking physical activity has been formally assessed and compared with the physical activity monitors in mobile phones [30]. The algorithm used by the Fitbit company products is proprietary; however, it has been previously used and validated in health services research [31,32]. The number of steps recorded by the Fitbit Ultra has been shown to correlate well with the ActiGraph activity monitor, a well-validated and frequently used exercise research tool [33]. The Fitbit device and step counting algorithm also appears to have good validity when compared with multiple other tools while walking in a controlled environment [30] but may underestimate physical activity under certain conditions such as cycling or other physical activity [34]. When applied to a population of cardiac rehabilitation patients, and compared with the ActiGraph research accelerometer as the gold standard, the same activity tracker used in this study was found to overestimate the amount of physical activity performed by participants [35].
Additional measures of interest included subjective characteristics specific to the adoption and continued use of the physical activity tracking device, measures of wellness, changes in physical activity behavior, and change in self-reported physical activity at 6 months. We conducted a stratified analysis of the population for the physical activity specific outcomes based on two predetermined factors: whether or not the participants continued to use their device throughout the entirety of the 6-month study period and whether or not the participants met the CDC recommended guidelines for adult physical activity at the start of the study, based on their self-reported physical activity in the baseline questionnaire. CDC guidelines for adults recommend "150 minutes of moderate-intensity aerobic exercise (ie, brisk walking) per week" [25].

Study Protocol
All residents within this emergency medicine program were given a wearable physical activity tracker to improve their overall physical activity levels. Before receiving their device, the residents were asked if they would like to participate in this research study, advised that there would be no compensation offered to participate, and informed that receipt of the device would not be contingent on participation. At enrollment, all eligible participants were asked to complete a baseline questionnaire regarding demographic characteristics and physical activity habits (see baseline survey instrument in Multimedia Appendix 1). This questionnaire and all further questionnaires were conducted through SurveyMonkey software. Participants then received their devices and were asked to complete a 2-week acclimatization period before the initiation of electronic data collection. During this acclimatization period, participants were encouraged to wear and use the device. The purpose of this acclimatization period was to allow participants to activate their devices and learn how to use them in their regular daily life. Primary data collection from the devices occurred over the following month, September 2014. Participants were aware that their physical activity information would be collected during this period and were instructed to wear their devices as instructed by the device manufacturer on the packaging insert and on the manufacturer's website. Specifically, participants were asked to wear their device at all times, with the exception of charging. The hospital training environment does not have stated restrictions on wristband or physical activity tracker use and the participants were able to wear the devices in the clinical setting. The choice to actively follow the physical activity data for 1 month, as opposed to a longer duration of time, was made by the investigators for several reasons. First, active data tracking time was limited to 1 month to minimize the impact of being a study participant on the daily lives of the emergency medicine residents. Second, there is a paucity of data on the length of time needed to create a lasting change in physical activity behavior among otherwise healthy physician volunteers with a physical activity tracker intervention.
The specific physical activity tracker used in this study allowed for near real-time physical activity information gathering and data downloads. Specifically, the device provides data on both steps and "active minutes" for each participant. "Active minutes" were calculated within the proprietary algorithm of the device; however, the device manufacturer describes "active minutes" as time measured when the device senses movement that correlates with physical activity above 3 metabolic equivalents (METs) for 10 consecutive minutes. This specific time cutoff was based on specific CDC guidelines for physical activity [25]. In order to facilitate regular data collection, all study participants were asked to create an account on the Fitbit Inc website and register their device for data tracking. Participants then shared access to their Web-based data for the duration of the study period. Data collection was conducted through a third-party application programming interface that pulled the physical activity data from the Fitbit.com servers and generated daily physical activity reports for each participant. These reports were collected for 1 month after which data from participants were only gathered to determine if they continued to use the device until the study period ended. Because prior research has shown that device-specific barriers, such as frequent charging, may decrease the number of days during follow-up that the participants can wear their device, active data tracking was limited to those days in which the participants wore their device for at least 100 steps. This limitation did not apply to the primary outcome measure.
Following the month of physical activity data gathering, participants were asked to complete a questionnaire assessing their use of the activity tracker, perceptions of the device, information on their physical activity during the past month, and a self-assessment of the impact of the device on their self-perceived physical activity and overall wellness. At 6 months, participants were asked to complete a final follow-up survey to assess their use and perceptions of the device as well as their current physical activity levels.

Statistical Analysis
Data analysis included descriptive statistics of demographic characteristics, measures of wellness, physical activity, and perceptions about the wearable activity tracker. As the data did not satisfy the assumption of normality, statistical comparisons were conducted with the nonparametric Wilcoxon signed rank test, for the primary and secondary outcomes of interest as well as for the stratified analyses within these outcomes. Statistical significance was defined as a P value of <.05. Given the small population size of this pilot study and lack of research precedent for this type of intervention, no power calculation was conducted. All statistical analyses were conducted in SAS version 9.4 (SAS Institute Inc).

Study Population
Of the 59 eligible residents, 30 ultimately participated in the active data tracking portion of this study, where they used the physical activity tracker for at least 1 week during the first month of follow-up and completed 3 questionnaires over the 6-month period. Of the 59 residents who received a device before the start of the study, 46 (78%) were initially willing to participate and completed the baseline questionnaire, but of these participants, 16 (35%, 16/46) did not register or wear their devices and were excluded from the study analysis (Figure 1). The participants who were excluded were similar in demographic characteristics and baseline physical activity behaviors to the study population based on responses to the initial questionnaire.
Among the 30 study participants, the median age was 28 years (interquartile range, IQR, 4.0), approximately half (53%, 16/30) were male, 40% (12/30) were married, and 10% (3/30) had children. In addition, 3 participants (10%, 3/30) had and were still using a physical activity tracker at the start of the study and 1 participant previously had a device but had stopped using it before the start of the study. The overall perception of physical activity trackers at baseline was positive, with 26 (87%, 26/30) participants describing devices as helpful or possibly helpful on a 5-point Likert scale. The participants generally described themselves as moderately healthy (median 2.0, IQR 2.0, on a scale of 0-4 ranging from not at all healthy to extremely healthy; Table 1).
Despite rating exercise as personally "important" (median 3.0, IQR 1.0) on a 5-point scale ranging from not at all important (score of 0) to extremely important (score 4), participants felt that they exercised less than they would like. The median number of different types of physical activities reported by the cohort was 2.0 (IQR 1.0). With regard to how work influenced their physical activity behaviors, the majority, 23/ (77%, 23/30), felt that residency training and their work schedule negatively affected their physical activity behaviors. Nearly everyone in the study, that is, 29 of 30 participants (97%), described physical activity in general as having a positive impact on their wellness, and all study participants felt that an increase in physical activity levels would have a positive impact on their wellness (Table  1).

Outcome Measures
The primary outcome measurement, change in self-reported number of days of physical activity per week after 1 month of device use, was not statistically significantly different from the baseline self-reported number of days of physical activity. The median self-reported number of days of exercise per week before receiving the device was 2.5 (IQR 1.9) and after 1 month was 2.8 days (IQR 1.5, P=.36; Table 2).
The stratified analysis of the primary outcome showed that among those participants with physical activity below the CDC recommended amount of weekly physical activity at baseline, there was a statistically significant increase in the number of weekly days of physical activity from 1.5 (IQR 0.9) to 2.4 (IQR 1.2), P=.04, at 1 month, and an increase from baseline to 2.0 (IQR 2.0) days per week at 6 months (P=.04). The population of participants who met or exceeded the CDC recommended guidelines for physical activity at study start did not have a statistically significant change in their physical activity at 1 month (P=.69; Table 2). Among participants who continued to use their device at 6 months (10/30, 33%), there was no statistically significant change in physical activity from their baseline at study start. The same was true of people who stopped using the device before the end of the study period (20/30, 67%; Table 2).
The secondary outcome of interest, change in days per week of physical activity as measured by the physical activity tracker compared with self-reported baseline days per week of physical activity did not reveal a statistically significant change in physical activity. The median number of days of physical activity as measured by the device was 2.5 (IQR 2.7) compared with the baseline median number of days of exercise per week of 2.5 (IQR 1.9). The median number of eligible days recorded by the device where the participant recorded at least 100 steps was 27.5 (IQR 8) over the course of the 30-day month. There was no statistically significant difference in physical activity levels at 1 month among those who met or did not meet CDC recommended exercise guidelines (P=.69). Nor was there a statistically significant difference among those who continued to use the device for the entirety of the study period when measured at 1 month compared with themselves (P=.85), or among the group of people who discontinued use before 6 months (P=.34; Table 2).

Continued Use
Barriers to the continued use of the wearable physical activity tracker were addressed in both the 1-month and 6-month follow-up questionnaires. When study participants were asked to list the barriers to continued use of their physical activity tracker at 1 month, half listed forgetfulness-either forgetting to charge or forgetting to wear-the device. However, the other half of participants did not note any barriers to continued use. Barriers to continued use are listed in Table 3 and include the following: not wanting to wear the device, boredom, the belief that the device was not accurately measuring physical activity, and that it was not increasing overall physical activity. Fashion and the device breaking were also noted as barriers.
At 1 month, 18 of 30 (60%) participants described a positive impact on their wellness because of physical activity tracker use and 16 of 30 (53%) listed physical activity tracker use as having a positive impact on their physical activity. Of the 30 participants, 20 participants (67%) continued to use their device after 1 month, but only 10 (33%) participants still used their device after 6 months (Table 3). Figure 2 describes in graphical format the number of study participants who continued to use their device, by week, during the 6-month follow-up period.
Among those who stopped using the device by 6 months (20 of 30 participants), the participants listed both subjective and functional device issues as their principal reason for stopping use of the device, which were similar to the reasons for discontinued use at 1 month. Reasons given for discontinued use included the following: the impression that the device was no longer changing their exercise habits, boredom with the device, the impression that it was not accurately recording physical activity, and the impression that the device was a fad. Device-specific reasons for discontinued use at 6 months included loss of the device, wristband breaking, and issues with charging the device frequently (Table 3).
Among participants who continued to use the device for the entire study period (10 of 30), 4 of 10 participants (40%) listed liking the data provided by the device as their reason for continued use. Additionally, 3 of 10 participants (30%) found that the device reminded them to exercise. And 2 of 10 participants (20%) listed peer pressure as their principal reason for continued use. One person listed the device making him or her feel more physically fit as the main reason for continued use (Table 3). Table 2. Self-reported physical activity among study participants at baseline, 1 month, and 6 months stratified by continued use and by level of physical activity before receiving device.
Estimate of the number of days of exercise per week 6 months after receipt of physical activity tracker

Principal Findings
The primary objective of this study was to examine the effectiveness of a wearable device for tracking physical activity on self-reported levels of physical activity among a relatively healthy group of emergency medicine residents 1 month after receiving a physical activity tracker. Within this cohort of 30 emergency medicine residents, there was no overall statistically significant change in self-reported average number of days of physical activity per week 1 month after receiving the physical activity tracker. However, within the prespecified subgroup of residents who did not meet the CDC recommended minimum level of physical activity before receiving the device, there was a statistically significant increase in self-reported weekly physical activity from baseline (1.5 days) to 1 month (2.4 days) and 6 months (2.0 days). Despite a lack of measurable change in the primary end point, the majority of study participants felt that receiving and using the physical activity tracker had a positive impact on their physical activity levels and overall wellness. The broad implications of these findings suggest that these devices do not appear to have a negative impact on physical activity, may be beneficial within specific populations, and may improve wellness in ways that are not measurable with self-reported or device-provided data. These findings may help other emergency medicine or medical training programs implement physical activity programs for residents to improve their wellness by targeting interventions to those who are not physically active and by pairing a physical activity tracker intervention with additional behavioral interventions.
There are several potential explanations for why we did not observe a substantial effect of the physical activity tracker on physical activity levels after 1 month for our entire study population. First, the population in our study was young, physically active at enrollment, and presumably healthy, with two-thirds of participants already meeting CDC guidelines for weekly exercise. Thus, the potential effect of the physical activity tracker among an already active population is likely smaller and may require a larger study to find a statistically significant increase in physical activity. This is supported by our finding that the physical activity tracker was only significantly effective among the subgroup of participants who had not met CDC guidelines for exercise at baseline. Another potential explanation for our findings was that a physical activity tracker alone was not enough to encourage a major change in physical activity. Our study did not use a specific external behavioral change technique, such as a study coordinator helping the participants set an exercise goal. Instead, participants had the opportunity to choose to use the device and its built-in tools as a motivator. Nonetheless, the physical activity tracker used in this study, when paired with the website and mobile phone app, uses many behavior change techniques that have been previously described in the literature, including goal-setting behavior, feedback on behavior, social comparison, prompts and cues, social and other nonspecific rewards, and immediate feedback [36]. Finally, one-third of study participants discontinued use of the physical activity tracker before the 1-month period, which may have reduced the potential effectiveness of the device.

Wellness and Physical Activity
Nonetheless, this population of emergency medicine residents, while generally healthy, is still at risk for psychosocial problems such as career burnout and lack of wellness [19][20][21][22]. Even emergency medicine residents who described themselves as moderately healthy at study enrollment felt that they nonetheless exercised less than they would like, suggesting that before using their physical activity tracker the participants in this cohort were both aware of their own levels of physical activity and placed a value on their own wellness and the effect that physical activity has on it. Study participants described physical activity as personally important and felt that an increase in their physical activity would improve their overall wellness. Residency training, work schedule, and night shifts were all listed as having a negative impact on their physical activity levels, suggesting that physical activity tracker or other interventions to improve physical activity and resident wellness are important.

Barriers to Adoption and Continued Use
Evidence does suggest that a physical activity tracker may increase physical activity; however, barriers to adoption and continued use may limit the overall effectiveness. In a qualitative analysis of the Pedometer and consultation-UP trial (PACE-UP), which used pedometers and notebooks for participants and nurse follow-up as their intervention, the authors found the process of monitored physical activity to be beneficial to most participants with the caveat that some participants perceived barriers when the equipment failed to accurately record their activity [37]. This mistrust of monitoring devices was also shown in our results, specifically among those who discontinued use of the physical activity tracker. This specific characteristic of physical activity trackers is a barrier that must be addressed in future research. It is difficult to measure the effect that even a single episode of unmeasured or incorrectly measured physical activity might have on adherence, but it has the potential to bias results. Nonetheless, the stratified analysis of participants who either continued to use their physical activity tracker throughout our study or who stopped during the study period yielded no overall change in measured or self-reported physical activity.
With two-thirds of the participants discontinuing use of the physical activity tracker at 6 months, a consideration of the reasons for discontinuation is warranted to help inform future studies that may assess a physical activity tracker intervention among a healthy population. Reasons for discontinued use were varied but broadly included subjective reasons such as not wanting to wear the device on the wrist, the belief that the device was not accurately recording physical activity, and device-specific reasons such as malfunction, loss, comfort, and fashion. In prior research among an internal medicine resident population, compliance and adherence to interventions with an older generation physical activity tracker were better when paired with an ongoing exercise program and with weekly reminder emails [12]; however, we chose not to add these elements to our research protocol in an attempt to focus on the device-specific benefit and create an intervention that would be simple, reproducible, and scalable. Future research on the use of a physical activity tracker for health and wellness promotion will likely continue to be hindered by these elements. However, researchers who choose to use the physical activity tracker for health promotion may see an improvement in continued use among participants who appreciate the data provided by the device and the reminder to exercise that the physical presence of the device on the arm provides. Additionally, using the data provided by this type of device appears to be somewhat limited by the user.
The physical activity tracker used in this study was specifically designed to capture ambulatory activities; however, the company allows for inputting the duration of alternative physical activities such as swimming, cycling, weight lifting, and yoga into the computer and application interface. We did not specifically ask our study participants to input or record nonambulatory activities. This likely would have primarily affected only the secondary outcome of this study, which was device-measured active days. However, had the participants logged their nonambulatory activities, this would have been captured as active time. We did not differentiate between personally logged and device-measured activities. Nonetheless, in the initial survey we screened participants for their preferred physical activities, and the median number of different activities was 2.0 (IQR 1.0). One additional potential reason why participants discontinued use of the physical activity tracker was the limited ability of the physical activity tracker to record accurate and complete information about a participant's physical activity. All participants endorsed performing physical activities that are readily captured by the device, such as walking, running, jogging, or hiking. We did not capture their primary mode of physical activity, and there is therefore the possibility of bias in the effectiveness of the device and the primary outcome, should the participants feel as though their physical activity was not being measured correctly. A total of 2 of the 20 participants who eventually stopped using the device noted that the device was not measuring their physical activity correctly, although it is unclear if this was specific to failure of the device to record nonambulatory physical activities or mismeasurement of activities that the device is supposed to accurately capture, such as walking. Other studies have also reported similar barriers to using these devices, including the "novelty effect" wherein continued use declined, lack of adherence among participants, and technical issues with the device or website [34]. Nonetheless, Fitbit devices have been used in studies of cardiac rehabilitation programs with better overall adherence to use [38], and have shown promise for physical activity interventions among obese sedentary adult women [39], and for patients with chronic obstructive pulmonary disease [40]. These findings point to a possible enhanced benefit among less physically active users, which is also suggested by our results. The overall effectiveness of the device among a less healthy study population may be influenced by multiple factors including regular contact with medical professionals and the variety of non-device-specific behavioral modification techniques used in their research protocols-such as a nurse or study coordinator helping to set goals.

Limitations and Future Directions
Several limitations were identified in this study. This was a single institutional study, albeit a large and diverse residency program. Study data suggest that the baseline physical activity levels were higher than that described in other studies of resident physical activity. The residency leadership's emphasis on well-being and exercise, as demonstrated by the gift of a physical activity tracker, may have biased resident participation, and participants may have been more likely to overreport physical activity or even use the physical activity tracker more than they would normally have had it not been a gift from their employer. Of the study investigators, 3 were emergency medicine trainees during enrollment and data acquisition, and although this poses a potential source of bias in that the study participants frequently interacted with the investigators, implementing this type of intervention in the future will most likely also involve peer-to-peer interaction. It is unclear how this type of interaction can bias the results of this type of study, but it most likely encouraged participants to exercise more frequently and possibly could have led to overreporting of physical activity. The small sample size also limited our ability to conduct subgroup analyses, and future research may be needed to examine the effect of a physical activity tracker among people of different demographic groups.
This study is subject to selection bias. Slightly more than half of the eligible participants in the emergency medicine residency were part of the active data collection and follow-up. However, the 16 residents who initially enrolled in the study and completed the baseline questionnaire, but did not participate in further active data collection, had similar baseline characteristics and self-reported levels of physical activity. This study also involved 3 participant questionnaires and therefore suffers from the inherent biases of research with cross-sectional elements.
To decrease the amount of recall bias, subjective recall periods were kept intentionally short and specific. Furthermore, participants were aware that they would be providing estimates of their physical activity habits before being asked for them and were thus more likely to accurately recall and report these values. Conversely, this study involved a physical activity intervention, which could have caused unintentional inflation of self-reported exercise frequency. To mitigate this possible source of bias, the physical activity data from the device itself were used in addition to the self-reported amount of physical activity from the participants, and results did show high agreement. It is also possible that the physical activity data provided by the website and mobile app associated with this physical activity tracker could have influenced the self-reported amount of physical activity at 1 month. It is unclear if this potential bias could have masked the effect of the intervention. Further research must be performed to determine the degree to which access to a person's physical activity data can influence that person's self-reported physical activity. It also must be noted that the optimal time period during which to observe a sustained change in physical activity for this type of intervention is unknown. The follow-up time of 1 month may have been too short for our primary outcome. Our choice to limit active follow-up to 1 month was made for several reasons. First, as this was a pilot study, we did not want to unduly burden the study participants as they are medical residents with significant demands on their time and they were asked to regularly interface with the mobile app or website and use the device. Second, the only other study of an intervention using a physical activity monitor on a similar population [10] chose a 6-week by 6-week time period as an appropriate length of time for its crossover randomized clinical trial. Our study allowed for a 2-week acclimatization period, followed by 1 month of active monitoring. Our study specifically aimed to address feasibility and effectiveness of the Fitbit device over a short time period. Our primary focus was not on maintenance of the health behavior; however, this will be of paramount interest for future investigators who wish to use a physical activity tracker in a similar population. Finally, this study did not use validated physical activity or wellness tools, and thus caution should be used when interpreting these data. Future studies should seek to use validated instruments for their study population to increase the ability to compare results across study populations.
Additional limitations about the physical activity tracker used in this study should be noted. First, the device, even when worn correctly, may have underrepresented [34] or overrepresented [35] the amount of physical activity performed by each participant-a known problem that has previously been described in the literature. Second, the device itself required the user to remember to use it and to keep it charged, both of which allowed for inconsistencies in the number of days eligible for active data tracking. Nonetheless, daily use of the device was generally good and the number of physically active days per week as recorded by the device was similar to, if only slightly lower than, the median number of active days provided on the 1-month questionnaire. Finally, the study was limited with respect to determining the true amount of physical activity performed by each participant during follow-up. The apparent lack of difference between the device-measured and self-reported physical activity observed in this study must be viewed in light of the small sample size. It remains unclear how behavioral change should be measured, either with a questionnaire or with the data provided by the device, when using a physical activity tracker as an intervention. We hope that future research in this area can address the limitations largely due to the relatively small sample size of our pilot study. We are encouraged by portions of the results that suggest an improvement in overall wellness and physical activity within the subset of the population. We suggest that future research address some of the device-specific and adherence concerns voiced by our participants. Research that has paired these devices with behavioral interventions has also shown promise and should be explored in a larger sample of healthy participants as well. We also suggest lengthening the overall study duration to more accurately capture adherence to behavior change.

Conclusions
The implementation of a physical activity tracker among a healthy population of emergency medicine residents did not change the overall self-reported physical activity at 1 month and 6 months. However, there was a significant improvement in the amount of physical activity among the residents with preintervention physical activity levels below the CDC recommended guidelines. Subjective improvements in overall wellness and physical activity were noted among the whole study population. Adherence waned over the study period with only one-third of participants continuing to use the device at 6 months. Our pilot study findings may provide helpful information for residency programs that may be contemplating a wearable physical activity tracker intervention among their residents or others who may be considering a similar intervention among a relatively healthy population of adult participants.

Introduction
The success of a behavioral intervention depends upon participants' active engagement in treatment. Engagement with treatment is a multifaceted state with behavioral, affective, and cognitive components that contribute to maximizing positive treatment outcomes [1]. Treatment engagement is therefore a key component of any evaluation of treatment efficacy. With an increasing interest in developing behavioral interventions in the mobile health (mHealth) space [2], appropriate methods for evaluating engagement in this context are necessary. Indeed, evaluating engagement in mHealth has been identified as critical for improving the impact of technology-based mental health interventions [3,4].
Unlike clinic-based care, mHealth data are often collected much more intensively [5], allowing more detailed patterns to emerge in the outcomes of interest [6]. With mHealth interventions, engagement evaluations usually focus on the behavioral component and examine various measures of mHealth intervention usage [3,7]. Outcome data may be available daily if quantified as app usage, short message service (SMS) messaging, passive sensing data, response to prompts, or use of an online portal, for example. More so than in a single time point, we must consider the nature of missing data in intensively collected engagement outcomes. Furthermore, compared with other clinical outcomes, engagement is particularly likely to have missing data related to the outcome value itself. For example, if a participant is disengaged in treatment and thus unlikely to attend a therapy session, there is an increased likelihood that the participant does not return for a follow-up visit as well. In the mHealth context, the problem is compounded in that mode of follow-up data collection and intervention delivery is often the same. That is, the collection of an intensively collected engagement outcome like app usage is directly tied to engagement itself. The availability of engagement data is likely strongly related to level of engagement with the intervention. Therefore, missingness in engagement outcomes should be considered to be nonrandom and nonignorable [8,9].
Longitudinal models such as mixed effects models and latent growth curve models are robust to random missingness but not to nonrandom missingness like that likely present in longitudinal engagement data [8,10]. That is, failure to take into account the mechanism of missingness results in biased inference about the outcome [11,12]. Time-to-dropout and longitudinal engagement are linked processes, and examining either separately is likely to miss key information. Analyzing intensively collected engagement therefore requires longitudinal methodology that takes into account nonrandom missing data. The model must also accommodate flexible patterns of engagement over time which can be captured when so many data points are available. Using a joint model enables simultaneous modeling of the longitudinal outcome and the dropout mechanism to accommodate data missing not at random. Models that jointly evaluate the time-to-event and longitudinal processes have previously been shown to reduce bias in estimation of the effects in the longitudinal and time-to-even processes [13][14][15][16]. They have been successfully applied in nonintensive, longitudinal studies (as in Henderson et al [14], for example). These models, however, have not previously been applied in intensively collected data in the mHealth context where they are particularly relevant.
Recent work has highlighted the need to understand engagement with mHealth interventions with the goal of designing effective interventions that meet users' needs [1,7]. Levels of engagement with an mHealth intervention may change over time and have important implications for understanding the success of an intervention. Understanding how engagement changes over time, factors associated with changes in level of engagement, and how engagement is related to changes in behavior targeted by the mHealth intervention could inform intervention tailoring and improvement. Therefore, accurate estimation of behavioral engagement over time is essential.
The objectives of this paper are to discuss the utility of the joint modeling approach in the analysis of longitudinal engagement data in mHealth research and illustrate the application of this approach using data from an mHealth intervention designed to support illness management among people with schizophrenia. We use data from a large implementation study (ClinicalTrials.gov NCT02364544) which involved the use of a smartphone intervention (FOCUS) designed to support illness management among people with schizophrenia. The study data, described in detail in a separate article [17], consist of weekly engagement outcomes. We first introduce both longitudinal and time-to-event submodels that make up the joint model. We then illustrate the need for joint modeling by examining the difference in observed engagement outcome by amount of available data. After performing a naïve analysis of the data that does not take into account nonrandom missingness, we analyze and interpret the engagement data via joint modeling and contrast the results of the 2 approaches.

FOCUS Intervention Analysis
The data for this evaluation are from a multisite implementation project that recruited participants at 10 community mental health centers and outpatient clinics. Eligible participants were individuals between the ages of 18 and 60 years with psychotic disorders who had recently been discharged from a psychiatric hospitalization. Participants were offered a technology-assisted relapse prevention program that could last up to 6 months. Variation in program duration was due to both participant-related (eg, discontinued phone use and/or study follow-ups) and project-related (eg, funding ended) factors. As part of the program, participants were provided with a smartphone with the FOCUS illness self-management program installed. FOCUS consists of both prompted (3 times per day) and self-initiated use where each use starts with a brief self-assessment and is followed by educational/intervention content. Program discontinuation was identified when participants notified study staff of a desire to end participation and/or returned the study phone. In addition, when participants enrolled in the last 5 months of the study, they participated for less than a full 6 months. Finally, when participants stopped generating phone data, stopped attending in-person services, and study staff were unable to contact them after repeated attempts, the study team made the determination of discontinuation.
The evaluation of engagement with the FOCUS intervention assessed the decline of engagement over time for this long-term mHealth intervention as well as factors that may be associated with differing rates of decline. Curvilinear declines were seen in each engagement outcome: Days of mHealth Use, Days Responding to Prompts, Days of On-Demand Use, and Daily On-Demand Use. In addition, several demographic and psychiatric variables were found associated with longitudinal engagement. Models of time to dropout included gender, age, and race as potential predictors [17]. In the current demonstration of joint modeling, we focus on the research question of change in engagement over time using Days of mHealth Use per week as the engagement outcome.

Joint Model Set-Up
Joint models are comprised of 2 submodels: the longitudinal model of a continuous outcome and a time-to-event model. Using notation from Rizpoulous [18], the observed longitudinal outcome for individual i, y ij is observed multiple times, j=1,...,n i . The longitudinal submodel is a linear mixed effects model where β is a vector of fixed effect regression coefficients associated with the predictors x i (t) and the vector b i is a set of individual-level random effects associated with predictors z i (t). We assume a normal distribution for both b i and ε i (t), (b i~N (0,D), ε i (t)~N(0,σ 2 )), and also that these 2 random variables are independent of each other. In this application, the outcome, y i (t), is engagement measured as weekly mHealth intervention usage. The research question is whether engagement changes over the course of the study, so time from randomization, a quadratic effect of time, and a fixed intercept term are included in x i (t). Other flexible models of time are possible, but for simplicity, we focus on this parametric model which appears to fit the observed trajectory well. For other research questions, other predictors may be included in x i (t). Due to the focus on changes over time, we have included only time variables in the longitudinal model in this application, but it is straightforward to include additional variables in this model including the baseline predictors used in the time-to-event model. In z i (t), we include a random intercept and slope term. The model of engagement is therefore: y i (t)=β 0 +β 1 t+β 2 t 2 +b 0i +b 1i t+ε i (t).

[Equation 1]
We rewrite the above equation in a different format in order to introduce the term m i (t), which represents the true value of the longitudinal outcome for individual i at time t, measured without error: Time-to-event models are referred to as survival models, as they are often applied to survival data that is only fully observed in some participants (those who die while in the study). In the behavioral sciences, time-to-event models can be applied to model times to any event where the event may not be observed in all individuals (eg, time to relapse or time to recovery). When the study ends prior to an individual's relapse to smoking, that participant's time to relapse is only partially observed. That is, it is known that he or she remained abstinent for the duration of the study, but the time of relapse is unknown. These partially observed times are said to be censored. In the context of engagement, the partially observed time-to-event data is the time to dropout. Time-to-dropout data is fully observed among those participants who drop out prior to the end of study. Time to dropout is censored when the study follow-up period ends.
The time-to-event submodel is given as a proportional hazard model [19]: Importantly, the true value of the longitudinal trajectory, m i (t), is a predictor in this model representing the assumption that the longitudinal trajectory influences the risk of dropout. Other baseline covariates in the model are represented by w i . In the current application, we include available baseline predictors that may influence the time to dropout: age, gender, and race (black, Hispanic, and other with white as the reference group): The semiparametric proportional hazard model does not require an assumption about the distribution of the time to event, and the parameter estimates associated with predictors in the model are conveniently interpreted as hazard ratios. For example, being male is associated with a risk of dropout that is exp{γ 2 } times the risk of dropout in females.
Estimation of the parameters in each model is performed by maximizing the log likelihood of the joint distribution of the longitudinal and time-to-event outcomes [18]. This joint model is known as a shared parameter model since the parameters that define the individual-level trajectory (random and fixed effects) influence both the longitudinal trajectory and the time-to-event model. Thus, the random effects account for both the association between the longitudinal and time-to-event outcomes and the nonindependence of repeated observations within individual [18].

Joint Modeling of Engagement
Nonignorable missingness, or missingness not at random (MNAR), occurs when the probability of missingness depends on unobserved longitudinal responses [8,11]. That is, it occurs if certain values of a variable are more likely to be missing than other values. In the case of engagement, it is very likely that lower levels of engagement are less likely to be observed because a participant who becomes less engaged over time is much more likely to drop out of the study. Longitudinal engagement data is therefore particularly subject to informative missingness. In the current study, engagement, defined as the number of days in a week that the participant used the mHealth intervention, is collected each week for up to 6 months. Participants provided data for differing amounts of time ranging from less than 1 month to 6 months or more. A participant who provided less than 6 months of data is considered to have dropped out for the purpose of the time-to-event analysis. This happened for various reasons. In some cases, the reason is administrative and is likely not informative (ie, value of the unobserved data should not be viewed as related to the data that would have been observed); for example, mobile data collection stopped because the implementation effort came to an end. On the other hand, there are several participants who stopped providing mobile data before the study ended. In the latter case, we should assume that the value of the engagement outcome that would have been observed (ie, if the participant provided data) is lower.
If we knew that all participants who dropped out did so due to disengagement (eg, stopped participating or using the phone due to lack of interest in the intervention), it might be reasonable to impute a 0 value for engagement for all weeks postdropout. This would be considered a worst-case scenario as it is possible that had these participants not dropped out they would have had some engagement even if it were low. However, there are also cases where dropout is unrelated to engagement, including administrative dropout or moving out of the area, lost phone, etc. For these 2 reasons, we should not assume that all missing data represents the worst case scenario of complete disengagement. The joint model allows for a relationship between level of engagement and likelihood of dropout but does not make assumptions that all missing data represents a complete lack of engagement. In this way, the joint model flexibly handles dropout that may or may not be related to engagement.
To implement the joint model, we used the JM package in R [18] (R Project). The model estimated is described in equations 1 and 2 above. Naïve models for longitudinal outcome and time-to-dropout were fit via linear mixed effects models and Cox proportional hazard models, respectively, using the lme function in the nlme package [20] and the survfit function in the survival package [21] in R.

Results
Data from 342 participants who used the FOCUS intervention for at least 1 week were included in these analyses. The mean age of this sample was 35 (SD 11) years; 62.3% were male, 50.0% were white, 25.2% were African American, 10.8% were Hispanic, and the remaining 14.0% reported being Asian, American Indian, Native Hawaiian, or more than one race.
Kaplan-Meier estimates of the time-to-dropout are presented in Figure 1. Median time-to-dropout in this study was 22 weeks, but dropout occurred throughout the course of the study. After a participant dropped out, engagement data were no longer available.
To illustrate the relationship between level of engagement and amount of data provided, we grouped participants by duration of mobile data provided. At each time point the available data within each group are used to compute a mean engagement. Figure 2 illustrates that participants who provided the most data for the longest duration had the highest level of engagement. Likewise, participants who discontinued using the intervention after only 1 month had a very low level of engagement during the time they were actually providing data. One of the benefits of mixed effects models is that data are not required at all time points for all participants. This is possible because the model estimates an individual's trend over time based on the data from that individual augmented by the trend of the full sample of participants [22]. However, this is problematic in the context of nonignorable missing data. If during the later months, data are only available from those participants who provided data for several months and those participants tended to be more engaged throughout, estimates from a naïve model during the later months will rely on data provided by highly engaged participants and therefore overestimate the level of engagement at those times.
The longitudinal engagement outcome is Days of mHealth Use per week (range 0-7). Sometimes count variables can be considered to have a Poisson distribution, but unlike a Poisson random variable, the distribution of this variable was symmetric around the mean (not skewed) and somewhat kurtotic. There is evidence supporting the consideration of Likert scale variables with multiple categories as continuous variables [23], and mixed effects models have been shown robust to both non-Gaussian random effects distributions [24,25] and non-Gaussian residual errors [26]. We therefore examined the distribution of the longitudinal engagement variable and the residuals from the mixed effects model to assess the appropriateness of the longitudinal submodel for this engagement outcome. Both indicated that there was not a significant deviation from normality and the model-based estimates fit the raw data means well. Table 1 and Figure 3 show the results of a naïve mixed effects model of engagement not taking into account dropout alongside the results when the joint model is implemented. The longitudinal models are similar with significant linear and quadratic terms showing a significant decline in engagement over time (negative linear time term) that is steeper toward the beginning of the study and levels off as the study progresses (negative quadratic time term). Figure 3, however, shows that the mixed model estimates a higher level of engagement than the joint model and this difference is pronounced toward the end of the study. At baseline, estimated level of engagement in the 2 models differs only by about 0.2 days per week. By 6 months, however, the model-estimated engagement from the naïve model is 2.9 days per week of uses, whereas the model-estimated engagement from the joint model is 1.8 days per week, a difference of 1.1 days per week.
Examining the naïve time-to-dropout model versus the time-to-dropout submodel of the joint model that includes longitudinal engagement as a predictor, we see that no baseline covariates have a significant effect on time-to-dropout in either model, but it is clear in the joint model time-to-dropout submodel there is a strong association between engagement level and risk of dropout. Specifically, using the mHealth intervention 1 day more per week is associated with 0.77 (exp (−0.26)) times the risk of dropout at any time (P<.001). That is a 23% decreased risk of dropout associated with greater engagement.   Model-based estimated mean engagement with mHealth intervention (intervention use) over the course of the study. Estimates (and 95% confidence intervals) from the joint model of engagement and time-to-drop-out and from the naïve mixed effects model not accounting for drop-out are displayed.

Discussion
Examining intensively collected engagement with the mHealth behavioral intervention made clear that level of engagement varied by amount of available mobile data. Naïve mixed effects models of engagement showed a slight decrease over the 6-month course of the study, but these results weight data from highly engaged participants toward the end of the study period leading to possibly biased results. Joint modeling of the linked processes of engagement and time to dropout allowed for an examination of engagement over time that more appropriately accounted for missing engagement data. These model results indicated a greater decline in engagement with the mobile intervention over time in the population. Furthermore, the time-to-event submodel of the joint model specifically quantifies the association between longitudinal engagement and dropout. The association is seen to be statistically significant, with those who are more engaged significantly less likely to drop out. And conversely, those who are less engaged are much more likely to drop out and therefore much more likely to yield missing engagement outcome data.
The present analysis represents just 1 example of implementing a joint model and comparing it to a naïve mixed model for engagement with an mHealth intervention. However, similar patterns between models would be expected assuming an association between increased likelihood of missingness and lower engagement. That is, the joint model results will likely estimate lower levels of engagement than a naïve mixed model. The magnitude of the difference between results from a mixed model and longitudinal submodel of a joint model depends on the association between engagement and missing data in the particular dataset being analyzed, the level of missing data, and the pattern of missingness over time. Therefore, a comparison of models from a different dataset may produce different results.
While missing data in the context of longitudinal studies is always a concern, often this missingness can be handled with the usual longitudinal modeling techniques such as mixed effects models. Importantly, with engagement data, the assumptions necessary for valid inference from typical models are likely not met since level of engagement may be related to likelihood of missing data. In this case, typical longitudinal models produce biased results. It is therefore especially important to account appropriately for missing data in analyses of engagement outcomes. With mHealth interventions, engagement is collected more intensively and often in the same mode as treatment is delivered so addressing missing engagement data is especially important. In the current investigation, we focus only on the behavioral component of engagement as this is frequently measured intensively via mobile devices and therefore most relevant for the modeling concepts presented.
Joint models are straightforward to implement with the JM package in R and offer flexibility in modeling the longitudinal trajectory over time. While in the current application we only used parametric models of time (quadratic), more flexible patterns of change over time can be accommodated by using spline basis terms in the longitudinal submodel of the joint model. Parametric assumptions on the time-to-event data are also not required.
There are other types of shared parameter models that model the longitudinal and/or time-to-event data differently with respect to specifying the individual-level trends in the longitudinal outcome, specifying the dependence of the time-to-event processes on these individual-level trends, varying the form of the time-to-event model, and approaching the estimation of model parameters [12]. The shared parameter model implemented in the current application is that proposed by Wulfsohn and Tsiatis [16]. Other methods for modeling longitudinal data with dropout, including random coefficient selection models and random coefficient pattern mixture models, are summarized in Little [27]. Pattern mixture models [22,28] estimate separate longitudinal trajectories by groups defined by dropout time and summarize the trajectory for the population by averaging the groups. When a limited number of dropout patterns are present to define the groups or when the goal is to examine trajectories separately by time of dropout, pattern mixture models may be most appropriate and also can be easily implemented. Related to pattern mixture models, the terminal decline model [29] is geared toward examining the longitudinal trajectory just prior to dropout or death. Selecting an appropriate model to accommodate nonignorable missingness is important and should be geared toward the research question. The shared parameter joint model implemented here is especially appropriate for intensively collected longitudinal data because the focus is on examining the longitudinal trajectory of the population over time, the pattern of engagement can be modeled flexibly, grouping individuals by dropout time is unnecessary, and no assumption is made about the distribution of the time to dropout.
Assessing engagement with mHealth behavioral interventions is crucial to evaluating their efficacy. Modeling intensively collected engagement should be done via models that appropriately account for the potential of nonignorable missing data. Using the shared parameter joint model implemented in the JM package in R is a straightforward way to flexibly model intensively collected engagement data like that from mHealth interventions and to examine the relationship between engagement and missing data.