Co-designing opportunities for Human-Centred Machine Learning in supporting Type 1 diabetes decision-making

Type 1 Diabetes (T1D) self-management requires hundreds of daily decisions. Diabetes technologies that use machine learning have significant potential to simplify this process and provide better decision support, but often rely on cumbersome data logging and cognitively demanding reflection on collected data. We set out to use co-design to identify opportunities for machine learning to support diabetes self-management in everyday settings. However, over nine months of interviews and design workshops with 15 people with T1D, we had to re-assess our assumptions about user needs. Our participants reported confidence in their personal knowledge and rejected machine learning based decision support when coping with routine situations, but highlighted the need for technological support in the context of unfamiliar or unexpected situations (holidays, illness, etc.). However, these are the situations where prior data are often lacking and drawing data-driven conclusions is challenging. Reflecting this challenge, we provide suggestions on how machine learning and other artificial intelligence approaches, e.g., expert systems, could enable decision-making support in both routine and unexpected situations.


Introduction
Among major health conditions, diabetes is one of the most common, affecting over 400 million people worldwide (International Diabetes Federation, 2017). Type 1 diabetes (T1D), which affects 5%-10% of those with diabetes, is an autoimmune condition requiring frequent injections of insulin to maintain blood glucose (BG) levels within a safe range. Elevated levels (hyperglycaemia) can lead to long-term complications, such as blindness, kidney failure, or nerve damage, while severely low BG levels (hypoglycaemia) can lead to unconsciousness, seizure, coma, and -in rare cases -death (McGill and Ahmann, 2017). While clinicians can play an important role in supporting diabetes care, effective daily management relies primarily on an individual's habits and management decisions (Funnell and Anderson, 2004). Selfmanaging diabetes typically involves self-monitoring BG levels and lifestyle factors such as food and physical activity multiple times per day, analysing this information, and dynamically adjusting numerous factors accordingly (Klonoff, 2012). However, maintaining this balance with the demands of daily life is challenging, resulting in many individuals failing to meet clinical guidelines (Miller et al., 2015). It is help users make sense of their data. This has often been approached by seeking to provide personalised suggestions about appropriate insulin doses based on a range of manually logged variables such as exercise levels, alcohol consumption and time of meal (e.g. Pesl et al., 2017), or by bringing attention to correlations between physical location and glucose stability (Sebillo et al., 2015). However, existing decision support systems still require burdensome logging, often lack explicit actionable suggestions (Ohlin and Olsson, 2015) and are far too often developed without end-user input (Gillespie and Seaver, 2016;Inkpen et al., 2019).
While HCI and AI research have previously been characterised as having distinct views of the relationship between humans and technology (Winograd, 2006;Grudin, 2009), recent work has sought to integrate human-centred design methodologies and AI approaches (Abdul et al., 2018;Katan et al., 2015;Yang et al., 2020). There is a clear opportunity for reducing the burden of T1D through AI and ML enhanced technologies; however, the integration of these solutions into people's everyday lives is not straightforward. Self-care involves planning, coping with adversity and reflection-both in the long and short term. As such, accounting for lived experience is paramount to the design of any human-centred ML-enhanced intervention. Despite theoretical promises of reduced care burden through the introduction of ML/AI for health care, the lack of domain knowledge, inadequate datasets, and the complexities of everyday life have scuppered many of these innovations from achieving real-world viability (Konam, 2022).
Given the lack of interest in long-term manual tracking (Wu et al., 2017;Gouveia et al., 2016;Arsand et al., 2007) and users' difficulties in engaging with reflection and sense-making (Katz et al., 2018a;Mamykina et al., 2015;Mamykina and Mynatt, 2007), our goal was to explore how these issues could be addressed with AI/ML. In this paper, we describe the outcomes of a series of interviews and six co-design workshops conducted over nine months with 15 people with T1D to identify opportunities for using machine learning within diabetes decision support systems that people with T1D would want to use. As research on chronic conditions within HCI focuses on tracking, reflection and sense-making with mobile technologies (Epstein et al., 2020;Desai et al., 2019;Kim et al., 2017;Raj et al., 2019a;Schroeder et al., 2019), it influenced our thinking on the design of this study. However, despite initially framing the co-design workshops around everyday self-tracking needs, we had to shift their focus away from technologies for everyday use. Our participants stated that they did not wish to use additional technology in routine situations, instead preferring to rely on personal 'diabetes rules': heuristics developed through their own lived experiences. The context-dependent nature of these heuristics meant that they were often inadequate in unexpected situations such as illness or holidays where participants might benefit more from technological support. In the ML/AI context, this highlights a tension between conventional approaches to machine learning which rely on statistical analysis of large data sets, and people's desire to only engage with these systems within atypical contexts for which there is therefore little retrospective data. Moreover, it suggests that the current dominance in research and industry of approaches focusing on constant self-tracking for ML-enabled T1D decision support may be unwarranted. Our work extends and supports existing research on T1D self-management. It confirms the use of personal heuristics and the unwillingness to engage in self-tracking, and highlights the need to support non-routine situations. Furthermore, it contributes to recent interdisciplinary Human-Centred ML discussions in HCI (Gillies et al., 2016;Jiang et al., 2021) by examining the distinguishing characteristics of different types of situations (routine vs non-routine, expected vs unexpected) and their specific challenges, and pointing towards appropriate solutions: from ensemble methods and expert systems, to anomaly detection and context-aware reminders.

Self-tracking and diabetes self-management
Self-tracking is a key practice in self-managing personal health and wellbeing. Prior work has documented self-tracking in a wide range of cases, including diabetes (Danesi et al., 2018;Kooiman et al., 2018;Mamykina et al., 2008), migraine (Schroeder et al., 2019), irritable bowel syndrome (Karkar et al., 2017), Parkinson's disease (Mishra et al., 2019), and multiple sclerosis (Ayobi et al., 2017). People with chronic health and wellbeing conditions typically intertwine documentary, goal-directed, and diagnostic self-tracking styles over time (Karkar et al., 2017;Schroeder et al., 2019) and develop personally meaningful self-care ecologies in creative ways, from using traditional tools, such as pencil and paper (Ayobi et al., 2018), to hacking emerging consumer health technologies (O'Kane et al., 2016c). However, people who engage in self-tracking report not only perceived benefits, such as self-awareness and a sense of control , but also describe adverse effects, experiencing ''pointless pressure'' (Ayobi et al., 2017, p.6) and questioning the purpose of self-tracking tools (Ng et al., 2018). These findings highlight the importance of human-centred and participatory approaches to designing data-driven technology.
People with T1D use a wide range of self-tracking (or selfmonitoring) technologies. Since the early 1980s, portable blood glucose meters have supported T1D management through self-measurement of blood glucose, thereby facilitating real-time decisions on insulin dosages, hypoglycaemia treatment, dietary choices, exercise, and other lifestyle factors (Klonoff, 2012). Use and frequency of such monitoring has been correlated with decreased A1c (a measure of BG levels over time) and improved clinical outcomes (Klonoff, 2007). More recently, continuous glucose monitors (CGM) are being increasingly adopted for T1D management, offering alerts, a glanceable record of frequently sampled measurements, as well as predictions of the direction and rate of change (Pettus and Edelman, 2016). Another widely adopted technology is the insulin pump, which better simulates the body's natural production of insulin through near-continuous infusion. CGMs and pumps have demonstrated clinical benefits, including improved glycaemic management and reduction of hypoglycaemic events. However, they require motivated individuals who can maintain the device's operating requirements, interpret data and calculate insulin dosages for meals and corrections based on multiple factors (Pickup and Keen, 2002;Rodbard, 2016;Sun and Costello, 2017). More recently these devices have been combined into the closed-loop artificial pancreas, which offers users significant increases in time in range without increasing hyperglycaemia (Usoh et al., 2022). However, such systems are expensive and still require frequent user input and adjustment for complex lifestyle factors, with benefits mostly occurring during the night when such factors play a smaller role in glycaemic management (Brown et al., 2019).
Given the complexity of diabetes and associated data-driven decision making, diabetes apps are increasingly being developed to support personal care. Common features include tracking and visualising diverse data, contextual tags, digital photographs of meals, self-reflection and identification of trends, communication with a support team, remote monitoring, peer support, and integration with sensors to automate tracking (Klasnja and Pratt, 2012;O'Murchu and Sigfridsson, 2010;Owen et al., 2015;Smith et al., 2007). However, despite the myriad options for tracking and visualising, such apps are primarily tools for reflection and the user is left to make sense of their disparate data (Katz et al., 2018a;Raj et al., 2019b). This focus on tracking and reflection is a common approach in health and behaviour change apps (e.g. Ayobi et al., 2017Ayobi et al., , 2020Stawarz et al., 2014Stawarz et al., , 2015, and in the context of diabetes assumes that supporting users in collecting diverse data will translate to improved diabetes management. Yet, the efficacy of this approach, especially without additional clinical support, remains unclear. Encouraging longer-term adoption of apps remains challenging (Maniam and Dhillon, 2015) and studies of existing T1D apps have shown only modest clinical benefits (Wu et al., 2017). In addition, such apps rely on motivation for continued engagement in intensive data collection, identification of patterns, and intentional modification of behaviours (Gouveia et al., 2016)-even though it is difficult to maintain regular long-term logging practices (Arsand et al., 2007) and interpreting such multivariate data is often challenging (Katz et al., 2018a;Mamykina et al., 2015). Therefore, further research is required to design methods that will better support users in their diabetes decision-making.

Personal nature of diabetes management
While monitoring is an essential aspect of diabetes management, its main benefit is the potential of the resulting data to inform and bring about better treatment decisions (Klonoff, 2007). However, personal diabetes data can be complex and challenging for people with diabetes to interpret and apply (Mamykina and Mynatt, 2007). Mamykina et al. (2015) have devised the Sensemaking theoretical framework, which proposes a dynamic interaction between two modes of daily diabetes management: habitual and sensemaking. Their framework also presents three key stages of feedback loop decision-making that occurs in both modes: perceiving new information related to the condition, understanding this information, and acting based upon this information. Sensemaking behaviours are typically triggered when the individual notes a 'gap', for instance, an unexplained out-of-range BG level. As a result, the new information does not fit into established self-care mental models (or heuristics) and the individual must engage in effortful thinking and then experiment with new behaviours (sensemaking). The sensemaking framework (Mamykina et al., 2015) asserts that these modes are complementary and interdependent: the ability to operate predominantly in the habitual mode is important for sustainable self-care, while cognitively demanding sensemaking is essential for learning the cause and effect relationships which form the basis for new heuristics that support effective habitual action.
Yet, while effective decision-making is essential for diabetes management, the relatively small number of people who meet clinical recommendations (Miller et al., 2015) suggests that individuals face significant barriers in establishing and applying practical and effective care heuristics. Among the challenges is the need to account for shifting contexts  and hard to predict biological responses related to factors such as exercise, illness or delays in insulin absorption (Peyser et al., 2014). In addition, diabetes is not a static condition, as disease needs are constantly changing and evolving over time (Klasnja et al., 2015), influencing self-management practices (O'Kane et al., 2013). Furthermore, presentations of data can reinforce biases rather than leading to actual insights (Mamykina and Mynatt, 2007). Therefore, while there has been significant progress in diabetes technologies, there is still need for further development, especially for decision support systems that can assist in reducing the cognitive effort of sensemaking such as pattern recognition and nonobvious correlation discovery (Katz et al., 2018a;Mamykina et al., 2015) in a manner that is compatible within the lived experience and stresses of daily life.

Challenges in developing human-centred AI/ML decision support systems
Artificial intelligence (AI) approaches have become increasingly popular for automated, scalable and affordable knowledge discovery and reasoning. Machine learning is the area of AI that involves the use of algorithms that improve (or learn) through data (or experiences) (Flach, 2012). Machine learning tasks are typically categorised as regression, classification and clustering, which fall under the main paradigms of machine learning techniques: supervised learning, where training data consisting of labelled pairs of input and desired output is used to build a predictive model; unsupervised learning, where the system seeks to discover emergent patterns from unlabelled training data; and reinforcement learning where an agent acts autonomously upon the environment through trial and error while attempting to maximise a reward. Expert systems are an alternative approach for decision support systems. Unlike machine learning, they primarily consist of predefined if-then rules derived from domain knowledge rather than learning models from data sets. Expert systems offer the advantage of increasing explainability-the ability for a human to understand how decisions have been made, which is an ongoing challenge for 'black box' ML applications (Rudin, 2019).
AI techniques have frequently been applied in healthcare, including diverse diabetes-specific tasks such as blood glucose prediction, hypo-/hyperglycaemia prediction, BG variability detection, controllers for insulin-based diabetes therapy, and lifestyle support (Donsa et al., 2015;Sowah et al., 2020;Tyler and Jacobs, 2020). Other work has focused on developing models for predicting hypoglycaemia (Plis et al., 2014) and using reinforcement learning for BG prediction (Yamagata et al., 2020). There has also been more exploratory work derived from qualitative research looking at how AI methods could provide personal diabetes decision support. For example, Mamykina et al. (2017) proposed a framework for personal discovery of cause-andeffect relationships in diabetes self-management, suggesting automated approaches for supporting feature selection, hypothesis formulation, hypothesis evaluation, and goal specification. However, despite promising attempts to expand AI techniques to support diabetes management, there are few research prototypes and commercial products that use ML to support T1D self-management (Forlenza, 2019), with little evidence of their usefulness and usability.
It is important to note that the dynamic, idiosyncratic, and potentially fatal implications of faulty diabetes decision-making provide significant challenges to automating diabetes decision support. Firstly, not infringing on user autonomy is an important principle of medical ethics (Beauchamp et al., 2001) and should therefore be considered a crucial element of decision support systems (Meredith and Arnott, 2003). Respecting user autonomy implies that people should be capable of not only using such systems, but also coping when they fail, either technically or because the situation is novel and there is insufficient data to make a recommendation. This requires methods for both ensuring that this transition can be handled smoothly, and that the user does not lose the ability to self-manage during such events. Secondly, systems can also function as designed and still provide incorrect recommendations. For example, machine learning depends on training data, which can reflect or contain biases and missing data or incomplete data resulting from unequal access to care and technology can reinforce socio-economic disparities (Gianfrancesco et al., 2018). Given that decision support systems should accurately reflect the personal values of the user (Ariail et al., 2015), unknown system biases pose further unresolved barriers. Finally, such systems rely on personal data which depends on user engagement. A recent paper on the use of decision support systems for insulin dose suggestions noted an important caveat: that these systems were reliant on continual app-based data entry, which is uncommon outside research settings (Forlenza, 2019). This poses significant challenges as frequent engagement with personal data is needed to inform actions, although the stressful nature of such interactions can lead to disengagement. For example, such data entry can challenge established beliefs, demand undesired actions, reveal unsatisfactory progress, or bring about negative emotional effects (Chang et al., 2017a). It can also be boring and burdensome (Arsand et al., 2007). These human-computer interaction challenges in developing diabetes decision support systems demonstrate the need to develop comprehensive human-centred approaches for the integration of AI methods into pervasive digital health systems.

Towards co-designing AI/ML decision support systems with people with T1D
The importance of involving end-users in the development of AI/ML decision support systems is well supported among the HCI community. In order to build systems that people can use effectively, it is critical to understand how they wish to interact with such systems, identify key obstacles, and challenge designer assumptions (Amershi et al., 2014). Diverse user-centred design methodologies have often been applied to diabetes technologies, with researchers reporting that they helped to answer research questions and understand users' mental models (LeRouge and Wickramasinghe, 2013). For example, Kanstrup et al. (2010) used interviews, workshops and explorations to develop software and services to support living with diabetes. The resulting prototype was significantly different from the researchers' initial concept of an AI ''GPS-style'' navigation system. Participants made it clear that they did not want a system that would tell them what to do, but rather a way of making better informed decisions. Arsand et al. (2012) also applied participatory design techniques to develop a mobile phone app that essentially acted as a diabetes diary. Lessons learned from participants included the importance of automating data entry, integration with additional sensors, and contextual sensitivity. Finally, McCarthy et al. (2017) used participatory design techniques in a workshop with people with T1D, who were teamed with designers to explore how BG monitoring devices could be re-designed to address stigma related to public use. Strategies such as disguising monitors to look like non-medical items, increasing brand identity, and personalisation were explored to gain insights into such potential approaches. Overall, engaging users is essential, as it can help to develop solutions that go beyond simply reinforcing existing hierarchical compliance-based relationships in healthcare by allowing the users to adapt technologies to their own needs (Jones et al., 2017). However, while people with T1D have been involved in developing technologies (e.g. McCarthy et al., 2017), the research on their involvement in the creation of ML/AI based solutions is limited.
While the initial research streams tended to focus on the feasibility and optimisation of ML models (Maniruzzaman et al., 2018), recent work has demonstrated the potential of human-centred ML systems (Mitchell et al., 2022). It has become clear that a detailed understanding of people's data collection and prediction needs is crucial in informing the design of ML systems: a lack of domain knowledge and end-user needs can cause cascading data issues and amplify potential harmful effects right from the onset and throughout ML development lifecycles (Sambasivan et al., 2021). Participatory research methods have been highlighted as particularly promising in informing the design of ML-based systems (Loi et al., 2018), as these inherently humancentred approaches can help embody key principles, such as gaining empathy and fostering shared decision making, mutual learning, and collective creativity. For example, co-design research on ML/AI outside of the diabetes context has successfully investigated concepts such as explainability and trust (e.g. Wang et al., 2019;Zicari et al., 2021). Therefore, through a series of interviews and co-design workshops, we contribute a human-centred and collaboratively-constructed understanding of people's decision support needs in diabetes care and implications for the design of AI/ML-based technologies.

Methods
The overall aim of our project was to involve people with T1D in design of decision support systems to better understand their data needs. While we were interested in participants' thoughts about MLbased systems, our main goal was to understand their needs first and let them drive the design of a potential decision-support system. As such, we used co-design as this is an approach to designing with-not for-the participants (McKercher, 2020). This is a practice where people collaborate to connect their knowledge, skills and resources to create potential solutions together (Zamenopoulos and Alexiou, 2018). Design workshops and reflecting on own experiences are a common approach (e.g. Harrington et al., 2018;Marent et al., 2018) to facilitating this co-creation.
Furthermore, we decided to focus on self-tracking as a starting point as this is an approach known and often used by people with T1D (Danesi et al., 2018;Kooiman et al., 2018;Mamykina et al., 2008). Self-tracking is an inherently data-driven practice and provides significant potential in leveraging AI-based approaches, as Mamykina et al. (2022)'s recent work on personal informatics and AI suggests. Based on this understanding, our objective was to develop a detailed understanding of people's self-tracking practices and preferences to collaboratively inform the design of desirable ML-based decision support systems.

Participants
Fifteen UK participants with T1D interested in diabetes technology were recruited using social media, word-of-mouth and leaflets in grocery stores, cafes, libraries and other public spaces that allowed us to reach a wide range of potential participants. Inclusion criteria were as follows: using blood glucose meters, being on Multiple Daily Injections (MDI) therapy, regularly self-adjusting analogue insulin dosages for different situations, using one of the common brands of insulin (Humalog, NovoRapid, Apidra, Lantus, Levemir, Tresiba or Fiasp), not taking any other blood glucose-lowering medication, and having iPhone 5S or later. Participants were 24-69 years old (average = 36.4 years, SD = 11.4), 11 were men. They had been living with T1D for 3-34 years (average = 17.3 years, SD = 9). See Table 1 for more details.

Procedures
Each participant attended the initial interview and later participated in a series of monthly design workshops (or telephone interviews if they could not attend). Similar to Marent et al. (2018), we combined semistructured interviews with co-design activities to support participants in sharing their personal views, experiences, and needs throughout this research project. The study received ethical approval from the Faculty's Ethics Committee at the last author's institution.

Initial interview
To help build rapport and gather background information that would help us inform the activities in the first workshop, each participant attended a one-to-one interview at the beginning of the study. The interviews took place at the University and lasted 60-90 min. The session started with an overview of the study and participants were able to ask any questions, after which we collected their consent. Next, participants were interviewed about their experience with diabetes, their daily routines, any health tracking and diabetes technologies they used, as well as their attitudes towards and perceptions of the use of artificial intelligence and machine learning for diabetes support. After the interview, a researcher walked the participant through the Dexcom sign up process to ensure they had access to a CGM during the study. The participant then received an Apple Watch and signed the lease document that stated they would be able to keep it if they completed the study. Finally, the researcher guided the participant through the registration process for our industry partner's app. Participants were asked to use the Apple Watch and the app as much or as little as they wanted, but to at least try out some of their features as it would inform co-design activities (Harrington et al., 2018).  Conference organiser none 3/5 b a If participants were unable to attend a workshop, they were invited to participate in a phone interview instead. Phone interviews are included in attendance numbers. b Seven participants missed the first workshop as they joined the study just before or soon after it took place. The topics covered during the first workshop were included in their initial interview to ensure all participants contributed.

Design workshops
In total, we conducted six workshops between August 2019 and February 2020. There were four 2-hour evening workshops and two 4-hour weekend workshops. Participants were asked to attend as many as possible and the workshop dates were provided in advance and were printed on the information sheet. Participants unable to attend a workshop were contacted over the phone to discuss the topics planned for that session; phone interviews lasted 15-30 minutes.
As we wanted the participants to drive the co-design process, we decided to start with a familiar topic to make it easier for them to engage with the workshops. Therefore, the first workshop (W1) focused on participants' current use of technology for supporting diabetes selfmanagement, related challenges and their data needs, and generating ideas to deal with these challenges. The content of the later workshops (W2-W6) was determined by findings from earlier workshops. As a result, we ended up discussing participants' experiences with selftracking apps (to encourage participants to think about different types of data and ways of representing it), identifying specific heuristics and situations in which they may not work, and designing decisionmaking apps. In particular, the two weekend workshops (W3 and W6) focused on hands-on activities and creating paper prototypes. The final workshop focused on dealing with non-routine situations-a topic that was purely driven by our co-designers who in earlier workshops flagged its importance.  Fig. 1.
In addition, Workshop 5 included a presentation from a machine learning expert (one of the co-authors), who then facilitated a brainstorming session on supporting decision-making with algorithms. We planned from the start to cover ML, but did not know which workshop would focus on it; however, we wanted to run this session after participants design their own decision-support systems to gain a better understanding of their needs and their understanding of what technology could do. The ML presentation covered an overview of machine learning algorithms and how they are currently used to predict BG levels, and an introduction to reinforcement learning and examples of how it is used in games to support decision-making. Reflections on our process and the ML session are available in Ayobi et al. (2021a,b).
We developed this series of co-design workshops according to best practice guidance on participatory research and co-design. For example, Harrington et al. (2018) demonstrated the benefits of experiencebased co-design approaches: encouraging older adults to use digital technology helped elicit detailed feedback and inform the design of novel features. In this vein, we provided a set of digital tools to support participants in developing shared experiences and in articulating their a Participants who were unable to attend workshops were invited to telephone interviews to discuss the topics covered at the sessions. We did not schedule phone calls after the hands-on weekend workshops (W3 and W6).
personal needs as part of co-design activities. For example, participants received a 12-month Dexcom G6 CGM sensor subscription, which was not covered by the UK's National Health Service (NHS) at the time of the study. While their active involvement lasted about nine months in total, they were able to use the full Dexcom subscription. In addition, we leased each of them an Apple Watch so that they could monitor in one place their Dexcom data, physical activity, heart rate, sleep, location, and step count during the study period. Participants who completed the study were able to keep the smartwatch. Throughout the study, participants actively and autonomously used both devices, which informed the brainstorming sessions and their designs, but neither was part of the formal data collection. They also used our industry partner's app for a few days to prepare for the first two workshops, but none found it personally useful, although they did refer to it and its functionality throughout the study. The app was used as an example of a decision support tool, although at the time it was an early prototype and did not offer any ML-supported functionality; participants were aware of these limitations. Furthermore, in preparation for Workshop 4, we asked participants to use Trackly , an app inspired by paper journaling practices that enables users to design their own trackers, for at least a week to help them identify things they would like to track and how (Harrington et al., 2018). We used the data shared by participants to create charts (see Fig. 2) that helped to visualise their data using labels and categories they defined. The charts were used as props during the discussion on decision-making and their data needs.

Analysis
The interviews were audio recorded and workshops were video recorded to aid transcription. Any design outputs (e.g. sketches, paper prototypes) produced during the workshops were photographed and archived. They were not analysed per se, but served as starting points for group discussions at the end of each session and were referred back to at later workshops; these discussions were included in the analysed transcripts. Participants' personal information was anonymised in the transcripts used for the analysis. We did not analyse the Dexcom or K. Stawarz et al. Apple Watch data; these tools were meant to encourage participants to think about their data and how they use it, but were not part of data collection.
Interviews and workshop group discussions were transcribed verbatim for analysis. We conducted reflexive thematic analysis (RTA) following the guidance from Braun and Clarke (2006) and Clarke and Braun (2021), with a primary coder and ongoing discussions within the research team (c.f. McDonald et al., 2019) and acknowledging the subjective and interpretative role of the researcher in this type of the analysis (Clarke and Braun, 2021;Gough and Madill, 2012). We focused on an inductive approach, although kept in mind the overall goal of the study.
Initial interviews and workshops (with their related phone interviews) were analysed separately but followed similar procedures. While there were specific topics we were interested in, like personal heuristics, we did not use a predefined code book. Instead, we followed a bottom-up approach, with the first author reading all transcripts as they became available and creating memos. The research team met once a week for the duration of the project and the findings from the interviews, workshops and reading of the transcripts were regularly discussed. The transcripts were read by the first author who then outlined the provisional coding guide for the initial interviews and a separate one for the workshops and telephone interviews to help build the understanding of the data and support the development of themes. These initial codes were discussed with the rest of the research team and amended where necessary. Then, the first author coded all transcripts in NVivo software using the guides as the starting point, while still being open to identifying new codes. Finally, the content of each code was summarised and once again discussed and iterated on by the team, which lead to the identification of three major themes: personal heuristics in a changing context (focused on everyday diabetes management), coping with unexpected situations (focused on issues and strategies our participants developed), and collaboration with a decision support system (focused on participants' suggestions on how technology could help them in different contexts). The flexibility of RTA as our chosen approach meant that we were then able to explore further the intricacies of varying information needs after these themes were defined. Given that participants as co-designers emphasised the importance of routine and non-routine situations (which led to this being the main topic of the final workshop), and talked about expected and unexpected situations, these distinctions were visible within the initial themes. Therefore, we decided to use this lens to reflect on the themes, especially in the context of potential ML-based decision support systems. Further discussions within the research team led to outlining four key types of situations people with T1D deal with that highlight different challenges and require a different technological approach.

Findings
Although the initial co-design focus was on using personal data for ML-enhanced technologies, it was clear that everyday T1D self-care practices did not need to have a heavy technology intervention. Instead, the workshops confirmed that the personal nature of diabetes selfmanagement and different information needs lead to the development of personal heuristics (Mamykina et al., 2015), and because of changing and complex contexts, these heuristics can be inadequate for guiding situated actions. These findings provided context in which participants co-designed their solutions and discussed opportunities for decisionsupport systems. Together with our participants we identified four broad types of situations that require a distinct approach: • Unexpected routine situations-situations familiar to the participant that happen unexpectedly and were not anticipated, e.g. late lunch; • Unexpected non-routine situations-unfamiliar and unexpected situations that cannot be addressed by known, routine approaches, e.g. illness; • Expected non-routine situations-situations that are unusual or unfamiliar, but were anticipated, e.g. travel; • Expected routine situations-everyday routine situations participants are familiar with.
This classification is neither exhaustive nor prescriptive given people's unique lived experience in daily life and their prior experience. However, it offers a useful lens as the different types of situations highlight a range of specific challenges that need to be considered in the context of potential AI/ML decision support system. Each would require a different level of technological support, including no need for everyday decision support technologies when dealing with expected routine situations. In the following sections we describe these situations together with participants' current strategies and design ideas. They are also summarised in Fig. 3.

Decision-making in unexpected routine situations
Our participants identified two types of unexpected situations: changes to the routine that led to familiar events taking place at a K. Stawarz et al. wrong time or in the wrong order (e.g. having to take car to work instead of cycling, eating lunch at a different time), and unexpected events that were new (e.g. illness; described in the next section). Addressing such situations was seen as an opportunity for technology to provide unique support that is currently not available.
When brainstorming ideas for future decision-support systems, participants agreed that the most useful solution would be a smart and collaborative system that would help them cope with unexpected situations. While they generally agreed that manual tracking was cumbersome (which was to be expected as people rarely engage in long-term data tracking Arsand et al., 2007), they still expressed willingness to track data periodically. They believed that this way they could teach a potential app about their personal heuristics so it could be applied to similar unexpected situations in the future, providing value for their effort.
Participants also reported that with no technological support, they developed coping strategies. Literature on resilient strategies (Furniss et al., 2011(Furniss et al., , 2012(Furniss et al., , 2014 distinguishes several such strategies, including contingency planning and routine adjustments. Our participants reported such mechanisms, e.g. making sure to have snacks at hand in case they are unable to eat lunch at a usual time: Where I work, we get to take our lunches at a convenient time. So, one day I might take my lunch [...]  Another way of coping with unexpected situations was limiting the variation and trying to make things as similar as possible. For example, participants reported eating the same foods at the same times to avoid dealing with unknown menus, limiting their workouts to the same types of exercises every day to make insulin calculations easier and to avoid surprises in case they accidentally overexert themselves, and so on. One participant reported significant advance planning: However, participants were not always able to prepare for the uncertainties of everyday life. Eating out was mentioned as an example of a situation where things could go differently than expected. For example, participants reported that it was sometimes difficult to time their pre-meal insulin doses due to variable meal waiting times. They also found it hard to assess the amount of carbs in meals or sugar in drinks, with the latter being potentially dangerous: Participants' accounts highlighted above suggest that heuristics on which they rely in unexpected routine situations are weak, although individuals can draw from their regular experience to navigate through the situation, e.g. by having a backup snack. One challenge for AI/ML systems in this context is the fact that as these situations are considered routine, users might lack motivation and awareness to document such events, which could lead to limited data on which to base any predictions. They may also not have the time during the event itself to collect data needed to train the decision support algorithms. However, supporting decision-making is even more complex in unexpected non-routine situations.

Decision-making in unexpected non-routine situations
A separate type of unexpected situation includes one-off occurrences, errors or mistakes. These non-routine situations are defined by encountering unexpected adversity within novel contexts and may require immediate action. For example, P2 reported going on a trip and forgetting his insulin kit at home, which meant he had to avoid carbohydrate consumption all day. While in the reported case the error led only to annoyance and disappointment (''I was planning on having chips and ice cream''), such situations can be potentially dangerous. For example, he also reported confusing the dosage when injecting insulin, which can have serious consequences: Being ill was frequently mentioned as a unique and hard to cope with situation. Participants agreed that illness was unpredictablethere were usually too many factors involved to make sense of them. For example, common cold or flu can result with different symptoms each time, which they said influenced their blood sugar levels to a different extent each time. While seasonal illness could be predictable, its specific effects are unpredictable. As a result, some participants would not manage their BG levels when ill; their situational coping mechanism was avoidance:

When I'm ill with a cold my blood glucose level is higher than usual and, as I was saying, you just kind of... I don't... I'm not so hot about keeping on top of my blood glucose level because I spend those three weeks thinking 'oh, it's a bit high but when I'm better I'll get back on top of things'. -P2, W5
Another participant reported ignoring their regular rules during illness. The unpredictable nature of an illness not only made it difficult to act in accordance with their routine heuristics, but could also adversely impact self-management, as this behaviour could influence health even once the illness had passed:

If it's a prolonged illness [you'd change] your background insulin or even being a bit more lax with your rules, like letting your rules kind of just slide a little bit because you're poorly. [...] I've had a cold for three weeks, like two or three weeks, so my new background insulin is like the norm now, but then as I get better [...] I'll start having hypos because I'm having too much background [insulin]. -P4, W5
Participants stated that the most useful decision support technology would help them cope with such unexpected situations. For example, Fig. 4 shows that in case of illness, participants would be keen to report data retrospectively to teach the app how their data looks like when they are unwell, with the hope that it would be useful in the future. They also asserted that technology would have to distinguish between data collected in routine vs non-routine situations, to ensure that they received relevant support. The quote below describes how one group of participants envisaged such a system: Looking at the specific scenarios, say, illness, you'd be at work, you'd start to feel ill but you're not really sure whether you are getting ill, you won't know until the next morning usually, and then you get to the next morning and you wake up and then yes, you're definitely ill now. Then you would tell the app you're ill, and then you can work back in retrospect, so you can tell it when you started feeling ill and then it would segment that data separately to your regular data, so then it's not collecting that data as a whole, it's collecting it like, 'This is your illness data', 'This is your holiday data', so it won't work the same as your basic data, so it's not drawing information based on when you're ill, and the algorithm learning from that, you're giving it different contexts so it deals with it differently. -P5, W6 Surprisingly, participants also accepted that the technology might offer faulty recommendations when confronted with novel situations. For example, in one of their storyboards (see Fig. 5) presented during Workshop 3, they described a scenario in which such a mistake took place: P5: So, for this [scenario] you get up in the morning, eat your breakfast. Chuck it in the app and take insulin based on that, and you've already planned to go to the gym. So, then you've cycled to the gym. Then as you're cycling you get an alarm -you get hypo and then you have to re-adjust [because the app gave you a wrong insulin dose suggestion].

P13: It could be seen as not working but working because if this is happening when the app is kind of getting to know you, then it could be taking on this information and if you go through the same scenario again at a later date, then it's kind of learnt and can maybe make suggestions about how many carbs you might want to think about taking before going to the gym to avoid having a hypo. Part of the learning process.
During the same workshop (W3), participants agreed that such learning would be necessary to improve the algorithms and help to build trust: P13: I guess you've gotta build a bit of trust and confidence in it all the time. The learning process could also be seen as a collaboration, where the user would input some information along with automatically collected data, and when the technology provided suggestions, the user would be able to approve or reject them, which could address the lack of regular data input and be useful in unexpected or non-routine situations. Similar to participants from the study by Kanstrup et al. (2010), our participants did not want to be simply told what to do. Instead, they believed that this type of a dialogue would be more useful as it could also help to verify advice, further build trust, and provide additional information related to unexpected situations:

It might be suggesting doing something and you're saying, 'No because I'm on holiday and I'm going to eat something completely different'. Or, 'No, I'm in a different time zone so things are changing'. There might be more reasons. Or, 'No, because I'm ill, so I'm actually going to do this differently instead. My background insulin is going up because I'm ill and I need more'. You might use the 'no' reasons more than that, I guess. I guess that could then build a profile based on whether you're ill, or on holiday. -P3, W6
As a result, most participants agreed that they would be willing to go through an intensive 'getting to know you' phase where they would be asked to manually track various factors to provide a reliable baseline for the technology, particularly with classifying unexpected situations. However, during a W5 discussion, they acknowledged that there would inevitably be some missing data and therefore the ability to respond to suggestions and agree/disagree with them was key:   P3: Yeah, so the quality of the data you put in. But again, I think at the end of the day almost you could almost sort of say 'oh, I was a bit rubbish today, I didn't tell you much today' or 'I've been really good, I've told you everything today'. It's almost like... so you could almost say 'oh, yeah, this day isn't very good to use as an example' or 'this day is a pretty good day', so you can almost just like say 'just forget about some data' maybe.
Unexpected situations are potentially stressful and in some cases hazardous. This is especially true when the user is unaware that such an event has occurred. As a result, the individual has limited or no experience to call upon, and the urgency of the situation limits the ability to learn through trial and error. The stress of the situation might also impair rational analysis. As such, these situations pose challenges to the application of stage-based frameworks for self-discovery in data (e.g. Li et al., 2010), or the ability to rationally reflect and then construct and apply hypothesis (Mamykina et al., 2015). In addition, the novelty of the situation suggests that ML methods will be limited by paucity of personal data to use as a reference or use for algorithmic analysis.

Decision-making in expected non-routine situations
Participants distinguished between everyday routines (which we discuss in the next section) and less frequent but expected/planned events, e.g. weekend routines, visiting parents or holidays. Travel was a unique case as it was often planned and anticipated, but at the same time also a source of significant disruption-especially when going to a new destination. Travelling with a chronic illness is challenging (Ramanayake et al., 2019) and indeed, participants agreed that even when routinely going to the same location, changes in time zone, cuisine, access to shops, medical supplies, and weather could impact BG levels, requiring changes in insulin doses and timing. However, as they broadly knew what to expect, they had mental models in place to help them deal with these situations.
[On holidays] I usually tend to make sure that I'm eating the right things which will keep my blood sugar stable. Always making sure that I have bits of food on me that I can eat at any time really. Also maybe adjusting doses of background insulin, if I'm going to be out and about exploring, because I don't tend to go on many holidays where I just sit at the pool or the beach. -P6, phone interview in lieu of W5 In general, participants agreed that while some types of situations were expected and in principle familiar, they could still result in surprises. This included situations where they could anticipate and prepare for variations to their routines, for example planning for travelling abroad, especially to a country with a significant time zone difference and unfamiliar foods. Their sketches and designs reflected the need for context-relevant advice or timely notifications in such cases. For example, when discussing travel, participants thought that location services could be used to automatically recognise if they are in an airport (see Fig. 4) or whether their time zone has changed, which could then adjust any other decision-support features. Events such as Christmas or visiting family were also mentioned as examples of situations that could have unexpected results-often caused by irregular scheduling, unusually large meals, and other activities that might impact diabetes management.
A lot of the stuff you eat [during Christmas] tends to be quite sugary. You're not doing much exercise, you're usually sat down for a large amount of the holidays. So, the way you're talking about routine, Christmas is a bit of a blip in a routine really. For me anyway, it's still trying to look at labels and just dose accordingly and sometimes you get it wrong because there's other things at play like not being active is going to have an effect as well. -P6, phone interview in lieu of W5 Such expected non-routine situations are by definition infrequent. While an individual might know they are coming, they have limited recent experience to guide their actions and therefore will need to develop new heuristics. This infrequency also results in scarce recent retrospective personal data to be analysed.
Furthermore, participants reported habitual processes where they were unable to specify what justified their behaviour or decisionmaking, and therefore were not sure if technology could help in any way at all. These habits often manifested in situations where participants admitted actions that they knew were not good for them. For example, P11 admitted being ''a chronic over-corrector '' and ''always'' having ''hypo[s] from over-correcting'', which highlighted a specific pattern in her behaviour: eating carbs straight away to increase BG levels, but then realising she had overeaten, which required a reduction with an additional insulin injection, which led to hypoglycaemia. This habitual over-correction frequently led to a 'yo-yo' effect where BG levels repeatedly went up and down. This pattern contrasted with P8 who reported not immediately correcting, waiting to see what would happen, and then acting later. In both cases, participants reported habitual behaviours despite recognising their counter-productive aspects. Systems that are able to recognise such repetitive patterns in adverse habitual behaviours could not only foster constructive self-reflection, but also potentially recommend strategies for behaviour change.
In addition, while participants reported that they had specific 'diabetes rules' they followed (e.g. ''if sugar level high, take insulin'', ''if sugar level low, eat carbs'', ''if drinking alcohol, eat in advance'', ''if returning home late, always eat before bed, especially if you were drinking''; see Fig. 6), when probed further at the subsequent workshop it became apparent that these rules and routines were heavily contextdependent. Furthermore, there were always exceptions, showing that their rules were not always true or adequate, as they were dependent on multiple factors: This showed that while in general participants knew what to expect, sometimes the factors to account for were too many to handle, and so the rules had to be simplified. However, participants were still adamant that they would not want technology to help with such routine events. Their attitude was similar in the context of expected nonroutine situations. However, as these situations are expected, there are opportunities for expert systems to help with planning. Although there may not be data for traditional approaches to machine learning, recommender systems that could be used by people to plan self-management activities, for instance for a big trip or a big holiday, might still be useful albeit not as a real-time intervention.

Decision-making in expected routine situations
Finally, when talking about potential decision support systems, participants made it clear that they did not need technology to help with routine everyday decisions. They already had clear rules (see Fig. 6) and established routines to guide their behaviour in familiar situations, e.g. calculating insulin dosage for the same breakfast every day, knowing how high their BG level must be before commuting to work or going about their usual day at work.

I don't do anything at work, I literally sit down at a computer for eight hours, very little movement and so I have my lunch, I have a salad for lunch, no carbs in that anyway, so generally don't do an injection with that, so work is pretty simple to control, it's just every other bit of living is awkward. -P14, initial interview
This was particularly true for participants who had lived with T1D for a while, as shown by this exchange from Participants generally dismissed more explicit technology-based decision support and from the very first workshop (see Fig. 7) expressed their lack of interest in regular, manual tracking of complex contextual data. From the technology perspective, this would make it difficult to identify trends within diabetes data as with scarce data even expert analysis is not always reliable (Mamykina et al., 2016).
Overall, given the presence of specific and trusted heuristics that allowed them to manage their diabetes in most everyday situations, our participants were generally indifferent towards new technologies in the context of expected or routine situations. However, they did express interest in systems that could collaboratively support them in the unexpected situations where they might lack proven heuristics. This highlights new opportunities for diabetes technologies as it suggests the need to shift the focus away from continuous tracking and engagement with a decision support system, which has been the focus of many commercial ventures. We discuss these results and highlight the opportunities for ML and AI systems in the next section.

Discussion
This research investigated how diabetes decision support systems could use AI and machine learning to better support the unmet needs of people with diabetes. To this end, we conducted a series of codesign workshops to understand users' decision-making practices, their informational needs, and expectations from decision support systems. Given the number of T1D relevant data streams potentially available K. Stawarz et al.   7. Examples of potentially useful information, collected during Workshop 1. Two bottom sheets show that two participants would be interested in tracking exercise, sleep temperature, travel and location-but only if these factors could be reliably tracked automatically. and the high cognitive burden of self-care (Klonoff, 2007;Mamykina and Mynatt, 2007), we sought to co-design machine learning enhanced technologies to support tracking and interpreting self-management, as per Choe et al. findings on its possible use in reducing burden . These kinds of advances have been seen in DIY approaches to T1D innovation (Kaziunas et al., 2018;O'Kane et al., 2016c), as medical device regulation and commercial innovation failed to keep up with user needs and expectations (Vincent et al., 2015). However, despite potential opportunities identified for using ML for decision support (e.g. Donsa et al., 2015), our focus on routine everyday care as a central point for an ML-based intervention turned out to be misplaced.
The results confirm and expand the current research on T1D selfmanagement. Our participants already had specific and trusted heuristics that allowed them to manage their diabetes in most situations and, similar to Katz et al. (2018b), we found that many of these rules were complex, flexible, and context dependent. These participants were generally indifferent towards new technologies to support routine decision making, trusting their established heuristics and habits to provide adequate self-management. In contrast, they did express interest in decision support systems that could collaboratively support them in unexpected situations where they might lack proven heuristics. This has similarities to work by James et al. on how life transitions can impact self-management of T1D, causing difficulties for AI and ML innovations that rely on constant data streams (James et al., 2023).
Therefore, it appears our participants were not interested in continuous engagement with a decision support system, but rather only within specific contexts. This echoes prior research that highlighted the need for diabetes technologies to account for shifting contexts and environmental factors . With this in mind, we use our findings to analyse the potential of ML to support self-care outside the routine everyday care context.

Machine learning for diabetes
Machine learning is an exciting technology for its ability to extract knowledge from complex data and researchers have been exploring its potential in supporting diabetes, e.g. Donsa et al. (2015), Forlenza  Yamagata et al. (2020). However, many of the most successful ML techniques rely on large and accurate data sets to extract statistical inferences, which might impede successful implementation in the context of personalised diabetes care. Our participants were resistant to continuous manual data logging, and the opportunity for everyday decision support did not appear to provide a sufficient reward to overcome this barrier. This attitude is not unique (Arsand et al., 2012;Kristensen and Ruckenstein, 2018) and suggests considerable barriers to building sufficient data sets, given that automating data collection, e.g. carbohydrate Table 3 An overview of different types of situations with information about people's heuristics based on their prior experience, related challenges for decision-support systems, and examples of AI approaches that could provide support in each context.  (Anthimopoulos et al., 2015). Furthermore, even personal data that can be collected is dependent on problematic long-term adoption of wearable tracking devices (Harrison et al., 2015). Research in other domains (e.g. home resource tracking) shows that algorithmic predictions in real-world context can be difficult due to routine changes or sporadic events (Fuentes et al., 2019). Therefore, designing usable diabetes decision support systems requires a nuanced approach that accounts for non-continuous use and contextual requirements.
Like many commercial ventures and large research projects in this domain, e.g. IBM (Latts, 2019;O'Leary, 2022), we made some initial assumptions about the needs of the participants in the set-up of the co-design sessions. In particular, we had assumed their motivation for everyday assistance could potentially overcome manual tracking issues and lead to new insights for everyday engagements with ML-enhanced self-care technologies. The immediate and consistent push-back from the participants meant we had to think about addressing their real needs, i.e. identifying the potential of technological support in unexpected and non-routine situations without the need for constant data collection. It also showed the limitations of focusing on an exclusively ML-based approach to innovation for self-management of T1D. Therefore, we had to think more widely about what a self-care system might mean for people with T1D, and indeed think about less hightech approaches to deal with people's lived experiences and their real technology needs. We discuss these considerations below.

Opportunities for artificial intelligence and machine learning
Prior work investigating the use of AI/ML in diabetes self -management has focused primarily on algorithms and technical aspects of predicting and detecting various factors, e.g. BG variability or hypo-/hyperglycaemia (Donsa et al., 2015;Sowah et al., 2020;Tyler and Jacobs, 2020;Woldaregay et al., 2019;Yamagata et al., 2020). With the growing focus on human-centred AI (Grudin, 2009;Inkpen et al., 2019;Winograd, 2006;Yang et al., 2020), HCI researchers have also explored how ML could be used to provide personalised predictions (Desai et al., 2019) or support sensemaking based on self-tracking insights (Mamykina et al., 2017). We extend this work by outlining the opportunities for using AI/ML approaches as part of diabetes decision support systems. We focus on potential solutions that could address user needs identified by our participants while taking into account issues with scarce data and aversion to tracking (Arsand et al., 2012;Kristensen and Ruckenstein, 2018). Below we point towards alternative AI approaches to offer specific directions for future research. We argue that rather than trying to change users' behaviour and encourage systematic tracking and engagement with technology, a comprehensive diabetes decision support system needs to account for different types of situations and should thus rely on a mixture of AI approaches that can provide assistance even with scarce data. We use the types of situations identified in our results to structure the discussion. The main points are summarised in Table 3 and described in more detail below.

Supporting unexpected routine situations
Our results suggest that focusing on unexpected situations would provide the most benefits to people with Type 1 Diabetes. The limited data caused by general dislike to prolonged tracking is a known issue (Arsand et al., 2012;Harrison et al., 2015;Kristensen and Ruckenstein, 2018), and introduces several difficulties as ML models trained with such data may lead to less accurate predictions. The issue of limited data could be mitigated by the use of simple methods such as linear classifiers or shallow decision trees, or ensembles of such models, e.g. random forests. Ensemble methods train multiple models and aggregate predictions from them, which helps to assess the uncertainty in the aggregate prediction (Flach, 2012). A chatbot using ensemble methods have been developed to support diabetes diagnosis (Bali et al., 2019) and a similar solution could potentially work for decision support. Another option is using Bayesian methods which combine prior assumptions about average instances with training data from a target instance to obtain posterior distributions of the model parameters better representing that instance (Flach, 2012). With the right prior parameters such methods perform well even with small amounts of training data. In the case of diabetes, the prior can be obtained from a small population of people to train a personalised model with the target individual's data. These approaches can estimate the uncertainty associated with the prediction, which is a key factor when making a decision. For example, our participants suggested that they would be willing to spend some time collecting data to initially train the algorithm; such willingness is not unique to the diabetes context and prior research shows that patients are willing to engage in self-tracking if they see benefits of this behaviour Rooksby and Rost, 2014). Even though people in general abandon selftracking (Harrison et al., 2015), this initial data could still be used by Bayesian methods to provide initial suggestions. If more data was collected with time, we could then move away from these models towards more complex solutions (e.g. deep neural networks) to get more accurate predictions.
As the unexpected routine situations have previous analogous occurrences, the technology could also highlight situations where data looks anomalous. Given that these situations are familiar and may be just happening at a different time or in a different order, simple notifications when an anomaly is detected could be enough, as the user would then know what to do. For example, a review of ML solutions for diabetes (Woldaregay et al., 2019) shows that anomaly detection is already frequently used, albeit mainly with the intention to supporting BG predictions. However, these solutions rely on large amounts of data and suffer from variability of data that is collected. Therefore, simply notifying people about anomalies, without trying to predict the outcomes, could be a good solution for situations with scarce data. Furthermore, notifications triggered by anomaly detection algorithms could also serve as a starting point for a collaboration between the user and technology. For example, our participants suggested an app that notifies them when it detects a sudden spike in BG level that does not correlate with registered activity or other data. As participants noted that they would be willing to engage in time-limited data collection, such notifications could initiate the capture of retrospective data. Therefore, a diabetes support system could initiate a period of collecting such data and take the user through a systematic procedure of documentation, cause-and-effect modelling, and finally heuristic building (Mamykina et al., 2017). Such heuristics could then be drawn upon in future analogous situations.

Supporting unexpected non-routine situations
Unexpected non-routine situations are even trickier to address with ML/AI approaches as by definition there is no prior data. However, rather than trying to predict unpredictable, diabetes decision support systems could focus on prevention instead. HCI research on contextaware reminders provides a good template on how such situations could be supported (e.g. Brewer et al., 2017). For example, researchers developed medication adherence systems that used context-aware reminders to warn users when they were about to leave without taking their medications or when they were past their normal medicationtaking time (Asai et al., 2011;Singh and Varshney, 2014;Varshney, 2013). A similar approach could be applied to diabetes and potentially notify users when they leave at a different time or do not have their insulin kit with them. For example, this could be achieved through the use of interactive stickers attached to the kit (Williams, 2020) that would trigger an alert when users' phone is away from the insulin. Furthermore, a combination of calendar access and location data could trigger notifications to remind users to pack their insulin kit and carb-rich snacks, although this could be perceived as potentially too intrusive. Similar approaches have been explored in the context of supporting physical activity (Haghbin and Kersten-Oertel, 2021) and used a combination of location data and activity levels to ensure the notifications were actionable.
In addition to prevention, the system could provide assistance after an unexpected non-routine situation occurred. If a user is in need of finding emergency snacks of insulin, the system could correlate the GPS data and food databases like MyFitnessPal to help locate nearest health centres, chemists, or grocery shops as well as providing store hours and telephone numbers to reduce the effort of attaining supplies. Furthermore, if there is some baseline data to serve as a reference point, mentioned earlier anomaly or novelty detection (Simeone et al., 2017) could be used to notify the user when the data significantly differs from the norm. Even with limited data, this could be enough to alert the user. When a person detects such a situation, regardless of whether they noticed it themselves or were notified by technology, they may panic or worry. A simple checklist could help here. For example, some mental health apps allow users to create a safety plan to refer to in crisis situations or save contact details of trusted individuals (O'Grady et al., 2020). In the case of diabetes, the checklist could provide a simple stepby-step guide (''Stay Calm. Check your current blood sugar level. Check how much insulin you have with you'', and so on). It could also be an expert system that is a de facto risk assessment tool: by asking a series of questions (e.g. What is your sugar level? Do you have insulin with you? Do you have your snacks? Is there a grocery store nearby?), it could calculate a risk score and provide relevant suggestions whilst helping the user to calm down. Such a system could attempt to reduce the cognitive effort involved in sensemaking, and, given the potential urgency, emphasise solution-finding over learning processes.
It is also worth considering that not every unexpected non-routine situation is entirely novel. Except for those with a new diagnosis, people with T1D often have at least some experience dealing with similar situations (Mamykina et al., 2015;Mamykina and Mynatt, 2007;O'Kane, 2016), so technology could help them draw from these experiences. As such, after recovering from an unexpected non-routine situation, the system could encourage reflection on what happened and the actions the individual decided to take. Kocielnik et al. (2018) developed a system that used a conversational user interface to support reflection over one's physical activity. Using similar 'mini-dialogues', the diabetes support system could help to classify non-routine situations, match them with similar events, and recommend other approaches or heuristics that had proven successful in parallel situations.
Such recommendation could also potentially access heuristics mined from social networks or message boards. When there is no medical urgency, tools for supporting structured guidance through the sensemaking process (Mamykina et al., 2016) could be a promising approach. Another option would be a collaborative system that anticipates problems, suggests potential interventions, and then asks for feedback to build models of efficacy (Pejovic and Musolesi, 2015). Moreover, by combining the approaches discussed above, we could create an anticipatory system that uses automatically collected data (e.g. from mobile phone or a CGM) to facilitate sensemaking and reflection which could help people manage unexpected situations.
Finally, illness is another example of a situation mentioned by our participants where ML algorithms may not be the right solution. Even if someone catches a cold or flu every year, which can make it seem like a routine occurrence, the symptoms and body's reactions differ as different strains of flu are active every year (NHS Choices, 2019). This can get further complicated with novel viruses that have similar symptoms to flu: several studies conducted during the COVID-19 pandemic have suggested people with diabetes have increased risks of more severe outcomes, including death (Barron et al., 2020;Petrilli et al., 2020). As a result, there may not be enough data to support ML algorithms. We discuss in the next section how crowdsourcing could be used for non-routine situations where data is lacking, but illness is too personal and heavily context-dependent, and thus there could be too much variability in the data to extract meaningful conclusions. However, community support may still be useful-not for generating data to feed algorithms, but for providing advice and learning from others' experiences. There might be therefore a potential for a Q&A system using language models like GPT-3 (Brown, 2012) that could be trained on health advice and community content to provide answers, although reliability could be an issue as, in the context of chronic conditions, wrong advice can be dangerous.

Supporting expected non-routine situations
Both our results as well as existing literature show that people with T1D are generally able to cope with expected non-routine situations (Mamykina et al., 2015;Mamykina and Mynatt, 2007;O'Kane, 2016), such as planned travel, although they would still benefit from decision-making support. However, given a lack of personal data, this is difficult to implement. One approach to address it would be to combine a personalised model with contextual and crowdsourced data. A personalised model based on user's patterns could be collected before the trip to establish a baseline for standard Physiologic Model-Based Algorithms, i.e. algorithms that learn glucose and insulin metabolism dynamics with parameters (Yamagata et al., 2020). This could then be combined with crowdsourced data, which could be mined for trends and clustered to identify similar profiles. For example, the non-profit group Tidepool (Tidepool Project, 2020) has been assembling large multi-feature personal data sets. However, Tidepool data is not annotated. Therefore, another option would be a creation of publicly available annotation tools that would enable the labelling of crowdsourced data, which would expand the opportunity for the application of machine learning techniques.
A second approach would be to move away from machine learning approaches and towards expert systems (Rudin, 2019). Rather than aiming to provide specific suggestions, such a system could instead support preparation and reflection. Taking the travel example, people often engage in preparation and data gathering beforehand, e.g. by consulting online reviews on destinations (Brown, 2012). This enables them to find specific information, learn about experiences of others, and even make comparisons with their previous experiences in the same or similar place. An expert system aimed at people with diabetes could support such research by offering suggestions of articles that match the context, checklists to help with planning, asking specific questions (e.g. what leisure activities are you planning?), or enabling a modelling tool to explore different what-if scenarios and develop resilience strategies.
Both approaches are based on the premise that people are averse to habitual manual tracking and data labelling. However, there is evidence that some people do engage in tracking (Rooksby and Rost, 2014), and indeed a couple of our participants reported tracking meals in detail. A decision support system aimed at people with diabetes could take advantage of existing habits of such trackers, as their data could contribute to crowdsourced models and data sets that could be used by ML and other AI approaches. Crowdsourcing has already been used to support annotations for machine learning, e.g. in the context of privacy policies (Wilson et al., 2016) and tools supporting such annotation acquisition exist, e.g. Revolt (Chang et al., 2017b). In addition, passive data sharing of anonymised automatically collected data such as location, activity, BG levels, weather, etc. could also feed into such systems.

Supporting expected routine situations
Finally, participants were clear they did not need or want a decisionmaking support for everyday situations. Such everyday management has been studied in HCI in detail (e.g. Danesi et al., 2018;Mamykina et al., 2008;Owen et al., 2015;Raj et al., 2019b;O'Kane, 2016;Mamykina and Mynatt, 2007), and highlights the expertise of people with T1D and the role of their lived experiences in informing their decisions. In line with this research, our participants reported relying on specific rules and heuristics for certain routine situations (Raj et al., 2019b). However, similar to Katz et al. (2018b), we found that many of these rules were complex and flexible, and did not always apply in realworld situations. Therefore, there is a potential for a decision support system to intervene in situations that may not be as well-known as one thinks. Such systems could help to detect situations which deviate from recommended or customised thresholds not only on BG levels but also behaviours correlated with improved outcomes. This could be done by collecting sensor data to provide tailored insights and alerts, which could evidence a need to challenge assumptions-akin to the potential anticipatory system outlined earlier in Section 5.2.2.
Systems could also seek to improve existing heuristics and routines by supporting users in documenting them and guiding reflection, hypothesis testing, and iterative refinement. O'Murchu and Sigfridsson explored the use of a similar system (O'Murchu and Sigfridsson, 2010). They developed an app that allows users to collect data that they find meaningful (e.g. BG levels, insulin injections, diet, medications, physical activity) and create their own categories ('tags') for data to help make associations between then. By allowing users to define the categories, such a system can not only support agency, but also facilitate engagement, which leads to improved sensemaking in the future . As such, it could support hypothesis testing and heuristic refinements.

Limitations and future work
Our participants reported being diagnosed with T1D between 3 and 30+ years ago, including two participants who still suspected being in the ''honeymoon period'' (i.e. when the pancreas still produces some insulin) and no participants were newly diagnosed. Therefore, our findings may not reflect these early experiences, although during the workshops our participants reflected on how their information needs changed over time and shared their early experiences. Moreover, the flexibility of solutions proposed during workshops means that they could be adapted to the needs of people with differing diabetes experience. Similarly, despite having a mix of people of differing professions, our self-selection recruitment process resulted in a disproportionately male, white, and experienced participant group. While this may have resulted in biasing results, mitigating factors include the lack of correlation between age, gender and self-described diabetes decision-making processes (Katz et al., 2018b) and the fact that while learning the basics of self-care is essential, it is also generally a short part of a lifetime living with diabetes (Klasnja et al., 2015). Nevertheless, their needs may still differ from those of people from different socio-economic backgrounds and/or those less comfortable with using technology, and future work needs to recruit a more diverse pool of participants.
The format of the study and the fact that our participants could interact with peers might have influenced the results. They often exchanged ideas and discussed their experiences, which sometimes shifted everyone's attention away from tasks at hand or could have led to groupthink. In addition, we did not build anything tangible as, similar to Jiang et al. (2021), our goal was to identify specific opportunities for Human-Centred Machine Learning and highlight potential directions for future research in the area of diabetes decision support, and thus the focus of the workshops was on idea generation and discussions to understand how they would affect people's lives. A detailed account of lessons learned from our approach is published elsewhere (Ayobi et al., 2021a).
Finally, we have focused only on people with T1D, but diabetes care also involves others, including family members and healthcare professionals. Future work should include their perspectives to ensure more comprehensive care, especially that their input and support could be valuable when dealing with unexpected situations.

Conclusions
Diabetes self-management requires making sense of myriad factors in order to manage blood glucose levels and thereby minimise risks. Machine learning has generated interest as a promising method to improve diabetes management. However, popular machine learning approaches often rely on the availability of large multi-featured sets to identify meaningful patterns based on frequently occurring events. Although our project team initially focused on ML-enhanced technology possibilities to support everyday self-management practices with personal data, this context of use was not wanted by our participants. Our results suggest that support for familiar, everyday situations -that could be the source of rich data -is needed the least, as people with diabetes already know what to do based on their lived experience. Unfortunately, it is the unexpected and unfamiliar situations where decision support would be most useful, and in these situations training data sets would typically be small and therefore noise would be problematic. We argue that the socio-technical challenges to each type of situation are unique, and systems to support such situations need to be flexible, accounting for different types of situations and contexts. Moreover, technological innovations that assist these situations, in particularly during non-routine times where automated decision support could be most useful, might have to rely on more traditional and less data intensive approaches to artificial intelligence such as expert systems, not just current trendy machine learning applications.