A Markov model for inferring event types on diabetes patients data

Gathering diabetes-related data requires effort from the patient side to log specific events throughout experiments. This is an error-prone task that the patients usually handle by following a prescribed protocol. However, patients often do not follow the protocol, causing missing or imperfect data. This study investigates the possibility of generating Markov models from existing/logged data and using them for imputation. The models are used to infer information related to missing events (types/activities) in data recorded by diabetes patients, allowing for improvements in the quality and continuity of such data. Our results indicate that such an approach can help improve the quality of the data collected.


Introduction
Collection of relevant data is an important activity in different domains, including healthcare and, in particular, diabetes management. In this specific domain, data can be continuous and passively acquired through sensors and smart devices, but also interactively relying on actions from the contributor/patient [1][2][3].
Data gathering in diabetes related research receive considerable amount of attention as the use of devices containing sensors -and acquiring data -can be considered a common aspect of diabetic patients nowadays. Several takes on the development of algorithms/techniques that aim at modeling through data require as input data acquired from patients [4][5][6][7]. Thus, datasets containing such data are of fundamental importance for advancing the state of the art regarding diabetes research. During the modeling, they can be used for learning and validation, and are not limited to provide data coming from a controlled environment (e.g., specialized care during hospitalization). They can also contain data collected during patients' daily routine by making use of wearables and leverage from the tracked/logged specific events data [8][9][10].
When narrowed to daily actions, patient behavior can be modeled through their own daily events (e.g., sleep, meals, and exercises), but the amount and quality of such data is critical. As a fair amount of these events are logged by the patients themselves, the data collection is error-prone since patients must log specific free-living events by interacting with devices when the events happen, or at specific times of their days. Hence, it is reasonable to assume that they may forget, resulting in incomplete logs, or even provide data at the correct time but in the wrong way. Hence, there is no guarantee that the patients The outline of the paper is as follows. In Section 2 we give a brief summary of previous relevant works, while in Section 3 we present important preliminary concepts and aspects. The focus of Section 4 is to provide details regarding the methods used by the proposed approach. The evaluation is then described in Section 6. In Section 7, we discuss relevant facets and also limitations encountered. Finally, in Section 8, we conclude the paper and future work possibilities are presented.

Related work
Self-reported data have a tendency of carrying errors and low quality information, and data logged by diabetes patients bring this same issue. For instance, meal related information is critical in the diabetes field, and hence attempts on improving the quality of the associated reported events were done. As patients incorrectly informed the amount of carbohydrates for each meal, Rhyner et al. [12] developed a prototype to better estimate the carbohydrate content of meals. They also discussed its usability as patients could avoid the estimation step, not reducing the amount (and quality) of the information generated. Another work by Zheng et al. [13], with the purpose of reminding patients to take bolus insulin after meals, developed a meal detection technique using Continuous Glucose Monitoring (CGM) data as input. However, this approach can also serve to identify meals that were not logged by patients during experiments performed in the diabetes domain. Finally, Berry et al. [14] evaluated the influence of meal components over the postprandial response, and developed a prediction model for triglyceride and glycemic responses to food intake. Although this work was primarily developed for personalized diet strategies, it allows this information to be added to a dataset and serve other researches.
Approaches focusing on data and event recovery, and failure prediction exist in the literature, and focus on identifying (imperfect) patterns on logs commonly generated by information systems and/or sensors [15]. Approaches such as these, can rely on stochastic models based on Markov premises and definitions [16].
Markov models are also able to fit non-deterministic behaviors due to their probabilistic nature. This aspect allows such models to be applied on detecting activity patterns in daily living conditions [17,18], as well as in the healthcare and diabetes domain [19][20][21][22].
Our approach extends the idea of identifying activity patterns on event logs by adding external observable information related to daily living conditions for diabetes patients. In this context, this factor allows for a contextualized inference of the activities.

Markov models
Due to the natural non-deterministic behaviors of human beings, probabilistic models can serve as an adequate framework for modeling such behaviors. Many researches in the literature focus on sequences (e.g., formed by words, phrases, and system or sensor states) to either detect patterns, infer, impute, or predict a value within the sequence [23][24][25][26]. However, they tend to only make use of the values that compose the sequence itself. In order to infer an activity in a sequence, the presented work proposes a Markov model-based inference step that considers the context around such activity. This context includes the neighbor activities in the sequence, as well as external observable factors. As a result, by translating the context into hidden states and emitted observations, the proposed approach permits us to use both layers together as causal factors in the same probabilistic model. In addition, other observable factors could be added to such context, thus giving opportunity for expanded models.
This section starts by introducing Markov chains, which considers only observable states, evolving to hidden Markov models, that make use of external factors as additional observable variables. All concepts and aspects here provided form the basis for the complete understanding of the proposed work.

Markov chain
Consider { , = 0, 1, 2, … } a discrete-time random process. In a Markov chain +1 is conditioned to the current state , but it does not depend on any of the other states from previous cycles 0 , 1 , … , −1 . In other words, the process has no memory, the process behavior that follows any cycle depends only on its current state, and this constraint is known as the Markovian assumption or property [11]. Let = { 1 , 2 , … , } be a finite set of possible states a probabilistic system can be in, and state = occurs at discrete time , and +1 = at + 1. In a Markov chain, is the probability of going from state to , and it is assumed that this transition does not depend on time. By generalizing the use of (1) to every possible state ∈ , the transition matrix can be defined as where is a row of containing all transition probabilities of going from state to any other existent state . In addition, an initial probability matrix ( ) can be defined as well, which specifies the probabilities of the process to start in each of the existing states.
Markov models provide a convenient way of representing a process through states and transitions, in which for each transition that is triggered, a state change occurs. In this sense, a Markov model can be visualized with a state transition diagram. To better illustrate it, if we take as example an matrix defined as follows the associated system can be depicted as the one in Fig. 1.
In summary, a Markov model is defined by the following: • Set of states = { 1 , 2 , … , }.
• Matrix , for initial probability values.
• Matrix , defining how likely it is for the process to be in a future state given the current one.
As the process analyzed in the presented work is a real-life process, it is reasonable to say that the Markovian assumption cannot be strictly followed: given a present state, the future ones are not entirely independent of all past states. However, for the sake of simplicity, the assumption can be used when considering a limited history/memory, which suits our intention to model (and represent) a finite number of events/transitions happening in a limited time window (day(s) of a diabetes patient).

Hidden Markov model
Hidden Markov models (HMMs) [27] allow for an additional layer when other observable random variable exists in the same context of the states, and can be seen as an augmentation of Markov chains. They come as an answer when states are not directly observed, but can be inferred through an associated (emitted) observable value, i.e., it uses the emitted observations to evaluate the probability of a state to happen [28,29].
Given another discrete-time random process { , = 0, 1, 2, … }, and a set of possible finite observations = { 1 , 2 , … , }, each element of has a probability of being generated/emitted from an element of the state space . The matrix is called emission matrix, and it has the probabilities of an observation being generated from a state : where In a discrete timeline, Fig. 2 presents both related processes: one covering the hidden (non-observable) states 0 , 1 , … , , and the other  0 ,  1 , … ,  producing the observations { 1 , 2 , … , } emitted by each state.
An essential characteristic to highlight is the output independence of each produced observation, which means that an emitted observation depends only on the state that originates it, and not on any other existing state or observation.
HMM models have the ability of inferring a sequence of states given a sequence of emitted observations, this is defined as the decoding problem [28,29], and is commonly tackled by specific dynamic programming tasks commonly relying on the Viterbi algorithm [30][31][32].
The diagram presented in Fig. 1 can be extended to add hidden states through the use of an emission matrix which adds another stochastic layer per state as depicted in Fig. 3.
An important difference from a chain can be noticed from the extended version: it allows for the addition of a layer able to incorporate external (observable) factors that permits the inference step to rely less on being aware of the current state. It now incorporates probabilities for transitions from one -now hidden -state to another, and emissions from each state to a possible observed value. In other words, it is defined now by both transition and emission matrices, e.g.,  In summary, for a HMM, the following input is expected: • , , and matrices. The latter defines how likely it is for the process to generate an observation from a specific state.
Hidden Markov models are useful for inferring events that a patient does not record, since additional data from observations regarding the context of the event sequence can be used for improved modeling. An example of such observations would be the glucose values coming from the signal provided continuously by each person -for our specific case, a diabetic patient -when using a CGM sensor.

Modeling patient events sequences through Markov models
In this section, we introduce the requirements regarding the data used in this research. It is followed by an explanation of how Markov chains are used to model event sequences, and then, how HMMs can be modeled to consider specific and related external observations. We propose three different modeling approaches applied to each patient: one relying on the chains only; and another two making use of the HMMs. The section details how each model is defined and how they are used for inference.

Input data
We assume that patients collect data regarding their daily activities as well as disease related events, such glucose control measures. In addition, measurements of the glucose level are also collected in a continuous manner through CGM, which is becoming a common tool used by diabetic patients in their daily routines. The ideal gathered data would reflect in a one-to-one manner all events -from the set defined to be logged -that happened during the days of each participating patient. Usually, this ideal scenario is far from what the acquired data really reflects in the end. Fig. 4 illustrates the events data logged by one same example diabetic patient in different days. The figure includes events such meals (breakfast, lunch, and dinner), hypo-correction (e.g., sugary drinks like juice or regular soda), snacks, and exercises, all tied to their own timestamps.
In addition, we can leverage from Fig. 4 to strengthen the argument tied to the problem tackled by the proposed work. For that, a few points can be highlighted: • The activities in each of the displayed sequences are very similar, however the existing differences are mostly presented after Lunch. • For the day 2021-10-11, there is no Dinner logged, thus the patient either have skipped such meal, or the event was not logged.   • By taking the first three days, one can note that between Lunch and Dinner, either a Snack, an Exercise, or none can happen. • In the last depicted day (2021-10-22), only a Dinner was logged, which characterizes a day with very poor quality of logged data. This assumption comes due to the fact that a diabetic patient is not likely to have this kind of day.
Let us assume we come to a point where we know an event has happened between Lunch and Dinner on 2021-09-16, and we must infer the activity that took place. Using all information from Fig. 4, it is reasonable to assume it might have been a Snack or an Exercise. However, in such scenario, not only the sequence of events can play a role, but also external observed factors as the glucose level tied to the missing event, or the associated time interval it occurred. Figs. 5 and 6 present such observable factors for days that would fit the scenario (2021-09-14 and 2021-10-04).
The Snack and Exercise events for both days differ in respect to both mentioned factors: the time they happened is different, and the glucose levels tied to each of them are also different. Such observations can be of great value on inferring an activity on scenarios such as the one previously described for day 2021-09-16, so that data correction or imputation can be done to keep the coherence of the logged data. This becomes an interesting and tangible approach to be taken in order to solve such issue, and will be detailed in the next sections.
The input data that is used in our approach can be seen as a collection of patients data following the characteristics of the aforementioned example patient. For each patient, it is expected a sequence of events recorded in multiple (and continuous) days, and glucose level signal measured in the same set of days.

Modeling from event sequences
According to the described input data, for each patient an event sequence is defined as follows: Let be a set of events, a set of activities (or event types), and the time universe. An event ∈ is denoted by = ( ( ), ( )), where -∶ → is a surjective function linking events to activities, -∶ → is an injective function linking events to time, -= ( , ) ∈ denotes that activity ( ) = ∈ happened at time ( ) = ∈ .
In the reality of the presented work we make use of a subset of event types, and each element of such subset is translated into one state, resulting in the following state space: For illustrative purposes, the elements of our state space are here grouped in four different categories, forming a reduced state space: Both upcoming examples (Examples 1 and 2) consider the reduced state space presented in (9). Further, these categories will be unfolded again into their contained event types (cf. Section 4.4), resulting back in the larger state space in (8) used in practice for modeling.

Example 1. Modeling with Events Data Only
By taking a sequence of logged events as input, first order Markov models (chains) can be created, i.e., translate event activities into states and calculate the associated probabilities of one appearing given the other. One transition matrix is generated for each existing patient, which means that for each entire sequence of events (from the first event logged by the patient to the last), all transitions from one event to the next are counted, and the associated probabilities are estimated/calculated. In the scope of this work, initial probabilities matrices are not generated, as we assume the first event of the sequences are always known (cf. Section 4.5).
By following the definitions given in Section 3.1, let be a state transitions sequence -or in our case, an events sequence. The frequency of each possible transition in (10) is counted in order to estimate the associated probability. Let be a multiset containing all transitions in the aforementioned state transitions sequence, ( , ) a function that gives the multiplicity of a transition from to . The estimated probability of transitioning from to is given by Applying (11) The first row of in (12) has the probabilities of transitioning from the state 1 (Meal) to the other existing states, thus: Similarly, the other rows contain the probabilities of transitioning from 2 (Exercise), 3 (Control), and 4 (Sleep). This can also be seen through the associated state transition diagram presented in Fig. 7.
Please note that, if needed, the frequency of rare (but not impossible) transitions can be artificially adjusted (e.g., using additive smoothing [33]) to avoid probability values equal to zero estimated based on the available input data.

Modeling with observed external factors
An HMM requires both transition and emission matrices. The latter is generated similarly to the former, but it relies on the link between state and observable variables to calculate the emission probabilities rather than the link between previous and next states.
be a set of observations, and a sequence of such observations, respectively. By expanding Example 1, and using the same transitions as before, states and observations can be aligned accordingly: Hence, as at each cycle an observation is generated by a state, tuples ( , ) are formed (e.g., ( 4 , 1 ), ( 1 , 2 ), ( 2 , 1 ), … ). The frequency of each tuple is counted, and the associated emission probability is estimated.
Let be a multiset containing all tuples formed from the aforementioned state-observation sequence, ( , ) a function that gives the multiplicity of a tuple ( , ) ∈ , the estimated emission probability of ( | ) is given by For instance, the estimated emission probability of ( 1 | 1 ) is given by As emphasized in Example 1, the frequency values here can also be artificially adjusted to avoid probability values equal to zero. Consider that for each state, an observation is made regarding the BG value in mg/dl. To enable a less wide but still representative observation of glucose levels, the BG signal was discretized based on six recommended ranges according to clinical diagnosis and safety [34][35][36]. This choice opens opportunities to take into account risks and consequences of the glucose levels associated to each event. The observed values become now one of the ranges displayed in Table 1, and for our example, let the observations be 1 = 70-180, 2 = 50-60, 3 = 60-70, 4 = 180-250.
For the first cycle of the sequence of observations given in (13), the tuple is ( 4 , 1 ) = (Sleep, 70-180), meaning that during the state Sleep of such cycle, a BG was observed within the range of 70-180 mg/dl. As the transitions informed are the same, the matrix is kept as in (12), while the emission matrix is: and the associated -and expanded -state transition diagram is depicted in Fig. 8, including observations and their probabilities of being emitted.  It is worth noting that the amount of observations in does not have to be the same as the amount of states in , and so the transition and emission matrices can differ in shape. However, for the example used, their sizes match.

Modeling approaches
Using the state space previously defined in (8), we defined three different modeling approaches to apply to the data of each patient: • Modeling Approach 1: Similar to Example 1, this approach makes use solely of the transition matrix created from the events sequence. • Modeling Approach 2: Makes use of both transition and emission matrix, defining an HMM based on BG ranges ( Table 1) tied to each event (state) similarly to Example 2. • Modeling Approach 3: Also a HMM, but now generating an emission matrix based on observed pre-defined time intervals according to Table 2, instead of BG ranges. The assumption is now that being within certain time windows would impact the probability of an event to happen, which can lead to a better probability estimation.
In our work, we use the models created by each described approach to infer one state between two others, i.e., an event activity that happened between two known events. The next section describes how transition and emission matrices are used by the models to perform such inference.

Inferring an event activity
The concepts behind the steps taken for event inference are formalized in this section.
With each patient having their entire events sequence translated into both transition and emission matrices, we use Markov models to infer an event (state) that is more likely to happen between two others.
In the proposed inference steps, event sequences consisting of three elements ( = 3) are used as input, and will mostly be referred in the form of triples ⟨ , , ⟩. Definition 2 (Inference Condition). For any triple event sequence in the form of ⟨ , , ⟩, where ( ) and ( ) are known, the inference condition is met when ∈ and ( ) is not known.
The evaluation and subsequent inference is done by defining the activity that would occur between two others, i.e., the ( ) value for every triple where the condition in Definition 2 holds. Definition 3 (Inferred Activity). Let be a triple event sequence where the inference condition holds. Let be a set of observations. ( ( ) = ) defines the probability of activity to be linked to ∈ , and is denoted by where -∶ → is a function linking events to observations.
The steps covered in Definition 3 make use of preliminary concepts brought by Section 3, and the components of the probability value evaluated during the inference are taken from both transition and emission matrices of the associated (Hidden) Markov model. An important aspect to emphasize is that the Markov assumption holds for both ( | ( )) and ( ( )| ), while the output independence is kept in ( ( )| ), with the observation depending only on the state (activity ) that produces it.

Example 3. Activity Inference for a Given Triple
By taking as example the sequence ⟨Dinner, , Exercise⟩ following Definition 1 and meeting the condition of Definition 2, we find the inferred activitŷaccording to Definition 3. In summary: 2. The activity with the highest estimated probability (more likely to happen) is then taken as the inferred activitŷ, i.e., the one to be associated to .

Experimental setup
In order to assess the predictive performance of our proposed methods, we use a public dataset. In this section, we describe the dataset and explain the experimental design to assess the performance of the methods.

Experimental design
The dataset is provided already split into training and testing subsets, following splitting rules defined by the authors of the dataset with the intention to allow unbiased comparison of developed models based on it. For our experiments, the same rules were respected. Each of the subsets contains different continuous days of glucose and events data collected by the same patients. Not only the most disease directly associated data like glucose level and use of insulin are included in the dataset, but also self-reported life events. Each person is referred to by randomly selected ID numbers, and the amount of logged events vary for each one of them. In summary, the data that is relevant for the presented work: • Every 5 min blood glucose (BG) level (CGM).
For each patient of the dataset, the data we summarized here is retrieved from both training and testing subsets, to be used in the modeling and testing steps. During the training step, transition and emission matrices are created following the methods covered in Section 4, thus models for each of the approaches (Approach 1, 2 and 3) are developed per patient. For each model, we create one normalized confusion matrix [38], and calculate both sensitivity and specificity to compare their performances. The results concerning the generated models are presented in the next section (Section 6).

Results
From the data described in Section 5, we create three individual models per patient following the previously described approaches. The generated matrices for a sample patient are presented in Tables 4 and  5, and 6. With the models in hand, the associated testing subset is then used for validation.
During the test step, all events -from the testing subset -for a certain patient are streamlined, resulting in one sequence of events ordered in time. From the beginning to the end of this sequence, every existing triple ⟨ , , ⟩ is tested, e.g., where is the size of the whole sequence. For each triple, the activity from is masked, and then inferred by the trained model. When the model makes a correct inference, it is considered a success, otherwise a failure. This is done for each of the explored modeling approaches per patient separately.
To summarize the results obtained, we sum (for all tested patients) all successes and failures, and present it as a normalized multi-class confusion matrix [38] for each modeling approach (Fig. 9). Our confusion matrix is a × matrix, where is the amount of event types, in our case, = 8. Each case in the test set has an actual class label (row), and an inferred class label (column). As a normalized matrix, each cell shows the number of cases that were inferred as the column divided by the total number of cases in the actual label (the number in parentheses for each row). For instance, Fig. 9a shows the confusion matrix for the Modeling Approach 1, which infers using a Markov chain, taking only state transitions into account; and Figs. 9b and 9c for the Modeling Approaches 2 and 3, using HMMs with observed BG ranges and time windows, respectively. The total number of inferences made are depicted between parenthesis for each label in the Actual axis.
For each confusion matrix, the diagonal shows the percentage of correct inferences made for each event type (row). Note that the percentage distribution (gradient coloring) varies for each model. The darker the cell in the diagonal is, the higher the accuracy to infer the event type associated to the row. For the cells in the diagonal that are not so dark, it is also important to check the other cells in the same row. They indicate, when trying to infer that event type (row), the percentage of times that the inference is done as another event type (column). Following these results, we use sensitivity and specificity values as measures to compare the models. We also discuss why we believe that certain models perform better when inferring a subset of event types. Table 7 shows the calculated sensitivity and specificity for the inferences made. Again, the sum of all the results per patient is used. Although we consider both sensitivity and specificity as metrics/criteria for model selection, we give more weight to the former rather than the latter. For us positives are critical, as it is mostly important to have correct predictions, and considerations over the negatives would not lead directly to the activity discovery. Moreover, it is worth noticing that the specificity values for all three models are very similar.
In general, by using the sensitivity values, Modeling Approach 3 performed better than the others. Hence, from a macro perspective, the events have a more time bound characteristic, i.e., the patients' events are more steered by the time of the day that they more commonly happen. However, events like Exercise, Hypo-correction, and Snackthat, unlike meals, are not strongly trussed to time intervals -demand a more in depth analysis: • Hypo-correction is commonly triggered by a natural patient-side observation of the BG level, which leads to an expected better accuracy of Modeling Approach 2. Furthermore, for this same model, the wrong inferences regarding this event are shifted towards the other low glucose control event considered: Snack. • By comparing the confusion matrices of the three models, one can notice that Exercise has a certain degree of dependency to event order, time, and BG level, as no model outperformed the others.
Each developed model has a peculiar characteristic as basis: Modeling Approach 1 relies solely on the events sequence order, and Modeling Approaches 2 and 3 have an expanded context that considers not only the sequence of events, but also the observed BG level and time, respectively. From the results, Meals and Sleep are better inferred by the third modeling approach, while Hypo-correction is better inferred by the second. This makes us believe that these types of events are bound to the characteristics of the observations used by each HMM, and at the same time, the inference made is heavily tied to such characteristics. This suggests that by using a hybrid or a multi-layered observation approach could improve the performance of the models. For instance, by taking Modeling Approach 3 as base and adding BG level   observations, this could improve the inference of BG bound events (e.g., Hypo-correction). Additionally, it might be that events not naturally bound to any specific characteristic (e.g., Snack, and Exercise) are also better inferred considering a combination of them.

Discussion
This section is dedicated to the discussion of three specific relevant points, presenting their limitations and possibilities with regard to reducing or surmount them. First, as emphasized before, each patient has his/her own logged data, and his/her own trained model. This can be seen as a particularly welcome feature that leads to personalized models. Nonetheless, it can be taken as a limitation of the modeling, as it means that we are unable to reuse any of the existing data/model for new patients. New patients would be required to log events for a certain time period before any model could be trained. On the other hand, individual models can capture the nuances regarding the different nondeterministic behaviors. A possible take on overcoming this, is that if a profile could be set initially and more than one patient associated to this previously created profile, a more general model could be pretrained per profile, and would be able to suit the associated patients. This could be achieved by exploring the concept of patient similarity [39][40][41].

Table 7
Sensitivity and specificity values for the confusion matrices in Fig. 9. Second, for the evaluation of the proposed approach, all models were trained using a real dataset. With that being said, the same -and already exposed -problems regarding missing data, are also expected to be present in this same dataset. Thus, the imperfections of the training data can also be anticipated in the trained models. However, by analyzing the generated transitions and emissions matrices, "nonusual" transitions (e.g., from Sleep to Exercise) are associated to a very low probability of happening, allowing the created models to naturally cope with such disturbances.
Finally, our approach relies on the fact that we synthetically know when a missing event occurs between two other events. We narrowed our scope by limiting our approach to tackle one part of a twofold problem, and identifying when missing events happen is an outcome that must ideally come from a solution tied to the second part. A further research is to identify the time boundaries of such missing events, giving us the opportunity to recognize a missing event slot. This future step will enforce the value of the presented approach, and can be linked seamlessly to the steps here proposed.

Conclusions
This paper explores the problem of dealing with and mitigating missing events data on diabetes patients related data. This is specially important due to the high effort that is expected from patients to log all required activities properly, which may not be fulfilled in practice, leading to missing events in the data streams. To solve this problem, we presented an approach based on Markov models able to infer missing activities tied to events in an existent data sequence. The approach showed that it is possible to consider not only the order of the events, but also external observations to improve the inference accuracy. Here, we presented two possible external observations: blood glucose (BG) level ranges -where the BG signal was discretized based on six recommended ranges according to clinical diagnosis and safety -, and time windows.
The choice of Markov modeling allows for getting insights in the non-deterministic behavior of a patient, which might contribute to a better and more complete patient behavior representation. For the context explored in the paper, the use of hidden states creates a link between BG signal and daily events, allowing for other BG attributes to be taken as external observations and added to the models.
The presented approach is one step towards a solution for improving quality of data from journals of diabetic patients. Models for glucose metabolism [42] and predictive decision support methods based on data from such journals can be improved with the use of better quality data with fewer missing events in them. Such solution should still take into account the identification of when missing events happen, and to what extent this can reduce the burden of logging multiple (correct) events on the patient side. Detecting missing events is a parcel of the problem intended to be tackled in future studies. In addition, future works will be devoted to extend the methodology for other possible observations, such as signals collected by accelerometer and/or heart rate monitoring sensors. This must be done not only by expanding the observation universe, but also by identifying useful ways of handling each added signal, i.e., how each additional signal must be discretized (partitioned) in order to be used by the developed models, while taking into account uncertainties in the definition of meaningful intervals and the probabilistic nature of the data [43]. Studies over the impact of the different lengths of intervals taken for such partitions -and the dynamics between them -are intended to be performed, and the consequences of such changes on the models analyzed.
Another point to be taken as future study is the generalization of the model for cases where patient profiles are a better fit than individual models. These profiles could be created by exploring the concept of patient similarity, which would allow for the identification of useful similarity functions able to link patients. Thus, similar patients could be grouped using similarities found on their data, and such clustered data used as input for the development of a per-cluster model that fits all members of such cluster.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.