Clustered multi-state models with observation-level random effects, mover-stayer effects and dynamic covariates: Modelling transition intensities and sojourn times in a study of psoriatic arthritis

In psoriatic arthritis, it is important to understand the joint activity (represented by swelling and pain) and damage processes because both are related to severe physical disability. This paper aims to provide a comprehensive investigation in to both processes occurring over time, in particular their relationship, by specifying a joint multi-state model at the individual hand joint-level, which also accounts for many of their important features. As there are multiple hand joints, such an analysis will be based on the use of clustered multi-state models. Here we consider an observation-level random effects structure with dynamic covariates and allow for the possibility that a subpopulation of patients are at minimal risk of damage. Such an analysis is found to provide further understanding of the activity-damage relationship beyond that provided by previous analyses. Consideration is also given to the modelling of mean sojourn times and jump probabilities. In particular, a novel model parameterization which allows easily interpretable covariate effects to act on these quantities is proposed.


Introduction
In psoriatic arthritis, manifestations of the disease typically result in joints becoming swollen and/or painful (active joints), which are reversible through treatment or management strategies or spontaneously, and may lead to permanent joint damage. The interplay between disease activity (as measured by activity in the joint) and damage is believed to be of a causal nature with a previous investigation performed by O'Keeffe et al. (2011) providing an extensive discussion on the topic. In that analysis, among others, individual joint level three-state models consisting of a not-active and not-damaged state, active and not-damaged state and an absorbing damaged state were proposed and produced strong evidence of a greatly increased transition rate to damage when a joint is active (compared with a joint being not active). The three-state models were fitted under a working independence assumption (the three-state processes are independent within an individual) with a robust covariance matrix used to adjust standard errors (Lee and Kim, 1998). The purpose of this paper is to extend the current modelling framework so that greater confidence with regard to the association between activity and damage can be achieved, and also to inform on other important clinical questions.
From a statistical point of view, it is important to adjust for observed and, where possible, unobserved characteristics which are believed to be strongly related to the processes of interest, i.e. confounder variables. An analysis that does not may produce spurious associations between included covariates and the outcome. Therefore, as extensions to O'Keeffe et al. (2011), dynamic covariates (covariates which describe important aspects of previous developments of the process) are included to allow current transition intensities to depend on previous history (relaxing the Markov assumption), random effects are introduced into the transition intensities to account for unobserved heterogeneity and provide a more efficient estimation procedure, and a mover-stayer model (Frydman, 1984) is considered to allow for the possibility that some patients (stayers) have no propensity to develop damaged joints. Whereas much research has focused on the effect of disease activity on joint damage, no research has yet considered the reverse association (i.e. the effect of damage on activity). Specifically, it is of interest to investigate whether the disease activity process changes with onset of damage, and how if so. To inform on the possible association, the absorbing damaged state is further subdivided into an active and damaged state and a not-active and damaged state, thereby allowing the disease activity process to be modelled even after a joint has become damaged. The resulting model then utilizes the entire data set, as opposed to previously where the disease activity process was stopped once a joint had become damaged. By considering a mover-stayer model, it is also possible to investigate whether the activity process is different between movers (those who have the propensity to develop damaged joints) and stayers, and this will also contribute new knowledge towards the relationship between damage and activity.
Clustered progressive multistate models constructed using random effects have previously been proposed in the panel data literature. See, for example, Cook et al. (2004), O'Keeffe et al. (2012) and Sutradhar and Cook (2008). In our context, these models introduce time invariant, possibly multivariate random effects at the patient level to account for the correlation between joints from the same patient, time invariant unobserved heterogeneity and relaxation of the Markov assumption. A novel feature of our work is the use of observation level multivariate random effects in clustered non-progressive multistate models to account for correlation and time varying unobserved heterogeneity and the introduction of dynamic covariates to relax the Markov assumption explicitly. The proposing of this random-effects structure was motivated by the extensive lengths of follow-up and the possibly non-predictable changes in unobserved heterogeneity due to treatment or management strategies employed by the clinic and the spontaneous nature of joint activity. Such observations are less likely to result in unobserved heterogeneity being time invariant or time varying but deterministic (which is enforced by patient level random effects). Along with generalized estimating equations, copulas (Diao and Cook, 2014) and expanded state space models  have also been proposed to handle clustering. Although there are considerable advantages to such models, they are particularly difficult to formulate and implement when more than two intermittently observed non-progressive multistate processes are of interest.
The natural multistate modelling parameterization allows covariates to act on transition intensities in a proportional hazards framework. Therefore easily interpretable covariate effects on these transition intensities can be obtained. Another natural way to view a multistate process is in terms of its sojourn times (the time spent in a state before a transition occurs) and jump probabilities (the probability of transitioning to a state given that a transition occurs). If these quantities are of interest, a model parameterization which allows easily interpretable covariate effects to act on these quantities would be useful, especially as current parameterizations may not enable such interpretation. We consider this issue to motivate a modification of the original three-state model. The next section introduces the psoriatic arthritis data on which this analysis is based.

Psoriatic arthritis data
Psoriatic arthritis is an inflammatory arthritis that is associated with the skin condition psoriasis. At the University of Toronto psoriatic arthritis clinic, over 1000 patients have been followed up longitudinally since it began in 1978 with clinic visits scheduled 6-12 months apart. In particular, at these visits, the active and damaged joint counts are recorded at the individual joint level, among other measurements, and therefore permit statistical modelling at this level of detail. In this investigation, focus will be on the 28 hand joints (14 joints in each hand; see Fig. 1 for more details), which can result in severe physical disability if active and/or damaged. Furthermore, this investigation is based on 743 patients who entered the clinic with no damage in either hand, so that patients are more comparable in their initial state of disease progression, and had more than two clinic visits. A dynamic covariate which requires previous observations will be constructed in the next section. Of this subset of patients, 69% (514 of 743 patients) had no damage at the end of their follow-up, which motivates consideration of a stayer population. The mean follow-up time was 10 years and 8 months with an interquartile range of 11 years and 6 months. The mean and median number of clinic visits were 12.7 and 8 respectively, and this ranged from 2 to 57. The mean and median intervisit times were 10 and 6 months, with a standard deviation of 1 year and 3 months. At clinic entry, the mean age at onset of arthritis was 36 years and 8 months with a standard deviation of 13 years and 4 months, whereas the mean duration of arthritis was 5 years and 2 months with a standard deviation of 7 years and 2 months. Furthermore, 55% of patients were male and 45% female.
In total, there were 264 208 observed transitions over all hand joints in the data. The observed transition matrix is whereĀ and A denote the absence and presence of activity in the joint respectively, andD and D denote whether the joint has been clinically assessed as not damaged and damaged respectively. The next section describes a six-state model which will be useful for jointly investigating the activity and damage processes.

Six-state model for transition intensities
Multistate models provide a convenient framework when the evolution of a stochastic process is of interest (Commenges, 1999;Anderson, 2002). This investigation demonstrates their use for the joint analysis of the disease activity and damage processes occurring in each individual hand joint. Specifically, consider the following four-state representation that is depicted in Fig. 2.
The process characteristics, particularly the reversibility of activity and permanent nature of damage, are reflected in the non-zero transition intensities which describe the instantaneous rate of transitioning between states. It is implicit that this representation describes the possible transitions of movers since λ 13 and λ 24 > 0. If, however, a stayer population exists with regard to developing damaged hand joints, their disease activity process can be described by the multistate diagram in Fig. 3.
Let λ l rsij denote the transition intensity from state r to s for the lth joint of the ith patient at the jth clinic visit (at each clinic visit all 28 hand joints are observed). Initially to investigate  specific clinical aspects described in Section 1 (and to formulate a more parsimonious model), the transition intensities are parameterized as follows: Thus regression coefficients are used to provide a simple representation of the main effect of damage and mover-stayer status on activity transitions as well as the main effect of activity on the damage transition. We note that more complex models involving interaction effects can be developed for estimating differential effects across various subgroups of patients. Furthermore, we let are constant baseline intensities, βĀ A , β AĀ and βD D are vectors of regression coefficients, z l ij is a vector of covariates that are associated with the lth joint from the ith patient at the jth clinic visit, and u ij and v ij are realizations of zero-mean bivariate normal observation level random effects. Here α ∈ R is an unknown parameter to be estimated which allows u ij to act differently across the different transition intensities associated with the activity process. Although not formally stated, we include time-dependent dynamic covariates in z l ij to relax the Markov assumption. Specifically, the observed history of the activity process is summarized through a joint-specific covariate denoted as the adjusted mean activity AMA (Ibañez et al. (2003); Fig. 4 provides a description), whereas a patient's state of disease progression is reflected through the current number of damaged joints attained. On average, AMA was calculated as 0.093 with a standard deviation of 0.19 in our data. Given these dynamic covariates, current transition intensities from the multistate process are then assumed independent of previous process history, i.e. the Markov assumption. The random effects are assumed independent across time (with respect to j) and can be seen as accounting for unobserved heterogeneity not due to previous process history (where adjustments to unobserved heterogeneity related to previous history are provided through the dynamic covariates), which is still unaccounted for in the model. It is worth noting that the explicitly specified regression coefficients in expressions (1) and (2) correspond to covariates with different modelling assumptions. The covariates in z l ij are assumed to remain constant between clinic visits and therefore are relevant when this is true (time invariant covariates) or a reasonable approximation (e.g. when the covariate process is unlikely to be highly fluctuating between clinic visits). For simplicity, such covariates are also usually included if understanding the relationship between these covariates and the outcome is not of primary interest but some form of adjustment for these covariates is necessary. In contrast, the regression coefficients β Damaged and β Active are describing the effect of a binary variable representing a joint being damaged and active respectively while reflecting the stochastic nature of these processes and therefore provide more realistic measures of association. This is especially useful because these are the clinical aspects of primary interest. The regression coefficient β Stayer is similar in nature to α as it describes the effect of a partially observable binary variable ; as x.t/ is intermittently observed and therefore the true path of x.t/ is not known, t 0 x.s/ ds is approximated as the area under the linearly interpolated observations of x.t/; for example, if a joint was observed at t D .0, 1, 2, 3, 5/ such that x.t/ D .0, 1, 0, 0, 1/ respectively, then AMA.5/ is approximated as the shaded area divided by 5, i.e. hence 0.4 (stayer = 1 and mover = 0); it can only be known that patients with damage are movers and that patients with no damage are either movers or stayers.
The motivation behind the use of observation level random effects arose from the potential need to allow for non-predictable changes in unobserved heterogeneity (Unkel et al., 2014) with regard to the damage and activity processes. The current methodology in the literature primarily uses patient level random effects. This forces unobserved heterogeneity to be time invariant and we assess this assumption in our context by fitting this model in addition to the model proposed. Likelihood values can then be used to compare informally the usefulness of the modelling framework proposed (models are non-nested but contain the same number of parameters; hence the same penalty terms are obtained if information criteria such as the Akaike information criterion are used). In general, the estimability of random-effects models, particularly variance components, will be driven by the random-effects structure and the variability of the data. The observation level random-effects structure incorporates fewer shared random effects between transition intensities than does its patient level counterpart. As a consequence, there will be less variability between transition intensities containing the same random effect, thus making it more difficult to estimate the random-effect variance when using the observation level random-effects structure. This would especially be so when a single multistate process is of interest, where substantial heterogeneity in the data (which could be generated by constraining transition intensities) will probably be needed for observation level random-effects models to be estimable, whereas patient level random-effects models are likely to require considerably less heterogeneity (Satten (1999) and Cook (1999) considered a single multistate process with patient level random effects). In a clustered multistate process framework, heterogeneity is also generated through the differences across several multistate processes; thus the observation level random-effects structure is far more likely to be estimable, especially as the number of processes increases.

Maximum likelihood estimation
The model proposed is fitted by constructing and then maximizing the marginal likelihood. Let X l i .t ij / denote the six-state process for the lth joint from the ith patient at time t ij , where {t i1 , : : : , t im i } denotes the times of the jth clinic visit from the ith patient. Let C i be a partially observable binary variable such that C i = 1 with probability 1 − π i if patient i is a mover (transitions are governed by the four-state model in Fig. 1), and C i = 0 with probability π i otherwise (transitions are governed by the two-state model in Fig. 2). Under the assumption that the conditional process (conditional on the piecewise constant dynamic covariates, {U ij = u ij , V ij = v ij } ∀ j and C i = c i ) of the lth joint from the ith patient is Markov and time homogeneous (although the marginal process is not assumed to be Markov and/or time homogeneous), the conditional likelihood contribution from the lth joint of the ith patient is where s l ij represents the state corresponding to the specific combination of .{Ā, A}, {D, D}/ observed at t ij for the lth joint of the ith patient. More details on the likelihood construction for time homogeneous Markov models, particularly the form of the transition probabilities, can be found in Kalbfleisch and Lawless (1985). Appendix A provides the closed form transition probabilities of the six-state process. Here, for simplicity, the likelihood contribution from the process between t i1 and t i2 is excluded because AMA cannot be calculated at t i1 ; it requires previous observations. As a diagnostic check of this simplification, we used activity at baseline to proxy AMA at baseline and this produced comparable results. If the assumption of independence between joints from the same patient is reasonable, conditional on the random effects U ij and V ij , then represents the likelihood contribution from the ith patient, still conditional on the dynamic covariates and C i = c i . Here Θ is a vector containing all the unknown parameters to be estimated (baseline intensities, regression coefficients and random-effects variance-covariance components) apart from the mover-stayer probabilities π i , φ.u ij , v ij ; 0, Σ/ denotes the zeromean bivariate normal density with covariance matrix Σ and z ij = {z 1 ij , : : : , z 28 ij }. The overall marginal likelihood contribution from the ith patient is then with the overall marginal likelihood L.Θ Å |z ij / obtained by taking the product of all likelihood contributions L i .Θ Å |z ij / from each patient. Here Θ Å = {Θ, π i } and c Å i is a binary indicator such that c Å i = 1 if damaged joints are observed from patient i at their last clinic visit and c Å i = 0 otherwise. The bivariate numerical integrations were computed by firstly factorizing the bivariate density function into conditional densities, i.e. φ{v ij ; ρu ij σ v =σ u , σ 2 v .1 − ρ 2 /}φ.u ij ; 0, σ 2 u /, where σ 2 v and σ 2 u denote the respective variance components and ρ the correlation parameter, then using Gauss-Hermite quadrature to evaluate each integral with respect to u ij and v ij separately. The numbers of quadrature points for each integration were chosen to be 15 and 30 for the observation and patient level random-effects models respectively. Weights and nodes from the quadrature rule were then calculated using the R (R Core Team, 2015) package statmod (Smyth et al., 2004). A sensitivity analysis indicated that further quadrature points provided negligible influence on parameter estimates and log-likelihood values. The log-likelihood was maximized using the BFGS (Broyden, 1970) optimization routine and asymptotic standard errors for parameter estimates were obtained by evaluating and then inverting the numerically derived Hessian matrix at the maximum likelihood estimates. The same estimation procedure was used in Yiu et al. (2017) to fit models that share the same essential characteristics with the models proposed in this paper. In their work, it was shown through simulation studies that, with a correctly specified model and sufficient follow-up information, the estimation procedure in this paper can provide reliable estimates of their model parameters.
The next subsection provides results of fitting the proposed model to the data described in Section 2.

Results
In addition to the aforementioned covariates, adjustment covariates for type of joint, presence of opposite or contralateral joint damage (the same joint in the opposite hand is damaged; opposite joint damaged equals 1 and 0 otherwise), sex (male ≡ 1 and female ≡ 0), age at onset of arthritis (in years) and duration of arthritis (in years) are provided through z l ij . Joint type is represented through a five-level categorical variable with levels metacarpophalangeal, proximal interphalangeal, distal interphalangeal, thumb metacarpophalangeal and baseline thumb proximal interphalangeal. This covariate was included in the transition intensities that were associated withĀ → A andD → D with preliminary analysis demonstrating little evidence of differential recovery rates from activity (i.e. A →Ā). The binary variable specifying the presence of opposite joint damage is motivated by previous analyses (Cresswell and Farewell, 2011;O'Keeffe et al., 2011) which indicate evidence of symmetric joint damage; the propensity of damage for a joint in a specific location to become damaged is increased if the contralateral joint in the other hand is earlier damaged. Table 1 presents the results from fitting the model proposed (with dynamic covariates and an observation level random-effects structure) to the 743 psoriatic arthritis patients described in Section 2. For comparative purposes, a model with patient level random effects, i.e. U ij = U i and V ij = V i , was also fitted. The results of this model and a comparison with the results of the model proposed are provided in Appendix B. The larger log-likelihood value of −47 389:11 for the model proposed compared with the log-likelihood value of −52 784:84 for the comparative model would suggest a preference for the model proposed.

Damage process
From Table 1, it is clear that both opposite joint damage and the number of damaged joints are strongly and positively associated with an increased damage progression rate. Thus this Table 1. Parameter estimates and 95% Wald intervals resulting from fitting the six-state model (described in Section 3) to 743 psoriatic arthritis patients

Parameter
Estimates for the following transitions: for a greater number of process features, although not adjusting for the stochastic nature of the opposite joint damage process. Activity, both current (the joint is active while adjusting for its stochastic nature) and history (as described by AMA), is seen to be strongly and positively associated with damage progression. Regarding inference, the confidence interval for the regression coefficient that is associated with current activity is narrower than the corresponding interval that was reported in O' Keeffe et al. (2011). This has probably resulted from using updated data and a more efficient estimation procedure (through dynamic covariates and random effects as opposed to a working independence assumption with a robust covariance matrix adjustment). These results therefore provide greater confidence in the strong positive association between activity and damage, and implicitly strengthens the argument that was made regarding causality.

Activity process
There is evidence that the transition intensities that are associated with entering and leaving the active joint state reduces once a joint has become damaged. However, as the respective (95% Wald intervals) confidence intervals contain or are close to zero, this observation must currently be regarded as suggestive. The presence of opposite joint damage and the number of damaged joints seem to be moderately or weakly associated with the activity transition intensities. In particular little association is seen with transitioning from the active joint state to the not-active joint state. The strong association between history of activity and current activity transition intensities is reassuring, since the interpretation of greater amounts of previous activity increasing the transition intensity to the active state while decreasing the transition intensity to the not-active state is intuitive.

Movers and stayers
The percentage of stayers (100π% where π = π i ∀ i) was estimated to be 14% (11%, 18%). Empirically, when compared with the 69% of patients who did not develop any damage, this estimate may seem to be a considerable underestimate of the true stayer proportion. However, because of the relationship between activity and damage, it is conceivable that many of these patients (those who did not develop damaged joints) did not develop damage because they were in the not-active state for long periods of continuous time, as opposed to being stayers per se. This observation is perhaps supported by Table 1, which suggests that movers have a vastly smaller transition intensity to the active state compared with stayers. A more specific investigation regarding the sojourn times of movers and stayers in the not-active joint state follows in the next section. It is also worth noting that the transition intensity to the not-active state is smaller for movers; however, it is far less pronounced. As a diagnostic check for parameter estimation, we plot the profile log-likelihood for π in of the profile log-likelihood for π. This indicates that the latent stayer proportion was identifiable under the assumptions of the six-state model in Section 3 because the profile log-likelihood has a quadratic shape.

Five-state model for mean sojourn times
In many settings, clinical interest lies in understanding the mean sojourn times from a particular time point (the mean amount of time that a process will spend in a state from a particular time point before a transition occurs) as a function of covariates at that particular time point. When two or more transitions are possible from a state of interest, the current methodology of investigating a covariate effect involves fixing other covariates at specified values (usually at their means or as a description of a particular patient) and then calculating the difference in mean sojourn times from a particular time point for that state by varying the covariate of interest. This methodology is implicit because, under the current multistate modelling parameterization in terms of transition intensities, a direct interpretation of covariate effects on the mean sojourn times is not straightforward when two or more transitions are possible from the state of interest; the mean sojourn time is a non-linear function of covariates from different transition intensities. This section considers the novel approach of modelling the mean sojourn times directly through a model reparameterization to obtain easily interpretable covariate effects on this quantity. In our situation, this approach is possible because there is a smooth bijection from the transition intensities to the mean sojourn times and jump probabilities (see the next paragraph). This implies that more elaborate but computationally intensive techniques such as the use of pseudo-observations (Anderson and Perme, 2010) can be avoided.
The specific context of interest concerns the sojourn times in the active and not-active states before damage. For simplicity, we can therefore revert to a three-state model for the movers by combining the activity process after damage has occurred into a single absorbing state, as depicted in Fig. 6. Similarly, the multistate diagram describing possible transitions for stayers is displayed in Fig. 7.
Although continuous time Markov processes can be viewed in terms of transition intensities specifying risks of transitioning through states (as thought of previously), another natural way to view such processes is in terms of sojourn times and jump probabilities being associated with each state. As mentioned, the sojourn time in a state from a particular time point describes the amount of time that the process will spend in that state from that particular time point before a transition occurs. Then, at the point of transitioning, the jump probabilities inform the multinomial distribution of the possible set of states in which the process will jump to next. For the conditional process that we consider (a continuous time Markov process), the sojourn time in state r from the time of entry into state r can be shown to have an exponential distribution with mean 1=Σ k =r λ rk and jump probabilities given by P.jumps to state k| jumps from state r/ = λ rk =Σ k =r λ rk . See pages 259-260 in Grimmett and Stirzaker (2001) for more details. Thus, by using the lack of memory property of the exponential distribution, we make the simplifying assumption that the sojourn times from a particular time point do not depend on the (unobserved) time that has already elapsed in that state conditional on time-dependent covariates and random effects at that particular time point and the time-independent mover-stayer status. To investigate the difference in the activity process between movers and stayers again more simply, we parameterize as follows: where μ l rij represents the mean sojourn time in state r for the lth joint of the ith patient from time t ij . Regression models for the mean sojourn times and jump probabilities can then be specified as follows: where p lĀ Dij and p l ADij denote the jump probabilities from stateĀ → D and A → D respectively for the lth joint of the ith patient at t ij . The right-hand side of each regression equation in expression (3) contains a baseline (indicated by 0 in the subscript) multiplied by the exponent of the sum of a linear predictor and linear function of realizations of random effects, as before. Dynamic covariates (AMA and the attained number of damaged joints) and random effects are again included to reflect features of the processes that were described in Section 3. The random effects follow a zero-mean bivariate normal distribution and are independent across time.

Results
Along with dynamic covariates, the presence of opposite joint damage, sex, age at onset of arthritis and duration of arthritis were included in the analysis, as before. Table 2 presents the results from fitting the five-state model that was described in Section 4 to the 743 psoriatic arthritis patients described in Section 2. From Table 2, the presence of opposite joint damage provides a slight increase in the mean sojourn time in the not-active joint state and greatly increases the probability of directly transitioning to damage (as opposed to active and not damaged) once a transition occurs. However, when a joint is active, there is little evidence to suggest that opposite joint damage influences the sojourn times nor the next state probability. These results indicate that the presence of opposite joint damage is particularly relevant when a joint is not active. A large number of damaged joints, although substantially increasing the jump probabilities to damage as opposed to active or not active, provides little effect on the mean sojourn times in the active and not-active states. As expected, greater amounts of previous activity (as described by AMA) decrease the mean sojourn time in the not-active state and increase the mean sojourn time in the active state. Table 2 also suggests that AMA is strongly and positively associated with the jump probability to damage when in the active state but not in the not-active state. Thus current activity is strengthened by the history of activity when dictating the next state of the process, but jumping to damage or  active from the not-active state could be unaffected by the history of activity. As hypothesized in the previous section, stayers have a far shorter mean sojourn time in the not-active state, and a slightly shorter sojourn time in the active state. Finally, the estimated stayer proportion in Table 2 is approximately equivalent to the stayer proportion that was reported in Table 1. Under the assumptions of the five-state model, the numerical optimization procedure could identify the parameter π as it converged at the maximum of its profile log-likelihood (Fig. 8). Table 3 provides the results of fitting the five-state model with a patient level random-effects structure (i.e. U ij = U i and V ij = V i ∀ j) to the data that were described in Section 2. The resulting log-likelihood value was −49 256:31, which is far smaller than −44 346:26 obtained from the model proposed. In this context, the observation level random-effects structure, after including dynamic covariates, again seems to be the more appropriate random-effects structure for the data.

Discussion
This research was motivated by reproducing prior results with an updated data set and undertaking new investigations into disease course and progression. For this, a single unifying clustered multistate modelling framework which allows simultaneous investigations of multiple clinical aspects was proposed. The results obtained therefore yield greater confidence when compared with multiple univariate investigations, which were performed previously, since they are based on adjusting for other important process characteristics. From a clinical perspective, the relationship between activity and damage was demonstrated as pronounced since both history and current activity were positively related to damage progression and jumping to damage once a joint immediately leaves the active state. In terms of the reverse relationship, the onset of damage is seen to slow the activity process although the confidence intervals for the relevant regression coefficients indicate that no change is a distinct possibility, maybe because of far fewer observed transitions after damaged has occurred. Interestingly, both models seem to identify a subpopulation of approximately 15% who are rapidly fluctuating in their activity process yet are at minimal risk of damage, perhaps because they have shorter sojourn times in the active joint state. An avenue of future work could involve identifying these patients especially because their treatment strategies should conceivably not consist of potent drugs, which may cause unpleasant side effects, but soft drugs to reduce joint swelling and pain. It is also reassuring that neither model contradicts any well-held clinical beliefs.
From a statistical point of view, the novel aspects of this research include the proposing of an observation level random-effects structure combined with dynamic covariates, a mover-stayer structure whereby movers and stayers can have different effects on transition intensities in which they are not implicitly defined for and a model parameterization which allows easily interpretable covariate effects to act on the sojourn times and jump probabilities. In our context, the usefulness of the methodology proposed was demonstrated through new clinical insights and substantial improvements in likelihood values over the use of standard methodology (patient level randomeffect models). Although the methodology proposed was described in terms of specific, but fairly complex, six-and five-state models for the margins, extensions to general clustered continuous time Markov models are straightforward. In particular, the proposed model parameterization in terms of sojourn times and jump probabilities, and mover-stayer effects on transition intensities are also applicable to univariate Markov multistate processes and therefore can provide a useful framework for inference in many clinical settings.

Results for the following transitions:
A