Abstract
In a Markov model the transition probabilities between states do not depend on the time spent in the current state. The present paper explores two ways of selecting the states of a discrete-time Markov model for a system partitioned into categories where the duration of stay in a category affects the probability of transition to another category. For a set of panel data, we compare the likelihood fits of the Markov models with states based on duration intervals and with states defined by duration values. For hierarchical systems, we show that the model with states based on duration values has a better maximum likelihood fit than the baseline Markov model where the states are the categories. We also prove that this is not the case for the duration-interval model, under conditions on the data that seem realistic in practice. Furthermore, we use the Akaike and Bayesian information criteria to compare these alternative Markov models. The theoretical findings are illustrated by an analysis of a real-world personnel data set.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: 2nd intr. symp. on information theory, Budapest, 1973, Akademiai Kiado
Anderson D, Burnham K (2004) Model selection and multi-model inference. Springer, New York, p 10
Bacci S, Pandolfi S, Pennoni F (2014) A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Adv Data Anal Classif 8(2):125–145
Bányai T, Landschützer C, Bányai Á (2018) Markov-chain simulation-based analysis of human resource structure: How staff deployment and staffing affect sustainable human resource strategy. Sustainability 10(10):3692
Barbu V, Limnios N (2008) Semi-Markov chains and hidden semi-Markov models toward applications: their use in reliability and DNA analysis. Springer, New York
Bartholomew D, Forbes A, McClean S (1991) Statistical techniques for manpower planning. Wiley, Chichester
Bartolucci F, Farcomeni A, Pennoni F (2012) Latent Markov models for longitudinal data. Chapman and Hall/CRC, Boca Raton
Cartella F, Lemeire J, Dimiccoli L, Sahli H (2015) Hidden semi-Markov models for predictive maintenance. Math Probl Eng 2015
D’Amico G, Janssen J, Manca R (2006) Homogeneous semi-Markov reliability models for credit risk management. Decis Econ Finance 28(2):79–93
De Feyter T (2006) Modelling heterogeneity in manpower planning: dividing the personnel system into more homogeneous subgroups. Appl Stoch Models Bus Ind 22(4):321–334
Dewar M, Wiggins C, Wood F (2012) Inference in hidden Markov models with explicit state duration distributions. IEEE Signal Process Lett 19(4):235–238
Duffy S, Day N, Tabár L, Chen H, Smith T (1997) Markov models of breast tumor progression: some age-specific results. J Natl Cancer Inst Monogr 22:93–97
Durland JM, McCurdy TH (1994) Duration-dependent transitions in a Markov model of US GNP growth. J Bus Econ Stat 12(3):279–288
Forbes J (1987) Early intraorganizational mobility: patterns and influences. Acad Manag J 30(1):110–125
Frühwirth-Schnatter S, Pittner S, Weber A, Winter-Ebmer R (2018) Analysing plant closure effects using time-varying mixture-of-experts Markov chain clustering. Ann Appl Stat 12(3):1796–1830
Jiang Y, Sinha K (1989) Bridge service life prediction model using the Markov chain. Transp Res Rec 1223:24–30
Katz RW (1981) On some criteria for estimating the order of a Markov chain. Technometrics 23(3):243–249
Langeheine R, Van de Pol F (1994) Discrete-time mixed Markov latent class models. A casebook of methods, Analyzing social and political change, pp 170–197
Longini I, Clark W, Gardner L, Brundage J (1991) The dynamics of CD4+ T-lymphocyte decline in HIV-infected individuals: a Markov modeling approach. J Acquir Immune Defici Syndr 4(11):1141–1147
Lyness K, Thompson D (2000) Climbing the corporate ladder: do female and male executives follow the same route? J Appl Psychol 85(1):86
McFarland D (1970) Intragenerational social mobility as a Markov process: including a time-stationary Markovian model that explains observed declines in mobility rates over time. Am Sociol Rev 35(3):463–476
McGinnis R (1968) A stochastic model of social mobility. Am Sociol Rev 33(5):712–722
Parker B, Caine D (1996) Holonic modelling: human resource planning and the two faces of Janus. Int J Manpow 17(8):30–45
Patten S (2005) Markov models of major depression for linking psychiatric epidemiology to clinical practice. Clin Pract Epidemiol Ment Health 1(2):11
Rombaut E, Guerry MA (2015) Decision trees as a classification technique in manpower planning. In: The 16th conference of the applied stochastic models and data analysis international society, pp 863–877
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Sonnenberg FA, Beck JR (1993) Markov models in medical decision making: a practical guide. Med Decis Mak 13(4):322–338
Uche P (1990) Non-homogeneity and transition probabilities of a Markov chain. Int J Math Educ Sci Technol 21(2):295–301
Ugwuowo F, McClean S (2000) Modelling heterogeneity in a manpower system: a review. Appl Stoch Models Bus Ind 16(2):99–110
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proofs of theorems and lemmas
Appendix: Proofs of theorems and lemmas
Let us recall the definition of the following functions:
and
Lemma 1
\(\ln {\hat{L}_{\mathcal {S}}} = \phi (n_{11},n_{12})+\phi (n_{11}+n_{12},n_{13})+\phi (n_{22},n_{23})\)
Proof
Using (4), (5) and the fact that \(n_2-n_{23}=n_{22}\), we obtain
\(\square\)
Lemma 2
\(\ln {\hat{L}_{\tilde{\mathcal {S}}}} = \phi (n_{11}+n_{12},n_{13})+\phi (n_{22},n_{23})\)
Proof
It follows from (10) that
using \(n_{1}=n_{11}+n_{12}+n_{13}\) and \(n_{2}=n_{22}+n_{23}\). Hence,
\(\square\)
Lemma 3
The function \(\phi\), defined by (12), is homogeneous of the first degree, i.e.
Proof
By (13), we have that \(\varphi (tu)=t\,\varphi (u)+u\,\varphi (t)\) for all \(t,u\ge 0\). The result then follows from (12). \(\square\)
Lemma 4
The function \(\phi\), defined by (12), is convex.
Proof
Let \(u,v>0\). We prove that the Hessian matrix of \(\phi\) at (u, v), denoted H, is positive semi-definite. Using standard calculus,
The eigenvalues of H are 0 and \(\frac{u^2+v^2}{uv(u+v)}\). They are both non-negative, hence H is positive semi-definite. \(\square\)
Lemma 5
For the function \(\phi\), defined by (12), holds
with equality if and only if \(ad=bc\).
Proof
The inequality is an immediate consequence of the convexity and homogeneity properties of the function \(\phi\) (Lemmas 3, 4):
Suppose \(ad=bc\). If \(ad=0=bc\), then equality holds surely because \(\phi (u,v)=0\) if \(u=0\) or \(v=0\). If \(ad=bc\ne 0\), then \(c=ta\) and \(d=tb\) for some number \(t\ne 0\). Hence, by the homogeneity of \(\phi\),
Now suppose equality holds. Then, by homogeneity of \(\phi\),
Consequently, the function \(f(t)=\phi (u,v)\) with \(u=a+t(c-a)\) and \(v=b+t(d-b)\) is linear on [0, 1], because \(\phi\) is convex. So, \(f''(t)=0\) for all \(t\in (0,1)\). Using the Hessian matrix of \(\phi\) in (23), we obtain
and the result \(ad=bc\) thus follows from \(f''(t)=0\). \(\square\)
Lemma 6
For the function \(\phi\), as defined in (12), holds that \(\phi _y:t \mapsto \phi (t,y)\) is strictly decreasing and strictly convex for all \(y>0\).
Proof
Using standard calculus, \({\phi _y}'(t)=\ln {t}-\ln {(t+y)}<0\) and \({\phi _y}''(t)=\frac{y}{t(t+y)}>0\), if \(t>0\). \(\square\)
Lemma 7
If \(n_{AA}>2n_{AB}\), it holds that \(\psi (n_{AA}-\frac{3}{2}n_{AB},n_{AB})<0\).
Proof
Let \(\alpha =n_{AA}/n_{AB}\). By (24) and Lemma 3,
where
and \(\phi _1\) is the function \(u \mapsto \phi (u,1)\). Since \(\phi _1\) is strictly convex (Lemma 6), its first derivative is strictly increasing and therefore h is strictly decreasing. Furthermore,
by Lemma 3. Consequently, \(h(\alpha )<0\) since \(\alpha >2\). \(\square\)
Lemma 8
If \(n_{AA}>2n_{AB}\), it holds that \(\psi (n_{AA}-\tfrac{3}{4}n_{AB},\tfrac{3}{4}n_{AB})<0\).
Proof
Let \(\alpha =n_{AA}/n_{AB}\) and \(\beta =3/4\). By (24) and Lemma 3,
where
With the use of the function \(\phi _1:u \mapsto \phi (u,1)\) and Eqs. (13) and (12) we can rewrite h(t) as
Since \(\phi _1\) is strictly convex (Lemma 6), its first derivative is strictly increasing and therefore h is strictly decreasing. The result now follows from the fact that \(h(2)<0\). \(\square\)
Lemma 9
If \(n_{AA}\ge 3n_{AB}\), it holds that \(\psi (n_{AB},n_{AB})<0\).
Proof
Let \(\alpha =n_{AA}/n_{AB}\). By (24) and Lemma 3,
where
and \(\phi _1\) is the function \(u \mapsto \phi (u,1)\). Since \(\phi _1\) is strictly convex (Lemma 6), its first derivative is strictly increasing and therefore h is strictly decreasing. Furthermore, \(h(3)=\ln {\frac{16}{27}}<0\). Hence, since \(\alpha \ge 3\) the monotonicity of h yields \(h(\alpha )<0\) and the result follows. \(\square\)
Lemma 10
If \(n_{AA}> 2n_{AB}\), it holds that \(\psi (\tfrac{n_{AA}-n_{AB}}{2},n_{AB})<0\).
Proof
Let \(\alpha =n_{AA}/n_{AB}\). By (24) and Lemmas 3 and 6,
where
Using standard calculus,
so that the function h is strictly decreasing. Furthermore, \(h(2)=0\) which can be verified by straightforward computation. Hence, since \(\alpha >2\) the monotonicity of h yields \(h(\alpha )<0\) and the result follows. \(\square\)
Lemma 11
\(\psi (0,\frac{n_{AA}n_{AB}}{n_{AA}+n_{AB}})=0\).
Proof
Denote \(\rho =\frac{n_{AA}n_{AB}}{n_{AA}+n_{AB}}\). Then, \(n_{AB}-\rho = \tfrac{n_{AB}}{n_{AA}}\rho\) and \(n_{AA}-\rho =\tfrac{n_{AA}}{n_{AB}}\rho\), so that, using (24) and Lemma 3,
\(\square\)
Lemma 12
\(\psi (0,n_{AB})=\psi (n_{AA}-n_{AB},n_{AB})>0\).
Proof
By (24), \(\psi (0,n_{AB})\) and \(\psi (n_{AA}-n_{AB},n_{AB})\) are both equal to \(\phi _{n_{AB}}(n_{AA}-n_{AB})-\phi _{n_{AB}}(n_{AA})\), where \(\phi _{n_{AB}}\) is the function \(u \mapsto \phi (u,n_{AB})\). According to Lemma 6, \(\phi _{n_{AB}}\) is strictly decreasing and the result follows. \(\square\)
Lemma 13
For an AB-complete data set, we have that \(\hat{\tau }_{23}>\hat{p}_{AB}\) is equivalent to \(g(n_{11},n_{12})>0\), where
Hence, if \(\hat{\tau }_{23}>\hat{p}_{AB}\), the point \((n_{11},n_{12})\) in xy-plane lies above the line through the points \((0,\tfrac{n_{AA}n_{AB}}{n_{AA}+n_{AB}})\) and \((n_{AA},0)\).
Proof
Because of AB-completeness, it holds that \(n_{23}=n_{12}\). Furthermore, \(n_{2}=n_{22}+n_{23}=n_{22}+n_{12}=n_{AA}-n_{11}\). Consequently, since \(\hat{\tau }_{23}=n_{23}/n_{2}\) and \(\hat{p}_{AB}=n_{AB}/(n_{AA}+n_{AB})\), we have
\(\square\)
Proof of theorem 1
Proof
In expression (17), we can eliminate the variables \(n_{13}\), \(n_{22}\) and \(n_{23}\), since \(n_{AA}=n_{11}+n_{12}+n_{22}\), and, by the AB-completeness assumption, \(n_{AB}=n_{13}+n_{23}\) and \(n_{12}=n_{23}\). Hence, \(\varDelta \ell \ell _{\mathrm{M}(k)\text{- }\mathrm{M}}=\psi (n_{11},n_{12})\), where
The function \(\psi\) is convex, since each of the first three terms in (24) is a convex function of (x, y), as a composition of the convex function \(\phi\) (Lemma 4) and an affine function from \(\mathbb {R}^2\) to \(\mathbb {R}^2\).
Let \(\alpha =n_{AA}-n_{AB}\). By assumption, \(\alpha >0\). Let \(\beta =\min \{n_{AB},\frac{\alpha }{2}\}\) and denote the following points in xy-plane: \(\mathbf {a}(0,\frac{n_{AA}n_{AB}}{n_{AA}+n_{AB}})\), \(\mathbf {b}(\beta ,n_{AB})\), \(\mathbf {c}(\alpha ,n_{AB})\), \(\mathbf {d}(0,n_{AA})\), \(\mathbf {e}(n_{AA}-\tfrac{3}{2}n_{AB},n_{AB})\) and \(\mathbf {f}(n_{AA}-\tfrac{3}{4}n_{AB},\tfrac{3}{4}n_{AB})\). Because \(n_{AA}>2n_{AB}\), the 3-simplex \(\mathbf {ecf}\) is a subset of the 4-simplex \(\mathbf {abcd}\), see Fig. 3.
Take \(k\ge 1\) sufficiently large, so that \(\hat{\tau }_{23}>\hat{p}_{AB}\). Then, using Lemma 13 and the fact that \(n_{11}\ge n_{12}\) whenever \(k\ge 1\), the point \(\mathbf {p}(n_{11},n_{12})\) in xy-plane must be contained in the 4-simplex \(\mathbf {abcd}\).
Suppose \(\varDelta \ell \ell _{\mathrm{M}(k)\text{- }\mathrm{M}}\ge 0\). We prove that \(\mathbf {p}\) must then belong to the 3-simplex \(\mathbf {ecf}\). First, we observe that \(\psi\) is non-positive on the 5-simplex \(\mathbf {abefd}\), because \(\psi\) is convex and \(\psi\) is non-positive in all vertices of \(\mathbf {abefd}\) (Lemmas 7, 8, 9, 10, 11). Hence, because \(\mathbf {p}\in \mathbf {abcd}\) and \(\psi (\mathbf {p})=\varDelta \ell \ell _{\mathrm{M}(k)\text{- }\mathrm{M}}\ge 0\), we have \(\mathbf {p}\in \mathbf {ecf}\). Consequently, \(n_{11}>n_{AA}-\tfrac{3}{2}n_{AB}\) and \(n_{12}>\tfrac{3}{4}n_{AB}\). But then, (20) cannot be satisfied, since \(n_{AA}-n_{11}=n_{22}+n_{12}=n_{22}+n_{23}=n_{2}\). \(\square\)
Proof of theorem 2
Proof
Denote \(n_{11}+n_{12}=a\), \(n_{13}=b\), \(n_{22}=c\) and \(n_{23}=d\). Then, \(n_{AA}=a+c\) and \(n_{AB}=b+d\). Hence, using (18), we have
which is non-negative by Lemma 5.
Furthermore, \(\varDelta \ell \ell _{\widetilde{\mathrm{M}}(k)\text{- }\mathrm{M}}=0\) is equivalent to \(\phi (a,b)+\phi (c,d)=\phi (a+c,b+d)\), which in turn is equivalent to \(ad=bc\) by Lemma 5. Finally, \(\hat{\tau }_{13}=\hat{\tau }_{23}=\hat{p}_{AB}\) if and only if \(ad=bc\), since \(\hat{\tau }_{13}=\tfrac{b}{a+b}\), \(\hat{\tau }_{23}=\tfrac{d}{c+d}\) and \(\hat{p}_{AB}=\tfrac{b+d}{a+c+b+d}\). \(\square\)
Proof of theorem 3
Proof
If \(n_{11}>0\) and \(n_{12}>0\), we have by the binomial theorem that \(n_{11}^{n_{11}}n_{12}^{n_{12}}< (n_{11}+n_{12})^{n_{11}+n_{12}}\), hence \(\phi (n_{11},n_{12})< 0\) using (12). The theorem now follows from (19) and the fact that \(\phi (n_{11},n_{12})=0\) if \(n_{11}=0\) or \(n_{12}=0\). \(\square\)
Rights and permissions
About this article
Cite this article
Carette, P., Guerry, MA. Markov models for duration-dependent transitions: selecting the states using duration values or duration intervals?. Stat Methods Appl 31, 1203–1223 (2022). https://doi.org/10.1007/s10260-022-00637-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-022-00637-2