Clustering of longitudinal curves via a penalized method and EM algorithm

Wang, Xin

doi:10.1007/s00180-023-01380-2

Clustering of longitudinal curves via a penalized method and EM algorithm

Original Paper
Published: 30 June 2023

Volume 39, pages 1485–1512, (2024)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Xin Wang ORCID: orcid.org/0000-0001-7801-1728¹

229 Accesses
Explore all metrics

Abstract

In this article, a new method is proposed for clustering longitudinal curves. In the proposed method, clusters of mean functions are identified through a weighted concave pairwise fusion method. The EM algorithm and the alternating direction method of multipliers algorithm are combined to estimate the group structure, mean functions and principal components simultaneously. The proposed method also allows to incorporate the prior neighborhood information to have more meaningful groups by adding pairwise weights in the pairwise penalties. In the simulation study, the performance of the proposed method is compared to some existing clustering methods in terms of the accuracy for estimating the number of subgroups and mean functions. The results suggest that ignoring the covariance structure will have a great effect on the performance of estimating the number of groups and estimating accuracy. The effect of including pairwise weights is also explored in a spatial lattice setting to take into consideration of the spatial information. The results show that incorporating spatial weights will improve the performance. A real example is used to illustrate the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

Article Open access 05 March 2020

An integrated approach for understanding global earthquake patterns and enhancing seismic risk assessment

Article Open access 13 March 2024

References

Basu S, Banerjee A, Mooney R.J (2004) Active semi-supervision for pairwise constrained clustering. In: Proceedings of the 2004 SIAM international conference on data mining. SIAM, pp. 333–344
Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. Ann Appl Stat 9(4):1726–1760
MathSciNet Google Scholar
Bouveyron C, Jacques J (2011) Model-based clustering of time series in group-specific functional subspaces. Adv Data Anal Classif 5(4):281–300
MathSciNet Google Scholar
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Google Scholar
Chi EC, Lange K (2015) Splitting methods for convex clustering. J Comput Graph Stat 24(4):994–1013
MathSciNet Google Scholar
Chiou JM, Li PL (2007) Functional clustering and identifying substructures of longitudinal data. J R Stat Soc Ser B (Stat Methodol) 69(4):679–699
MathSciNet Google Scholar
Chiou JM, Li PL (2008) Correlation-based functional clustering via subspace projection. J Am Stat Assoc 103(484):1684–1692
MathSciNet Google Scholar
Coffey N, Hinde J, Holian E (2014) Clustering longitudinal profiles using P-splines and mixed effects models applied to time-course gene expression data. Comput Stat Data Anal 71:14–29
MathSciNet Google Scholar
Daawin P, Kim S, Miljkovic T (2019) Predictive modeling of obesity prevalence for the us population. N Am Actuar J 23(1):64–81
MathSciNet Google Scholar
de Amorim RC (2012) Constrained clustering with minkowski weighted k-means. In: 2012 IEEE 13th international symposium on computational intelligence and informatics (CINTI). IEEE, pp. 13–17
De Boor C (2001) A practical guide to splines. Springer, New York, NY
Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
MathSciNet Google Scholar
Fang K, Chen Y, Ma S, Zhang Q (2022) Biclustering analysis of functionals via penalized fusion. J Multivar Anal 189:104874
MathSciNet Google Scholar
Foulds J, Kumar S, Getoor L (2015) Latent topic networks: a versatile probabilistic programming framework for topic models. In International conference on machine learning. PMLR, pp. 777–786
Hales CM, Carroll MD, Fryar CD, Ogden CL (2017) Prevalence of obesity among adults and youth: United states, 2015–2016. NCHS data brief (288)
Huang H, Li Y, Guan Y (2014) Joint modeling and clustering paired generalized longitudinal trajectories with application to cocaine abuse treatment data. J Am Stat Assoc 109(508):1412–1424
MathSciNet Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Google Scholar
Ibrahim JG, Zhu H, Tang N (2008) Model selection criteria for missing-data problems using the EM algorithm. J Am Stat Assoc 103(484):1648–1658
MathSciNet Google Scholar
Jacques J, Preda C (2013) Funclust: a curves clustering method using functional random variables density approximation. Neurocomputing 112:164–171
Google Scholar
Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8(3):231–255
MathSciNet Google Scholar
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666
Google Scholar
James GM, Hastie TJ, Sugar CA (2000) Principal component models for sparse functional data. Biometrika 87(3):587–602
MathSciNet Google Scholar
James GM, Sugar CA (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98(462):397–408
MathSciNet Google Scholar
Jiang H, Serban N (2012) Clustering random curves under spatial interdependence with application to service accessibility. Technometrics 54(2):108–119
MathSciNet Google Scholar
Li T, Song X, Zhang Y, Zhu H, Zhu Z (2021) Clusterwise functional linear regression models. Comput Stat Data Anal 158:107192
MathSciNet Google Scholar
Li Y, Hsing T (2010) Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann Stat 38(6):3321–3351
MathSciNet Google Scholar
Li Y, Wang N, Carroll RJ (2013) Selecting the number of principal components in functional data. J Am Stat Assoc 108(504):1284–1294
MathSciNet Google Scholar
Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19(4):474–482
Google Scholar
Lv Y, Zhu X, Zhu Z, Qu A (2020) Nonparametric cluster analysis on multiple outcomes of longitudinal data. Stat Sin 30(4):1829–1856
MathSciNet Google Scholar
Ma H, Liu C, Xu S, Yang J (2023) Subgroup analysis for functional partial linear regression model. Can J Stat 51(2):559–579
MathSciNet Google Scholar
Ma S, Huang J (2017) A concave pairwise fusion approach to subgroup analysis. J Am Stat Assoc 112(517):410–423
MathSciNet Google Scholar
Ma S, Huang J, Zhang Z, Liu M (2020) Exploration of heterogeneous treatment effects via concave fusion. Int J Biostat 16(1):20180026. https://www.degruyter.com/document/doi/10.1515/ijb-2018-0026/html
Miljkovic T, Wang X (2021) Identifying subgroups of age and cohort effects in obesity prevalence. Biom J 63(1):168–186
MathSciNet Google Scholar
Ng SK, McLachlan GJ, Wang K, Ben-Tovim Jones L, Ng SW (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22(14):1745–1752
Google Scholar
Peng J, Müller HG (2008) Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. Ann Appl Stat 2(3):1056–1077
MathSciNet Google Scholar
Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York
Google Scholar
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Google Scholar
Redd A (2012) A comment on the orthogonalization of B-spline basis functions and their derivatives. Stat Comput 22(1):251–257
MathSciNet Google Scholar
Ren M, Zhang S, Zhang Q, Ma S (2022) Gaussian graphical model-based heterogeneity analysis via penalized fusion. Biometrics 78(2):524–535
MathSciNet Google Scholar
Sangalli LM, Secchi P, Vantini S, Vitelli V (2010) K-mean alignment for curve clustering. Comput Stat Data Anal 54(5):1219–1233
MathSciNet Google Scholar
Sugar CA, James GM (2003) Finding the number of clusters in a dataset: an information-theoretic approach. J Am Stat Assoc 98(463):750–763
MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288
MathSciNet Google Scholar
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
MathSciNet Google Scholar
Wang H, Li R, Tsai CL (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3):553–568
MathSciNet Google Scholar
Wang X, Zhu Z, Zhang HH (2023) Spatial heterogeneity automatic detection and estimation. Comput Stat Data Anal 180:107667
MathSciNet Google Scholar
Xiao P, Wang G (2022) Partial functional linear regression with autoregressive errors. Commun Stat Theory Methods 51(13):4515–4536
MathSciNet Google Scholar
Yao F, Müller HG, Wang JL (2005) Functional data analysis for sparse longitudinal data. J Am Stat Assoc 100(470):577–590
MathSciNet Google Scholar
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
MathSciNet Google Scholar
Zhang X, Zhang Q, Ma S, Fang K (2022) Subgroup analysis for high-dimensional functional regression. J Multivar Anal 192:105100
MathSciNet Google Scholar
Zhou L, Huang JZ, Carroll RJ (2008) Joint modelling of paired sparse functional data using principal components. Biometrika 95(3):601–619
MathSciNet Google Scholar
Zhou L, Sun S, Fu H, Song PXK (2022) Subgroup-effects models for the analysis of personal treatment effects. Ann Appl Stat 16(1):80–103
MathSciNet Google Scholar
Zhu X, Qu A (2018) Cluster analysis of longitudinal profiles with subgroups. Electron J Stat 12(1):171–193
MathSciNet Google Scholar
Zhu X, Tang X, Qu A (2021) Longitudinal clustering for heterogeneous binary data. Stat Sin 31(2):603–624
MathSciNet Google Scholar
Zhu Y, Di C, Chen YQ (2019) Clustering functional data with application to electronic medication adherence monitoring in HIV prevention trials. Stat Biosci 11(2):238–261
Google Scholar

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, San Diego State University, 5500 Campanile, San Diego, CA, 92182, USA
Xin Wang

Authors

Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Wang.

Ethics declarations

Conflict of interest

No financial or non-financial interests are directly or indirectly related to the submitted work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In this appendix, the EM algorithm with a known group structure is presented. The EM procedure is similar to the EM algorithm in James et al. (2000), the main difference is that a new design matrix is constructed based on the given group information.

If the group structure is known, suppose there are $\tilde{K}$ groups and define $\tilde{\varvec{W}}$ be an $n\times \tilde{K}$ matrix with element $w_{ij}$ and $w_{ij}=1$ if i is in the kth group. Also define $\varvec{W}=\tilde{\varvec{W}}\otimes \varvec{I}_{q}$ and $\varvec{U}=\varvec{B}_{0}\varvec{W}$. $\left( \tilde{\varvec{\alpha }}_{1}^{T},\dots ,\tilde{\varvec{\alpha }}_{\tilde{K}}^{T}\right) ^{T}=\tilde{\varvec{\alpha }}=\left( \varvec{U}^{T}\varvec{U}\right) ^{-1}\varvec{U}^{T}\varvec{Y}$ is the estimate of coefficients for $\tilde{K}$ groups $\varvec{\alpha } = (\varvec{\alpha }_1^T,\dots , \varvec{\alpha }_{\tilde{K}}^T)^T$, which is set as the initial estimate of $\varvec{\alpha }$. Thus, $\tilde{\varvec{\beta }}_{i}=\tilde{\varvec{\alpha }}_{k}$ if i is in the kth group. Define

$$\begin{aligned} \varvec{C}_{n}=\frac{1}{n}\sum _{i=1}^{n}\left( \varvec{\beta }_{i}^{*}-\tilde{\varvec{\beta }}_{i}\right) ^{T}\left( \varvec{\beta }_{i}^{*}-\tilde{\varvec{\beta }}_{i}\right) , \end{aligned}$$

where $\varvec{\beta }_i^*$ is obtained using the same procedure in Remark 2. Then, the eigendecomposition is done for $\varvec{C}_{n}=\varvec{\Theta }_{0}\varvec{\Lambda }_{0}\varvec{\Theta }_{0}^{T}$, where $\varvec{\Theta }_0$ and $\varvec{\Lambda }_0$ are the initial values of $\varvec{\Theta }$ and $\varvec{\Lambda }$, respectively.

Similar to the proposed algorithm, the conditional distribution of $\varvec{\xi }_i$ is needed, which has the following forms

$$\begin{aligned} \varvec{\xi }_{i}\vert \varvec{\Omega }\sim N\left( \varvec{m}_{i},\varvec{V}_{i}\right) , \end{aligned}$$

where $\varvec{m}_{i} = \; { E\left[ \varvec{\xi }_{i}\vert \varvec{\alpha }, \varvec{\Theta }, \varvec{\lambda }, \sigma ^2 \right] }$ and $\varvec{V}_{i} = \; { V\left[ \varvec{\xi }_{i}\vert \varvec{\alpha }, \varvec{\Theta }, \varvec{\lambda }, \sigma ^2 \right] }$ with the following form.

$$\begin{aligned} \varvec{m}_{i} = &{ E\left[ \varvec{\xi }_{i}\vert \varvec{\alpha }, \varvec{\Theta }, \varvec{\lambda }, \sigma ^2 \right] }=\left( \varvec{\Theta }^{T}\varvec{B}_{i}^{T}\varvec{B}_{i}\varvec{\Theta }+\sigma ^{2}\varvec{\Lambda }^{-1}\right) ^{-1}\varvec{\Theta }^{T}\varvec{B}_{i}^{T}\left( \varvec{Y}_{i}-\varvec{U}_{i}\varvec{\alpha }\right) ,\\ \varvec{V}_{i} = &{ V\left[ \varvec{\xi }_{i}\vert \varvec{\alpha }, \varvec{\Theta }, \varvec{\lambda }, \sigma ^2 \right] }=\left( \frac{1}{\sigma ^{2}}\varvec{\Theta }^{T}\varvec{B}_{i}^{T}\varvec{B}_{i}\varvec{\Theta }+\varvec{\Lambda }^{-1}\right) ^{-1}. \end{aligned}$$

The only difference between the conditional distribution here and the proposed algorithm is that $\varvec{\alpha }$ is used instead of $\varvec{\beta }$, since the group structure information is given.

Similarly, $\sigma ^2$ is updated by

$$\begin{aligned} \sigma ^{2}&=\frac{1}{\sum _{i=1}^{n}n_{i}}\sum _{i=1}^{n}\left( \varvec{Y}_{i}-\varvec{U}_{i}\varvec{\alpha }-\varvec{B}_{i}\varvec{\Theta }\hat{\varvec{m}}_{i}\right) ^{T}\left( \varvec{Y}_{i}-\varvec{U}_{i}\varvec{\alpha }-\varvec{B}_{i}\varvec{\Theta }\hat{\varvec{m}}_{i}\right) \nonumber \\& \quad +\frac{1}{\sum _{i=1}^{n}n_{i}}\sum _{i=1}^{n}tr\left( \varvec{B}_{i}\varvec{\Theta }\hat{\varvec{V}}_{i}\varvec{\Theta }^{T}\varvec{B}_{i}^{T}\right) . \end{aligned}$$

(25)

Also, the same procedure is used to updated $\varvec{\Theta }$ and $\varvec{\lambda }$ with

$$\begin{aligned} \tilde{\varvec{\theta }}_{j}&=\left( \sum _{i=1}^{n}\varvec{B}_{i}^{T}\varvec{B}_{i}\left( \hat{m}_{ij}^{2}+\hat{\varvec{V}_{i}}\left( j,j\right) \right) \right) ^{-1}\\& \quad \cdot \sum _{i=1}^{n}\varvec{B}_{i}^{T}\left[ \left( \varvec{Y}_{i}-\varvec{U}_{i}\varvec{\alpha }\right) \hat{m}_{ij}-\sum _{l\ne j}\varvec{B}_{i}\varvec{\theta }_{l}\left( \hat{m}_{il}\hat{m}_{ij}+\hat{\varvec{V}}_{i}\left( l,j\right) \right) \right] . \end{aligned}$$

Last, $\varvec{\alpha }$ is updated as

$$\begin{aligned} \tilde{\varvec{\alpha }}=\left( \varvec{U}^{T}\varvec{U}\right) ^{-1}\varvec{U}^{T}\left( \varvec{Y}-\varvec{B}_{0}(\varvec{I}_{n}\otimes \varvec{\Theta })\hat{\varvec{m}}\right) . \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, X. Clustering of longitudinal curves via a penalized method and EM algorithm. Comput Stat 39, 1485–1512 (2024). https://doi.org/10.1007/s00180-023-01380-2

Download citation

Received: 14 October 2022
Accepted: 14 June 2023
Published: 30 June 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00180-023-01380-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering of longitudinal curves via a penalized method and EM algorithm

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

An integrated approach for understanding global earthquake patterns and enhancing seismic risk assessment

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering of longitudinal curves via a penalized method and EM algorithm

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

An integrated approach for understanding global earthquake patterns and enhancing seismic risk assessment

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation