Application of Penalized Mixed Model in Identification of Most Associated Factors with Hemoglobin A 1 c Level in Type 2 Diabetes

Background: The effect of controlling blood sugar on decreasing diabetes complications and their fatality has been investigated in many cross-sectional studies, but instability of blood sugar and some of the potential effective factors on it during the time render these studies imprecise and unreliable. Exploring among a big number of possible covariates is another challenging issue which renders the traditional methods inefficient. Therefore, we aimed to determine factors which are mostly associated with HbA1c level, among a large number of potential covariates using penalized linear mixed model in a longitudinal study Method: The participants consisted of diabetic patients referred to Endocrine and Metabolism Research Center of Isfahan from 2000 to 2012 who were measured 4-12 times. Linear mixed model with LASSO penalty was used to investigate the relationship between HbA1c and the factors which potentially affect HbA1c. SPSS version 18 and glmmLassopackage in R. 3.1.3 software were used for statistical analysis. Results: Most of the 360 patients, (62.5%) were female. Their mean age was 52.2 years (SD=9.24) and median number of their visit was 5 with inter-quartile range of 4 to 6. The simple mixed model revealed that all of the covariates had significant effects on HbA1c, but using LMMLASSO led to elimination of 8 redundant covariates from the final model. Conclusion: By Appling linear mixed model with LASSO penalty retinopathy, hypertension, cholesterol, HDL and TG had the most significant association with HbA1c level.


Introduction
Diabetes is the most prevalent metabolic and chronic non-communicable disease in the world which causes a decrease in the quality of life, complications, morbidity, and even mortality of patients (Hosseini, Tazhibi, Amini, Zaree, & Hashemi, 1389;Sadeghi, Kharazmi, Javanbakht, Heidari, & Bayati, 2012).The prevalence of diabetes in Iran is 8% which is the highest rate in the world (9.84% in males and 10.68% in females).Also, according to the World Health Organization, this figure will be increased to 7 million in Iran by 2030 if appropriate preventive measures are not implemented (Khaledi, Moridi, &Gharibi, 2011;Mahmoudi, 2006).By careful control of blood sugar, a lot of complications such as ischemic heart disease, stroke, retinopathy, neuropathy, and nephropathy can be delayed or even their occurrence can be prevented (Mahmoudi, 2006).Some traditional methods of measuring blood sugar are fasting blood sugar (FBS) and glycosylated hemoglobin (HbA1c).The normal range for HbA1c for diabetes treatment is an important goal although FBS is a more common method.Fasting state is not necessary forHbA1c test and this test "has greater pre-analytic stability and less biological perturbations compared with FBS", so it is a suitable diagnostic test (Chen, Magliano, & Zimmet, 2012).Additionally, FBS and HbA1C levels are strongly correlated (Kodiatte et al., 2012).Therefore, identification of the effective factors on the HbA1c can guide health policy makers to establish suitable preventive health policies.
As HbA1c and some of the potential effective factors are not stable during the time, trusting one observation of a person is not wise.To handle this challenge, we conducted a longitudinal study and for each person the HbA1c was measured in different months.Thus, the need for a longitudinal study is obvious.
Analyzing longitudinal data by mixed models is one of the best statistical methods because they can handle the dependency of observations which belong to an individual while disregarding it will cause a decrease in the precision of the model parameters (Fitzmaurice, Laird, & Ware, 2012).Despite all their advantages, mixed models like the other regression models are suffering from high dimensionality (when the number of variables is huge).To resolve this shortcoming, we applied penalized mixed model in this study.The difference between traditional mixed models and penalized mixed model is that there is no restriction on the number of covariate in the latter model.One of the important properties of the penalized models is that the process of variable selection and estimation is done simultaneously, which causes more accuracy than common methods of variable selection (Groll & Tutz, 2014).
The univariate effect of many potential effective factors on HbA1c, such as physical activity, cardiometabolic risk, diastolic dysfunction, anxiety, depression and retinopathy, etc., has been investigated in many cross-sectional studies (Beraki, Magnuson, Särnblad, Åman, & Samuelsson 2014;Camara et al., 2015;Chaudhary, Aneja, Shukla, & Razi, 2015;Millar, Perry, & Phillips, 2015;Sabanayagam et al., 2014), but such studies are not appropriate due to the reasons mentioned above.Also, the effect of some potential factors on HbA1c in many longitudinal studies was determined using simple (not mixed) statistical methods.For example, in a study done in 2015, the effect of high saturated fat, healthy diet and physical activity on HbA1c was evaluated through multiple linear regression (Jansen et al., 2015).In another cohort study, the association between HbA1c and microvascular complications was studied by applying life table analysis with Wilcoxon (Gehan) log-rank test (Nordwallet al., 2015).Additionally, the relationship between HbA1c and heart failure was investigated in 2015 in a longitudinal study using proportional hazard regression (Parry et al., 2015), but obviously their methods were not the best due to the correlation between the observations.Some longitudinal studies were also done, in which traditional mixed model was applied instead of usual statistical methods mentioned before.For example, in a study of determining predictors of glycemic control among patients with type 2 diabetes, a mixed model was applied with seven predictors (Benoit, Fleming, Philis-Tsimikas, & Ji, 2005).Moreover, reduction in Hemoglobin A1c was evaluated through mixed model (Bailey, Zisser, & Garg, 2007).In another study, mixed model was applied for modeling HbA1c, the five static independent variables, and four non-static variables (Dallal, Hatalski, Trang, & Chernoff, 2011).
To the best of our knowledge, there is no study on the simultaneous effect of huge number of variables on HbA1c (near 20), so the aim of our study was to find the factors which had the most effect on HbA1c level among a large number of variables by applying penalized mixed model in a longitudinal study.

Material and Method
This was a retrospective cohort study and the participants consisted of diabetic patients who referred to Isfahan Endocrine and Metabolism Research Center from 2000 to 2012.After omission of the missing data, , 360 out of the 1500 diabetics who referred to this center in this period were considered eligible for participating in the study and the number of records was 1942 totally.The inclusion criteria were over 30 years old type 2 diabetic patients who had presented the training classes of this center and referred to this clinic at least four time during the study period and once a year, and had HbA1c test at each referral, with maximum of 12 measurements.The patients' diabetes was diagnosed through ADA criteria, fasting plasma glucose ≥ 126 mg/dl or oral glucose tolerance test ≥ 200 or random glucose ≥ 200, and symptoms (Association, 2013).

Outcome and Covariates
Independent variables were age, sex, weight, height, body mass index at the first visit; duration of diabetes; history of certain diseases during the follow-up; the type of treatment; systolic blood pressure (SBP) and diastolic blood pressure (DBP); blood lipids including cholesterol, triglyceride (TG), high-density lipoprotein (HDL), low-density lipoprotein (LDL); creatinine; ischemic heart disease (IHD); and complications such as retinopathy and proteinuria.Also, the three treatments containing insulin, oral anti-diabetic drugs (OADs), OADs plus insulin and regimen was another independent variable.

Statistical Methods
The linear mixed model (LMM) was used as follows: where specifies the design matrix of the i-th cluster, = ( , … , ), is fixed effects vector and b is random effects vector that has a normal distribution with covariance matrix Q(ρ).Also, φis dispersion parameter, , and , .The corresponding log-likelihood for LMM is Tibshirani in 1996 introduced The LASSO (least absolute shrinkage and selection operator).By adding a penalty term on the log-likelihood of LMM, linear mixed model LASSO (LMMLASSO) is represented as follows for given Where λ is a positive constant which is known as tuning parameter.The nature of this penalty leads to simultaneous estimation and variable selection.For estimating lambda, Akaik information criteria (AIC) were used (Groll & Tutz, 2014).
SPSS version 21 and glmmLasso package in R. 3.1.3software was used for statistical analysis.In the linear mixed model LASSO, statistical inferences are not based on P-values, because the process of parameter selection and estimation is done simultaneously, so when the coefficient of a variable is zero, this implies lack of statistical significance of that variable and the non-zero coefficients show the significant variables (Groll & Tutz, 2014).

Results
Most of the patients (62.5%) were female.Their mean of age (SD) was 52.2 ± 9.24 and mean number of visiting was 5 with inter-quartile range of 4 to 6. Descriptive statistics are presented in Table 1.Simple linear mixed model with random intercept was fitted to find the effect of covariates on HbA1c.The simple mixed model revealed that all of the covariates had significant effects on HbA1c (the P-value for season was 0.006, for month it was 0.043, and for the other covariates it was less than 0.001).
In order to estimate the simultaneous effects of covariates on response, linear mixed model with LASSO penalty (LMMLASSO) was fitted and AIC criteria was used to find the optimum value of lambda.Using LMMLASSO, 8 redundant covariates were eliminated from the final model as their estimated coefficients were zero.The presented model introduced retinopathy, hypertension, cholesterol, HDL and TG as the covariates that have the most significance effect on HbA1c (Table 2).The HbA1c of patients with retinopathy was .35more than those without retinopathy.Also, HbA1c was .17less in patients with hypertension than those without hypertension.In addition, one unit increase in TG and CHOL was associated with 0.12 and 0.16 increase in HbA1c, respectively.For SBP, DBP and creatinine, the association was reversed although it was very small (-0.06, -0.02 and -0.05, respectively).The effect of month was significant so that the HbA1c in each month was less in comparison to the first month of the year.

Discussion
We identified the factors that had the most effect on HbA1c among a large number of covariates in a longitudinal study.Linear mixed model with LASSO penalty introduced retinopathy, hypertension, cholesterol, HDL and TG as the most significant effective factors on HbA1c while in the univariate mixed models all the variables were significant.In this study, the treatment was not an effective factor on HbA1c.The non-significant association is likely due to the fact that most of the patients with type 2 diabetes had undergone OAD treatment.This might be true aboutthe non-significant effect of Proteinuria.
Our finding confirmed the previous studies' results.For example, in a study of inspecting the association between glycaemic control and serum lipids profile in type 2 diabetic patients, the correlations between HbA1c with cholesterol, TG and LDL were direct and there was an inverse correlation with HDL.This relationship shows that HbA1c can be used as a good predictor of lipid profile in addition to being an indicator for long-term glycaemic control.In addition, the diabetic patients with poor glycogenic control (HbA1c >6%-9%) showed a significant increase in cholesterol and TG and a decrease in HDL, but no significant change in LDL was observed.This is in agreement with our non-significant association between HbA1c and LDL (Khan, Sobki, & Khan, 2007).Also, in a cross-sectional study by Ram VinodMahato et al., positive and significant correlations were found between HbA1c and TG, LDL and HDL.They found a significant association between HbA1c and TG (Vinod Mahato et al., 2011).
Another finding of this study was the positive relationship between increasing levels of HbA1c and retinopathy.This was consistent with previous studies.In a study by Tam VH and his colleagues, it was found that baseline HbA1c levels were higher in those suffering retinopathy in comparison with those without this complication (Tam, Lam, Chu, Tse, & Fung, 2009).In another study conducted in 2012, an association was observed between HbA1c and retinopathy (Kostev & Rathmann, 2013).Similarly in another observational study conducted in 2000, the lowest risk for diabetic complication as retinopathy was for those with normal range of HbA1c and it was shown that the incidence of these complications was significantly related to the levels of HbA1c (Stratton et al., 2000).
The result of our study showed reduction of HbA1c in each month in comparison with the first month of the year.One justification might be that the first month of the New Year is Nowroz days and it is the holidays' time in Iran; people visit their families and eat too much nuts and cookies according to the tradition in these days.Another explanation for this might be lack of referring to doctors in these days due to happiness and the traditional beliefs indicating that "if you visit doctors in the first month of the New Year, you will be sick in the other months of the year." Reduction in the risk of development of microvascular complications by standard glycemic control has been demonstrated in many studies; in contrast, reduction in macrovascular disease risks such as ischemic heart disease with severe glycemic control has been difficult to demonstrate.This association was observed in none of the DCCT and Kumamoto's studies.This may justify the non-significant effect of CHD in our study (Kirkman et al., 2006).
One of the limitations of this study was the patients' irregular visits.In addition, half of the observations were roughly excluded from the analysis because the penalized mixed model cannot handle the missing observations.
There are many cross-sectional or longitudinal studies that have investigated the association between one or a limited number of factors and HbA1c through simple statistical method or traditional mixed model.A key feature of this study was considering a large number of variables simultaneously using penalized mixed model which is more precise than the usual mixed models and can handle a huge number of variables even in the case of small sample size.Also, data were gathered with high precision and according to reliable standards, so this study has a good quality both in terms of data gathering and statistical method.

Conclusion
In conclusion, the simple mixed model revealed that all of the covariates had significant effects on HbA1c, but using LMMLASSO led to elimination of 8 redundant covariates from the final model.

Table 1 .
Descriptive statistics of the studied variables

Table 2 .
The Coefficient and standard error of penalized mixed model *Non significant variables; ** standard error is not estimated for non significant variables.