Prediction model for thyrotoxic atrial fibrillation: a retrospective study

Background Thyrotoxic atrial fibrillation (TAF) is a recognized significant complication of hyperthyroidism. Early identification of the individuals predisposed to TAF would improve thyrotoxic patients’ management. However, to our knowledge, an instrument that establishes an individual risk of the condition is unavailable. Therefore, the aim of this study is to build a TAF prediction model and rank TAF predictors in order of importance using machine learning techniques. Methods In this retrospective study, we have investigated 36 demographic and clinical features for 420 patients with overt hyperthyroidism, 30% of which had TAF. At first, the association of these features with TAF was evaluated by classical statistical methods. Then, we developed several TAF prediction models with eight different machine learning classifiers and compared them by performance metrics. The models included ten features that were selected based on their clinical effectuality and importance for model output. Finally, we ranked TAF predictors, elicited from the optimal final model, by the machine learning tehniques. Results The best performance metrics prediction model was built with the extreme gradient boosting classifier. It had the reasonable accuracy of 84% and AUROC of 0.89 on the test set. The model confirmed such well-known TAF risk factors as age, sex, hyperthyroidism duration, heart rate and some concomitant cardiovascular diseases (arterial hypertension and conjestive heart rate). We also identified premature atrial contraction and premature ventricular contraction as new TAF predictors. The top five TAF predictors, elicited from the model, included (in order of importance) PAC, PVC, hyperthyroidism duration, heart rate during hyperthyroidism and age. Conclusions We developed a machine learning model for TAF prediction. It seems to be the first available analytical tool for TAF risk assessment. In addition, we defined five most important TAF predictors, including premature atrial contraction and premature ventricular contraction as the new ones. These results have contributed to TAF prediction investigation and may serve as a basis for further research focused on TAF prediction improvement and facilitation of thyrotoxic patients’ management.

atrial contraction and premature ventricular contraction as new TAF predictors. The top ve TAF predictors, elicited from the model, included (in order of importance) PAC, PVC, hyperthyroidism duration, heart rate during hyperthyroidism and age.
Conclusions: We developed a machine learning model for TAF prediction. It seems to be the rst available analytical tool for TAF risk assessment. In addition, we de ned ve most important TAF predictors, including premature atrial contraction and premature ventricular contraction as the new ones. These results have contributed to TAF prediction investigation and may serve as a basis for further research focused on TAF prediction improvement and facilitation of thyrotoxic patients' management.

Background
Hyperthyroidism is associated with an increase in both total and cardiovascular mortality [1]. The majority of patients with hyperthyroidism are working age individuals. Consequently, its negative social impact is highly signi cant [2].
Atrial brillation (AF) is the most common severe complication of hyperthyroidism. It is known to provoke both thromboembolic events and heart failure and increase mortality [3]. The thyrothoxic AF (TAF) incidence is as follows: 7-8% among middle-aged patients, 10-20% in seniors and 20-35% for those having coronary heart disease or valvular disease [4][5][6]. Hence, TAF prevention is a crucial problem.
Moreover, few studies mentioned new TAF risk factors listed below. They are, undoubtedly, less explored and need to be con rmed. The obesity, presence of chronic kidney disease, proteinuria, increased levels of hepatic transaminases and C-reactive protein are shown to raise TAF risk [12,18]. Conversely, the use of beta-blockers, angiotensin-converting enzyme inhibitors or antiarrhythmic drugs before hyperthyroidism is associated with a lower TAF frequency [9,12,18]. The ndings regarding thyroid hormones level have been controversial. Generally, when investigating overt hyperthyroidism, an association of free triiodothyronine (fT3) or free thyroxine (fT4) level with TAF frequency [4,13,14,19], has not been revealed. By contrast, some researchers have demonstrated that fT3 and fT4, [9] or fT4 exclusively [18], have been higher among patients with TAF.
Therefore, to date, many TAF predictors are known, but the information appears to be insu cient and controversial. In addition, no TAF prediction tool has been developed. To the best of our knowledge, we did this for the rst time. Earlier, we published an article in the «Lecture Notes in Computer Science» within the framework of «International Conference on Computational Science», where the mathematical aspects of TAF prediction instruments development are discussed in detail [20].
The purpose of this study was to build a TAF prediction model and rank TAF predictors in order of importance.
TAF prediction model is an indispensable tool for the early identi cation of individuals with high risk of TAF. It would give practitioners the resources to determine indications for more intensive medical care or early radical treatment of hyperthyroidism (total thyroidectomy, radioiodine therapy) [21][22][23]. This will ultimately lead to a decrease in TAF frequency. The practical implications of the current study have been TAF prevention, and, as a result, decrease in health-care costs.
Since machine learning can improve the accuracy of the prediction, and its application in the medical eld has yielded promising results [24][25][26][27], we used it to develop our model. Machine learning is a datadriven approach that can identify nonlinear associations and complex interactions between variables without the need to pre-specify these relationships a priori [28]. Thereby, in modeling risk, the machine learning is doing more than merely approximating physician skills but nding novel relationships not readily apparent to human beings [25]. Starting with patient-level observations, algorithms sift through vast numbers of variables, looking for combinations that reliably predict outcomes [24]. All this makes machine learning an excellent method for prediction instruments construction. he data are presented as a mean ± standard deviation for abnormal distribution and as a median (interquartile range (IQR)) for abnormal distribution.

.2 Derivation of a thyrotoxic atrial fibrillation prediction model
We used machine learning techniques and Python 3.6 for a TAF prediction model evelopment.
Hereafter we described the steps of the model development. We removed the features of low importance for model output. We also eliminated the eatures of low clinical effectuality such as serum potassium and lipids, since their oncentrations are highly variable and strongly depend on the drugs taken and the diet. As a esult, ten most important and clinically feasible features were selected for the final model.

Preprocessing of the data:
reprocessing of the data comprised the following steps: normalization (module sklearnreprocessing-normalize), scaling (module sklearn-preprocessing-scale), resampling for the alance of classes, replacing the data gaps.
3. Splitting the data: o evaluate the models' quality, we randomly divided the study sample into two parts: 70% n=294) were used for the estimation of the models (training) and 30% (n=126) for the alidation (testing).

Used classification machine learning algorithms:
We investigated the performance of the following machine learning methods: logistic Partial dependence plot. It shows the marginal effect one or two features have on the predicted outcome of a machine learning model [37]. To construct partial dependence plot, a variable is selected, and its value is continuously changing, whilst a change in the prediction value is observed and recorded.

.3 Investigation of the TAF predictors elicited from the model
We used feature importance and SHAP values methods to rank and select the most important AF predictors elicited from the model.

Characteristics of the study group
The study cohort consisted of 420 subjects with a history of overt hyperthyroidism, 79.3% women and 20.7% men, whose mean age at the onset of hyperthyroidism was 44.3±12.1 years. 94% of patients had GD, others had nonimmune thyroid pathology: TA or MNG.
Detailed characteristic of the study population is shown in table 2.
TSH level was lower than the detection limit of 0.01 μIU/l in the majority of cases. When calculating the median for the group, it was considered that these individuals had TSH level of 0.01 µIU/l. The median, thereby, was presented as <0.014 µIU/l (table 2).
The lipid panel assessment showed that TC, LDL and TG mean levels were target (for low or moderate cardiovascular risk). HDL mean level for the men and women was at the lower limit of the target range.
Page 12/31 The proportion of diabetes cases was high due to the big amount of diabetes patients at Almazov centre and Pavlov University. They were enrolled in the study because they had hyperthyroidism as a secondary diagnosis. The median heart rate during hyperthyroidism of the study cohort was 94 bpm (IQR 85; 103.5 bpm). Sinus tachycardia (heart rate ≥90 bpm) was found in 64.3% of participants.
Regarding TAF, we intentionally enrolled TAF subjects in the study cohort, which explains the abnormally high percentage (30.2%) of these patients in our sample.     3. Thyrotoxic atrial fibrillation prediction models

Interpretation of the prediction models
In this section we present the results of applying three interpretability techniques for our TAF prediction model. They are as follows: Feature Importance, Shapley Values and Partial Dependence Plot.
Feature importance method Figure 1 shows the ranking of the input features importance. As shown in the figure, the feature other heart rhythm disorders during hyperthyroidism is the most important one, followed by PAC and PVC during hyperthyroidism. The variable relapses of hyperthyroidism is the least significant feature. igure 3 provides the interpretation of the model prediction for one random patient. We ighlighted the variables that had a strong impact on the model prediction for the patient.
he influence values of the features were calculated by the SHAP method. Features ncreasing TAF probability were marked in red, the ones reducing TAF -in blue. Heart rate uring hyperthyroidism of 98 bpm and PAC during hyperthyroidism increased the probability f TAF most strongly. Features, reducing the probability of TAF for this particular patient, were as follows: short duration of hyperthyroidism (Duration of HT = 9), absence of PVC PVC during HT = 1), absence of arterialhypertension during hyperthyroidism (AH during HT = 1) and heart rate-reducing therapy during hyperthyroidism (HRRT during HT = 2). The uration of hyperthyroidism had the strongest absolute influence on the resulting value. As a esult, TAF development probability of 7% was calculated for this patient.
3. Partial Dependence Plot method Figure 4 shows the cumulative effect of two predictors. This effect was calculated by the Partial dependence plot method. The scale shows how age and hyperthyroidism duration alues alterations change TAF probability, provided the other features values are fixed. If a atient was older than 33, and hyperthyroidism duration was more than 20 months, the atient had TAF development risk more than 0.5. These two features increased the robability of TAF, when their values were increasing. Minimal risk value was 0.16 for atients who were younger than 20 with the short period of hyperthyroidism. Maximal risk alue was 0.7 for patients who were older than 60 with the period of hyperthyroidism for ver 40 months.

Top thyrotoxic atrial brillation risk factors elicited from the prediction model
The next aim of the study was to rank TAF predictors by the importance value and identify

Discussion
High TAF prevalence among hyperthyroid patients [4,[38][39][40] and the lack of any TAF prediction system motivated this research. To the best of our knowledge, we developed the rst TAF prediction model. Top ve risk factors emerging from our model include age, hyperthyroidism duration, PAC, PVC and heart rate during hyperthyroidism.
We believe a TAF prediction tool would be of great use. It would help determine indications for early radical hyperthyroidism treatment and improve patient counselling and management [21][22][23].
We used machine learning methods to build the model. Among eight evaluated machine learning classi ers, XGB classi er achieved the best performance metrics. Our nal XGB model had a reasonable accuracy of 84% and good discrimination ability with AUROC of 0.89 on the test set. Among 36 investigated potential TAF predictors, ten were selected as input variables for the model. The variables were ranked by feature importance and SHAP methods ( gures 1 and 2). These methods calculate the importance value in different ways and, therefore, could produce differing results [36]. The prediction model takes into account variables characteristics displayed by both methods. Hereafter, we will discuss the predictors inferred from the model in comparison with the previous ndings in the eld.
To begin with, we will consider our ndings on rhythm disorders as TAF predictors. Both feature importance and SHAP methods showed that PAC and PVC during hyperthyroidism are among ve most important TAF risk factors ( gures 1 and 2). It seems that PAC and PVC impact on TAF had not been investigated before, and, in our study, they were de ned as novel TAF predictors.
We would like to emphasize that our model con rms such widely acknowledged TAF risk factors as age, sex and hyperthyroidism duration. According to the SHAP method, hyperthyroidism duration had the highest impact on model output, while age and sex ranked the fourth and the seventh out of ten factors, respectively ( gure 2). In contrast, the feature importance method shows that hyperthyroidism duration had the mean importance value among ten input variables. Age and sex were almost the least important factors (table 1).
The next known TAF risk factor is heart rate. Earlier, heart rate above 80 bpm was mentioned as a TAF predictor [12]. However, our ndings were dissimilar. Machine learning methods showed the nonlinear interaction between heart rate and TAF. Figure 2 shows that low heart rate reduces TAF risk, the medium values mostly increase it, but the highest ones have a minimal impact on model output. The latter phenomenon could be due to the scarce information obtained. We had only several heart rate measurements from medical records, which may not re ect the actual heart rate.
The concomitant cardiovascular diseases is another TAF predictor. As early as in 1959 G. Sandler and G.M. Wilson showed that TAF frequency was signi cantly higher in patients with cardiovascular diseases preceding hyperthyroidism [15]. According to the more recent studies, coronary heart disease, congestive heart failure and high blood pressure signi cantly increase TAF risk [4,7,12,17]. Our ndings on cardiovascular diseases were mixed. On the one hand, we showed that hypertension and congestive heart failure existence (both before and during hyperthyroidism) raise TAF risk. Moreover, arterial hypertension during hyperthyroidism was the only su ciently important variable to be included in the model. On the other hand, contrary to the majority of studies [4,12,17], we did not nd coronary heart disease or history of myocardial infarction to predict TAF.
Next, we would like to consider the less investigated TAF predictor, that is, heart rate lowering drug use. It is worth noting, that we explored the heart rate-reducing therapy both before and during hyperthyroidism as two separate variables. All patients before hyperthyroidism and 97% of those during hyperthyroidism received beta-blockers as this therapy. The heart rate-reducing therapy before hyperthyroidism had a minimal impact on TAF prediction according to machine learning methods and, based on that, was excluded from the prediction model. It might be of interest to note, that the classical statistical methods showed that the patients receiving beta-blockers before hyperthyroidism were more prone to TAF.
However, there is some evidence that beta-blockers could decrease TAF incidence [12,18]. The divergent results could be explained by the following fact. In our study almost all the participants who received beta-blockers before hyperthyroidism had concomitant cardiovascular diseases. All of them had arterial hypertension and 78.8% -coronary heart disease. These cardiovascular diseases are known to signi cantly contribute to TAF. The heart rate-reducing therapy during hyperthyroidism was included in the model. We found this therapy to decrease TAF risk. Therefore, beta-blockers use during hyperthyroidism might be an effective TAF preventive measure.
Lastly, we consider the number of hyperthyroidism relapses, which was not previously mentioned in the literature as a TAF predictor. It was the least important variable of the model ( gures 1 and 2); therefor we did not single out it as a new TAF risk factor.
The study comes with some limitations. Firstly, being retrospective, the research has no randomization factor that cuts off unknown or unrecorded effects on the studied features. Secondly, the sample size is smaller than the optimal one for machine learning methods. In addition, the study participants were recruited from two healthcare organizations. Consequently, the model's accuracy may change when tested in different cohorts. For this reason, the tool needs to be validated in other studies. As for the new TAF predictors, PAC and PVC, since there are no other studies testing their predictive value, these results also need to be con rmed. Another limitation is the fact, that three input variables (PAC, PVC and other rhythm disorders during hyperthyroidism) required ECG results. These variables complicate the collection of information, necessary for TAF risk calculation. The next limitation regards gathering information on rhythm disorders. Holter monitoring, performed in the onset of hyperthyroidism, would be the most appropriate method for rhythm disorders detection. In our study, the data were ascertained either from ECG and Holter monitoring, or from the anamneses and diagnoses in the medical records. Finally, our prediction model has been developed without determination of the period the forecast is intended to cover. This makes the model less convenient for practical use, because preventive measures for de nite period are more effective.

Conclusions
We have developed the machine learning model which predicts TAF with 84% accuracy. It seems to be the rst available TAF prediction tool.
In addition, we have identi ed that TAF risk factors with the highest predictive ability include PAC, PVC, age, heart rate during hyperthyroidism and hyperthyroidism duration. All listed above arrhythmias seem to be the new TAF predictors. Further studies have to con rm these new TAF risk factors, as well as validate the usefulness and appropriateness of our model in independent cohorts. The study could serve as a basis for further research focused on TAF prediction improvement and facilitation of thyrotoxic patients' management. Our results could be considered in the development of TAF risk scales, introduction of which into the clinical practice has a potential to reduce TAF incidence.     Migration of the pacemaker through the atria (Red) HT=hyperthyroidism. AH=arterial hypertension.