Electrocardiogram Monitoring Wearable Devices and Artificial-Intelligence-Enabled Diagnostic Capabilities: A Review

Worldwide, population aging and unhealthy lifestyles have increased the incidence of high-risk health conditions such as cardiovascular diseases, sleep apnea, and other conditions. Recently, to facilitate early identification and diagnosis, efforts have been made in the research and development of new wearable devices to make them smaller, more comfortable, more accurate, and increasingly compatible with artificial intelligence technologies. These efforts can pave the way to the longer and continuous health monitoring of different biosignals, including the real-time detection of diseases, thus providing more timely and accurate predictions of health events that can drastically improve the healthcare management of patients. Most recent reviews focus on a specific category of disease, the use of artificial intelligence in 12-lead electrocardiograms, or on wearable technology. However, we present recent advances in the use of electrocardiogram signals acquired with wearable devices or from publicly available databases and the analysis of such signals with artificial intelligence methods to detect and predict diseases. As expected, most of the available research focuses on heart diseases, sleep apnea, and other emerging areas, such as mental stress. From a methodological point of view, although traditional statistical methods and machine learning are still widely used, we observe an increasing use of more advanced deep learning methods, specifically architectures that can handle the complexity of biosignal data. These deep learning methods typically include convolutional and recurrent neural networks. Moreover, when proposing new artificial intelligence methods, we observe that the prevalent choice is to use publicly available databases rather than collecting new data.


Introduction
The electrocardiogram (ECG) is among the most commonly utilized clinical tests for patient monitoring and assessment because it is easy to acquire and provides extensive information about patients' cardiac health [1]. Instead, continuous, real-time, remote monitoring allows for a more rigorous oversight of patients' conditions, even compared to in-hospital observation. Wearable devices to address monitoring are now a prominent focus of industry [1][2][3][4][5][6], which in turn provides strong motivation for applying artificial intelligence (AI) algorithms to ECG signals for automated disease detection and prediction [7][8][9][10][11].
Therefore, this review focuses on wearable medical devices for ECG acquisition followed by AI analysis (ECG-AI) to predict and detect specific diseases ( Figure 1). We mainly focused on the published results obtained with single-lead ECG systems, which are widely used in ambulatory monitoring but are not comfortable to wear for long periods. The use of single-lead ECG has the potential to give important diagnostic information on the user's health [1,5] but also has some limitations compared to the standard 12-lead ECG [6].
We examined publications on ECG signals and AI technology applied to wearable and mobile devices for predicting and detecting diseases. Most of the included papers are related to CVD, followed by, in order of number of published studies, the other three groups: (1) sleep apnea, (2) mental health and epilepsy, and (3) other applications such as hyperglycemia and hypoglycemia ( Figure 2). While other diseases such as hyperkalemia, hypokalemia, and acute pulmonary embolism are addressed in the literature related to ECG-AI, these studies were not included here because they generally use 12-lead ECGs and do not focus on wearable applications.

Arrhythmias
Cardiac arrhythmia is an abnormal rhythm of the heartbeat [19]. The electrical pathway of a normal cardiac contraction has a characteristic electrical pattern on an ECG recording, comprised of a "P" wave (indicating atrial depolarization), followed by a "QRS" complex (indicating ventricular depolarization), and a "T" wave (indicating ventricular repolarization). A typical ECG is shown in Figure 3. Perturbations in the ECG may indicate underlying pathophysiologic changes. Common conditions that can be discerned from ECG changes include various arrhythmias. The most common type of irregular arrhythmia is atrial fibrillation (AF), which is characterized by disorganized electrical impulses of the atrium. AF increases the risk of stroke by up to 17% annually in high-risk individuals [20]. In addition, AF with sustained ventricular rates greater than 110 beats per minute can lead to cardiomyopathy, heart failure (HF), and sudden cardiac death if not adequately treated [21]. The worldwide prevalence

Arrhythmias
Cardiac arrhythmia is an abnormal rhythm of the heartbeat [19]. The electrical pathway of a normal cardiac contraction has a characteristic electrical pattern on an ECG recording, comprised of a "P" wave (indicating atrial depolarization), followed by a "QRS" complex (indicating ventricular depolarization), and a "T" wave (indicating ventricular repolarization). A typical ECG is shown in Figure 3.

Arrhythmias
Cardiac arrhythmia is an abnormal rhythm of the heartbeat [19]. The electrical pathway of a normal cardiac contraction has a characteristic electrical pattern on an ECG recording, comprised of a "P" wave (indicating atrial depolarization), followed by a "QRS" complex (indicating ventricular depolarization), and a "T" wave (indicating ventricular repolarization). A typical ECG is shown in Figure 3. Perturbations in the ECG may indicate underlying pathophysiologic changes. Common conditions that can be discerned from ECG changes include various arrhythmias. The most common type of irregular arrhythmia is atrial fibrillation (AF), which is characterized by disorganized electrical impulses of the atrium. AF increases the risk of stroke by up to 17% annually in high-risk individuals [20]. In addition, AF with sustained ventricular rates greater than 110 beats per minute can lead to cardiomyopathy, heart failure (HF), and sudden cardiac death if not adequately treated [21]. The worldwide prevalence Perturbations in the ECG may indicate underlying pathophysiologic changes. Common conditions that can be discerned from ECG changes include various arrhythmias. The most common type of irregular arrhythmia is atrial fibrillation (AF), which is characterized by disorganized electrical impulses of the atrium. AF increases the risk of stroke by up to 17% annually in high-risk individuals [20]. In addition, AF with sustained ventricular rates greater than 110 beats per minute can lead to cardiomyopathy, heart failure (HF), and sudden cardiac death if not adequately treated [21]. The worldwide prevalence of AF was estimated at approximately 46 million individuals in 2016 [22], with up to one-third of these individuals being asymptomatic and thus unaware they have AF while also being at increased risk of stroke.
In addition to AF, there are other arrhythmias for which wearable ECG devices are amenable including premature atrial contraction, premature ventricular contraction (PVC), atrial flutter, atrioventricular reentrant tachycardia, atrioventricular nodal reentrant tachycardia, and first-, second-, or third-degree heart block. Several recent papers demonstrated the use of wearable technology capable of identifying premature atrial contractions or PVCs with over 97% accuracy [17,18,23,24]. A class of malignant arrhythmias has a high risk of progression to cardiac arrest or even death [25]. Examples of malignant rhythms include ventricular tachycardia and ventricular fibrillation.

Coronary Artery Disease
Coronary artery disease is the insidious buildup of cholesterol plaques within the walls of the arteries of the heart, eventually leading to a narrowing of the blood vessels [26]. When the narrowing of blood vessels surpasses a critical threshold (often described as a narrowing of greater than 70% of the inner lumen of the artery), symptoms such as exertional chest pain (angina), exertional shortness of breath, and decreased exercise tolerance can occur. Coronary artery disease accounts for the vast majority of cardiac-related deaths [27]. A diagnosis of coronary heart disease generally requires a history and physical exam, a stress test, and an observation of ECG changes suggestive of cardiac ischemia.
Various ECG changes are associated with acute and chronic ischemia. For instance, the presence of Q waves in any lead other than the right-sided leads (i.e., aVR and V1, occasionally in III) is often pathognomonic for prior infarction and non-viable myocardium [28]. On the other hand, chronically inverted T-waves and ST depressions are generally described as non-specific ECG patterns and are difficult to interpret on their own, requiring additional context. However, in the correct clinical setting, these changes can be dynamic where they appear while the patient has active symptoms and normalize when they resolve. Such dynamic changes indicate significant coronary artery disease that needs to be aggressively investigated because the sudden development of ST-segment elevation associated with symptoms suggests an evolving coronary artery occlusion and subsequent myocardial impairment. Such patients need to be examined then treated immediately. Future work to develop ECG-AI wearables for real-time detection of acute ischemia will likely improve outcomes.

Wearables
ECG-AI has been combined with wearable devices to investigate various cardiac pathologies, including AF, stroke, cardiac arrest, and heart failure. In fact, arrhythmia monitoring is among the most popular applications of wearable devices in medicine. However, wearable devices are limited in their ability to detect arrhythmias other than AF [6,29], particularly ventricular tachycardia or ventricular fibrillation, which is why wearable technologies capable of accurately detecting either ventricular tachycardia or ventricular fibrillation were limited in the literature.
Overall, there are a limited number of studies involving wearables. Some studies use commercially available wearables to explore the implementation of ECG-AI. For example, devices such as the Amazfit Band 1S (PPG and single-lead ECG) [30], the HealthyPiV3 biosensors [31], or Polar H7 HR monitor [32] have been utilized. A few research groups have even built their own wearable ECG recording prototypes [33][34][35].
The Food and Drug Administration (FDA) recently approved a single-lead ECG smartwatch proven to detect AF in the general population [36]. Another device developed for AF monitoring and detection includes a single-lead wireless ECG patch worn over the chest, which provides real-time ECG monitoring using cloud-based data analysis and data sharing with medical providers [13]. Similarly, a custom wrist-based wearable ECG recorder was compared to the standard 12-lead configuration via a prospective, registration-only, single-center study for the detection of AF [37]. Although a small dataset based on a relatively low number of patients was used, a sensitivity and specificity of 99.4% and 99.8%, respectively, were reported. The wrist-based device's convenience and ease of use was highlighted as an attractive modality for arrhythmia detection in the general population. Lastly, a single-lead ECG chest belt that transmits data to a cloud service for analysis was described, and a sensitivity and specificity of 100% and 95.4%, respectively, were reported [38]. The study included a user experience questionnaire, showing that 77% of participants preferred the chest belt to a standard 3-lead Holter monitor. Additional studies detecting AF have been performed using commercially available heart rate monitors and ECG systems [30][31][32][39][40][41]

. Arrhythmia
Due to their ubiquitous availability, most ECG-AI research has been performed using public databases such as the PhysioNet [42] MIT-BIH Arrhythmia database [43,44] while only a few research groups have independently acquired data from patients. Curated and publicly available datasets include physician annotations that provide a reference for ECG-AI algorithm training (Table 1).
Machine learning (ML) and deep learning (DL) have both been extensively applied to ECG data to detect arrhythmias. Despite being relatively poorer performing, ML is utilized for arrhythmia detection due to some of the limitations of DL, including resourceintensive hyper-parameters to find the optimal network configuration and the challenges in understanding the rules underlying trained prediction models [45]. However, DL has shown modest improvements over ML for arrhythmia detection. The varying sample resolutions could pose a challenge for these techniques, but it was shown that it is possible to accurately detect arrythmias using down sampled ECG data [46].
ML approaches often include the use of decision tree ensembles such as Random Forest [13,47] or support vector machines (SVMs) [40,48] for arrhythmia classification. Multi-stage and multi-level classification systems derive local features of atrial and ventricular activity through a combination of SVMs and decision trees and global features from the raw ECG recording, ultimately leading to classification through linear SVMs. Furthermore, a rotated linear-kernel SVM has been proposed in which two SVM classifiers are trained, one on the global dataset and the other on a patient-dependent dataset obtaining two different discriminant hyperplanes. The final hyperplane, obtained by rotating the first hyperplane by a specific amount towards the second hyperplane, resulted in an improved sensitivity [49]. Similarly, this ML method has been used with a classifier of de-correlated Lorenz plots of inter-beat intervals [32], and with another classifier built on features extracted through pre-processing methods from density Poincaré plots that represented the ECG segments [23]. Alternatively, the use of SVMs through a semi-supervised learning method was demonstrated [50], while the hybrid framework effectively combined the advantages of ensemble learning and evolutionary computation to maximize arrhythmia classification accuracy [51].
With regard to DL approaches, convolutional neural network (CNN) architecture was applied to arrhythmia [52][53][54] and AF classifications [24,55]. Other architectures of interest for AF classification include a deep densely connected neural network based on 12-lead ECG [15], a feedforward neural network based on features encompassing R-R intervals [56] and another based on the Lightweight Fusing Transformer [17]. Hybrid constructions have also been presented, frequently involving an architecture based on a CNN and long short-term memory (LSTM) [57][58][59][60], as well as an extension to SVM with predictions from a CNN [41]. With a similar premise to the rotated linear-kernel SVM [49], a study has proposed a Generic CNN suitable for all individuals, and a tuned dedicated CNN as obtained by finetuning the previous model with respect to a specific individual [61]. Another approach of interest is the use of multi-scale (MS) CNNs to improve feature extraction and classification from ECG data [62]. Additionally, a global hybrid multi-scale convolutional neural network (Acc 99.84%) was proposed as an advanced alternative to other MS-based approaches through their hybrid multi-scale convolution module [63].
Previous research has also designed lightweight DL models using cloud-based applications to efficiently classify ECG data. These approaches utilize fused recurrent neural network (RNN) layers instead of standard RNN layers [39]. The application of compression [44,64] and conversion techniques (Acc 99.60%) [65], and model-hardware cooptimization [66] to reduce the model's size in terms of computational parameters, resulted in lower memory consumption and inference time. Other techniques to accelerate arrhythmia detection include real-time data compression, signal processing, and data transmission [67][68][69]. Alternatively, ECG data may be compressed to enable real-time AF classification [70,71].
In addition to directly processing ECG data, some studies focused on its two-dimensional representation, which can be used for feature extraction and/or classification. Examples of these representations include spectrograms [31] and iris spectrograms [72]. Alternatively, the ECG signal may be transformed into an electrocardiomatrix, which is a two-dimensional representation that includes the rhythm and shape of the QRS complex [73]. A beat-intervaltexture CNN was then used to process the electrocardiomatrix. In this architecture, there are four different layers: the first two layers perform low-level feature extraction, and the two subsequent layers perform high-level feature extraction using three types of convolution filters (beat, interval, and texture). Next, a feature attention layer weighs the identified features concerning the arrhythmia classes and uses such weighted features for classification.
Deep metric learning for PVC detection has also been demonstrated [18]. Such learning methods combine the mechanisms of metric learning for effective feature extraction in which the features are processed with k-nearest neighbors for binary classification. In comparing ML and DL, the former may use the ECG to define summary features that provide physiologic insight, whereas the latter automatically extracts discriminating information from complete waveforms [74]. ML and DL may complement one another, as demonstrated by the multiview fusion classification model in which both summary and deep features from ECG signals were fused [57]. However, DL may independently offer some physiologic information via gradient-weighted class activation mapping, which can highlight the relative contributions of the temporal regions of the ECG signal that most contribute to the AI-obtained classification [73].

Other Cardiovascular Diseases
Other cardiovascular conditions amenable to ECG-AI include myocardial infarction and heart failure ( Table 2). Particularly with myocardial infarction detection, there has been a shift from ML techniques towards DL techniques [16,35,78] due to their higher performances and the fact that no handcrafted feature extraction is required. DL techniques for myocardial infarction detection include the application of both simple and complex models. Examples of simple DL models include an artificial neural network with only three layers (Acc 99.10%) [79] and CNN [12,16] and LSTM [80] algorithms. More complex DL models include a deep belief network for unsupervised heart rate variability (HRV) feature extraction and selection with LSTM for classification [76], a multi-channel lightweight model for the simultaneous analysis and classification of four ECG leads [81], and a twodimensional CNN for the classification of ECG waveform snapshots [34]. It is important to notice that the ECG-AI determination of myocardial infarction commonly involves 12-lead data because the different leads represent different projections of the heart's electrical activity, which is necessary to capture region-specific ischemia [12,16,[78][79][80][81]. However, some algorithms were assessed based on data recorded from wearable single-lead devices [34,35].
The analysis of 12-lead data also enabled the screening of heart failure with reduced ejection fraction (Acc 82.50%) [82]. Following a short-time Fourier transform in combination with a CNN, an interpretable model highlighted the essential regions in the various ECG leads associated with the final classification. In particular, the lateral (aVL, I, −aVR, V5, V6) and anterior leads (V3, V4) greatly impacted heart failure with a reduced ejection fraction detection. In contrast, the performance of the inferior leads (II, aVF, III) was relatively poor. The findings also confirmed that a rightward T-wave axis, prolonged QT duration, and prolonged QTc are associated with heart failure and that the T-wave axis is an independent and strong risk factor for cardiac events in the elderly.

Sleep Apnea
Sleep apnea is a sleep disorder characterized by the interruption of breath during sleep [83]. It is divided into two subtypes: central sleep apnea (CSA) and obstructive sleep apnea (OSA). (Figure 4). CSA is less prevalent and results from the abnormal regulation of breathing in the brainstem respiratory centers, which leads to an absence of or diminution in involuntary respiratory effort while asleep [84]. OSA is a highly prevalent sleep-related disorder characterized by the repetitive complete obstruction (apnea) or partial obstruction (hypopnea) of the upper airway that results from loss of muscle tone in anatomically susceptible persons [85]. It is estimated that OSA affects almost 1 billion people globally [86], with 425 million adults aged 30-69 years having moderate to severe OSA [87]. CSA is associated with heart failure, renal failure, and the acute phases of stroke, while OSA can lead to excessive daytime sleepiness, chronic fatigue, hypertension, stroke, and other cardiovascular disorders. Thus, early and accurate diagnosis of sleep apnea is essential. Laboratory-based polysomnography has been used as a reference standard for diagnosing OSA. Polysomnography involves the overnight recording of: the bilateral occipital, central, and frontal electroencephalogram; chin, leg, and surface electromyogram; left and right eye electro-oculogram; and ECG, pulse-oximetry, airflow, and respiratory effort. Yet, polysomnography is time-consuming, expensive, and uncomfortable for the patient and requires a trained technician. Therefore, an ECG-AI approach to sleep apnea diagnosis is a potentially convenient and cost-effective alternative [88].

Wearables
To our knowledge, no studies have investigated the use of wearable ECG-AI devices for sleep apnea detection. In fact, sleep apnea ECG data analysis has solely relied on existing datasets such as the PhysioNet Apnea-ECG database [89] or by collecting new data based on polysomnography.

Algorithms
When automatically identifying OSA from ECG recordings, DL is preferable over traditional ML because of its ability to automatically learn discriminating features from raw data ( Table 3). For instance, a CNN using a modified LeNet-5 architecture was compared against five conventional approaches [90]. The superior performance of CNN (Acc 96.00%) for OSA classification was further reinforced by the finding that short-term (30 s) ECG segments were classified into four (normal, mild, moderate, and severe) versus two (normal and OSA) categories [91].
An OSA detection framework based on a multiscale dilation attention CNN and a weighted loss time-dependent classification model for feature extraction and classification were proposed to fully exploit ECG information via DL [92]. The novelty of the multiscale dilation attention one-dimensional CNN lies in the parallel multi-branch structure and dilation operations, which allow the model to explore the feature space efficiently by assigning feature weight with the efficient channel attention module. The classifier addresses the challenges following temporal dependence between ECG segments using a weighted loss function that reduces class imbalance.
Hybrid DL methods have also been proposed in which different methods are combined. Examples are the CNN and LSTM combination with SVM [93], a hybrid three-dimensional CNN-LSTM combination where 20 successive single segments were analyzed simultaneously to include the time evolution pattern of the ECG [94], and a CNN representation learning model for feature extraction combined with a temporal dependence model for classification [95]. To address the limited ability of classic network architectures in feature extraction, the use of a one-dimensional squeeze-and-excitation residual group network to detect OSA using inter-beat intervals and R-wave and Q-wave amplitude from two-minute ECG signal segments was proposed [96]. The network architecture is a CNN in which the residual group convolutions are included to alleviate the computational burden whereas the squeeze-and-excitation mechanism manages the importance of the three inputs.

Mental Health and Epilepsy
Another field of ECG-AI application is clinical psychophysiology, which has used cardiovascular indicators for decades as proxies of cognitive and emotional processes [97]. The stress response is the most investigated of such processes and is characterized by a set of physiologic changes, including increased heart and respiratory rates, skin conductance, cortisol secretion, and muscular and pupillary dilation [98]. The individual tendency to be either hyper-or hypo-reactive is associated with an increased risk of cardiovascular disease and other somatic and mental health conditions [99][100][101]. Consequently, clinical psychophysiology aims to identify objective signs and early biomarkers of somatic and mental illness [102] with applications ranging from cardiovascular rehabilitation to clinical monitoring and work-related health and safety [103][104][105].
The data-gathering approach in this research field commonly entails the psychophysiological assessment, during which study participants are exposed to stressful tasks (e.g., mental arithmetic, cold pressure test, public speech) preceded by a baseline phase and followed by a recovery phase [106] (Figure 5). Such an evaluation is most widely implemented in a laboratory setting; however, several variants have been proposed to improve its everyday validity, including virtual-reality-based studies [107] and ambulatory assessments [108]. Regardless of the specific focus on stress or emotions, most of the reviewed studies (see Table 4) focused on HRV features. HRV is an index of cardiovascular flexibility and adaptability with higher HRV being associated with more effective responsivity to stressors and recovery in stress-free conditions [109]. Moreover, vagal tone is a main determinant of resting-state HRV levels, and it is also associated with a network of structures involved in emotion regulation (e.g., the amygdala) and executive functions (e.g., the prefrontal cortex) [109,110]. Therefore, HRV is among the physiologic indicators of stress, emotions, and other self-regulatory processes [111]. HRV indices in both the time and the frequency domains are widely used for ECG-AI stress detection and emotion recognition [104].
HRV is also implicated in other neuropsychologic conditions such as epilepsy and epileptic seizures, the prediction of which has profound clinical utility [112]. For instance, epileptic patients are characterized by lower high-frequency HRV and overall sympathovagal imbalance [113], and cardio acceleration (tachycardia), with HRV reductions being typical peripheral concomitants of epileptiform electroencephalography (EEG) activity [113,114].

Wearables
Several commercial wearable devices were used to collect ECG data for research involving stress detection and emotion recognition, including the Zephyr BioHarness 3.0 [74,115], T-REX TR100A [116], and "LaPatch" [117]. However, the continued development of public, disease-dedicated databases, such as the PhysioNet Driver stress dataset [118], allows for algorithm development and evaluation without collecting data [119]. Such an approach is mainly used for epilepsy applications, where the condition is monitored and not induced and where the biosignals are directly evaluated from patients to detect and predict event occurrence. In these studies, ECG and EEG data are analyzed together. An additional two studies were reported in which the ECG signal was collected with ad-hoc wearable prototypes alongside other biosignals such as the EEG [120,121].

Algorithms
Various ML and DL approaches are used in psychophysiological research. Conventional ML techniques were adopted for mental fatigue detection and emotion classification [117,119]. In particular, a wavelet scattering algorithm was successfully applied to extract more complex ECG features than the standard time-and frequency-based features [119].
ML and DL are mainly used for stress detection, as in a study where stress level was estimated through a combination of principal component analysis for feature extraction and SVM for classification [115]. Moreover, a two-branched deep learning neural network (DNN) based on the deep ECG net structure was proposed [74]. Two branches are devoted to feature extraction of ECG and respiratory features, respectively, after which they are concatenated for classification. Of interest here are the author's visualizations of the network's learning process, which provide insight into the network's decision-making. In a second DNN, two training methods were investigated: training from scratch and transfer learning [116]. In the latter method, the pre-trained model parameters were determined following training on one database after which they were adjusted using a second database. Classification performance analyses indicated that the transfer learning application improved the scores of all metrics (Acc 90.19%). ECG-AI algorithms for mental stress and emotion detection are typically trained on signal segments classified as "stressed" vs. "unstressed" based on the experimental phase of the psychophysiological assessment (i.e., stressor versus baseline/recovery) [74,[115][116][117], whereas one study labeled the segments based on self-report measures [117], and another studies ECG activity with changes in criterion variables such as salivary cortisol [115] and an expert rating of participants' facial expressions [119] (see Table 4).
For seizure detection, two different ML approaches were reported. The application of a multivariate statistical process control was demonstrated via a technique that searches for changes in HRV indices that could indicate seizures [120]. Nonetheless, the system had a sensitivity of 85.7% with a false alarm rate of 0.62 times per hour, implying a need for improvement. The use of two singular models were evaluated: the first, based on SVM, to classify EEG signals and the second, based on random forest, to classify ECG signals [121]. The classifiers were used against a multimodal model by integrating the predictions of the two models for seizure detection. Performance evaluation showed that integrating the prediction results of both physiologic signals in the multimodal model increased sensitivity while maintaining the same false alarm rate for two out of three databases. These studies typically used data from long-term pre-surgical monitoring [120][121][122]. The AI algorithms were then trained and tested against expert annotation of video-recorded EEG segments, which were categorized as during, after, or between seizures. Overall, these studies were characterized by lower heterogeneity in terms of research protocols and reported algorithm performance metrics compared to stress and emotion recognition studies due to the higher availability of research standards for clinical validation [123]. Studies involving ECG-AI wearables to detect mental health conditions are limited in several ways. Firstly, many studies used small samples with poorly specified or even unspecified inclusion criteria. Such low statistical power limits algorithm performance, reproducibility, and the generalizability of results. Secondly, signal pre-processing steps, including the detection of ECG components, artifact identification, and computation of the ECG features, are substantially different among the reviewed studies. Some studies used ECG tracing of 20 s or less, which excludes the use of HRV features such as the low frequency power (requires a frequency of 0.04 Hz or oscillations as long as 25 s) because signal segments lasting at least 10 times the lower frequency bound (about 4 min) have been recommended to provide proper estimates [124].

Other Applications
Examples of ECG-AI applied to other areas are reported in Table 5. Applications include the evaluation of blood sugar and sports medicine.

Wearables
Public databases relating to ECG and outcome data are currently available for the most common cardiac conditions such as AF and only a minority are available for other diseases. Therefore, consumer devices such as the Medtronic Zephyr BioPatch™ HP80 [125] and single-lead ECG prototypes [126,127] have been used to collect patient-specific data related to other medical conditions for subsequent AI analysis. However, the number of publicly available datasets for tailored medical applications is increasing [42].

Algorithms
ECG-AI has been successfully used to detect hyperglycemia and hypoglycemia [125,126]. A novel feature extraction method and a ten-layer artificial neural network classifier for the detection of hyperglycemia [126] was proposed, and it achieved an improvement of 53% versus the previous models. A person-specific system, including a DL model for each participant, was proposed for the detection of hypoglycemia [125]. Specifically, the data recorded from the first few days were used for training, while the rest was used for system evaluation. Two models were investigated: a CNN and a CNN-RNN combination. The CNN module produced a fixed-length ECG to be further processed by the next RNN module.
Another application of ECG-AI is in sports medicine to evaluate fatigue and abnormal health events in real-time. The effectiveness of this approach was demonstrated via a weighted one-class SVM using signals recorded on volunteers undergoing specific tasks [127].

General Challenges and Limitations
The clinical reliability of wearable devices is challenged by several factors including the fact that mobile versions collect fewer data compared to their clinical analogs. An example is that the ECGs of wearable devices are typically single to triple leads, while those utilized clinically feature twelve leads. Wearable technologies are also intended to be worn throughout the activities of daily living, which results in an increased likelihood of collecting intermittent or noisy data. Furthermore, the real-time effectiveness of corresponding AI algorithms are potentially compromised by processing demands relative to battery capacity or, when the processing is to be carried out on the cloud, limited connection to wireless networks in rural areas.
Once recorded via wearable devices, data are commonly reviewed by physicians when such information would be valuable in order to better understand a patient's history [128]. However, diagnoses and predictions provided by AI algorithms are less readily accepted by clinicians [129] because the basis for these decisions is a black box. That is, an AI algorithm may decide on a particular medical condition, but the inherent lack of physiologic insight makes the reliability of such decisions uncertain by clinical standards. Determinations made by supervised AI algorithms are therefore more likely to be clinically acceptable if more insight into the physiologic mechanism by which they make their predictions can be provided.
Two limitations result from the need to provide physiologic detail. Firstly, defining summary domain-aware features to enable supervised AI reduces the dimensionality of the data and may thus limit the prediction potential at the expense of a better physiologic understanding. Indeed, to perform supervised learning and therefore satisfy clinical standards for physiologic understanding, data should be processed to obtain translatable summary features. Regarding ECG analysis, such characteristics may include the R-R interval, QRS width and magnitude, and ST-segment elevation or depression, among others. Nonetheless, this approach relies on knowing what summary features to define and doing so comprehensively. Unfortunately, the definition of translatable characteristics relies on those that are already known via traditional medicine. These characteristics are the most obvious to human interpretation, which thus undermines the main advantage of using AI: the ability to make determinations beyond the threshold of human elucidation. Secondly, it may also be desirable to perform processing steps such as truncating, filtering, or downsampling data to make physiologic detail more obvious or optimize input before initiating an AI algorithm. However, these steps also potentially remove valuable information beyond the level of human interpretation. In moving towards a compromise of deeper knowledge with some physiologic insight, heat maps that highlight the temporal segment of the ECG most influential in making a classification are valuable [73].
Another challenge facing the clinical adoption of AI diagnoses is that there are no standards for defining what level of correctness is sufficient to replace a physician as the primary assessor. Such a threshold is particularly important to consider in the context of AI algorithms being trained by physician specialists because AI diagnoses are then relative to the most expert clinical standard rather than the average [128]. Additionally, this standard assumes that all patients have access to the best physician specialist with whom the AI algorithm is being compared. In fact, many individuals may not have any access at all, especially in real-time. Thus, wearable devices in conjunction with AI algorithms offer far greater monitoring of patients but have a higher standard for diagnostic reliability.
An additional limitation of current AI methods is that algorithm training requires the availability of quality data. In most cases, such datasets need to be large enough to be divided into training and testing sets while also being curated so that most fields are complete and are purged of erroneous information. As shown in this review, many publicly available, condition-specific datasets are emerging. However, developers should keep in mind that each database has its limitations (e.g., not socioeconomically or racially diverse enough) that narrow the database's scope of use. After the acceptance of an algorithm, ongoing post-application clinical validation is essential to maintaining confidence in diag-nostic or predictive correctness but is more challenging because these data are not curated and may thus be noisy, discontinuous, or otherwise incomplete.
As demonstrated in the tables of this review, there are no standards for defining correctness, and therefore, the direct comparison of various AI methods is often not possible. However, all measures of correctness (total error rate, positive predictive value, accuracy, sensitivity, specificity, AUC, and F1) rely on base variables including true positives, true negatives, false positives, and false negatives [130,131]. The consistent reporting of all base variable values or all measures of correctness would overcome this current limitation.
In general, the methods proposed in the literature are not easy to compare due to the different datasets used in the experiments and different research targets. The most promising algorithm for ECG applications is the deep learning CNN architecture. However, in arrhythmias detection and classification, it is possible to have a clearer understanding and insight of the algorithms' performances. In fact, arrythmia detection is a common outcome for ECG-AI technology because arrhythmias can be relatively easily identified using onelead ECG without the need of the standard twelve-lead ECGs, making these detection techniques easier to transfer and deploy on wearable devices. Unsurprisingly, the most popular application of wearable devices in medicine are arrhythmia detectors/monitors. For the other applications, more research and datasets need to be analyzed. Based on our work, we expect an increase in interest in the applications of wearables and ECG-AI in sleep apnea and mental stress. Moreover, new applications of ECG-AI for other conditions, such as hyper/hypoglycemia, will likely see an increase in data and research work as well. Promoting challenges between research groups seems to be the best way to boost the development of the best AI solutions. Examples of such competitions include MIT-BIH Arrhythmia or the 2017 PhysioNet/Computing in Cardiology Challenge.

Towards the Future
Wearable devices will continue to have an increasing role in personalized healthcare because they enhance accessibility, reliability, and cost effectiveness. Technology advancements that enable this expansion will include devices that acquire more reliable and higher quality signals and those that obtain more signals simultaneously, increasingly approximating clinical diagnostics. In terms of the wearable ECG, high-quality data will be continuously obtained from more reliable and improved sensors with multiple leads [1,5].
In the future, AI algorithms will be trained using an increasing number of larger, curated, condition-specific datasets. Future datasets that are more generalized to include more covariates to capture additional peripheral information are also likely to emerge. Data collected by wearable ECG devices will increasingly be transferred to a cloud for AI processing because the algorithms will be too computationally intensive to be executed locally [2,132].
Prospective wearable ECG-AI devices will normalize the near-instantaneous assessment and treatment of certain acute conditions, improving outcomes. These devices and algorithms will also more comprehensively consider whole-body physiology and health by integrating a variety of data sources simultaneously. Ongoing successes will increase confidence in automated decision making and reinforce its role in personalized healthcare [9,129].

Conclusions
The ECG contains highly valuable information. The diagnosing and predicting of specific clinical conditions, including arrhythmias, coronary artery disease, sleep apnea, mental health, and epilepsy are increasingly enabled via wearable devices that record ECG data and continuously analyze it in real-time using AI algorithms. In this review, we highlighted the current applications, with performances and limitations, of ECG-AI applied to wearable devices for disease detection and prediction. As reported by several other authors, the ongoing development of large, curated datasets targeting specific clinical conditions is essential for developing and validating various AI approaches. Since ECG-AI is tailored to specific medical applications, the methods that are most effective for one clinical condition are not necessarily appropriate for application to others. Advancements in this field require a combination of knowledge domains that create a unique expertise. Such technology is leading to a paradigm shift in personalized medicine that is making the diagnosis of many conditions more accessible, reliable, and cost effective.

Conflicts of Interest:
The co-author A.B. is employed by AccYouRate Group, which is a company that is producing wearable technology that analyzes ECG signals on a mobile platform.