A dataset on 24-h electrocardiograph, sleep and metabolic function of male type 2 diabetes mellitus

This dataset provides a collection of 24 h electrocardiograph (ECG) signals, ECG analysis results based on circadian rhythm and R-peak detection, results of sleep quality assessment and clinical indicators of metabolic function acquired from 60 male type 2 diabetes mellitus (T2DM) inpatients. Upon admission, a fasting blood draw and urinary sample were obtained the next morning for routine glucose, lipid and renal panels. Subjects were also involved in investigation for diabetic complications. On the second day of hospitalization, subjects were monitored in hospital for 24-h ECG starting at 10 pm. Subjective sleep quality was assessed by Pittsburgh Sleep Quality Index and a brief sleep log was used to record sleep duration for the studied night. Objective sleep quality and sleep staging were assessed by cardiopulmonary coupling analysis. This dataset could be utilized to conduct conjoint research on the relationships among sleep, metabolic function, and function of cardiovascular system and autonomic nervous system derived from ECG analysis in T2DM, and further investigate the information in ECG signals based on circadian rhythm and physiological status, providing new insights into long term physiological signal processing.


a b s t r a c t
This dataset provides a collection of 24 h electrocardiograph (ECG) signals, ECG analysis results based on circadian rhythm and R-peak detection, results of sleep quality assessment and clinical indicators of metabolic function acquired from 60 male type 2 diabetes mellitus (T2DM) inpatients. Upon admission, a fasting blood draw and urinary sample were obtained the next morning for routine glucose, lipid and renal panels. Subjects were also involved in investigation for diabetic complications. On the second day of hospitalization, subjects were monitored in hospital for 24-h ECG starting at 10 pm. Subjective sleep quality was assessed by Pittsburgh Sleep Quality Index and a brief sleep log was used to record sleep duration for the studied night. Objective sleep quality and sleep staging were assessed by cardiopulmonary coupling analysis. This dataset could be utilized to conduct conjoint research on the relationships among sleep, metabolic function, and function of cardiovascular system and autonomic nervous system derived from ECG analysis in T2DM, and further investigate the information in ECG signals based on circadian rhythm and physiological status, providing new insights into long term physiological signal processing. ©

Value of the Data
• This dataset provides 24 h ECG signals of male T2DM patients during hospitalization, and ECG analysis results based on circadian rhythm and R-peak detection method proposed in our previous study [2] . All ECG signals provided were carefully checked with noise level, artifacts, and ectopic beats. The dataset also includes results of objective and subjective sleep quality assessment and clinical indicators of metabolic function acquired from subjects during hospitalization. • Researchers, who are interested in the relationships among sleep, metabolic function, and function of cardiovascular system and autonomic nervous system (ANS) derived from ECG analysis in T2DM, can use this dataset to conduct conjoint research. • This dataset can be used to explore the interaction among sleep, metabolic function and cardiovascular system, and ANS in T2DM patients, to optimize the pre-diagnose and treatment of T2DM and related complications with new analytical methods of long-term physiological signal. For example: 1) the association between physiological function and sleep quality, 2) the impact of circadian rhythm on cardiovascular and ANS function derived from ECG sig-nals, and 3) The alternation of cardiovascular function, HRV derived ANS function and their correlation with metabolic function during sleep cycles.

Objective
Type 2 diabetes mellitus (T2DM) causes dysfunction of autonomic nervous system (ANS), which brings about significant morbidity and mortality [3] . The ANS plays a crucial role in metabolic regulation, displaying clear circadian rhythm in healthy individuals [4] . Sleep, a vital brain phenomenon, significantly impacts both the ANS and metabolic function [5 , 6] . In addition to worsening metabolic regulation, the dysfunction of ANS in T2DM patients may also affect circadian rhythms, causing sleep disturbances, further worsening metabolic status and accelerating the progression of type 2 diabetes [7] . Heart rate variability (HRV) is recognized as an effective measure of heart-brain interaction and autonomic activity [8] . However, current research primarily focuses on linear analysis of 24-hour or 5-15 minute electrocardiogram (ECG) recordings obtained in different physiological states without inspection of the underlying autonomic pathways [8] . This approach fails to adequately extract information from long-term signals and overlooks the complex spatial and temporal dynamics and fractal properties of heart rate time series. This dataset provides 24-h ECG signals, ECG analysis results based on circadian rhythm and R-peak detection, and indicators of sleep and metabolic function in T2DM. It can be utilized to conduct conjoint research on the relationships among sleep, metabolic function, and function of cardiovascular system and ANS derived from ECG analysis in T2DM. It can be utilized to further investigate the information in ECG signals based on circadian rhythm and physiological status, providing new insights into long term physiological signal processing.

ECG Signals
ECG recordings were collected by an FDA (U.S. Food and Drug Administration) approved ambulatory single-lead Holter electrocardiogram monitor (DynaDx Corporation, Mountainview, CA, United States), able to record ECG for over 24-h. Sampling frequency of ECG monitoring was set to 250 Hz. The ECG signals were saved in .mat format and were stored in the ECG folder. The variables in the MATLAB structure with corresponding fieldnames are as follows: • all: ECG during 24-h; • sleep: ECG during sleep; • day: ECG during awake;

RR-Interval Time Series
Based on R-peak detection and sleep staging results derived from CPC, the RR interval time sequences during sleep period were segmented according to sleep stages, including unstable sleep, stable sleep and REM sleep. RR interval sequences during the same sleep stage were stitched together as a whole and saved in .mat format. The RR-interval time series files were stored in the RR-interval folder. The variables in the MATLAB structure with corresponding fieldnames are as follows: • Coronary artery disease and cardiac insufficiency: 0/1 means the subject doesn't have/has the complication. • Lower extremity atherosclerosis or stenosis: 0/1 means the subject doesn't have/has the complication. • Carotid plaque: 0/1 means the subject doesn't have/has the complication.

Subjective Sleep Quality
Sleep quality analysis was performed on ECG signals recorded during sleep period of 24 h ECG monitoring. The results of the qualified signal analysis of 38 subjects were saved in the subjective sleep quality.xlsx excel file. The subjective sleep quality was acquired based on a cardiopulmonary coupling analysis algorithm [9] . The sleep quality metrics are as follows:

Objective Sleep Quality
Objective sleep quality was assessed by Pittsburgh Sleep Quality Index (PSQI). PSQI includes multiple sleep-related variables over the preceding month, using Likert and open-ended response formats [10] . The PSQI yields seven component scores: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, sleep medication, and daytime dysfunction [11] . During 24-h ECG monitoring, 53 subjects filled in PSQI questionnaire and the results were saved in the Objective sleep quality.xlsx excel file.

volunteered inpatients with T2DM from Suzhou Science & Technology Town
Hospital took part in this study. Due to the small number of female inpatients in the study, and to eliminate the interference of perimenopausal syndrome on metabolic status of women in the age group, data from only 60 male inpatients (50 ± 16 years) were included in the follow-up.

Clinical Indicators of Metabolic Function
During hospitalization, subjects underwent clinical examinations to obtain their metabolic function and health status. On admission, fasting blood and urine samples were taken on the following morning for routine glucose, lipid and kidney tests. Subjects also participated in an investigation of diabetic complications.
Subsequently, box plots were used to remove outliers for clinical indicators. Different from other methods that need to assume that the data obey the normal distribution, the drawing of the box graph relies on the actual data, and does not need to assume that the data obey some specific distribution form in advance, which can represent and consider the original appearance of the data shape more truly and intuitively. Meanwhile, the box chart is based on quartile and interquartile distance to judge outliers. The quartile has a certain resistance, and the proportion of outliers is small, which is difficult to affect the quartile. Therefore, for RR-interval time series, the box chart can be used to eliminate outliers. The difference between the upper quartile U and the lower quartile L of the data was defined as the IQR. The upper bound was set to U + 1.5IQR and the lower bound was set to L-1.5IQR. Data exceeding the upper and lower bounds can be considered as outliers. However, in order to retain more information under the pathological data, we retained some data that did not appear to be seriously wrong, that is, outside the range of normal values, but may be present in patients with dysfunctional states.

ECG Data Recording
ECG recordings were collected by an FDA (U.S. Food and Drug Administration) approved ambulatory single-lead Holter electrocardiogram monitor (DynaDx Corporation, Mountainview, CA, USA). Sampling frequency of ECG monitoring was set to 250 Hz. Monitoring on ECG signals of all subjects started at 10 pm on the second day of hospitalization, lasting for 24-h. All ECG recordings were checked for noise levels, artifacts, and ectopic rhythms. Seventeen ECG recordings were discarded due to low data quality, overshort recording time, or the presence of atrial fibrillation or server arrhythmias in hospitalized patients.

ECG Data Preprocessing
The accurate detection of R-peak is important for heart rate variability analysis, beat segmentation and arrhythmia recognition. There are many influencing factors during the process, including noise interference and the similarity of characteristics (slope, amplitude and frequency) between T wave and R wave, which cause great interference to R wave detection. Numerous R-peak detection methods have been proposed for analyzing ECG signals. However, when these methods are applied to long-term recorded ECG signals, particularly using wearable single-lead ECG devices, the accuracy of R-peak detection is often unsatisfactory. To improve the accuracy of R-peak detection [2] , we employed a method for extracting high-quality RR intervals as proposed in our previous study [2] , which combines five common R-peak detection methods [7][8][9][10][11] , and is based on template matching and weighted fusion methods.
The specific steps for calculating the initial R-peak labeling scores are as follows: (a) Five sets of initial R-peak detection results were obtained by five detection methods [7][8][9][10][11] . (b) The ECG samples were classified by template matching, and weights were selected according to the categories. There are five categories in total, corresponding to five sets of weights, which includes five weights corresponding to the five detection methods [7][8][9][10][11] .
Method with better performance on specific type of ECG is assigned more weights. (c) Based on the weights in b), a "score" is calculated for each R-peak labeled in each of the five sets of results in a). Higher score indicates higher accuracy of the R-peak detection. (d) Combine the five sets of R-peak detection results into a vector and calculate the scores of the initial R-peak annotations obtained in step a) one by one.

Sleep Quality Assessment
Subjective sleep quality was evaluated using the Pittsburgh Sleep Quality Index (PSQI). The assessment utilized Likert and open-ended response formats [5] . Among the 53 subjects who underwent 24-hour ECG monitoring, the PSQI questionnaire was completed.
Objective sleep quality and sleep staging were determined through ECG-based cardiopulmonary coupling (CPC) analysis [4] . This analysis involves mathematical analysis of the combination between heart rate variability and the beat-to-beat respirational modulation of the QRS waveform. The CPC analysis was applied on extracted ECG recordings obtained during sleep. The determination of sleep period was derived from a brief sleep log collected from subjects, which recorded the sleep duration on the night of ECG recording. The CPC analysis categorizes major physiological sleep states into stable sleep, unstable sleep, and rapid eye movement (REM) states [4] . 38 CPC sleep reports were available for analysis in this study.

Ethics Statements
All participants were informed the experimental protocol and matters needing attention, then signed the written informed consent prior to participating in the study. This study was approved by the Institutional Review Board of Suzhou Science & Technology Town Hospital (No. IRB2019045).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Dataset on electrocardiograph, sleep and metabolic function of male type 2 diabetes mellitus (Original data) (Mendeley Data).