Collection of human reaction times and supporting health related data for analysis of cognitive and physical performance

Smoking, excessive drinking, overeating and physical inactivity are well-established risk factors decreasing human physical performance. Moreover, epidemiological work has identified modifiable lifestyle factors, such as poor diet and physical and cognitive inactivity that are associated with the risk of reduced cognitive performance. Definition, collection and annotation of human reaction times and suitable health related data and metadata provides researchers with a necessary source for further analysis of human physical and cognitive performance. The collection of human reaction times and supporting health related data was obtained from two groups comprising together 349 people of all ages - the visitors of the Days of Science and Technology 2016 held on the Pilsen central square and members of the Mensa Czech Republic visiting the neuroinformatics lab at the University of West Bohemia. Each provided dataset contains a complete or partial set of data obtained from the following measurements: hands and legs reaction times, color vision, spirometry, electrocardiography, blood pressure, blood glucose, body proportions and flexibility. It also provides a sufficient set of metadata (age, gender and summary of the participant's current life style and health) to allow researchers to perform further analysis. This article has two main aims. The first aim is to provide a well annotated collection of human reaction times and health related data that is suitable for further analysis of lifestyle and human cognitive and physical performance. This data collection is complemented with a preliminarily statistical evaluation. The second aim is to present a procedure of efficient acquisition of human reaction times and supporting health related data in non-lab and lab conditions.

further analysis of lifestyle and human cognitive and physical performance. This data collection is complemented with a preliminarily statistical evaluation. The second aim is to present a procedure of efficient acquisition of human reaction times and supporting health related data in non-lab and lab conditions. & 2018 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Specifications table Subject Area
Informatics, Biology More specific subject area Health informatics, Human health related data, Infrastructure for health related data collection Type of data A custom hardware device was developed for measuring hands reaction times [1]. A custom software tool was developed for measuring legs reaction times [2]. A software infrastructure for rapid collection of health related data was developed [3]. Prior to measurements all participants were familiarized with the goal of the project, overall experimental procedure and related legal conditions.

Experimental features
Two groups of participants, 349 people of all ages, provided health related data and metadata during the following measurements: hands and legs reaction times, color vision, spirometry, electrocardiography, blood pressure, blood glucose, body proportions, and flexibility. The collected medatata set included age, gender and summary of the participant's current life style and health.

Data source location
Value of data The human reaction times and other health related data and metadata were collected from 349 people of all ages.
There are projects utilizing reaction time as a physiological measure (e.g. [4][5][6]) but to the authors best knowledge, there are no datasets publicly available that contain reaction time data together with other supportive health related data and metadata.
The resulting data collection allows other researchers to perform further analysis, e.g. to detect early symptoms of starting chronic diseases [7,8].
The effects of a healthy lifestyle on cognitive functions are of interest to many people.

Data
The purpose of this article is to provide interested researchers with a well annotated and sufficiently large collection of human reaction times and health related data and metadata that could be suitable for further analysis of lifestyle and human cognitive and physical performance. The second aim is to present a procedure of efficient acquisition of human reaction times and supporting health related data in non-lab and lab conditions.
Each provided dataset contains a complete or partial set of data obtained from the following measurements: hands and legs reaction times, color vision, spirometry, electrocardiography, blood pressure, blood glucose, body proportions and flexibility. It also provides a sufficient set of metadata (age, gender and summary of the participant's current life style and health) to allow researchers to perform further analysis.

Participants and environment
Two groups of participants took part in the project. The first group included 293 people (98 males, 136 females, 59 with no record of their gender in the registration form) visiting the regional event 'Days of Science and Technology 2016' held on the central square in Pilsen, Czech Republic in September 2016. The participants were measured in a large textile tent (see Fig. 1). The weather was sunny with an average outside temperature of about 30ÅC.
The second group of participants included 56 people from the organization Mensa Czech Republic that brings together people with an IQ greater than 130 (30 males, 23 females, 3 with no record of their gender in the registration form). In this case the experiments were performed in the air- conditioned neuroinformatics laboratory, University of West Bohemia, Czech Republic (see Fig. 2) with an average temperature of 21ÅC.
Age and gender distributions of all participants are listed in Tables 1 and 2.   Table 2 Age and gender distribution of all participants having taken part in the project as the members of Mensa Czech Republic.

Data collection procedure
Prior to measurements all participants were familiarized with the goal of the project, overall experimental procedure and related legal conditions. Then they were registered into a software application for rapid collection, storage, processing and visualization of heterogeneous health related data (described in [3]), signed the informed consent and filled in a short motivational questionnaire (described in more detail in Section 2.3). Immediately after that they took part in individual measurements organized at nine physical sites (the number of physical measurement sites was limited for the participants from Mensa Czech Republic). Each physical site was equipped with appropriate hardware and software tools related to the type of measurement and served at least by one human expert who also provided the participant with the information about the site measurement. The last physical site, the information desk, served both for the registration of the participants and provision of measurements results. It was served by three people.
Although there was a recommended route between individual measurement sites, in fact the participants could circle them in any order (see the schema of measurement sites and the recommended route in Fig. 3). They were also not required to complete all the measurements and could have interrupted the measurement cycle at any time. Only in the best case they visited all the measurement sites and filled in all questions in the questionnaire. The complete data collection procedure took approximately 15 minutes.
When a single measurement was completed, the obtained data were inserted via a user interface into a software application. When the participant finished his/her last measurement, he/she was provided with the results (measured values) from all the visited measurement sites organized on one A4 page.

Motivational questionnaire
After registering and signing the informed consent each participant proceeded to fill in a motivational questionnaire containing a set of 13 single choice questions to provide a basic overview of participant's current lifestyle and health condition. The following questions were asked: Q10: How often do you drink alcoholic beverages?

Measurement sites
The number of measurements sites was different for the participants visiting the Days of Science and Technology 2016 and the participants from Mensa Czech Republic visiting the neuroinformatics laboratory at the University of West Bohemia. The restriction of measurement sites for the participants from Mensa Czech Republic (information desk, hands reaction time, legs reaction time, and color vision sites were available for them) was primarily caused by the limited time they had during their visit in the laboratory and their interest in other kinds of measurements related to brain functioning.

Hands reaction times
The measurement site was focused on the measurement of participant's hands reaction times to outside visual stimuli.
A custom cognitive research device consisting of a wooden desk with four buttons and LED panels placed in a square formation (as is shown in Fig. 4) and related hardware and embedded control software for generation of visual stimuli (lighting up the LED panels) and recording the participant's responses [1] was used. The task of the participant was to press a button near the LED diode panel turned on by right or left hand as quickly as possible. Only one LED panel could have been active at a time. The order of lighting up the LED panels was random and controlled by embedded software. In total the participant completed 16 trials where he/he pressed one of the four buttons placed on the wooden plate according to the LED panel turned on.
The results given to the participant contained the following values (all these values were computed by the embedded control software and manually inputted by the site experimenter to the software application for rapid collection of health related data): Average hands reaction time [ms]calculated from 16 trials when the participant pressed one of the four buttons placed on the wooden plate according to the LED panel turned on, Number of missed reactionsa missed reaction was considered when no button was pressed within the time limit one of the LED panels was turned on, Number of incorrect reactionsan incorrect reaction was considered when a wrong button was pressed within the time limit one of the LED panels was turned on.

Legs reaction times
The next measurement site was focused on the measurement of the legs reaction time using an impact dance pad (see Fig. 5). This dance pad was divided into nine areas (central area serving as a base point and eight side areas serving as places capturing touches of the participant's leg) and connected to a laptop where these areas were represented by corresponding patterns that had been randomly highlighted. Only one area could have been active at a time. The task of the participant was to stand in the central part of the dance pad, step aside once the corresponding pattern on the laptop was highlighted and return quickly back to the central part of the dancing pad [2]. In total the participant completed 16 trials where he/she stepped aside and return back to the central position.
The results given to the participant contained the following values (all these values were computed by the custom application and manually inputted by the site experimenter to the software application for rapid collection of health related data):

Color vision
The third measurement site was focused on the measurement of color vision. The participant was tested using a total of eight pseudochromatic pictures. His/her task was to recognize the number hidden in these pictures.
The results given to the participant contained the following values (all these values were inputted manually by the site experimenter to the software application for rapid collection of health related data): List of incorrectly recognized pictures.

Spirometry
The fourth measurement site was focused on the measurement of lung capacity, forced expiratory volume, and expiratory flow of the participant. All measurements were performed using the SP10W spirometer.
The results given to the participant contained the following values (all these values were inputted manually by the site experimenter to the software application for rapid collection of health related data):

Electrocardiography (ECG)
The fifth measurement site was focused on the electrocardiography measurements. It included heart rate (HR) together with measurement of the ST segment and QRS interval. QRS representing ventricle depolarization was measured from the start of the Q wave to the end of the S wave. The ST segment was measured from the end of the S wave, J point, to the start of the T wave. All measurements were performed using the ReadMyHeart Handheld ECG device.
The results given to the participant contained the following values (all these values were inputted manually by the site experimenter to the software application for rapid collection of health related data): Heart rate (HR) [puls/min], ST Segment [mm]the length of the ST segment that represents the interval between ventricular depolarization and repolarization, QRS Interval [s]duration of the QRS complex that represents ventricle depolarization.

Blood pressure
The sixth measurement site was focused on the measurement of blood pressure in a traditional way as a systolic and diastolic blood pressure. This measurement was completed by the measurement of heart rate, here denoted as puls. All measurements were performed using the Omron M6 Comfort IT device.
The results given to the participant contained the following values (all these values were inputted manually by the site experimenter to the software application for rapid collection of health related data):

Blood glucose
The seventh measurement site was focused on the measurement of glucose concentration in blood. All measurements were performed using the FORA Diamond Mini blood glucose monitoring system.
The result given to the participant contained the following value (this value was inputted manually by the site experimenter to the software application for rapid collection of health related data): Glucose [mmol/l] -concentration of glucose in blood.

Body proportions
The eight section was focused on the measurement of body proportions, the participant's height was measured manually, weight, body mass index (BMI), and concentration of muscle-mass, water and fat in participant's body was measured and calculated the Medisana BS 440 Connect device.
The results given to the participant contained the following values (these values were inputted manually by the site experimenter to the software application for rapid collection of health related data):

Flexibility
The ninth section was focused on measurement of human body flexibility that was measured using a 13 cm high portable podium. The participant standing at the podium was asked to touch his/ her own feet. Not being able to do it, the result was as a negative number, On the other hand, when the participant managed to bend even more, the result was a positive number.
The result given to the participant contained the following value (this value was inputted manually by the site experimenter to the software application for rapid collection of health related data): Flexibility [cm]difference between position of fingers and foot during deep forward bend (limited by þ13 cm podium height).

Used hardware
The following table (Table 3) summarizes devices used during the measurements.

Data collections
The data collected during the Days of Science and Technology 2016 are available in Table 6 (hands and legs reaction times), Table 7 (color vision and spirometry data), Table 8 (ECG, blood pressure and blood glucose data) and Table 9 (body proportions and flexibility data). The data collected in the neuroinformatics laboratory from the members of Mensa Czech Republic are available in Table 11 (hands and legs reaction times). The questionnaire data collected during the Days of Science and Technology 2016 are available in Table 10. The questionnaire data collected in the neuroinformatics laboratory are available in Table 12.

Description and processing of datasets
Each record in the data collection corresponds to one participant. The first step of the preliminary statistical analysis was to distribute the obtained data into three categories. The first category Basic data contains the following information about each participant: 1. the group the participant belongs to (0the member of the Mensa Czech Republic, 1the person visiting the Days of Science and Technology 2016), 2. gender (0 male, 1female), 3. age.
The second category Measured data includes the subcategories that are related to the data obtained from individual measurement sites (in case of pseudochromatic pictures the value 0/1 means that the participant did not recognize/recognized the hidden number): The last category Questionnaire data includes the data obtained from the questionnaires completed by the participants. These data are divided into the following subcategories: 1. Sportthe participant -0 -does not do any sport þ does not want to do any sport, -1 -does not do any sport and wants to do some sport, -2 -does some sport, but has no friends to do some sport with, -3 -does some sport and has friends to do some sport with. 2. Foodthe participant -0 -eats irregularly and unhealthily, -1 -eats irregularly and healthily, -2 -eats regularly, but unhealthily, -3 -eats irregularly and unhealthily. 3. Drinking habitsthe participant -0 -drinks enough water, -1 -does not drink enough water. 4. Supplementsthe participant -0 -does not use any supplements, -1 -uses supplements. 5. Smokingthe participant -0 -does not smoke, -1 -smokes up to 10 cigarettes per month, -2 -smokes up to 10 cigarettes per week, -3 -smokes up to 10 cigarettes per day, -4 -smokes up to 20 cigarettes per day, -5 -smokes 20 or more cigarettes per day. 6. Alcoholic beveragesthe participant -0 -does not drink any alcoholic beverages, -1 -drinks alcoholic beverages occasionally, -2 -drinks alcoholic beverages once a week, -3 -drinks alcoholic beverages several times per week. 7. Medical checksthe participant -0 -undergoes medical checks periodically, -1 -undergoes medical checks irregularly.
The dataset is partly inconsistent because not the whole set of health related data was obtained from each participant. Moreover, the members of Mensa Czech Republic did not participated in all measurements to obtain the whole set of health related data. The motivational questionnaire was also   not filled fully in all cases. Since unfilled or inaccurate data can influence the variability of the dataset, the statistical methods that can cope with expected errors were used. The significance level of 0.05 was used for all tests. All statistical methods were performed in MATLAB.

Basic statistical characteristics of dataset
Box plot graphs used to visualize the basic statistical characteristics of the data were created separately for the members of Mensa Czech Republic and for the visitors of the Days of Science and Technology 2016. Figs. 6 and 7 show the box plot graphs depicting the range of participants' age, legs reaction times and BMI.

Dependency relationships of measured and questionnaire data
To find dependencies between the subcategories of measured and questionnaire data linear regression in the form of general matrix Y ¼ βX þ ϵ was used, where Y is the vector of response variables (measured data) and X is the vector of explanatory variables (questionnaire data), ϵ is a random component of the linear model. To construct linear regression models the following methods were used.
The first method was based on the gradual selection of questionnaire data subcategories. On the base of significance coefficients and p-values it was decided which questionnaire data subcategory significantly affects the selected subcategory of the measured data.
The questionnaire data were regarded as one package in the second method. The multivariate regression analysis for all subcategories of the questionnaire data processed simultaneously was implemented.
In the third method the stepwise regression was used. This method is useful in case of large amount of explanatory regressors (the questionnaire data subcategories in our case). Gradual draining of individual explanatory variables, for which H 0 : β i ¼ 0 cannot be dismissed, simplifies the regression model and identification of statistically significant explanatory regressors.
The stepwise regression was chosen as the most suitable method for the dataset. The results of this regression are summarized in Tables 4 and 5 (value 1 corresponds to the fact that the explanatory variable significantly affects the response variable, 0 corresponds to the independence of two variables).   -