Can breathing gases be analyzed without a mouth mask? Proof-of-concept and concurrent validity of a newly developed design with a mask-less headset Applied Ergonomics

A portable headset has been developed to analyze breathing gases and establish the energetic workload of physically active workers. This proof-of-concept study aimed to investigate the following: (1) the validity of the headset compared to indirect calorimetry using a mouth mask; (2) the validity of the headset compared to the validity of oxygen consumption (V ˙ O2) estimated on the basis of heart rate; (3) the influence of wind on validity; and (4) user experiences of the headset. Fifteen subjects performed a submaximal cycling test twice, once with the headset, and once with a mouth mask and heartrate monitor. Concurrent validity of the headset was analyzed using an intraclass correlation coefficient (ICC). Across all phases, a good correlation between the headset and mouth mask was observed for V ˙ O2, carbon dioxide production (V ˙ CO2) and exhaled volume (V ˙ E) (ICC ≥ 0.72). The headset tended to underestimate V ˙ O2, V ˙ CO2 and V ˙ E at low intensities and to overestimate it at higher intensities. The headset was more valid for estimating V ˙ O2 (ICC = 0.39) than estimates based on heart rate (ICC = 0.11) (n = 7). Wind flow caused an overestimation (md ≥ 18.4 ± 16.9%) and lowered the correlation of V ˙ O2 between the headset and the mouth mask to a moderate level (ICC = 0.48). The subjects preferred the headset over the mouth mask because it was more comfortable, did not hinder communication and had lower breathing resistance. The headset appears to be useable for monitoring development of the energetic workloads of physically active workers, being more valid than heart rate monitoring and more practical than indirect calorimetry with a mouth mask. Proof-of-concept was confirmed. Another design step and further validation studies are needed before implementation in the workplace.


Introduction
Workers' work capacity is dependent upon factors such as health, age, lifestyle and physical fitness (Heerkens et al., 2004;Costa-Black et al., 2013;Schultz et al., 2007). Energetic capacity (one of the aspects of physical fitness) depends on the condition of the respiratory system. The functioning of the human respiratory system declines with age, starting at about the age of 30 (Ilmarinen, 2001;Chan et al., 2000;Bellew et al., 2005) and resulting in a declining energetic (work) capacity among older, physically active workers (Kenny et al., 2008). When the workload exceeds the energetic (work) capacity, overload occurs (Kenny et al., 2008) resulting in concentration problems, lowered well-being, fatigue, health problems and absenteeism (Costa-Black et al., 2013;Ilmarinen, 2001;Kenny et al., 2008;Weerding et al., 2005;Bos et al., 2004). For a sustainable workforce, it is important to maintain the balance between workload and individuals' work capacity (Soer et al., 2014;Catal and Akbulut, 2018).
To determine this balance, there is a need for objective measurement tools that monitor the energetic workload and capacity of individuals at work in their natural working environment (Catal and Akbulut, 2018;Faria et al., 2018;Alberto et al., 2017). The energetic workload and -capacity of physically active workers can be measured in various ways. Direct calorimetry and doubly labeled water techniques are the most accurate methods, followed by indirect calorimetry using a mouth mask (Catal and Akbulut, 2018;Kenny et al., 2017;Borges et al., 2019;Hoehn et al., 2018;Rexhepi and Brestovci, 2011;Bini et al., 2019). These methods need to be performed in controlled laboratory conditions (Kenny et al., 2017). Moreover, they are expensive, interfere with workability, have high breathing resistance and are uncomfortable (Hoehn et al., 2018). Indirect calorimetry is not feasible in the workplace because the mouth mask is impractical and hinders communication, which is often crucial for safety during work (Catal and Akbulut, 2018;Hoehn et al., 2018;Borges et al., 2019;Rexhepi and Brestovci, 2011). Alternatively, heart rate (HR) measurements can be taken to estimate oxygen consumption (V˙O2) (Astrand and Ryhming, 1954). HR measurement is more feasible in the workplace and is generally accepted (Hiiloskorpi et al., 2003;Keytel et al., 2005). However, prediction of energy expenditure from HR is far less valid than direct and indirect calorimetry (Bos et al., 2004;Catal and Akbulut, 2018;Bernmark et al., 2012;Butte et al., 2012;Livingstone et al., 1992;Rennie et al., 2005;Ceesay et al., 1989;Leonard, 2003) and can only be applied if the HR is between 125 and 170 beats/min (Astrand and Ryhming, 1954). A wearable breathing-gas analyzer without a mouth mask would fill the validity and feasibility gaps between these measurement strategies. With this headset, we aim to introduce a system whose measurement validity and usability are positioned between indirect calorimetry and HR monitoring.
A new wearable breathing-gas measurement and analyzing system has been developed ( Fig. 1). This newly developed system collects breathing gases via a headset close to the mouth and nose, where a sample of breathing gases is taken and transported to the rear part of the headset for analysis. Because work activities can change during a working day and over several days, it is necessary to measure energy consumption over prolonged periods of time (Bos et al., 2004). With this wearable system, it may be possible to gain a complete overview of the energetic workload and capacity of workers during performance of different types of work. Moreover, the system is developed to monitor individual physiologic responses to these work activities. It can be used in various conditions, and the headset enables communication during work.
The validity of this breathing-gas analyzing headset and the influence of wind when applied in outdoor workplaces have not yet been evaluated. This proof-of-concept study aimed to investigate the following: 1. The validity of V˙O2, carbon dioxide production (V˙CO2) and respiratory exchange ratio (RER) measurements produced by the developed breathing-gas analyzing headset compared to a mouth mask (reference system); 2. The validity of V˙O2 measurements produced by the developed breathing-gas analyzing headset compared to estimated V˙O2 based on HR; 3. The influence of wind on the validity of the system; 4. The user experience of the developed headset system.

Subjects
The 15 subjects in this study were healthy volunteers, recruited by distributing flyers at a university and hospital. Inclusion criteria were people aged between 18 and 67 who were experienced daily cyclers (at least 30 min at 15 km/h). Exclusion criteria were use of pacemakers or other vital electronic devices, lung-, heart-and/or vessel diseases, and injuries to or dysfunction of the lower extremities.
The Medical Ethics Committee of the University Medical Center Groningen, the Netherlands, issued a waiver for this study (stating that it does not involve medical research under Dutch law), and the study was approved (M16.190,947).

Study design and procedures
In this concurrent validity study, the validity of the breathing-gas analyzing headset was investigated and compared to the reference system, an indirect calorimetry breathing-gas system using a mouth mask, and heart rate monitoring. Healthy subjects performed a submaximal cycling test on an ergometer (Ergoline GmbH, Bitz, Germany; Hoehn et al., 2018;Rexhepi and Brestovci, 2011) twice in the same setup: once with the headset and once with the mouth mask and HR measurement. The submaximal cycling test contained four phases: (I) resting, (II) cycling without resistance, (III) cycling with a 75 W load and (IV) cycling with a 125 W load (Table 1). Cycling loads were selected based on the workload of different types of workers. Phases I (resting) and II (0 W) represented the workload of office workers, and phases III (75 W) and IV (125 W) represented the workload of physically active workers. To examine the influence of wind on the validity of the headset, each phase was structured as follows: (a) a steady state period of 2 min; (b) a 45-s measurement period; (c) simulated wind flow (using a fan at a distance of 1 m from the subject, with a wind speed of 10±1 m/s) from the side for 45 s; (d) simulated wind flow from the front for 45 s; and (e) fluctuating wind flow in front for 45 s (Table 1). The total duration of the cycling test was 20 min (5 min resting and 15 min cycling) with a Fig. 1. The head part of the breathing gas analyzing headset. The head part with the air shield with flow sensor, temperature and wind sensor and a sampling tube to transport the breathing gases to the back part.
constant cycling speed of 65-70 rpm. To eliminate the effects of fatigue, the order of cycling with the headset or with the mouth mask and HR was alternated between subjects. The two cycling tests were performed on the same day with a break of at least 15 min in between to allow for recovery.
To position the headset, the subjects were asked to breath in rest for about 20 s and blow shortly (about 3 s) harder via the nose and mouth. The headset was positioned correctly when the subject felt a light bounce off of the air flows by the part in front of the mouth and when in the recording of the air flow was clearly visible. The mean distance between the mouth and the sensor in the headset was 4 cm (range 3-5 cm) and between the noise and headset was 5 cm (range 3-6 cm).
Subjects' experiences with the headset were explored using the user interface design method AEIOU (activities, environments, interactions, objectives and users). In this descriptive observational study, the subjects (users) provided feedback (interactions) by thinking aloud. The user interface was explored through researcher (AEIOU) observations. Subjects were asked to think aloud as they were putting on and taking off the headset and mouth mask (objects), and during and after the cycling test (activities). At the end of the study, the subjects were asked questions about their experiences in terms of comfort, functionality, adjustability and positioning, and usability (Likert, 1932;Finstad, 2010). They were also given the opportunity to provide additional feedback or comments. The study was performed in an exercise laboratory at a medical rehabilitation center under constant ambient conditions (an ambient temperature of 21.0 ± 2.0 • C) (environment).

Breathing gas analyzing headset
The indirect calorimetry breathing-gas analyzer without a mouth mask is a wearable headset. It contains an oxygen (O 2 ), carbon dioxide (CO 2 ), flow, wind, temperature and humidity sensor. Flow was measured in front of the mouth, and wind speed and temperature were measured at the side of the headset. Breathing gases (from the nose and mouth) were collected by an air shield on the headset in front of the mouth. This air shield (height 35 mm, width 45 mm and depth 12 mm) is designed to collect breathing gases from the nose and mouth as illustrated in Fig. 2. In this air shield the flow sensor and sampling tube, which transports the breathing gases to the rear box (located at the back of the worker) for analysis. This box contained the O 2 , CO 2 , temperature and humidity sensor, two pumps with an extraction speed of 3.2 L/min and a battery pack (see Fig. 3). The O 2 sensor (Oxygen Sensor OOM109-LF2; EnviteC-Wismar GmbH, Wismar, Germany) has a measurement range of 0-100% with an accuracy of <1%, an operating temperature range of 0-50 • C and a response time of <300 ms (T90) (Envitec by Honeywell, 2008). The Treymet Comet II CO 2 sensor (TreyMed, Inc., Pewaukee, Wisconsin, USA) has a measurement range of 0-13% with an accuracy of ±0.2 mmHg or 5% of the actual concentration, an operating temperature range of 5-55 • C and a response time of <28 ms (TreyMed, 2007). The thermal mass-flow sensor FS5 has a measurement range of 0-100 m/s with an accuracy of <3%, an operating temperature range of -20-150 • C and a response time of 160 ms (Innovative Sensor Technology, Unknown). To measure and correct for environmental wind, the Rev. P hot-wire anemometer (Modern Device, Providence, Rhode Island, USA) was used, which has a measurement range of 0-67 m/s and an accuracy of 1% (Modern Device, 2019) (Prohasky and Watkins, 2014). The head part weighted 190 g and the back part 1255 g resulting in a total weight of the breathing gas analyzing headset of 1445 g.
The headset had a sample frequency of <1 s. V˙O2, V˙CO2 and respiratory exchange ratio (RER = ratio of V˙CO2 over V˙O2) were calculated per 5 s (in line with the reference system). The sensors were calibrated according to the instructions in their manuals using a standard certified commercial gas preparation (range O 2 from 16 to 21%, CO 2 from 0.05 to 5%, breathing frequency from 5 to 45 breaths per minute,V˙Efrom 10-60 L/min). In-vitro and in-vivo studies explored the sensitivity of the O 2 , CO 2 and flow sensor to capture nose and mouth breathing. To standardize for metabolic calibration, the headset data were trained with machine learning (computational learning theory) on learning data in a two-step approach. These learning data were gathered in two pilot studies including, in total, 26 subjects. During these pilot studies, the subjects were measured in rest and when cycling with up to 125W resistance, according to the protocol described in the study design. This data contained an O 2 range of 13-21%, CO 2 from 0.05 up to 6% and a flow range of 5-45 breaths per minute. Additionally, the sensitivity of the flow sensor to capture nose and mouth was explored with a breathing frequency ranging from 5 to 20 breaths per minute (by mouth breathing and by nose breathing). An algorithm was developed based on the best fitting model (Pearson correlation coefficient), resulting in a linear regression model using the gradient descent methodology per parameter, i.e., V˙O2, V˙CO2, RER and its parameters fraction inhaled (FiO 2 ) and exhaled O 2 (FeO 2 ), fraction inhaled (FiCO 2 ) and exhaled CO 2 (FeCO 2 ) and exhaled volume (V˙E). Different (adaptive) filtering techniques (including (extended) Kalman and (normalized and recursive) least mean squares) were examined. Based on the outcomes of the learning data, no filter was applied during this study.

Indirect calorimetry with mouth mask
As a reference, respiratory breath-by-breath gas analysis was measured with CareFusion's JAEGER™ Vyntus™ CPX (CareFusion Germany 234 GmbH, Hoechberg, Germany). This indirect calorimetry breathing-gas analyzing system is an accurate and reliable method (Perez-Suarez et al., 2018;Carlomagno et al., 2015) that is used for (medical) diagnostics (Skrgat et al., 2018;Rokkedal-Lausch et al., 2019). The mouth-mask system has a ventilation measurement range of 0-300 L with an accuracy of 2% or 0.5 L/min, a volume range of 0-10 L with an accuracy of 2% or 50 mL, and a V˙O2 and V˙CO2 measurement range of 0-7 L/min with an accuracy of 3% or 0.05 L/min (CareFusion, 2016). The resolution is 0.01 vol% with a response time of (T10-90) = 75 ms (CareFusion, 2016). The flow range is 0-15 L/s with an accuracy of 3% or 70 mL/s, and the calculated RER has a measurement range of 0.6-2.0 with an accuracy of 4% or 0.04 (CareFusion, 2016).

Heart rate monitor
HR was measured using the Cardiac Acquisition Module (CAM-14) of the CardioSoft™ Diagnostic System Exercise Stress Testing ECG application (GE Healthcare, Wauwatosa, US). This 15-leads ECG has a sampling rate of 16.000 samples/second per lead with an analyzing frequency of 500 samples/second. The dynamic range is 320 ± 10 mV with a resolution of 4.88 μV/LSB at 500 Hz and <15 μV noise (GE Healthcare, 2017). HR was registered using automatic arrhythmia detection (GE Healthcare, 2017).

Data analysis
The main measurement parameter for studying the headset's validity (aims 1-3) was V˙O2 , and the secondary parameters were V˙CO2 and RER. The mean value of the last 30 s of every measurement was used for data analysis. Validity (aims 1-3) was tested using paired t-tests (normally distributed data) or the Wilcoxon signed-rank test (non-normally distributed data) and by the intraclass correlation coefficient (ICC, twoway mixed model, absolute agreement) per parameter. The mean difference (MD) was shown per result with the standard deviation (SD). The ICC was considered as excellent when ICC≥0.80; good when 0.60≥ICC>0.79; moderate when 0.40≥ICC>0.59; and low when ICC<0.40 (Cicchetti, 1994), and was presented with limits of agreement (LoA) (±1.96*SDdifference (Bland and Altman, 1999)). Bland-Altman plots were used to analyze individual differences between two measurement methods (headset, mouth mask and/or HR) against the individual mean of the headset (Bland and Altman, 1999). Level of significance was set at p ≤ 0.05. The following outcomes were interpreted as acceptable for proof-of-concept: accuracy of ±5%; moderate, good or excellent ICC compared to the mouth mask; and an ICC for the headset higher than the ICC for HR compared to the mouth mask. The scores for user experiences (aim 4) were presented as median and interquartile range. A good or excellent user rating (Likert score ≥4) was defined as acceptable for this proof-of-concept.

Results
The 15 subjects (eight male and seven female) had an age (mean ± SD) of 31.0 ± 14.4 years, a height of 180.9 ± 9.1 cm and a weight of 79.6 ± 11.2 kg. Of these 15 subjects, eight started with the headset, followed by the mouth mask, and seven started with the mouth mask, followed by a measurement with the headset. Table 2 shows the MD and ICC of V˙O2, V˙CO2,V˙E and RER between the measurements with the headset compared with the mouth mask. For V˙O2 across all phases, the difference between the headset and mouth mask was acceptable (MD = 1.93%), and an excellent ICC was observed (ICC = 0.86). For V˙CO2 andV˙E over all phases and while cycling at 75 W, differences between the headset and mouth mask were acceptable (MD ≤ 3.96%) with a moderate to good ICC (ICC≥0.48). In other phases, the differences between the headset and mouth mask exceeded the acceptable level of 10% (MD ≥ 9.96%), and a low ICC with the mouth mask was observed (ICC≤0.40). The Bland-Altman plot (Fig. 4) shows a proportional error for V˙O2 and V˙CO2 at 125 W. However, this error was not present across all phases. The headset tended to overestimate V˙O2, V˙CO2 andV˙E at low intensities (resting and 0 W) and underestimate it at higher intensities (75 W and 125 W).

Validity of the headset compared to the mouth mask
For RER, across all phases and in rest differences between the headset and mouth mask were acceptable (MD ≤ 2.34%) and a moderate ICC was observed across all phases (ICC = 0.50). When cycling the acceptable levels were exceeded (MD ≥ 5.37%) and within the different phases, low correlations were observed (ICC≤0.33). Appendix A presents the validity of the parameters behind calculation of V˙O2 and V˙CO2.

Validity of the headset compared to validity of V˙O2 estimated according to heart rate
Since V˙O2 based on HR can be calculated only if HR is between 125 and 170 beats/min (Astrand and Ryhming, 1954), the validity of V˙O2 estimated according to HR could only be determined during cycling at 125 W; seven subjects had an HR between 125 and 170 beats/min (149.14 ± 17.18 beats/min); seven subjects had an HR exceeding the lower level (103.7 ± 30.1 beats per minute); and one subject had an HR exceeding the upper level (180.1 beats/min). Table 3 shows the validity of V˙O2 measured with the headset compared to the mouth mask, and the validity of estimations based on HR compared to the mouth mask for these seven subjects.
V˙O2 estimated with HR compared to V˙O2 measured with the mouth mask was within the acceptable level (MD = 9.54%). V˙O2 measured with the headset exceeded the acceptable level (MD = 13.05%). The correlations fulfilled the proof-of-concept criteria; the headset had a higher ICC (ICC = 0.39) than V˙O2 calculated with HR (ICC = 0.11). The Bland-Altman plots are shown in Fig. 5. Table 4 presents the influence of wind for both the mouth mask and the headset in terms of V˙O2, V˙CO2,V˙E and RER. Appendix C presents the comparison between the headset with and without wind, and between the mouth mask with and without wind. Also the effects of wind per phase are presented in appendix C.

Validity of results under the influence of wind
The mean difference between the headset with wind and the mouth mask without wind across all phases was increased due to the wind. Despite for V˙O2 with fluctuating wind (MD = 5.55%) andV˙E with wind from the front (MD = 16.14%), the differences were still acceptable (MD ≤ 4.75%). All ICC's between the headset and the mouth mask lowered from good to moderate and low (ICC≤0.51). At low intensities, the headset tended to overestimate V˙O2, V˙CO2 and V e , indicating a major influence of wind on the validity of the headset and resulting in ICCs varying from good to low. This proportional error was also visible in the Bland-Altman plot of V˙O2, V˙CO2 andV˙E ( Fig. 6) in all three wind directions.

User experience
All subjects preferred the headset to the mouth-mask system. One subject stated: "After the test with the system with mouth mask, I understand why the headset is being designed. The mouth-mask system is not comfortable, causes irritation and impedes communication." The results of the questionnaire are presented in Fig. 7. All medians were 4.
The headset was considered to be more practical due to the fact that it does not hinder communication and does not cause breathing resistance or the "trapped feeling" that the mouth mask does. Subjects felt that absence of the mouth mask made it more comfortable, less obtrusive and therefore preferable to the system with a mouth mask, for use during the working day. Two subjects mentioned that they experienced much greater breathing resistance with the mouth mask, which influenced their breathing pattern. Four subjects were not able to comment on the potential usability of this version in the workplace. One subject mentioned that for long-term wear (throughout the day), the headset could be heavy, and a lighter model would be preferable. Moreover, two Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of oxygen consumption (V˙O2), carbon dioxide production (V˙CO2) and Respiratory Exchange Ratio (RER) measured with the reference system indirect calorimetry with mouth mask against the headset. a p-value of paired t-test. b p-value of ICC. subjects (and the researchers themselves) felt that hygiene of the headset needed to be improved. The fabric layer on the forehead could be cleaned, but it absorbed sweat which would be undesirable if the headset was shared by different users. Furthermore, the flow sensor in the air shield catching gases inhaled through the nose and mouth was complex and time-consuming to clean, also requiring optimization before implementation in the workplace. All but one subject felt that the headset was stable once in position during cycling and head movements. One subject mentioned that the arm of the headset was constantly visible at the corners of the visual field, which was disturbing. According to all subjects, the headset did not interfere during the cycling test and seemed to be feasible during performance of physically active jobs. A final comment made by one of the subjects was that the design of the headset and back part is unattractive and too big and heavy. Subjects indicated the need for a professionalized prototype.

Discussion
The aim of this study was to evaluate the validity and usability of a newly developed headset to measure energetic workload. The headset showed moderate to good validity when compared to the mouth mask for V˙O2, V˙CO2 and RER. The headset had better concurrent validity with the mouth mask (gold standard) than V˙O2 estimated on the basis of HR. Users considered the headset to be more practical than the mouth mask. However, at low air flows (e.g., when the user was resting), lower values for V˙O2 and V˙CO2 were measured, and magnifying algorithms were required. This led to a tendency to overestimate gas concentrations at high air flows (as when cycling at 125 W). The disturbing wind flow Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of oxygen consumption (V˙O2) measured with the reference system indirect calorimetry with mouth mask against the headset and heart rate (HR) using Åstrand-Ryhming nomogram. a p-value of paired t-test. b p-value of ICC.  Mean differences (MD; headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of oxygen consumption (V˙O2), carbon dioxide production (V˙CO2), exhaled volume (V˙E) and Respiratory Exchange Ratio (RER) measured with the reference system indirect calorimetry with mouth mask (mouth mask) against the headset. The mouth mask and headset are compared with each other while a wind flow (10 m/ s) was blowing at the systems from the side, front and fluctuating in front. a p-value of paired t-test. b p-value of ICC.
caused overestimation of the volume (mainly when the user was resting) and decreased the validity from good without wind to moderate or low with wind flow. However, the mean difference for V˙O2 across all phases was still acceptable. More extensive training and machine learning are required to optimize the algorithm of the headset, especially at high and low breathing volumes. Moreover, large intra-individual differences of V˙O2, V˙CO2 and RER were observed. In line with the comments of two subjects, the researchers noticed that the breathing pattern differed between cycling with the headset versus the mouth mask, which could have affected the outcomes. Before the headset can be implemented in the workplace, its design needs to be professionalized. Overall, the usability of the headset is promising. Because most of the quality criteria were fulfilled, proof-of-concept of the present version of the headset was supported.
A key strength of this study is the use of this completely new and innovative breathing-gas analyzing system and comparison with current (medical and practical) standards in order to explore its concurrent validity and usability. This is the first study in which such a device has been presented and validated. The strength of the system is the lack of mouth mask, making it a more wearable design (headset), which provides the opportunity for users to communicate while wearing it. The system is comfortable and non-obtrusive and can potentially be used for prolonged periods of time. Individual physiological responses during activities can be investigated with this system. Our study included a broad range of subjects varying widely in age, physical fitness and smoking habits. While all subjects passed the minimum cycling requirement of 30 min a day at 15 km/h, some far exceeded this lower limit. Some did smoke, others not. The variation in VO2 and VCO2 between subjects suggests that physical fitness varied, however, measurement of the subjects' maximum physical capacity was not needed for the objective of this study. Another strength of this study is the simulation of environmental conditions during outdoor work by correcting the system for wind.
A limitation of this study is the method used to calibrate the sensors on the headset. An optimal calibration procedure and suitable equipment are not yet available for this device. This could have caused deviations in the measurements within and between subjects. It is expected that these deviations were small and irrelevant due to the constant ambient laboratory conditions. However, the development of a calibration procedure requires further attention. In future studies, it would be preferable to validate the system against direct calorimetry methods. Moreover, the sample size in this study was limited for comparison with the validity of the V˙O2 estimated on the basis of heart rate (n = 7), so the results only provide first indications. The limited amount of data that could be used for this comparison was caused by the limiting heart rate range (125-170 beats/min) of the Astrand-Ryhming nomogram. However, this also indicates that the headset would be applicable in a much wider activity range compared to the estimation based on heart rate. The reliability and accuracy of the headset will always be lower than with systems using a mouth mask. A mouth mask creates a closed environment in which all breathing gases pass one opening, which makes it easier to precisely measure all concentrations. The headset creates more open conditions in which breathing gases are directly mixed with environmental gases, resulting in lower measured O 2 and CO 2 concentrations. Magnifying algorithms and prediction models can in part compensate for this limitation.
Now that proof-of-concept has been established, further development and research need to be undertaken to increase the system's accuracy, validity and reliability. A professional prototype needs to be developed (TRL 5) based on this proof-of-concept, and the filtering methods, algorithms and models need to be optimized using more learning datasets. The design and mechanical features of the proof-ofconcept version could be improved by designing a smaller headset with fewer fixation points around the head; a wind shield around the flow sensor; and a more professionalized and attractive design. Additionally, individual fitting, the most optimal distance in front of the nose and mouth and sustaining that position needs to be further ensured. With this improved prototype, more (validation) research, and research on test-retest reliability should be performed with a larger sample, including the target group of physically active workers. Moreover, the influence of working activities (e.g. noise, vibration and body movements) on the validity of the system should be studied. In addition, the  validation and usability of this headset need to be investigated in reallife working conditions (Havenith and Heus, 2004) associated with different kinds of physically demanding occupations. Furthermore, the future system should provide the user with personalized real-time feedback.
There is increasing interest in the potential of this breathing-gas analyzing headset to monitor physiological responses of individuals during different kinds of activities (Tamura, 2019;Liu et al., 2019). The headset can be used to monitor individual responses to activity over full working days. Aside from the importance of such a system for monitoring the V˙O2 of physically active workers, there is also the potential for this system to be used in (occupational) healthcare and sports settings. The headset system could be an objective measurement tool for monitoring the energetic workload and capacity of individuals during various activities (working, rehabilitation and sports) carried out in users' actual environment over longer periods of time. This system is likely to be of interest as a low-level, comfortable and easy-to-use device for monitoring the physical fitness of subjects during their training in multiple settings. As the system can be used over longer periods of time in a comfortable manner, more complete information about the subject can be gathered rather than snapshots. This headset could fill a gap in the existing range of instruments for measuring energy consumption by being more valid than heart rate measurements and more useable than indirect calorimetry measurements with a mouth mask.

Conclusion
The gas-exchange measuring headset has shown moderate to good validity compared to indirect calorimetry using a mouth mask for measuring V˙O2, V˙CO2,V˙E and RER. The headset is more valid compared to V˙O2 estimated on the basis of HR. Wind disturbances hampered the validity of the headset, but even with wind, the validity of the system remained acceptable. Users experienced the headset as more comfortable and useable compared to the mouth-mask system. The present version is not yet completely valid, but its potential is supported and indicates opportunities for further professionalization.

Funding
This work, part of SPRINT@Work, was supported by the European Regional Development Fund, the province and municipality of Groningen, and the province of Drenthe [grant number T-3036, 2013].

Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: C. C. Roossien, G.J. Verkerke and M.F. Reneman declare the following interests which may be considered as potential competing interests: pending patent application of University of Groningen and University Medical Center Groningen about the development breathing gases analyzing headset [application number EP19189792.5]. All authors declare that they have no known financial interests related to this patent application or personal relationships that could have appeared to influence the work reported in this paper. Mean with standard deviation (SD) (mean ± SD) for oxygen consumption (V˙O2), carbon dioxide production (V˙CO2) in L/min and respiratory exchange ratio (RER) measured with the mouth mask and headset.  Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of inhaled (FiO 2 ) and exhaled (FeO 2 ) concentration oxygen and inhaled (FiCO 2 ) exhaled (FeCO 2 ) concentration carbon dioxide and ventilatory efficiency (V˙E/ V˙CO2) measured with the reference system indirect calorimetry with mouth mask against the headset. * Non-parametric data. a p-value of paired t-test. b p-value of ICC.  Mean with standard deviation (SD) (mean ± SD) for oxygen consumption (V˙O2) measured with the mouth mask and headset and heart rate (HR) in beats/min per phase. Mean with standard deviation (SD) (mean ± SD) for oxygen consumption (V˙O2), carbon dioxide production (V˙CO2) in L/min and respiratory exchange ratio (RER) measured with the mouth mask per phase with influence of the wind flow from the side, front and fluctuating in the front. Mean differences (MD) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of oxygen consumption (V˙O2), carbon dioxide production (V˙CO2), exhaled volume (V˙E) and Respiratory Exchange Ratio (RER) measured with the reference system indirect calorimetry with mouth mask (mouth mask) and the headset. The mouth mask and headset are compared with themselves while a wind flow (10 m/s) was blowing at the systems from the side, front and fluctuating in front. a p-value of paired t-test. b p-value of ICC.  Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of oxygen consumption (V˙O2) measured with the reference system indirect calorimetry with mouth mask against the headset. The mouth mask and headset are compared with each other and with themselves while a wind flow (10 m/s) was blowing at the systems from the side, front and fluctuating in front per phase. a p-value of paired t-test. b p-value of ICC.  Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of carbon dioxide production (V˙CO2) measured with the reference system indirect calorimetry with mouth mask against the headset. The mouth mask and headset are compared with each other and with themselves while a wind flow (10 m/s) was blowing at the systems from the side, front and fluctuating in front per phase. a p-value of paired t-test. b p-value of ICC.  Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of exhale volume (V˙E) measured with the reference system indirect calorimetry with mouth mask against the headset. The mouth mask and headset are compared with each other and with themselves while a wind flow (10 m/s) was blowing at the systems from the side, front and fluctuating in front per phase. a p-value of paired t-test. b p-value of ICC. Mean differences (MD: headsetmouth mask) and p-value and the intraclass correlations (ICC) with a confidence interval (CI) of 95%, p-value and Limits of Agreement (LoA) of respiratory exchange ratio (RER) measured with the reference system indirect calorimetry with mouth mask against the headset. The mouth mask and headset are compared with each other and with themselves while a wind flow (10 m/s) was blowing at the systems from the side, front and fluctuating in front per phase.