Performance of Microsoft Azure Kinect DK as a tool for estimating human body segment lengths

The Microsoft Kinect depth sensor, with its built-in software that automatically captures joint coordinates without markers, could be a potential tool for ergonomic studies. This study investigates the performance of Kinect in limb segment lengths using dual-energy X-ray absorptiometry (DXA) as a reference. Healthy children and adults (n = 76) were recruited for limb length measurements by Kinect and DXA. The results showed consistent ratios of arm, forearm, thigh, and leg lengths to height, which were 0.16, 0.14, 0.23, and 0.22 respectively, for both age groups and methods. Kinect exhibited perfect correlation among all limb lengths, indicating fixed proportions assumed by its algorithm. Comparing the two methods, there was a strong correlation (R = 0.850–0.985) and good to excellent agreement (ICC = 0.829–0.977), except for the right leg in adults, where agreement was slightly lower but still moderate (ICC = 0.712). The measurement bias between the methods ranged from − 1.455 to 0.536 cm. In conclusion, Kinect yields outcomes similar to DXA, indicating its potential utility as a tool for ergonomic studies. However, the built-in algorithm of Kinect assumes fixed limb proportions for individuals, which may not be ideal for studies focusing on investigating limb discrepancies or anatomical differences.


Participants
Eligible participants were healthy Taiwanese aged 6-60 years.Exclusion criteria were participants with pregnancy, known chronic disease, limb defect and pacemaker implant.Anthropometric and Kinect measures were performed by trained research assistants.DXA examinations were conducted by a certified DXA technician.All measurements were conducted at the Radiology Department of Chang Gung Memorial Hospital at Chiayi.Prior to the study, participants were instructed to fast for four hours and to empty their bladders.

Anthropometric measure
Body height and weight were measured using a digital scale (Super-View, HW-3050, Taipei, Taiwan) with participants wearing no shoes and lightweight clothing.Weight measurements were recorded to the nearest 0.1 kg and height measurements were recorded to the nearest 0.1 cm.A single measurement was taken for each participant.

Azure Kinect setup and data acquisition
The Kinect measurements were conducted in a windowless examination room, primarily illuminated by uniform artificial lighting.The Azure Kinect DK (Microsoft Inc., Redmond, WA, USA) was positioned in front of the participant at a height of 1.1 m on a tripod, approximately 1.7 m away from the individual being recorded (Fig. 1A).For individuals whose height exceeded the field of view of the Kinect sensor, the tripod was positioned 1.9 m away from the recorded individual.The Kinect sensor's raw data was captured at a sampling rate of 30 Hz, utilizing the narrow field of view mode without binning, and possessing a resolution of 640 × 576.This was achieved through the integration of the Azure Kinect SDK v1.4.0 and Azure Kinect Body Tracking SDK v1.1.0,executed within the Visual Studio Code 2019 compiler environment using C/C# programming languages.This investigation was conducted using a Gigabyte laptop computer equipped with an 11th Generation Intel® Core™ i7-11800H Processor operating at boost clock speed of 4.6 GHz, and a NVIDIA GeForce RTX 3080 mobile graphics card, running on the Windows 10 operating system.
Participants were instructed to slowly move their limbs to allow the Kinect to capture the joints and then stand still with their legs apart and hands held away from their torso (Fig. 1B).Ten static depth images were captured and coordinated for measurements per participant after repositioning.The entire Kinect measurement process lasted approximately 15 min.The Azure Kinect Body Tracking SDK automatically provided 32 joint coordinates for the human body.This investigation focused on measuring eight major segments of the human limb, specifically the upper arm, forearm, thigh, and leg on both sides of the body.

DXA setup and data acquisition
Whole body image was acquired using a fan-beam DXA system (Horizon W, Hologic, Inc.) equipped with Hologic Apex version 5.6.According to the manufacturer's product specifications, a whole body DXA scan by the Hologic Horizon W scanner requires 272 s with 15 uSv radiation exposure and the scan length is 195.5 cm and scan width is 65.5 cm.Prior to image analysis, DXA images in JPEG format were downloaded, resized to their actual dimensions, and subsequently rescaled to the known scan length using Image J 1.54 (National Institutes of Health, Bethesda, MD, USA) by a trained research assistant.Then, a radiologist with more than 20 years of experience in DXA measured once from the final images, including the biomechanical lengths of the upper arm, forearm, thigh, and leg on both sides of the body 13,14 .The intraclass correlation coefficient (ICC) between a single rater's three measurements on the same DXA images of the first 15 participants ranged from 0.960 to 0.996, indicating almost perfect agreement.

Theory/calculation
To quantify the body segment lengths by Kinect, measurements were taken from the shoulder to the elbow for the upper arm, from the elbow to the wrist for the forearm, from the hip to the knee for the thigh, and from the knee to the ankle for the leg (Supplementary Table 1).The length of body segment was calculated by measuring distance between two joints, defined by their coordinates (x 1 , y 1 , z 1 ) and (x 2 , y 2 , z 2 ), using Eq.(1) as below: To quantify the body segment lengths by DXA, measurements were taken from the center of the humeral head to the midpoint of the humeroradial joint for the upper arm, from the midpoint of the humeroradial joint to the midpoint of the radiocarpal joint for the forearm, from the femoral head center to the midpoint of the tibial condyles for the thigh, and from the midpoint of the tibial condyles to the mid-width of the talus for the leg (Fig. 2).

Statistical analysis
Statistical analyses are performed using MedCalc for Windows (MedCalc Software, Ostend, Belgium).ICC for was employed to evaluate absolute agreement for the same rater among the initial 15 participants.Correlation Coefficient (R) and R-squared (R 2 ) are used to represent the correlation between Kinect and DXA measurement methods.ICC for absolute agreement, Concordance Correlation Coefficient (CCC), and Bland-Altman Plot are used to demonstrate consistency between these two tools.

Results
This study enrolled 22 children (11 boys and 11 girls) and 54 adults (25 males and 29 females) as participants.The average age of children was 13.18 years, while adults had an average age of 35.12 years.Participant characteristics are detailed in Table 1.Two children were found to be obese with BMI z-scores over 2, while seventeen adults were diagnosed with obesity due to their BMI exceeding 25 kg/m 2 .The average ratios of arm length to height, forearm length to height, thigh length to height and leg length to height were consistent at 0.16, 0.14, 0.23 and 0.22, respectively, across both children and adults, regardless of whether DXA or Azure Kinect methods were utilized.
In terms of the correlation across all eight limb segments, the correlation coefficient was found to be higher in children (R = 0.950-0.997,Table 2) as compared to adults (R = 0.783-0.982,Table 3) by DXA.When Azure Kinect was used for evaluation, it shown a perfect correlation among all limb segments (R = 1, Table 4).This suggests that the Azure Kinect Body Tracking SDK assumed fixed limb proportions beforehand.
Table 5 presents the results of the correlation and agreement analysis for the eight major limb lengths when comparing the Azure Kinect with DXA method in both adults and children.Overall, measurements in children showed a stronger correlation and agreement with both methods.An analysis of Subgroups comprising both obese and non-obese adults shows similar results (Supplementary Table 2).For the correlation analysis, both methods showed a very strong correlation (R = 0.850-0.918)and exhibited excellent linearity (R 2 = 0.723-0.843) in the adult group.In the child group, measurements by both tools showed a very strong correlation (R = 0.944-0.985)and almost perfect linearity for linear regression (R 2 = 0.927-0.971),except for the measurement in the right forearm, which demonstrated an excellent fit (R 2 = 0.891).
In the agreement analysis, it was observed that all limb length measurements obtained through both methods exhibited good agreement (ICC = 0.829-0.896) in adults.However, for the right leg in adults, the agreement was   www.nature.com/scientificreports/slightly lower but still considered moderate (ICC = 0.712).In contrast, measurements taken from children using both methods demonstrated excellent agreement (ICC = 0.937-0.977).Concordance Correlation Coefficient (CCC) was used to evaluate the precision and accuracy of measurements obtained from DXA and Azure Kinect methods.In the adult group, it was observed that there was poor agreement between the measurements by the two methods (CCC = 0.708-0.894).Conversely, in the child group, the measurements showed moderate to substantial agreement (CCC = 0.934-0.976).
The Bland-Altman Plot analysis as used to test the difference between DXA and Kinect measures, showing that the measurement bias between the two methods for each major limb segments ranged from − 0.589 to 0.536 cm in children and from − 1.455 and 0.420 cm in adults.

Discussion
This research explores the application of the Azure Kinect for estimating limb length in healthy subjects, using low-dose X-ray method DXA as the reference method.The length of a limb can be measured using mechanical, anatomic or kinematic axis 13 .In this study, limb lengths by DXA were estimated using mechanical axis because it is more compatible to the Azure Kinect method.This study showed that there is a reliable correlation and agreement in estimating limb length between the two methods across both children and adults.These findings indicate the potential of the Azure Kinect as a valuable tool for anthropometric assessments.However, the Azure Kinect assumes that all limb lengths have the same fixed proportion for each person.This feature might not be suitable for certain studies, such as those aim to assess limb discrepancies or anatomical differences in people.In clinical settings, the DXA method is commonly used to estimate bone mineral densities and body composition.This study is distinctive for utilizing the DXA method but not conventional X-ray method as a radiographic anthropometric technique.DXA scans and conventional X-ray methods both create images by projecting the 3D structure onto a 2D film, which can introduce measurement errors in body segment lengths.However, the DXA scanner exhibits less magnification and distortion errors along the body's vertical axis (from head to toe) compared to conventional X-ray methods 12,15 .This is because DXA scan uses a movable C-arm gantry containing an X-ray tube and a linear detector array, which moves in a line-by-line pattern along the vertical axis of the body.The DXA method also benefits from only using the daily background radiation.This results in images that may have less detail but are still adequate for observing bones and joints, making it a more suitable option for healthy subjects.It is worth noting that the whole-body image in a DXA report is not in the correct size and the image needs to be calibrated according to a known dimension before taking measurements 11,12,16 .
In the human body, there is a minor variation in the body proportions among individuals due to age, genetic factors, nutrition, disease and other factors [17][18][19] .However, we observe that limb length data obtained from Azure Kinect demonstrate consistent limb length proportions for each subject.A potential explanation for the results by Azure Kinect may be attributed to the limitation imposed by pre-existing assumption in the Azure Kinect model.Previous studies [20][21][22] have employed a priori constraints on limb length proportions in the 3D human pose estimation.This approach demonstrates commendable generalization capabilities while avoiding error estimation.Hence, caution should be taken when using Kinect sensor for measuring limb length discrepancies and body proportions.
In this study, the DXA method serves as the reference method for Azure Kinect.Thus, it is important to recognize the inherent difference in imaging technologies between DXA and Kinect.The DXA and Azure Kinect methods were obtained with participants in different positions: the DXA scan was conducted with participants in a supine position, whereas the Azure Kinect imaging involved participants in a standing position.Moreover, DXA images constitute a 2D radiographic measurement, whereas Kinect employs a 3D measurement approach.As participants lay on the examination table during a DXA scan, their limbs were aligned parallel to the table, reducing perspective errors.In this study, participants were positioned on the DXA table with their joints fully extended to further prevent errors caused by the rotational displacement of the limbs.Despite the inherent differences between DXA and Azure Kinect methods, the limb length estimates demonstrated compatibility, indicating that Azure Kinect could serve as an alternative tool for limb length measurement.
In this study, both children and adults have similar limb-to-height ratios, measured by either DXA or Azure Kinect methods.In general, the lengths of the arm, forearm, thigh, and leg constitute 16%, 14%, 23%, and 22% of the body's height, respectively, across both age groups.These findings regarding body proportions align with earlier researches using clinical anthropometric measurement 23,24 .This study also discovered that body proportions were less variations in children compared to adults.This might be because lifestyle differences are more apparent in adults.However, no relevant references were found.
While Azure Kinect shows promise in measuring body limb length, it also has limitations.Firstly, external factors, like environmental lighting and object color, can cause noise in the Azure Kinect readings 25 .To minimize the noise, we conducted our study in a room without windows, ensuring consistent artificial lighting.Participants were also instructed to wear light-colored clothing during the examination.Secondly, viewing angles and position of the object might affect the measurement errors of Kinect sensor 26,27 .To deduce the error, Azure Kinect was placed right in frontal of the participants and as close as possible to the subject.Thirdly, a previous study suggested that being obese might affect the accuracy of 3D mesh reconstruction 5 .However, our findings indicated that obesity does not appear to restrict the Kinect sensor's ability to track body joint locations.Finally, the Kinect software's built-in assumptions and tracking algorithms for skeletal tracking may restrict the accurate placement of joint locations, leading to measurement errors 28 .As we lack knowledge about these assumptions and algorithms, we cannot prevent these errors.
In conclusion, Azure Kinect, equipped with its integrated software, can automatically capture joint coordinates without the need for body markers.Our research shows that Azure Kinect yields similar outcomes as DXA, indicating its potential utility as a tool for ergonomic studies.However, the built-in algorithm of Azure Kinect assumes fixed limb proportions for individuals, which may not be ideal for studies focuses on investigating limb discrepancies or anatomical differences.

Figure 1 .
Figure 1.Experimental Setup for Azure Kinect Placement.(a) The Azure Kinect was placed on a tripod 1.1 m high, 1.7 m from the subject.If the subject's height exceeded the sensor's field of view, they stood 1.9 m in front of it.(b) Participant Skeletal Tracking Illustration.The image depicts a participant with skeletal tracking overlaid, showcasing the spatial tracking of 32 body points.

Figure 2 .
Figure 2.Imaging processing workflow for dual-energy X-ray absorptiometry image.A whole-body DXA scan is resized and rescaled to its actual dimensions of 195.5 cm in length and 65.5 cm in width using Image J. Subsequently, limb length measurements were obtained from the processed image.

Table 5 .
Correlation and agreement between DXA and Kinect measurements.Bias in Bland-Altman plot is calculated as (DXA-Kinect)/mean.