Real-time inverse kinematics for the upper limb: a model-based algorithm using segment orientations

Model based analysis of human upper limb movements has key importance in understanding the motor control processes of our nervous system. Various simulation software packages have been developed over the years to perform model based analysis. These packages provide computationally intensive—and therefore off-line—solutions to calculate the anatomical joint angles from motion captured raw measurement data (also referred as inverse kinematics). In addition, recent developments in inertial motion sensing technology show that it may replace large, immobile and expensive optical systems with small, mobile and cheaper solutions in cases when a laboratory-free measurement setup is needed. The objective of the presented work is to extend the workflow of measurement and analysis of human arm movements with an algorithm that allows accurate and real-time estimation of anatomical joint angles for a widely used OpenSim upper limb kinematic model when inertial sensors are used for movement recording. The internal structure of the selected upper limb model is analyzed and used as the underlying platform for the development of the proposed algorithm. Based on this structure, a prototype marker set is constructed that facilitates the reconstruction of model-based joint angles using orientation data directly available from inertial measurement systems. The mathematical formulation of the reconstruction algorithm is presented along with the validation of the algorithm on various platforms, including embedded environments. Execution performance tables of the proposed algorithm show significant improvement on all tested platforms. Compared to OpenSim’s Inverse Kinematics tool 50–15,000x speedup is achieved while maintaining numerical accuracy. The proposed algorithm is capable of real-time reconstruction of standardized anatomical joint angles even in embedded environments, establishing a new way for complex applications to take advantage of accurate and fast model-based inverse kinematics calculations.

model includes 15 degrees of freedom and 50 muscle compartments and in addition to movement kinematics it enables the evaluation of muscle-tendon lengths, moment arms, muscle forces and joint moments in an anatomically reasonable setup (also conforming to the ISB recommendation [27] as those presented in [23] and [24]). Our motivation of choosing this approach instead of using direct joint kinematics estimates from inertial sensor data lies in the strong belief that the utilization of this upper limb model leads to better understanding of the additional inherent properties of arm movements (e.g. muscle activation patterns) compared to a pure kinematic investigation. By using the model from [26], two main differences from the direct approach should be considered: (1) because the model was developed mainly for optical motion capture technology, joint angle reconstruction is less coupled with raw measurement data in the sense that it is performed as an inverse task based on locations of arbitrarily placed virtual markers (instead of relying on sensor orientations directly) and (2) as a consequence of the inverse kinematics approach, the model raises more computational demand for joint angle reconstruction. While the first point contributes to a possibly wider adoption of the model (i.e. measurement data of any reasonable source-being it optical, ultrasound based, inertial, etc.-only needs to be expressed in terms of virtual marker locations), the second can be considered as a drawback in certain situations when real-time joint angle reconstruction would be beneficial.
To analyze model-defined parameters of recorded movements, various software packages have been developed for biomechanical analysis over the past decades. Some of them are tightly integrated into their corresponding measurement system's ecosystem with only kinematic reconstruction capabilities [28][29][30] while others are independent of a particular hardware setup (SIMM [31] and OpenSim [32]), even including analysis options for muscle activations and movement dynamics if the corresponding models are available. While these tools may address the joint angle reconstruction problem involving inverse kinematics in different ways resulting in varying performance, OpenSim [32] was chosen as the reference model-based analysis software for the purposes of the current study, because (as to the best of the authors' knowledge) it is the only free and open source (thus widely accessible) option in the biomechanical modeling field.
Because the upper limb model used in this study was originally developed for the SIMM software, an other reason for using OpenSim was it's ability to import and handle SIMM models natively, so combined with the model it provides a complete package for open source arm movement analysis. For kinematic evaluation, OpenSim uses the "standard" offline measurement-scaling-inverse kinematics pipeline where the actual biomechanical model (single limb to full body) is fitted to measurement data. During this process, positions of virtual markers placed on specific model segments are fitted to experimentally recorded marker positions of the subject with the same arrangement. Scaling is important to generate subject-specific model instances while inverse kinematics (IK) is performed to extract model-defined anatomical joint angles that produced the movement. OpenSim uses a text based structured XML model format that contains all information needed for the biomechanical description of the human body [bodies, kinematic constraints and forces (i.e. muscles)] that are accessible through API calls, too.
Complex measurement and analysis of upper limb movements including kinematics and muscle activities is an exciting and growing subfield of human movement analysis [33][34][35][36][37] that promises better understanding of control patterns during specific movements, and as an example benefit may-on the longer term-advance control techniques currently applied to arm and hand prostheses. This process however needs tighter integration of kinematic measurement and reconstruction (from raw data to anatomical joint angles) as the time and computational overhead of the offline measurement-scaling-inverse kinematics scheme gives a bottleneck in applications where real-time analysis of the control patterns with respect to the actual kinematics would be beneficial.
As a specific example, let us consider a supervised learning based muscle activity classification framework like the one proposed in [38] that is built around the increasingly popular concept of deep neural networks [39]. The goal of this setup is to assess the classification capability of various network architectures in estimating upper limb movement kinematics based only on muscle activity recordings. The large amount of labeled data that is needed to train these networks (even iteratively after initial deployment) should be presented by a method that performs muscle activity data labeling based on the actual (anatomically relevant) kinematic state of the limb automatically as the measurements occur. By using the approach proposed in this paper, labeled measurement data could be collected in real-time and simultaneously from multiple recording devices and a high level of automation might be reached in the process of measurement, network training and deployment.

Purpose of the study
The main goal of this study is to extend the measurement and analysis workflow of human arm movements with a method that allows accurate and real-time calculation of anatomical joint angles for a widely used upper limb model in cases when inertial sensors are used for movement recording. For this purpose a custom kinematic algorithm is introduced that utilizes orientation information of arm segments to perform joint angle reconstruction. Accuracy and execution times of the proposed algorithm are validated against the most widely available biomechanic simulation software's inverse kinematics algorithm on various platforms.

Upper limb model
To analyze arm kinematics with OpenSim the most complete model available was chosen known as the Stanford VA Upper Limb Model [26]. It is freely available as part of the Simtk project [40] in SIMM model format [41] that can be imported directly into OpenSim. The model is based on experimental data, includes 15 degrees of freedom and 50 muscle compartments and enables the evaluation of kinematics, muscle-tendon lengths, moment arms, muscle forces and joint moments in an anatomically reasonable setup. After importing, the structure of the model follows OpenSim's convention including bodies connected with joints, rotational and translational kinematic constraints and forces defining muscle paths and attributes (for details, see Additional file 1). The 15 degrees of freedom define the kinematics of the shoulder (3), elbow (2), wrist (2), index finger (4) and thumb (4). As the current work focuses on kinematics of the shoulder, elbow and wrist joints only, any muscles and kinematics of the index finger and the thumb will not be taken into account further in this study. The seven degrees of freedom that define the kinematic state of the whole arm excluding the fingers are elevation plane, thoracohumeral (elevation) angle and axial rotation for the shoulder, elbow flexion and forearm rotation for the elbow and deviation and flexion for the wrist.
The model represents the upper limb as a linked kinematic chain of bodies, each having a parent body, a location in the parent's frame and a joint describing the possible relative motion of the child with respect to the parent. The three-dimensional posture of the arm is generated by consecutive rotations of bodies determined by the actual angle values (joint coordinates) in proximal to distal order. As the movement of the shoulder girdle (clavicle, scapula and humerus) is complex and cannot be measured directly in most cases, the model implements regression equations that vary only with the thoracohumeral angle to determine the position of the shoulder joint with respect to the thorax.

Orientation from joint coordinates
The reference orientation of the model (all joint coordinate values equal 0°) occurs when all of the following conditions are true [26] (for a visual reference, see Fig. 1a): 1. The shaft of the humerus is parallel to the vertical axis of the thorax. 2. In case of shoulder elevation, the humerus moves in the plane of shoulder abduction. 3. In case of elbow flexion, the forearm moves in the sagittal plane. 4. The hand is in the sagittal plane. 5. The third metacarpal bone in aligned with the long axis of the forearm.
During the calculation of arm orientation determined by the actual joint coordinates the position of the shoulder joint is calculated first from the thoracohumeral angle. This is followed by four consecutive rotations in the shoulder joint in the order of elevation plane, elevation angle, elevation plane and axial rotation, where the rotation axes of elevation plane and axial rotation overlap in the reference arm orientation. Elbow flexion occurs in the humeroulnar joint while forearm rotation takes place in the radioulnar joint. Motion of the wrist is distributed among the proximal and distal rows of carpal bones by having two rotations for each row (four in total) both depending on flexion and deviation values.

Markers, scaling and inverse kinematics
To evaluate subject motion with OpenSim, model parameters have to be adjusted to experimental data. As of OpenSim's latest version at the time of writing (version 3.3), this can be achieved by using marker based motion capture data and virtual markers located on the model at approximately the same places as the experimental markers are located on the subject. This setup allows automatic subject specific scaling of the model [42] and calculation of anatomical joint coordinates (inverse kinematics) during the measured movement using weighted least squares optimization [43]. In the inverse kinematics tool, individual marker weights can be user specified and the least squares problem is solved with a quadratic programming solver (convergence criterion: 0.0001, iteration limit: 1000). As the efficiency of both scaling and inverse kinematics is highly dependent on the accuracy of virtual marker locations, marker placement is usually an iterative process until the best fit to experimental data is found.

Prototype markers
To enable the utilization of the upper limb model with inertial measurements, a prototype marker set was defined (see Additional file 2). For this purpose, orthonormal bases were formed for each anatomical joint (shoulder, elbow and wrist) and markers were placed at specific locations in these bases to reflect the actual compound rotations among the respective degrees of freedom (for the corresponding mathematical definitions, see Appendix 1.

Orthonormal bases
Shoulder The three independent model axes for the shoulder (defined in model bodies humphant, humphant1 and humerus that have the same position), collectively denoted as B sh_orig , were good candidates to form a basis because they are unit length vectors (like all axes in the model) and almost orthogonal to each other (pairwise deviations from right angle are 0.00064°, 0.0029° and 0.0002°). For proper operation of the proposed algorithm however, these axes were orthogonalized using QR decomposition (see Appendix 2) to prevent error accumulation during the calculations. This resulted in the orthonormal basis B sh_orth . As a result, rotations in the shoulder can be expressed as elemental rotations of B sh_orth with acceptable angle errors due to the pairwise deviations between the original and new basis vectors after orthogonalization (0.000019°, 0.000655° and 0.002925°, respectively).
Elbow As relative orientation of the two rotation axes in the elbow is not close enough to orthogonal and the axes are defined in different parent bodies (r el_flex → ulna and r pro_sup → radius), the orthonormal basis B pro_sup and the rotation matrix R B pro_sup r el_flex (θ el_flex ) were formed to properly express the compound rotation as the product of an axis-angle and an elementary rotation about the main axis in B pro_sup . Again, some angle errors are expected while calculating the elbow flexion angle in this basis because r el_flex is threated as it would belong to the radius body of the model.
Wrist Rotations in the wrist are the most complex among the three anatomical joints. Effects of the two active joint coordinates (deviation and flexion) are distributed among two model bodies (lunate and capitate), each having two nonorthogonal rotations (r dev , r flex → lunate and r pdr1 , r pdr3 → capitate) depending on both joint coordinates. To deal with the complexity of this structure, the orthonormal basis B pdr3 and rotation axes r B pdr3 dev , r B pdr3 flex and r B pdr3 pdr1 were constructed to prepare the calculation of θ dev and θ flex . Using this approach, r dev and r flex are threated as if they would belong to the capitate body of the model.

Marker placement
In order to add virtual markers to any OpenSim model, the parent body and the location within the parent's frame have to be defined for each marker. Having the orthonormal bases from the previous section (B sh_orth , B pro_sup and B pdr3 ), 12 prototype markers were placed on the model as follows (for reference, see where PM refers to prototype marker, x is the serial number of the basis in which the marker is located (1-3), [SH|EL|WR] refers to the anatomical joint in which the marker is located and [O|X|Y|Z] refers to the marker's location within the actual basis (origin or any of the axes). For example the name of the wrist's origin marker is

PM3_WR_O.
Because all markers follow their parent bodies' orientation during analyzed movements, coordinates of the difference vectors between the origin markers and the same basis' axis markers reflect the compound rotation matrix in each anatomical joint (in the corresponding basis) at any time instant. To utilize this feature it is crucial that the structure of each joint's marker subset remains consistent during measurements (by keeping the formation of an orthonormal basis), because any deviation in relative marker positions renders the derived compound rotation matrix inaccurate. As a consequence, it is recommended to use arm segment orientations to calculate the actual positions of prototype markers instead of measuring them directly. This can be achieved when using optical motion capture devices as segment orientations can be reconstructed with most systems by having at least three markers on each segment, however this is still an offline process. More importantly, utilizing orientation information makes the application of inertial sensors possible and beneficial in this setup as they are used to determine orientation directly. As an additional benefit, the offset-independent nature of orientation information enables subject-independent joint angle reconstruction, rendering the scaling step of the standard inverse kinematics approach unnecessary in the process. Using this feature a real-time inverse kinematics algorithm is introduced in the next section that provides joint coordinate outputs coherent with OpenSim's inverse kinematics tool.

Algorithm description
The key point in accelerating the selected upper limb model's inverse kinematics calculation is the model specific determination of prototype marker locations. By constructing representative orthonormal bases in each anatomical joint of interest (B sh_orth in the shoulder, B pro_sup in the elbow and B pdr3 in the wrist) joint specific rotations can be addressed as elementary or axis-angle rotations in the corresponding bases. Having prototype markers in locations that reflect the actual orientations of these bases gives the possibility to express joint coordinates (rotation angles) in an efficient way, even in closed algebraic form in the shoulder and the elbow. As there was no closed form solution found to calculate angles in the wrist, a numerical algorithm is given for this part of the problem. MATLAB R2015b (Mathworks, Natick, MA, USA) was used for algorithm prototyping and development.

Shoulder
Because shoulder prototype markers are placed on the model in a way that they show the actual orientations of the main axes of B sh_orth , an experimental (numerical) compound rotation matrix can be constructed from their spatial coordinates as shown in (1), where each marker position should be considered as a row vector.
By utilizing the kinematic structure of the shoulder joint (and keeping the assumption that R shoulder = R shoulder as detailed in Appendix 3), estimations of rotation angle values can be calculated as follows: Although the formulations in (2a) and (2c) could be susceptible to modulo π and sign errors in general, the allowed angle ranges defined in the model (θ sh_elv : 0 • → 180 • , θ sh_rot : −90 • → 20 • ) keep these equations safe to use until the experimental data does not force the model outside of these ranges.

Elbow
Similarly to the shoulder, the experimental compound rotation matrix can be constructed from the actual spatial positions of the elbow's prototype markers. Because the model implements rotations in an incremental way, a reverse rotation of the extracted frame have to be performed in the shoulder's basis to get the correct experimental rotation matrix for the elbow as shown in (3).
Having R elbow = R elbow estimations of joint angle values in the elbow can be calculated as (for further details, see Appendix 5): where r As in the case of the shoulder, (4a) and (4b) should be used with care because of modulo π and sign errors, but again having sufficient joint angle limits in the model (θ el_flex : 0 • → 130 • , θ pro_sup : −90 • → 90 • ) application of these formulas is safe until experimental data does not force the model outside of these ranges.

Wrist
The experimental compound rotation matrix for the wrist can be constructed from the actual spatial positions of its prototype markers. Because of incremental rotations in the model, reverse rotations of the extracted frame have to be performed in the shoulder's and elbow's bases to get the correct experimental rotation matrix as shown in (5).
Although there is no closed form solution to calculate joint angle rotations in the wrist, the flexion angle can be determined as the solution of the following root finding problem (further details and definitions of a, b, c, x, y and z can be found in Appendix 7): Based on this definition, the following properties hold for F (θ flex , σ ): 1. (−θ flex + η) defines a baseline with constant negative slope for the two possible solutions F (θ flex , −1) and F (θ flex , 1). 2. Because of the definition of the atan2(y,x) function, the value of atan2 ξ − c 2 , c will always be positive if ξ − c 2 is real (i.e. c 2 ≤ ξ). This implies that the two solutions to F (θ flex , σ ) do not cross the baseline but remain "below" (σ = −1) and "above" (σ = 1) of it for all values of θ flex .
As c depends on the actual compound rotation matrix R wrist , its value is influenced by both θ dev_c and θ flex_c . As a consequence, there may be wrist configurations where c 2 > ξ for some regions of θ flex , driving F (θ flex , σ ) into a singular state within these regions. To prevent problems arising from this situation during the root finding process, singularity border points for θ flex can be determined as follows. Let (7) as defined in (28), only θ flex changed to ϑ to denote specific singularity border points.
Considering (6), singularity borders occur at locations where c 2 = ξ, resulting in c 1,2 = ± √ ξ . Using these equalities and Euler's formula, c can be rewritten into an exponential form that can be solved for ϑ resulting in the formulas shown below.
As a result, four separate complex-valued singularity border points can be determined for all wrist configurations. To get a better understanding of the structure of F (θ flex , σ ) , the function was visually inspected with an interactive MATLAB script developed for this purpose. The tool allows the simulation of different user-defined wrist configurations through separate sliders for θ dev_c and θ flex_c while plotting all relevant information about the problem (a representative screenshot is shown in Fig. 2). Based on manual testing throughout the model-defined ranges for θ dev_c and θ flex_c , the following observations were made: 1. The condition in (23)    is sufficiently small but not zero), two separate roots may appear, but only one being valid. 8. F (θ flex , σ ) will have a valid root at θ flex = µ if and only if σ = sign(µ − η).
Based on these observations, (6) can be solved with the following algorithm: Algorithm 1: Numerical algorithm to calculate θ flex Data: x, y, z, η and ξ from (A.7.7) and (6) Although the computational demand of wrist angle calculations is higher than of the shoulder and the elbow, the algorithm has still higher overall time efficiency than the optimization approach used by OpenSim's inverse kinematics tool, as it is shown in the "Results" section.

Algorithm validation
Testing and validation of the described algorithm was automated using OpenSim with its Python API and MATLAB. To make direct comparison possible between OpenSim's optimization method and the proposed algorithm, eight additional virtual markers were placed on the model at locations that are suitable for optical motion capture (e.g. using Vicon) simulating an environment where OpenSim is generally applied. The virtual marker locations are the following (for visual reference, see Fig. 1d): • VM1_TH : Thorax marker at the upper end of the sternum.
• VM2_AC : Acromio-clavicular joint of the shoulder girdle. The structure of the upper limb model (including marker positions) was extracted using OpenSim's Python API and saved into a .mat file for further processing with MATLAB. A forward kinematics function (functionally equivalent to OpenSim's implementation) was developed in MATLAB to calculate body and marker positions for specific joint coordinate vectors of [θ elv , θ sh_elv , θ sh_rot , θ el_flex , θ pro_sup , θ dev_c , θ flex_c ] in the model, enabling the analysis of trajectories for both PMx and VMx markers from artificially generated movement patterns (Fig. 1b-d).

Simulated movement patterns
To avoid possible problems accompanying experimental measurements, simulated movement patterns were generated to test the performance and validity of the proposed algorithm. 100 separate pseudo-random (random seed = 10) joint coordinate trajectories were constructed in MATLAB having a duration of 5 s and a sampling frequency of 100 Hz. The trajectories were generated as 5th order Bézier-curves as shown in (11) using six uniformly distributed control points (0, 20,...,100%) with randomly chosen values for each joint coordinate from their valid intervals defined in the model. A representative movement pattern is shown in Fig. 3.
Following this step, forward kinematics was performed for each of the simulated patterns to calculate PMx and VMx marker trajectories yielding simulated "measurement" data as it would have been recorded during a real movement. The resulted trajectories were then used as inputs to inverse kinematics calculations with OpenSim (VMx) and (11) 4 (1 − t)P 4 + t 5 P 5 where t ∈ [0, 1] and P i , i ∈ {0, . . . , 5} are the control points. our algorithm (PMx) while the corresponding movement patterns served as reference for the outputs of each of the tested methods.

Inverse kinematics with OpenSim
To speed up the validation process, OpenSim (v3.3) was compiled from source on a Supermicro server having two Intel ® Xeon ® E5-2695 v3 CPUs (with a total of 56 execution threads) and 64 GB RAM, running Ubuntu Server 14.04.2 LTS operating system. Although the inverse kinematics (IK) algorithm in the used OpenSim version do not utilize multicore architectures natively, each IK task can be divided into separate subtasks that can run in parallel thanks to the applied optimization method (there is no data dependency between time frames). To utilize this property, a pipeline was developed using MATLAB and Bash to prepare VMx marker data and the required files for OpenSim and manage file transfers, multi-threaded IK execution, results collection and evaluation. One important step before performing the IK calculation in OpenSim is subject-specific scaling of the used model and relative weighting of the markers. As only simulated data were used in the current study on an unmodified upper limb model, the scaled model file was identical to the original file during IK execution, while all marker weights were equal.

Algorithm implementation
The prototype of the proposed algorithm was implemented in MATLAB and tested with the simulated PMx marker trajectories. Calculation of (6) was performed using Fig. 3 Representative simulated movement pattern used for algorithm validation. Simulated movement patterns were generated to validate the proposed kinematic algorithm. 100 separate pseudo-random joint coordinate trajectories were constructed as 5th order Bézier-curves having 5 s duration and 100 Hz sampling frequency. PMx and VMx marker trajectories were calculated with our forward kinematics MATLAB function to generate simulated "measurement" data for the proposed algorithm and OpenSim MATLAB's built-in fzero() function. Based on the MATLAB version, the algorithm was implemented in ANSI C to target practical applications. In this case (6) was solved with Brent's root finding algorithm from [44]. Furthermore, compilation options were included to assess the effects of different data precisions (float or double) on the accuracy and execution time of the algorithm. This was not an option with MATLAB because fzero() cannot be used with float input.
To address possible accuracy problems arising from the lower precision of float data, an additional test case with a simple output continuity check for wrist angles was included, namely when the absolute difference between two successive θ flex_c values is larger than 5°, the actual θ flex_c will be the previous θ flex_c + 0.5°. This modified version of the algorithm is denoted with mod. suffix among the results.

Evaluation platforms
MATLAB and C implementations of the proposed algorithm were tested on a system with an Intel ® Core ® i5-540M processor running Ubuntu Desktop 14.04.4 LTS. In addition, the C implementation was evaluated on the following microcontroller units (MCUs) that are capable of targeting resource constrained environments (e.g. wearable measurement devices) with high performance: For proper comparison, both MCUs were clocked at 168 MHz and the source codes differed only in device specific details (e.g. hardware initialization). Algorithm evaluation on the MCUs was controlled with MATLAB via a UART link including data preparation, transmission and storage.

Performance metrics
To evaluate the overall performance of the algorithm compared to OpenSim's IK method, accuracy and execution times were analyzed in all cases (OpenSim, MATLAB and C implementations). To assess accuracy, RMS values were computed for the differences between the calculated and simulated joint coordinate trajectories for each trial. Means and standard deviations of these RMS values were then calculated across trials for each platform and precision (where this was applicable). Running times of OpenSim's IK evaluation were calculated as a sum of subtask execution times from the IK log output directly. Algorithm execution times were measured by the tic and toc methods in MATLAB, the clock() function from the <time.h> library for the C implementation on PC and on-chip hardware timers clocked at 1 MHz for both MCUs.

Data exclusion from OpenSim trials
Although inverse kinematics in OpenSim was calculated using an unmodified and unscaled model in each trial, there were cases when large step errors occurred at seemingly random locations in the IK output (independently of subtask borders mentioned in "Inverse kinematics with OpenSim" section). This phenomenon may be handled by marker placement adjustment or error checking in measurement data in general. As IK input was strictly controlled by using simulated trajectories and the markers remained intact in the model between trials, further troubleshooting would have been needed to find a solution to this issue. Because the main emphasis of the study is the proposed algorithm and not OpenSim's internal workings and IK troubleshooting, all OpenSim trials were excluded from final accuracy assessment where any of the resulted joint coordinate RMS errors exceeded 5° to not bias the results with incorrect data. As a result, only 59 trials out of 100 were used to calculate the accuracy of OpenSim's IK algorithm. This however did not have any effect on the other measurements, so MATLAB and all C results were calculated across 100 trials. Table 1. The results show that considering the mean of all valid trials (59 for OpenSim, 100 for all others), all platforms performed reasonably well producing errors below 3° for all joint coordinates (for a trialwise visual comparison between OpenSim's IK method and the proposed algorithm, see Additional file 3).

RMS errors from algorithm evaluation are shown in
Regarding OpenSim it can be seen that errors for each joint coordinate are larger than those provided by our algorithm. The reason for this can lie in the optimization approach of OpenSim that in fact contains hard-coded convergence (0.0001) and iteration (1000) limits. However these limits prevent OpenSim's IK algorithm to match the simulated movement patterns perfectly, they provide a practical solution to the accuracy ↔ running time trade-off for the software's general usage.
MATLAB and C implementations of the proposed algorithm performed equally well for shoulder and elbow angles independent of the used data precision (double / float). This could occur because of the relatively low number of operations needed by these joint coordinates shown in equation groups () and () that prevented considerable error accumulation due to the lower precision of float. Regarding wrist angles a clear distinction can be made between double and float (MATLAB uses double as default). The two main reasons for this phenomenon are (1) the significantly larger computational demand of θ dev_c and θ flex_c involving iterative processes that can lead to precision error accumulations and (2) rounding error based mismatch in the root finding process involved in the calculation of θ flex in rare cases when two roots are present in (6). A deeper analysis among the trial-wise results revealed that the second reason was more significant as roughly 70% of the trials ended up in no more than 0.1° maximum error with float precision. The rest of the trials contained 1-5 "wrong" samples showing 15°-20° impulse-like errors while the remaining samples within the trial did not have this problem. Investigation of the erroneous samples revealed that indeed a wrong root for (6) was found in these cases. To deal with this issue, an output continuity checking step was implemented for float precision in cases denoted with the mod. suffix. This turned out to be a simple yet effective solution to the problem as the corresponding results show the disappearance of the impulse-like errors.

Table 1 RMS errors
Each row represents a separate test environment for the reference (OpenSim) and proposed inverse kinematics algorithm. The columns show mean Borbély and Szolgay BioMed Eng OnLine (2017) 16:21

Execution time
To assess overall performance, execution times were compared between OpenSim's IK method and our proposed algorithm on different platforms and are shown in Table 2.
Measurement results show that the optimization approach of OpenSim performed the calculation of a single iteration in 145 ms on average. Because of the application specific nature of the proposed algorithm, its running times considering different implementations (MATLAB/C), data precisions (double/float) and platforms (PC/ARM Cortex-M) all showed a significant increase in execution performance compared to OpenSim, the worst result being about 5 ms on average for a single iteration.
As expected, the C implementation is more than two orders of magnitude faster than the MATLAB version on the PC, yielding execution times per iteration about 10 μs with all precision variants (double, float and float mod.). Opposed to this, running times on embedded platforms showed more scattered results. The difference between double and float is more expressed in these cases while application of the FPU accelerates float computations even further (hard float entries in Table 2). Regarding the modified algorithm variant it can be seen that even the extra continuity check adds some amount to the execution time per iteration, the possibility to use float precision brings more speed advantage, especially with the FPU enabled. These findings are true for both tested MCUs with the observation that ARM's M7 architecture is about twice as fast as M4 when running the presented algorithm with the same core clock.

Discussion
Evaluation results of the tested algorithms show that each approach provides proper accuracy for most common arm movement analysis scenarios. One important aspect however is that while OpenSim provides a useful general tool for biomechanical analysis including fields beyond inverse kinematics (e.g. inverse and forward dynamics), the calculation of joint angles from the actual experimental data is rather demanding computationally. As the output of this step gives the basis for all other analysis methods in the software, the amount of time needed for the overall processing pipeline highly depends on the efficiency of this algorithm. As Table 2 shows, the average amount of time needed for OpenSim's IK algorithm to perform a single iteration would allow about 7 Hz operation that falls behind the generally accepted practice in human movement recording of at least 50 Hz. This property excludes OpenSim from tight integration with systems requiring real-time movement kinematics, however that is not the software's original target application anyway (up to version 3.3 at least).
Considering the algorithm proposed in the study Tables 1 and 2 show a significant improvement in performance in both accuracy and execution time when compared to OpenSim's IK method. The main reason for this difference is the algorithm's application specific nature with the utilization of both the internal structure of the used upper limb model and inertial sensing of movement to determine limb segment orientations directly. As the MATLAB version showed proper accuracy and sufficiently short execution time on the PC, implementation of the algorithm in ANSI C was reasonable to assess its "real" performance without the overhead of a general prototyping tool that MATLAB essentially has. Because accuracy results are the same or very similar across specific variants of the C implementation (i.e. using double/float precision), only execution time differences are discussed later in the text.
Running times of the algorithm's C implementation showed more than four orders of magnitude speedup on the tested Intel ® Core ® i5-540M processor compared to Open-Sim's IK algorithm on a more recent and higher performance server CPU with Xeon ® architecture, yielding about 10 μs execution time per iteration for all variants. However this is an impressive improvement, running the algorithm on PC would still pose problems from practical aspects of possible applications (e.g. total size and mobility of the measurement system or communication overhead between the measuring and processing device), so the real benefit of this speed increase lies in the "spare" performance that opened the way to testing the algorithm in resource constrained embedded environments. Evaluation of the proposed method on high performance MCUs showed that all implementation variants that provided good accuracy (double and [soft/hard] float mod.) had acceptable execution times on both architectures (M4 and M7) for real-time operation, considering 100 Hz as sufficient sampling frequency for human movement analysis. Based on these results, the specific implementation variant should be chosen taking the overall design requirements of the actual practical application into account (i.e. wearable measurement devices like the one presented in [45]) as in most cases the algorithm should fit into a system containing other computationally demanding processes (like sensor fusion algorithms) with power consumption being a critical part of the design for example.
An other practical advantage of the described algorithm is that it enables subject-independent joint angle reconstruction during the measurements. This means that by taking advantage of the offset-independent nature of orientation sensing, no scaling is required for the proper calculation of inverse kinematics (opposed to OpenSim) as long as the IMUs are capable to produce good approximations of limb segment orientations.
It needs to be emphasized however that the application specific nature of the algorithm and its dependency on the used upper limb model induce some practical considerations, because having a method that works in a strictly controlled simulation environment does not guarantee its applicability in a real situation. A fundamental thing to consider is whether data provided by real sensors reflect arm segment orientations needed by the algorithm accurately. As this was a key requirement from the beginning of algorithm design, the prototype markers were defined in local bases of the joints that can be directly expressed in terms of sensor orientations (for details, see Additional files 1, 2). As an additional benefit, the definition of an anatomical calibration procedure-often needed when inertial sensors are used for human movement recordingcan be avoided as the proposed algorithm does not use segment length information for joint angle reconstruction. What cannot be avoided however is the accurate estimate of sensor orientations, as the whole process depends highly on the precision of this step. Although there is no ultimate solution to the problem of inertial sensor fusion yet, there are continuous algorithmic efforts to reach higher accuracy and reliability (for different highlighted approaches, see "Background" section). But even in cases when the sensors provide accurate orientation information of the measured limb, care must be taken when determining the limb's reference orientation based on the measurements. The reason for this is mainly inter-subject variability in the sense that even the model defines the reference posture clearly, it cannot be assumed that any actual subject will reproduce the same posture very accurately that can lead to offset errors during the measurement. Furthermore, the assumption was made during algorithm development that the measured movement always remains within the valid joint angle ranges defined in the model. As long as this assumption holds (as in the case of simulated movement patterns presented in this study), the algorithm should not have problems with proper joint angle reconstruction. However, if outliers are present in the experimental data (e.g. reference posture errors, inaccuracies in the measurement or the sensor fusion algorithm or extreme anatomical ranges of a subject) undefined output states can occur. This may be handled with a simple saturation technique on the algorithm level but rather should be prevented by applying proper experimental design and calibration methods. In a practical setup this involves proper sensor placement and various steps before the measurements including zero motion offset compensation, hard and soft iron error compensation in the magnetometer and determining relative sensor orientations with respect to the measured segments [46,47] for example.

Conclusions
With keeping the upper mentioned considerations in mind the proposed algorithm is capable for real-time reconstruction of standardized anatomical joint angles even in embedded environments, opening the way to complex applications requiring accurate and fast calculation of model-based movement kinematics. Although the presented algorithm is special to the selected upper limb model, the introduced approach by strategically placing the prototype markers can further be applied to other biomechanical models in the future. As a result, the proposed method brings the possibility to widen the application areas of OpenSim with complex models and making its overall analysis pipeline more efficient by accelerating the calculation of inverse kinematics and (12c) r 3 = r axis_id_0 × r 2 r axis_id_0 × r 2 (13) r B axis_id_0 axis_id = r axis_id B axis_id_0 (14) exp θ r = I 3 + sin (θ ) r + (1 − cos (θ )) r 2 where r : The orientation of the humerus is determined by four consecutive rotations in the shoulder in the order of elevation plane, elevation angle, -elevation plane and axial rotation degrees of freedom (for axis and angle notations, see Table 3). Based on axis definitions in the model, rotations about r elv and r sh_rot can be estimated with an elementary rotation about the second axis of B sh_orth while the estimation of the rotation about r sh_elv can be done with an elementary rotation about B sh_orth 's third axis as shown in ().
Using these definitions, the symbolic expression of the compound rotation matrix that represents the actual orientation in the shoulder can be calculated as shown in (20). For the convenience of calculation, this step was performed with MATLAB's Symbolic Math Toolbox using the script shown in Appendix 4.