Automated Implementation of the Edinburgh Visual Gait Score (EVGS) Using OpenPose and Handheld Smartphone Video

Recent advancements in computing and artificial intelligence (AI) make it possible to quantitatively evaluate human movement using digital video, thereby opening the possibility of more accessible gait analysis. The Edinburgh Visual Gait Score (EVGS) is an effective tool for observational gait analysis, but human scoring of videos can take over 20 min and requires experienced observers. This research developed an algorithmic implementation of the EVGS from handheld smartphone video to enable automatic scoring. Participant walking was video recorded at 60 Hz using a smartphone, and body keypoints were identified using the OpenPose BODY25 pose estimation model. An algorithm was developed to identify foot events and strides, and EVGS parameters were determined at relevant gait events. Stride detection was accurate within two to five frames. The level of agreement between the algorithmic and human reviewer EVGS results was strong for 14 of 17 parameters, and the algorithmic EVGS results were highly correlated (r > 0.80, “r” represents the Pearson correlation coefficient) to the ground truth values for 8 of the 17 parameters. This approach could make gait analysis more accessible and cost-effective, particularly in areas without gait assessment expertise. These findings pave the way for future studies to explore the use of smartphone video and AI algorithms in remote gait analysis.


Introduction
Gait analysis is used to evaluate functional status and neurological health, particularly for those with mobility issues [1,2]. Instrumented gait analysis (IGA) is the gold standard that uses three-dimensional body pose data to diagnose gait abnormalities [3][4][5]. However, IGA is resource-intensive and not available in most clinical settings [4]. Observational gait analysis or visual gait analysis (VGA) is an alternative that provides a simple and easy-to-use procedure for clinicians to observe and characterize a patient's gait. In VGA, recorded video is evaluated using a variety of scales that concentrate on specific joints, planes, and events in the gait cycle. Despite its subjective nature, VGA is widely used to evaluate gait issues in children and adults, and several computer-based image analysis methods have been designed to support clinicians. However, VGA may result in low sensitivity, specificity, validity, and reliability when compared to IGA [6].
One of the oldest VGA scoring scales is the Physician Rating Scale (PRS), which was specifically developed for children with cerebral palsy (CP) [7]. Its development, reliability, and validity were not well documented when the scale was first published

•
Provides viable methods that use handheld smartphone video, thereby making this approach applicable to the remote video capture of patient movements by clinicians or parents/caregivers; • Creates an automatic system using OpenPose keypoints to detect foot events and strides and automatically calculate the EVGS. This potentially reduces the workload for clinicians because the EVGS is calculated quickly and automatically without the need for manual calculations or human intervention; • This approach saves substantial time and resources, eliminating the need for specialized equipment or extensive preparation for gait analysis.  This approach saves substantial time and resources, eliminating the need for specialized equipment or extensive preparation for gait analysis. The research presents a novel and time-efficient method for calculating the EVGS results using smartphone video data, advancing the understanding of automatic gait analysis methods using the EVGS and providing a basis for future research in this area. Overall, this research provides a promising avenue for the integration of automatic EVGS analysis and scoring using smartphones, paving the way for remote visual gait analysis technology.

Materials and Methods
To achieve the automated EVGS analysis objective, the proposed system must be able to identify the appropriate video frames for analysis and then apply a series of rules to generate scores for each parameter. This involves two main tasks: creating a method for identifying the required gait events in a walking video and creating an algorithm for scoring the EVGS parameters. The proposed methodology sequentially implements pose estimation, view detection, direction of motion detection, gait event identification, stride allocation, and finally, algorithmic calculation of the EVGS.
A preliminary set of videos was needed to develop the algorithm. To obtain these videos, an able-bodied individual walking in both sagittal and coronal views was recorded at 60 Hz using a handheld smartphone camera.

Pose Estimation
Sensors 2023, 23, x FOR PEER REVIEW 3 of 35  This approach saves substantial time and resources, eliminating the need for specialized equipment or extensive preparation for gait analysis. The research presents a novel and time-efficient method for calculating the EVGS results using smartphone video data, advancing the understanding of automatic gait analysis methods using the EVGS and providing a basis for future research in this area. Overall, this research provides a promising avenue for the integration of automatic EVGS analysis and scoring using smartphones, paving the way for remote visual gait analysis technology.

Materials and Methods
To achieve the automated EVGS analysis objective, the proposed system must be able to identify the appropriate video frames for analysis and then apply a series of rules to generate scores for each parameter. This involves two main tasks: creating a method for identifying the required gait events in a walking video and creating an algorithm for scoring the EVGS parameters. The proposed methodology sequentially implements pose estimation, view detection, direction of motion detection, gait event identification, stride allocation, and finally, algorithmic calculation of the EVGS.
A preliminary set of videos was needed to develop the algorithm. To obtain these videos, an able-bodied individual walking in both sagittal and coronal views was recorded at 60 Hz using a handheld smartphone camera.

Pose Estimation
Sensors 2023, 23, x FOR PEER REVIEW 3 of 35  This approach saves substantial time and resources, eliminating the need for specialized equipment or extensive preparation for gait analysis. The research presents a novel and time-efficient method for calculating the EVGS results using smartphone video data, advancing the understanding of automatic gait analysis methods using the EVGS and providing a basis for future research in this area. Overall, this research provides a promising avenue for the integration of automatic EVGS analysis and scoring using smartphones, paving the way for remote visual gait analysis technology.

Materials and Methods
To achieve the automated EVGS analysis objective, the proposed system must be able to identify the appropriate video frames for analysis and then apply a series of rules to generate scores for each parameter. This involves two main tasks: creating a method for identifying the required gait events in a walking video and creating an algorithm for scoring the EVGS parameters. The proposed methodology sequentially implements pose estimation, view detection, direction of motion detection, gait event identification, stride allocation, and finally, algorithmic calculation of the EVGS.
A preliminary set of videos was needed to develop the algorithm. To obtain these videos, an able-bodied individual walking in both sagittal and coronal views was recorded at 60 Hz using a handheld smartphone camera.

Pose Estimation
Sensors 2023, 23, x FOR PEER REVIEW 3 of 35  This approach saves substantial time and resources, eliminating the need for specialized equipment or extensive preparation for gait analysis. The research presents a novel and time-efficient method for calculating the EVGS results using smartphone video data, advancing the understanding of automatic gait analysis methods using the EVGS and providing a basis for future research in this area. Overall, this research provides a promising avenue for the integration of automatic EVGS analysis and scoring using smartphones, paving the way for remote visual gait analysis technology.

Materials and Methods
To achieve the automated EVGS analysis objective, the proposed system must be able to identify the appropriate video frames for analysis and then apply a series of rules to generate scores for each parameter. This involves two main tasks: creating a method for identifying the required gait events in a walking video and creating an algorithm for scoring the EVGS parameters. The proposed methodology sequentially implements pose estimation, view detection, direction of motion detection, gait event identification, stride allocation, and finally, algorithmic calculation of the EVGS.
A preliminary set of videos was needed to develop the algorithm. To obtain these videos, an able-bodied individual walking in both sagittal and coronal views was recorded at 60 Hz using a handheld smartphone camera. The research presents a novel and time-efficient method for calculating the EVGS results using smartphone video data, advancing the understanding of automatic gait analysis methods using the EVGS and providing a basis for future research in this area. Overall, this research provides a promising avenue for the integration of automatic EVGS analysis and scoring using smartphones, paving the way for remote visual gait analysis technology.

Materials and Methods
To achieve the automated EVGS analysis objective, the proposed system must be able to identify the appropriate video frames for analysis and then apply a series of rules to generate scores for each parameter. This involves two main tasks: creating a method for identifying the required gait events in a walking video and creating an algorithm for scoring the EVGS parameters. The proposed methodology sequentially implements pose estimation, view detection, direction of motion detection, gait event identification, stride allocation, and finally, algorithmic calculation of the EVGS.
A preliminary set of videos was needed to develop the algorithm. To obtain these videos, an able-bodied individual walking in both sagittal and coronal views was recorded at 60 Hz using a handheld smartphone camera.

Pose Estimation
For each video frame, the OpenPose BODY25 model provided joint coordinates as keypoints (Figure 1). Coordinate trajectories can be noisy and sometimes contain outliers; therefore, keypoint data were filtered using a zero-phase, dual-pass, second-order Butterworth filter with a 12 Hz cut-off frequency. A 2D (two-dimensional) keypoint processing strategy was adapted from earlier work for a markerless AI motion analysis method for  [32], where keypoints with scores below a 10% threshold were removed, and then cubic spline interpolation was used to fill gaps of five frames (0.083 s) or fewer. strategy was adapted from earlier work for a markerless AI motion analysis method for hallways [32], where keypoints with scores below a 10% threshold were removed, and then cubic spline interpolation was used to fill gaps of five frames (0.083 s) or fewer.

Gait Event and Stride Detection
The process involved determining if the video is in the sagittal or coronal plane; determining whether the motion is left to right, right to left, anterior to posterior, or posterior to anterior; detecting the four distinct gait events; and assigning values to a particular stride. The required events are foot strike, foot off, mid-midstance (midpoint of midstance) and mid-midswing (midpoint of midswing).

Detection of Sagittal/Coronal Views
Trunk length fluctuates very little when walking in the sagittal plane, and trunk length stays within a certain threshold. The algorithm considers a video to be sagittal when the absolute difference in trunk length (distance between the midshoulder (KP1) and midhip (KP8) keypoints) between the first and last frames is less than a threshold (99 pixels). To identify the threshold, 37 videos of able-bodied people walking in both coronal and sagittal views were analyzed. A difference in trunk length of less than 99 pixels

Gait Event and Stride Detection
The process involved determining if the video is in the sagittal or coronal plane; determining whether the motion is left to right, right to left, anterior to posterior, or posterior to anterior; detecting the four distinct gait events; and assigning values to a particular stride. The required events are foot strike, foot off, mid-midstance (midpoint of midstance) and mid-midswing (midpoint of midswing).

Detection of Sagittal/Coronal Views
Trunk length fluctuates very little when walking in the sagittal plane, and trunk length stays within a certain threshold. The algorithm considers a video to be sagittal when the absolute difference in trunk length (distance between the midshoulder (KP1) and midhip (KP8) keypoints) between the first and last frames is less than a threshold (99 pixels). To identify the threshold, 37 videos of able-bodied people walking in both coronal and sagittal views were analyzed. A difference in trunk length of less than 99 pixels produced the best results, where the video resolution was 1920 × 1080. Therefore, 99 pixels was set as the threshold for classifying sagittal plane walking.
In the front (coronal) view, trunk length is maximum when the person is close to the camera and decreases from maximum when the person walks away from the camera.

Direction of Motion Detection
Following the coronal or sagittal identification of the video, the next step determines frontal or rear in coronal view and left to right or right to left in sagittal view.
An algorithm was employed that replicated the approach for locating the sagittal/coronal views by monitoring trunk length and comparing the difference between the first and final frames. If the difference is negative, the participant is facing forward, and if the difference is positive, the person is moving away from the camera.
In the sagittal view, the algorithm tracks the nose (KP0) location on the x-axis and calculates the difference between nose positions in the first and final frames. X-coordinates are maximal at the right border and minimal at the left border. Using this concept, if the difference is negative, the person is walking from left to right. If the difference is positive, the person is walking from right to left. This method produced consistent results.

Gait Event Identification
Foot strike and foot-off events were determined using the Zeni et al. [33] method. According to this approach, heel strike is when the sacral marker's forward distance from the heel marker is at its maximum, and toe off is when the toe marker is furthest posterior from the sacral marker. Since sacrum keypoints are not provided in OpenPose, the midhip keypoint (KP8) was used to replace the sacral marker location. Figure 2 provides an example of identified foot strikes and foot off using the heel positions (KP24 and KP21) and midhip positions (KP8).
The distance between the toes (KP19 and KP22) is used to determine mid-midstance and mid-midswing. The legs are closer together in mid-midstance posture; hence, the space between the two toes will be less. Assuming that the right big toe (KP22) is (x 1 , y 1 ) and the left big toe (KP19) is (x 2 , y 2 ), the distance between toes (d) was calculated using Equation (1), and mid-midstance was at the minimum distance ( Figure 3): produced the best results, where the video resolution was 1920 × 1080. Therefore, 99 pixels was set as the threshold for classifying sagittal plane walking.
In the front (coronal) view, trunk length is maximum when the person is close to the camera and decreases from maximum when the person walks away from the camera.

Direction of Motion Detection
Following the coronal or sagittal identification of the video, the next step determines frontal or rear in coronal view and left to right or right to left in sagittal view.
An algorithm was employed that replicated the approach for locating the sagittal/coronal views by monitoring trunk length and comparing the difference between the first and final frames. If the difference is negative, the participant is facing forward, and if the difference is positive, the person is moving away from the camera.
In the sagittal view, the algorithm tracks the nose (KP0) location on the x-axis and calculates the difference between nose positions in the first and final frames. X-coordinates are maximal at the right border and minimal at the left border. Using this concept, if the difference is negative, the person is walking from left to right. If the difference is positive, the person is walking from right to left. This method produced consistent results.

Gait Event Identification
Foot strike and foot-off events were determined using the Zeni et al. [33] method. According to this approach, heel strike is when the sacral marker's forward distance from the heel marker is at its maximum, and toe off is when the toe marker is furthest posterior from the sacral marker. Since sacrum keypoints are not provided in OpenPose, the midhip keypoint (KP8) was used to replace the sacral marker location. Figure 2 provides an example of identified foot strikes and foot off using the heel positions (KP24 and KP21) and midhip positions (KP8). The distance between the toes (KP19 and KP22) is used to determine mid-midstance and mid-midswing. The legs are closer together in mid-midstance posture; hence, the space between the two toes will be less. Assuming that the right big toe (KP22) is ( , ) and the left big toe (KP19) is ( , ), the distance between toes (d) was calculated using Equation (1), and mid-midstance was at the minimum distance ( Figure 3): For a coronal view, the beginning and end of the stride, and mid-midstance must be located. The same procedure was used as in the sagittal view, measuring the distance between the toes (Figure 4). The distance between the toes (KP19 and KP22) is used to determine mid-midstance and mid-midswing. The legs are closer together in mid-midstance posture; hence, the space between the two toes will be less. Assuming that the right big toe (KP22) is ( , ) and the left big toe (KP19) is ( , ), the distance between toes (d) was calculated using Equation (1), and mid-midstance was at the minimum distance ( Figure 3): For a coronal view, the beginning and end of the stride, and mid-midstance must be located. The same procedure was used as in the sagittal view, measuring the distance between the toes (Figure 4). For a coronal view, the beginning and end of the stride, and mid-midstance must be located. The same procedure was used as in the sagittal view, measuring the distance between the toes (Figure 4).
Allocating Foot Events to Specific Strides Figure 5 provides a flowchart for identifying strides. The method involves determining if the stride begins on the left or the right; acquiring at least a stride to determine if the number of right/left foot strikes equals or exceeds 2; and assigning events to particular strides by searching for foot strikes, foot offs, or mid-midstance that could happen between the start and finish of the stride.  Allocating Foot Events to Specific Strides Figure 5 provides a flowchart for identifying strides. The method involves determining if the stride begins on the left or the right; acquiring at least a stride to determine if the number of right/left foot strikes equals or exceeds 2; and assigning events to particular strides by searching for foot strikes, foot offs, or mid-midstance that could happen between the start and finish of the stride.

Algorithmic Implementation of EVGS
The Edinburgh Visual Gait Score (EVGS) is a clinical tool used to assess gait deviations in patients with various neurological and musculoskeletal conditions. EVGS consists of 17 parameters for each lower extremity, totaling 34 parameters. In practice, a person's gait in sagittal and coronal views are video recorded while walking, using a handheld camera or a camera on a tripod. In typical clinical practice, the video is scored by a human according to the EVGS guidelines. The person scoring the video can use video editing software to pause the recorded video at specific gait events for analysis. Software can also be used to determine joint angles and other EVGS parameters. A 3-point ordinal scale is used to score each parameter: 0 = normal (within +/− 1.5 standard deviations of the mean), 1 = moderate deviations (between 1.5 and 4.5 standard deviations of the mean), and 2 = significant/severe deviations (more than 4.5 SD of mean). A lower overall score indicates less gait deviation [20]. In this research, to enhance understanding and maintain consistency, the 17 EVGS parameters were grouped based on foot events and gait  Table 2 below lists the parameters and how they are organized according to these events.

Foot Events and Gait Phases EVGS Parameters
Sagittal view The majority of EVGS parameters need angle calculations, including hip, knee, and ankle. Hip angle in the sagittal plane was computed using the angle between the axis perpendicular to the trunk axis and the thigh axis ( Figure 6), Since OpenPose does not provide keypoints for a pelvis segment. The trunk axis was the axis between the midshoulder (KP1) and midhip (KP8) keypoints. The line connecting the hip (KP9 and KP12) and knee (KP10 and KP13) keypoints was the thigh axis. Note that this is different from how people can visualize hip angle, which is usually the thigh angle related to pelvis orientation. Equation (2) was used to calculate the slope For the trunk axis, ( , ) are the midshoulder (KP1) coordinates, and ( , ) are the midhip (KP8) coordinates. Using these coordinates and Equation (3), the trunk axis angular coefficient was determined. The axes used to compute joint angles are shown in Figure 7. The axis perpendicular to the trunk axis thus had the following slope: For the trunk axis, (x 1 , y 1 ) are the midshoulder (KP1) coordinates, and (x 2 , y 2 ) are the midhip (KP8) coordinates. Using these coordinates and Equation (3), the trunk axis angular coefficient was determined. The axes used to compute joint angles are shown in Figure 7. The axis perpendicular to the trunk axis thus had the following slope: − Therefore, the angle (in degrees) is given by When the knee is in front of the body, the angle value is positive (flexion). This method was used to calculate peak hip flexion in swing (#13) and peak hip extension in stance (#12).
Knee extension in terminal swing is determined using the knee angle, the angle between the thigh and shank sagittal axes ( Figure 8). When the knee is flexed, the angle is positive. Equation (4) determines the slope of the thigh segment, and Equation (6) is for the tibia segment slope.  A similar procedure was followed for the thigh axis, where m 2 was calculated from (x 3 ,y 4 ), which corresponds to the hip's coordinates (KP9 and KP12), and (x 4 , y 4 ) represents the knee keypoints (KP10 and KP13): Therefore, the angle (in degrees) is given by When the knee is in front of the body, the angle value is positive (flexion). This method was used to calculate peak hip flexion in swing (#13) and peak hip extension in stance (#12).
Knee extension in terminal swing is determined using the knee angle, the angle between the thigh and shank sagittal axes ( Figure 8). When the knee is flexed, the angle is positive. Equation (4) determines the slope of the thigh segment, and Equation (6) is for the tibia segment slope. For the thigh, hip coordinates (KP9 and KP12) were ( , y ), and knee coordinates (KP10 and KP13) were ( , ), and Equation (4) was used to compute the angular coefficient (slope) of thigh axis . The ankle coordinates (KP11 and KP14) were ( , ). The slope, (shank axis), was calculated using Equation (6): − For the thigh, hip coordinates (KP9 and KP12) were (x 3 , y 3 ), and knee coordinates (KP10 and KP13) were (x 4 , y 4 ), and Equation (4) was used to compute the angular coefficient (slope) of thigh axis m 2 . The ankle coordinates (KP11 and KP14) were (x 5 , y 5 ). The slope, m 3 (shank axis), was calculated using Equation (6): Knee angle was then calculated using This method was used to calculate knee extension in terminal swing (#10), peak knee extension in stance (#9), and peak knee flexion in swing (#11).
The dorsiflexion/plantarflexion angle was calculated using the foot and shank sagittal axes ( Figure 9). For the thigh, hip coordinates (KP9 and KP12) were ( , y ), and knee coordinates (KP10 and KP13) were ( , ), and Equation (4) was used to compute the angular coefficient (slope) of thigh axis . The ankle coordinates (KP11 and KP14) were ( , ). The slope, (shank axis), was calculated using Equation (6): Knee angle was then calculated using This method was used to calculate knee extension in terminal swing (#10), peak knee extension in stance (#9), and peak knee flexion in swing (#11).
The dorsiflexion/plantarflexion angle was calculated using the foot and shank sagittal axes ( Figure 9).   (6) and (8) are used for the tibia and foot segments. The knee coordinates (KP10 and KP13) are (x 4 , y 4 ), and ankle coordinates (KP11 and KP14) are (x 5 , y 5 ). Using Equation (6), the angular coefficient is m 3 (shank axis). The heel coordinates (KP21 and KP24) are (x 6 , y 6 ), and the mean positions of the big (KP19 and KP22) and small toes (KP20 and KP23) are (x 7 , y 7 ). Equation (8) determines the slope m 4 (foot axis) The ankle angle was calculated using Equation (9): Initial Contact (#1) To determine whether the heel, toe, or flat foot makes initial contact with the ground, a line (foot axis) between the heel (KP21 and KP24) and midtoe (midpoint of the big (KP19 and KP22) and small toe (KP20 and KP23)) keypoints were used. The angle between the foot axis and the image coordinate system x-axis in sagittal plane was measured. The development dataset was used to calculate angle thresholds for scoring. The thresholds were determined by evaluating sagittal videos showing any of the three conditions (heel contact, toe contact, flat foot contact). After analysis, the flat foot contact range was set to between 0 and 20 • , Since 95% of the development dataset showed heel contact at more than 20 • and toe contact at less than 0 • .

Peak Sagittal Trunk Position (#16)
Similar to the foot position approach, a line from the midshoulder (KP1) to the midhip keypoint (KP8) was used for trunk angle. The angle between this trunk line and the x-axis of the image coordinate system in the sagittal plane was calculated. Then, in accordance with the EVGS handbook [20], normal (vertical to 5 • forward or backward), abnormal (greater than 5 • backward or between 6 • and 15 • forward), and highly abnormal (more than 15 • forward) conditions were classified.

Pelvic Rotation in Midstance (#15) and Maximum Pelvic Obliquity in Stance (#14)
OpenPose only provides hip keypoints, but the pelvis requires at least 3 keypoints to define segment orientation. As a surrogate measure, pelvic rotation angle was determined from a line connecting the right (KP9) and left hip (KP12) joints and the image coordinate system's sagittal plane y-axis.

Heel Lift (#2)
Heel lift has five EVGS criteria: normal, early, delayed, no heel, and no forefoot contact. Using the same methodology for detecting foot position at initial contact, no forefoot touch and no heel contact were detected at midstance. "No forefoot contact" refers to when the forefoot (the front part of the foot, including the toes) does not contact the ground during midstance. "No heel contact" refers to when the heel does not contact the ground during midstance.
Three foot events (stance foot heel lift, other foot level with the stance foot, opposite foot contact with the ground) were recorded if the foot was in a flat posture during midstance. A normal EVGS result occurs if heel lift occurs on the stance leg between the other foot being at the same level as the stance foot and the other foot making contact with the ground. Heel lift is early if the stance leg heel lift occurs before the opposing foot levels with the stance foot. Heel lift is delayed if lifting occurs after the opposing foot reaches the ground.
Foot Clearance in Swing (#6) Foot clearance has four criteria: full clearance, reduced clearance, no clearance, and high steps. Toe, heel, and ankle keypoints were considered for this parameter. Full clearance occurs when the big toe and heel involved in foot clearance cross the big toe and heel of the opposing leg, respectively. Reduced clearance occurs when the heel crosses the opposing leg's heel, but the toe remains below or on the same level as the opposite leg's toe. No clearance is when both big toe and heel are below the opposite big toe and heel, respectively. A high step occurs if the toe passes the opposite leg's midpoint between the ankle and knee.

Maximum Lateral Trunk Shift (#17)
The EVGS instructions for maximum lateral trunk shift are not machine-friendly because the scale's description only refers to "marked," "reduced," or "moderate" trunk shift, with no threshold values provided to distinguish between the various conditions. The proposed approach for determining lateral trunk shift involves computing the angle between the trunk axis and the image coordinate's y-axis in the coronal plane. Angle thresholds at which the trunk shift are normal, moderate, and marked and were determined using the development dataset. Videos from the coronal view that displayed any of the three conditions (normal, mild trunk shift, marked trunk shift) were analyzed. After analysis, maximum lateral trunk shift was considered normal if the angle was between 0 • and 5 • , reduced if the angle was less than zero, moderate if the trunk angle was between 6 • and 15 • , and severely inclined if the trunk angle was greater than 15 • .

Knee Progression Angle (#8)
Knee progression angle is highly subjective and depends on visual cues. This parameter determines if the kneecap is visible. EVGS instructions are insufficient to determine the knee progression angle using only OpenPose keypoints. The proposed method of determining knee progression angle involves obtaining the ankle angle and establishing threshold values using the development dataset. Following video analysis, normal was chosen as an ankle angle between −25 • and 25 • , internal rotation as an ankle angle more than 25 • , and external rotation as an ankle angle less than −25 • .
The algorithmic implementation of the Edinburgh Visual Gait Score is presented in Pseudocode in Appendix B.

Algorithm Evaluation Methodology
Clinical implementation of EVGS is human-scored. To evaluate the proposed new EVGS algorithmic approach, the algorithmic scoring was evaluated by comparison to scoring by a panel of reviewers.
To evaluate the automated EVGS analysis system, a set of videos was collected that represented gait conditions for each EVGS parameter and result. This also allowed for a thorough evaluation of system performance, because the videos could be used to test various gait patterns.
Three healthy people with a good understanding of gait volunteered for this study. Participants were instructed to wear running shoes; tightly fitting, brightly colored clothes; and shorts with patellae (kneecaps) visible. Participants were female, 20 to 25 years, and with no issues affecting walking. All participants provided informed consent, and the study was approved by the University of Ottawa Research Ethics Board.
Before video recording, gait characteristics mentioned in the EVGS scale were explained to participants so that they could recreate these conditions. Nine gait sets were collected for the sagittal view: healthy, hip, knee, trunk, ankle, pelvis, foot position, heel lift, and foot clearance (Table 3). Five gait sets were collected for the coronal view: healthy, knee, trunk, foot, and pelvis (Table 4). Thirty-seven walking trials were collected, one for each condition and two for normal gait. This approach allowed for a comprehensive examination of different gait conditions and helped in the evaluation of algorithms for automatically calculating EVGS.
Data were collected using an iPhone 13 pro with a 6.1-inch display and 2532 × 1170 rear camera pixel resolution. The smartphone was handheld to replicate situations without a tripod or other support. Three individuals were involved in each recording process: volunteer being recorded, phone video recording operator, and assistant monitoring the recording. The operator was instructed to maintain a steady hand and hold the camera in portrait orientation at approximately neck level during the entire trial to ensure that the participant's entire body was captured in the video. The camera was oriented parallel to the plane being captured. For a coronal view, the operator stood in front of the participant. A satisfactory video had the participant visible, balanced lighting, and no images, objects, or people in the background that may be mistaken for the participant. A brief description, diagram, or video of each gait pattern was provided before the video trial to ensure that the participants understood and interpreted the walking pattern correctly. Additionally, an EVGS manual was provided for the participants to read before the video session to help reduce bias that may be introduced by any demonstration of the gait pattern.
For each trial, participants walked 10 m at a comfortable self-selected speed in an obstacle-free and flat hallway. Following the initial 10 m walk, individuals took a brief two-second break, turned 180 • in their chosen direction, took a brief two-second break, and then resumed straight walking. Each participant completed 37 walking trials, resulting in 111 videos being collected.
Each video was trimmed at the start and end to eliminate frames where the person was not on camera. Sagittal view videos were divided into left-to-right and right-to-left directions, while coronal view videos were divided into front view and rear view. Video quality was checked to ensure that it met the required standards. Videos that were not taken correctly were removed and recaptured. After preprocessing, 216 videos were available for analysis. Five University of Ottawa students volunteered to score videos, each person with a sufficient understanding of gait events. Before evaluating the videos, all reviewers were briefed on normal gait kinematics, gait phases, and methodologies for recording and reviewing gait recordings. Additionally, the purpose and background of the EVGS were discussed, and gait analysis using the EVGS was demonstrated.
For training, the reviewers manually scored an example video using the EVGS and then compared their results to gain a better understanding. The reviewers then applied EVGS to evaluate the videos independently, without consulting one another. Slow motion and freeze-frame playback were used with video player software. Reviewers did not have time restrictions and could work at their own pace while reviewing.

Algorithm Evaluation
To validate the approach to automatically calculating EVGS results, independent evaluations were completed for stride detection and EVGS results.
For foot event ground truth, one reviewer manually identified foot events in each video. The ground truth was then manually labeled for coronal/sagittal view, motion direction, foot strike and foot off, mid-midstance, and the total number of strides in each video. For EVGS algorithm evaluation, each video was scored by at least two reviewers. The reviewers used the EVGS to evaluate the videos independently, without consulting one another. Each reviewer was unaware of the other reviewer's scoring.
Each reviewer completed an EVGS form containing all the EVGS parameters for each leg. The reviewers evaluated the entire gait cycle, including multiple strides, for each leg in each video. Only the most frequent score for each parameter for each leg in each video was considered as the final score. The use of the most frequent score also helped reduce potential biases introduced by evaluating only a single stride. The videos were rated between 0 and 2, with 0 denoting normal performance and 2 denoting highly abnormal, as specified in EVGS. Reviewers were instructed to assess joint angles using their eyes alone, without using any software or other tools, to replicate the instructions described in EVGS.
Once the ground truth for coronal/sagittal view, motion direction, foot strike and foot off, mid-midstance, the total number of strides, and the EVGS results were provided by the reviewers for each video, the algorithmic results and EVGS results were compared with the reviewers' results. When using a human-scored scale, such as the EVGS, it is important to assess the degree of agreement or disagreement between different reviewers. Therefore, Pearson correlations were calculated for scores assigned by different reviewers, to determine the degree of consistency. Correlations were also used as a metric to compare algorithmic and reviewer results, with correlations interpreted as very high (0.9-1.0), high (0.7-0.9), moderate (0.5-0.7), low (0.3-0.5), and negligible (0.0-0.3).

Coronal/Sagittal View Detection
To assess sagittal or coronal view identification, algorithmic findings and ground truth were compared. Table 5 displays the sagittal or coronal view (number of videos) classification confusion matrices. Accuracy was 96.3%, sensitivity was 93.1%, specificity was 98.4%, precision was 97.6%, and the F1-score was 0.95. Algorithm accuracy was acceptable. The algorithm failed when multiple people were in the video or when a person unexpectedly turned towards the camera while walking in the sagittal view. Tables 6 and 7 show confusion matrices for detecting the direction of motion (number of videos). For the coronal view, accuracy was 92.8%, sensitivity was 90.9%, specificity was 95.0%, precision was 95.2%, and the F1-score was 0.93. For the sagittal view, accuracy was 92.4%, sensitivity was 93.7%, specificity was 91.1%, precision was 90.9%, and the F1-score was 0.92. Table 6. Confusion matrix for detecting direction of motion (coronal view).

Ground Truth Coronal Sagittal
Coronal 60 6 Sagittal 4 62 The algorithm correctly determined direction in the majority of videos but failed when a person's reflection was visible on the wall, when numerous people were in the video, when the camera moved as the person moved (both small and large camera movements), or when the participant was not in the frame the entire time.

Foot Strike, Foot off, and Mid-Midstance Detection
Algorithmically determined foot strike, foot off, and mid-midstance were compared to manually labelled ground truth. Each video was classified into three categories based on the discrepancy between the identified frame number and the actual frame number: two frames or less discrepancy, two to five frames discrepancy, and more than five frames discrepancy. Each stride in a video was categorized, and then the most frequently occurring category was assigned to the video.
Additional foot events may occasionally be detected by the algorithm, while at other times, the algorithm may not detect an event ( Figure 10). These circumstances occurred rarely (5 of 100 videos) when the OpenPose keypoints abruptly swapped between legs. when a person's reflection was visible on the wall, when numerous people were in the video, when the camera moved as the person moved (both small and large camera movements), or when the participant was not in the frame the entire time.

Foot Strike, Foot off, and Mid-Midstance Detection
Algorithmically determined foot strike, foot off, and mid-midstance were compared to manually labelled ground truth. Each video was classified into three categories based on the discrepancy between the identified frame number and the actual frame number: two frames or less discrepancy, two to five frames discrepancy, and more than five frames discrepancy. Each stride in a video was categorized, and then the most frequently occurring category was assigned to the video.
Additional foot events may occasionally be detected by the algorithm, while at other times, the algorithm may not detect an event ( Figure 10). These circumstances occurred rarely (5 of 100 videos) when the OpenPose keypoints abruptly swapped between legs.
(a) (b) Figure 10. Example where the algorithm detects additional mid-midstance (a) but fails to detect a mid-midstance (b) (shown by ellipse region). The red dashed line is algorithm's estimated midmidstance. Solid black lines are ground truth for mid-midstance. Table 8 shows accuracy results for foot strike, foot off, and mid-midstance.  Table 8 shows accuracy results for foot strike, foot off, and mid-midstance.

Number of Strides
The difference between the number of strides detected by the algorithm and the number of strides detected by the reviewer are displayed in Table 9. Each video was classified into three categories based on the difference between the identified number of strides and the actual number of strides: two strides or less, two to five strides, and more than five strides.

EVGS Coronal Videos
The EVGS results were computed for five coronal view parameters. Pearson correlation analyses assessed the relationship between reviewer 1 and reviewer 2, reviewer 1 with the algorithm, and reviewer 2 with the algorithm, and for both legs and for each gait set. The EVGS result correlations between reviewers (R1, R2) and the algorithm for right and left legs for different gait sets are shown in Figure 11. The tables of results are in Appendix A.

Number of Strides
The difference between the number of strides detected by the algorithm and the number of strides detected by the reviewer are displayed in Table 9. Each video was classified into three categories based on the difference between the identified number of strides and the actual number of strides: two strides or less, two to five strides, and more than five strides.

EVGS Coronal Videos
The EVGS results were computed for five coronal view parameters. Pearson correlation analyses assessed the relationship between reviewer 1 and reviewer 2, reviewer 1 with the algorithm, and reviewer 2 with the algorithm, and for both legs and for each gait set. The EVGS result correlations between reviewers (R1, R2) and the algorithm for right and left legs for different gait sets are shown in Figure 11. The tables of results are in Appendix A.

EVGS Sagittal Videos
The sagittal view had nine gait sets. The EVGS result correlations between reviewers (R1, R2) and the algorithm for right and left legs for different gait sets are shown in Figure 12. The tables of results are in Appendix A. Figure 11. Reviewer correlation and correlation between the algorithm and the reviewers for coronal plane parameters.

EVGS Sagittal Videos
The sagittal view had nine gait sets. The EVGS result correlations between reviewers (R1, R2) and the algorithm for right and left legs for different gait sets are shown in Figure  12. The tables of results are in Appendix A. The results for each EVGS parameter are summarized in Table 10, using the overall average correlation between reviewers and the algorithm.

Parameter Quality
Coronal parameters

Discussion
The discussion is divided into subsections for coronal view parameters and sagittal view parameters. The results for each EVGS parameter are summarized in Table 10, using the overall average correlation between reviewers and the algorithm. Table 10. Correlation interpretation between reviewers (R1 and R2) and algorithm.

Parameter Quality
Coronal parameters

Discussion
The discussion is divided into subsections for coronal view parameters and sagittal view parameters.

Coronal View Parameters
Based on coronal view analysis across different gait sets, all five coronal parameters had high correlations among reviewers (r = 0.7 to 0.9). The literature on the EVGS interrater reliability found that the level of agreement between reviewers varied for each parameter [34]. The knee progression angle showed a high level of agreement, with an 81% agreement rate among reviewers, while the other parameters had a moderate level of agreement (r = 0.5-0.7). Although there were differences between the correlation coefficients in this research and the level of agreement in the literature, the overall range of agreement among reviewers for each gait parameter was consistent between this study and the literature.
Foot rotation and lateral trunk shift showed high correlations between the algorithm and reviewers, while the knee progression angle had moderate correlations. Pelvic obliquity had low correlations, and the hindfoot valgus/varus parameter showed negligible correlations between the algorithm and reviewers. Keypoints on the ankle and heel that are near each other in a narrow area and far from the camera are more likely to be occluded during walking, which helps to explain this result. Figure 13 depicts foot keypoints as the person walks away from the camera. Since the two legs are close together, the foot keypoints for the left leg are not accurately detected. Possible improvements to address these issues could include using the knowledge that the stance foot should not move (i.e., marker location fixed) when not "toe walking", thereby helping to avoid confusion when the swing foot passes close to the stance foot. Additional model training for these situations could also help improve OpenPose keypoint determination.

Coronal View Parameters
Based on coronal view analysis across different gait sets, all five coronal parameters had high correlations among reviewers (r = 0.7 to 0.9). The literature on the EVGS interrater reliability found that the level of agreement between reviewers varied for each parameter [34]. The knee progression angle showed a high level of agreement, with an 81% agreement rate among reviewers, while the other parameters had a moderate level of agreement (r = 0.5-0.7). Although there were differences between the correlation coefficients in this research and the level of agreement in the literature, the overall range of agreement among reviewers for each gait parameter was consistent between this study and the literature.
Foot rotation and lateral trunk shift showed high correlations between the algorithm and reviewers, while the knee progression angle had moderate correlations. Pelvic obliquity had low correlations, and the hindfoot valgus/varus parameter showed negligible correlations between the algorithm and reviewers. Keypoints on the ankle and heel that are near each other in a narrow area and far from the camera are more likely to be occluded during walking, which helps to explain this result. Figure 13 depicts foot keypoints as the person walks away from the camera. Since the two legs are close together, the foot keypoints for the left leg are not accurately detected. Possible improvements to address these issues could include using the knowledge that the stance foot should not move (i.e., marker location fixed) when not "toe walking", thereby helping to avoid confusion when the swing foot passes close to the stance foot. Additional model training for these situations could also help improve OpenPose keypoint determination. Pelvic obliquity involves movement of the entire pelvis segment. Hip keypoints were used as a surrogate measure of pelvic movement, because they are a part of the pelvic region but may not represent the entire pelvic segment; therefore, lower correlations between the algorithm and reviewer were anticipated. The difference was also attributed to the reviewer looking at the entire pelvis when making their assessment, which differs from the algorithm only using hip keypoints. In future, improved AI models that estimate depth (i.e., 3D coordinates for each keypoint) could provide additional information to improve outcomes when using the hips as a surrogate marker. Pelvic obliquity involves movement of the entire pelvis segment. Hip keypoints were used as a surrogate measure of pelvic movement, because they are a part of the pelvic region but may not represent the entire pelvic segment; therefore, lower correlations between the algorithm and reviewer were anticipated. The difference was also attributed to the reviewer looking at the entire pelvis when making their assessment, which differs from the algorithm only using hip keypoints. In future, improved AI models that estimate depth (i.e., 3D coordinates for each keypoint) could provide additional information to improve outcomes when using the hips as a surrogate marker.
With the exception of hindfoot valgus and varus, all parameters demonstrated strong correlations between reviewers and the algorithm for the healthy gait set. The knee progression angle had a stronger correlation within the knee gait set, while the foot progression angle and lateral trunk shift had the strongest correlation within the foot gait set. All correlations between the algorithm and the reviewer were lower than between the review outcomes. As mentioned previously, this difference can be attributed to surrogate measures from the OpenPose BODY25 keypoints. In addition, if the camera is not parallel to the coronal plane and oriented vertically, parallax effects may not accurately depict the body keypoints, and phone orientation can affect measures where the phone axis is assumed to be aligned to the ground. Keypoint detection errors can also occur for body parts that are closer to the camera. Future research could include phone sensor data to enable video frame transformation to gravity, thereby helping to provide a consistent ground plane.

Sagittal View Parameters
Based on sagittal view analysis, the agreement between reviewers was very high for initial contact, followed by six parameters with high correlations, namely peak sagittal trunk position (#16), ankle dorsiflexion in swing (#7), peak hip flexion (#13), heel lift (#2), knee extension in terminal swing (#10), and peak hip extension (#12). Four parameters had moderate correlations (max. ankle dorsiflexion in stance (#3), pelvic rotation in midstance (#15), foot clearance (#6), and peak knee flexion in swing (#11)). The literature on the interrater reliability of sagittal view gait parameters reported that initial contact had an agreement rate of 90%, while foot clearance and heel lift had agreement rates of 82-83% [34]. Knee extension in terminal swing had an agreement rate of 62%, and knee peak flexion in swing had an agreement rate of 69%. For both the research results and the literature, initial contact was the most consistent parameter between raters for sagittal view gait parameters. Other parameters, such as trunk position, ankle dorsiflexion in swing, foot clearance, and heel lift, also had high levels of agreement between reviewers. However, knee flexion, knee extension in terminal swing, and knee peak flexion in swing had lower levels of agreement.
The foot position parameter had a very high correlation between the algorithm and the reviewer. Five parameters had a high correlation: peak sagittal trunk position (#16), ankle dorsiflexion in swing (#7), knee extension in terminal swing (#10), peak knee extension in stance (#9), and peak hip extension (#12). Three parameters had a moderate correlation: max. ankle dorsiflexion in stance (#3), peak knee flexion in swing (#11), and peak hip flexion (#13). Finally, three parameters had a negligible correlation: pelvic rotation in midstance (#15), heel lift (#2), and foot clearance (#6). Foot parameter correlations were close to zero, indicating no relationship between the algorithm and reviewer scores. Foot parameters are the keypoints farthest from the cameras, can be occluded during walking, can rely on the floor plane being correct, and as a terminal segment, can have poorer OpenPose performance. Pelvic parameters had lower correlations (r = 0.23), since the hip keypoints were used as a surrogate measure of pelvic movement. The difference could also relate to the reviewer's assessment of the complete pelvis as opposed to the algorithm's use of only the hip keypoints. In future research, estimating the keypoint depth coordinate could help improve the pelvic rotation scoring.
Knee flexion and ankle dorsiflexion in swing had stronger correlations between the algorithm and reviewers when using the foot position gait set. Trunk position had a better correlation for the knee gait set. In contrast, peak hip flexion and ankle dorsiflexion in swing had the strongest correlations when using the ankle gait set. The results of our research showed that ankle dorsiflexion had a stronger correlation with the foot and ankle gait dataset, which was specifically designed to focus on foot movements during gait. This may be due to the fact that the foot movements are more prominent in this gait dataset, making it easier for both reviewers and the algorithm to accurately detect and agree on the presence and extent of ankle dorsiflexion.
For peak hip flexion correlation, the between reviewers' results were higher than the reviewers and algorithm correlations. This difference can be attributed to small differences in angle measurements leading to different scores. For example, a score of 0 corresponds to a peak hip flexion angle between 25 • and 45 • , while a score of 1 corresponds to angles between 45 • and 60 • . Reviewers may not accurately detect small angle changes, such as differences between 44 • and 46 • , leading to discrepancies in their assessments. In contrast, the algorithm assigns scores using the measured angle. Therefore, the algorithm provides a more objective and consistent assessment of the peak hip flexion parameter compared to the subjective assessments of human reviewers.
Foot strike and foot-off detection are crucial for the heel lift score. Even a small frame change affects this parameter. Consider, for instance, a scenario in which the opposite leg foot strike occurs at frame 53 and foot off occurs at frame 52. In this case, the patient would have a normal heel lift score (score 0). The stride detection algorithm would identify the foot strike at frame 53 and the foot off at 54. Foot event frames could be within the permitted range of 2, which is used for stride detection. The condition would, therefore, be considered delayed, and a score of one would be assigned. This demonstrates how even a small variation of 1 frame would substantially affect the heel lift parameter.
For foot clearance, reviewers scored Figure 14a as 2, "No clearance". The person in Figure 14b was scored as 1, "Reduced clearance". However, the algorithm categorized Figure 14a as "Reduced clearance" and Figure 14b as "No clearance", as the keypoints were clustered in a small area, and there was no data for the floor axis. Considering that OpenPose does not locate foot keypoints at the shoe insole level and handheld smartphone video can lead to variable floor plane estimation, foot clearance and heel lift parameters will be difficult to achieve without further research. Potential improvements could involve using phone orientation data to correct the floor plane and continuing to develop the foot keypoint identification aspect of pose estimation models. in angle measurements leading to different scores. For example, a score of 0 corre to a peak hip flexion angle between 25° and 45°, while a score of 1 corresponds to between 45° and 60°. Reviewers may not accurately detect small angle changes, differences between 44° and 46°, leading to discrepancies in their assessments. In c the algorithm assigns scores using the measured angle. Therefore, the algorithm p a more objective and consistent assessment of the peak hip flexion parameter com to the subjective assessments of human reviewers.
Foot strike and foot-off detection are crucial for the heel lift score. Even a sma change affects this parameter. Consider, for instance, a scenario in which the oppo foot strike occurs at frame 53 and foot off occurs at frame 52. In this case, the patien have a normal heel lift score (score 0). The stride detection algorithm would iden foot strike at frame 53 and the foot off at 54. Foot event frames could be within the ted range of 2, which is used for stride detection. The condition would, therefore, sidered delayed, and a score of one would be assigned. This demonstrates how small variation of 1 frame would substantially affect the heel lift parameter.
For foot clearance, reviewers scored Figure 14a as 2, "No clearance". The pe Figure 14b was scored as 1, "Reduced clearance". However, the algorithm cate Figure 14a as "Reduced clearance" and Figure 14b as "No clearance", as the ke were clustered in a small area, and there was no data for the floor axis. Consideri OpenPose does not locate foot keypoints at the shoe insole level and ha smartphone video can lead to variable floor plane estimation, foot clearance and parameters will be difficult to achieve without further research. Potential improv could involve using phone orientation data to correct the floor plane and contin develop the foot keypoint identification aspect of pose estimation models. Algorithm performance varied across the different parameters. For the foot p sion angle (#5), both the correlation between reviewers and the correlation betw viewers and the algorithm were high, indicating that the algorithm performed identifying this parameter. The algorithm performed reasonably well for the kn gression angle (#8) and lateral trunk shift (#17), with moderate and high agreem tween reviewers and the algorithm, respectively. The algorithm's performance w Algorithm performance varied across the different parameters. For the foot progression angle (#5), both the correlation between reviewers and the correlation between reviewers and the algorithm were high, indicating that the algorithm performed well in identifying this parameter. The algorithm performed reasonably well for the knee progression angle (#8) and lateral trunk shift (#17), with moderate and high agreement between reviewers and the algorithm, respectively. The algorithm's performance was poor and negligible for pelvic obliquity (#14) and hindfoot valgus/varus (#4) in the coronal view compared to human reviewers. The correlation between the algorithm and each reviewer was moderate for pelvic obliquity and negligible for hindfoot valgus/varus, indicating the algorithm struggled to accurately identify these parameters in a way that aligned with human judgment. The correlation between the two reviewers for hindfoot valgus/varus was also negligible compared to other coronal parameters, which may have contributed to the algorithm's lower performance.

Conclusions
This research proposes an automated approach for using the Edinburgh Visual Gait Score (EVGS) with handheld smartphone video. The results show that the algorithmic stride detection component works better for sagittal videos than coronal videos, and the EVGS performance varied across different parameters. Of the seventeen EVGS parameters, correlations between human reviewers and the algorithm were very high for one parameter, high for seven parameters, moderate for four parameters, low for one parameter, and negligible for four parameters. Based on these results, automated EVGS processing could now be used for 12 parameters, but more research is needed to properly classify pelvic rotation in midstance (#15), heel lift (#2), foot clearance in swing (#6), maximum pelvic obliquity in midstance (#14), and hindfoot valgus/varus (#4).
This research demonstrates the feasibility of using videos recorded using a handheld smartphone camera, processed by pose estimation models, and scored by rule-based algorithms. The proposed approach has the potential to enhance gait analysis accessibility and facilitate continuous monitoring of patients outside the clinic.
Future research should involve capturing smartphone orientation, accelerometer, and gyroscope data while recording videos and incorporating this information into the stride identification and EVGS algorithms for results, especially for foot-related parameters. To improve the accuracy of patient gait pattern capture, future research could gather data from more strides to recognize more foot events and minimize the effect of outliers. Additionally, machine learning algorithms could be implemented to potentially improve the automatic scoring of foot event and stride frame identification. An app for smartphone-based gait analysis could be developed, enabling the automatic acquisition of the Edinburgh Visual Gait Score (EVGS) after capturing patient video with the app and thereby enabling evidencebased decision-making during a patient encounter.