EchoGest: Soft Ultrasonic Waveguides Based Sensing Skin for Subject-Independent Hand Gesture Recognition

Gesture recognition is crucial for enhancing human-computer interaction and is particularly pivotal in rehabilitation contexts, aiding individuals recovering from physical impairments and significantly improving their mobility and interactive capabilities. However, current wearable hand gesture recognition approaches are often limited in detection performance, wearability, and generalization. We thus introduce EchoGest, a novel hand gesture recognition system based on soft, stretchable, transparent artificial skin with integrated ultrasonic waveguides. Our presented system is the first to use soft ultrasonic waveguides for hand gesture recognition. EcoflexTM 00–31 and EcoflexTM 00-45 Near ClearTM silicone elastomers were employed to fabricate the artificial skin and ultrasonic waveguides, while 0.1 mm diameter silver-plated copper wires connected the transducers in the waveguides to the electrical system. The wires are enclosed within an additional elastomer layer, achieving a sensing skin with a total thickness of around $500~\mu $ m. Ten participants wore the EchoGest system and performed static hand gestures from two gesture sets: 8 daily life gestures and 10 American Sign Language (ASL) digits 0-9. Leave-One-Subject-Out Cross-Validation analysis demonstrated accuracies of 91.13% for daily life gestures and 88.5% for ASL gestures. The EchoGest system has significant potential in rehabilitation, particularly for tracking and evaluating hand mobility, which could substantially reduce the workload of therapists in both clinical and home-based settings. Integrating this technology could revolutionize hand gesture recognition applications, from real-time sign language translation to innovative rehabilitation techniques.

for affected individuals such as stroke patients [1].This recognition becomes essential as hands are vital for performing a wide range of daily activities such as writing, eating, and communicating, allowing us to continually interact with our surroundings [2].Furthermore, hand gestures are universally used for communication [4] and play a pivotal role in human-computer interaction [5].In rehabilitation contexts, technology-assisted exercises that enhance hand mobility through gesture recognition can significantly benefit patients.For instance, automatic hand gesture recognition can be integrated with smartphones or computer games to monitor rehabilitation progress while actively engaging the user [6].Such interfaces not only promote independent living but also alleviate the workload of medical professionals in clinical settings [2].Additionally, the fundamental means of communication for hearing impaired people is their hands [7], and hand gesture recognition systems can bridge the communication gap between hearing and hearing-impaired individuals via automatic sign language translation [8].In addition to medical applications, hand gesture recognition enables more intuitive communication for various human-computer interaction applications [9], such as virtual reality (VR) / augmented reality (AR) [10], hand gesture interaction with smartphones [11], and in-vehicle menu control to mitigate visual distractions while driving [12].
Recent technological advances have enabled the miniaturization of sensors, circuit boards, microcontrollers, and batteries [13].Compared to vision-based systems, sensors are less susceptible to environmental factors [14].Portable systems offer extensive sensor data storage capacities, rendering them apt for mobile and wearable applications.The major drawback of sensor-based methods is the likelihood of user discomfort or mobility limitations due to sensor configuration.To solve this problem, sensors can be incorporated into wearable devices like gloves, wristbands, or rings.
Wearable sensor-based systems can be applied to a broader spectrum of applications.Ergonomic and accurate wearable technologies are more promising for daily life activities.To this end, various band-like wearable devices incorporating conventional rigid sensors have been devised for hand gesture recognition [9], [15], [16].These methods demonstrate reduced performance as the number and complexity of gestures increase.Additionally, they require intricate configurations and a long calibration period.Another method for accurately identifying and reconstructing finger movement is using data gloves [17], [18].However, they are mostly bulky and costly.In addition to being expensive, glove-inspired designs often cover the entire hand, thereby restricting customizability and limiting the user's natural tactile interactions with the surroundings [7], [19].Most commercial wearable devices rely on well-established sensing technologies that cannot stretch, such as resistive strain gauges and optical goniometers [20].As a result of this constraint, they are inherently unsuitable for usage on our relatively delicate bodies.Moreover, piezoelectric sensors [21] and inertial measurement units (IMUs) [22] measure movement instead of displacement, making it challenging to estimate static finger configurations over time.Therefore, developing stretchable, lightweight, and comfortable wearable sensors for gesture recognition is of great significance.
Epidermal electronics [23] and artificial skin sensors [24] offer flexible and stretchable wearable solutions for assessing finger postures.In this regard, there have been several noteworthy research publications related to sensing skins equipped with soft sensors for hand gesture recognition [25], [26], [27], [28], [29].A soft glove was developed to estimate finger joint angles [25]; however, it was relatively bulky.Gu et al. [26] developed a transparent wearable soft ionotronic skin system with high stretchability, comprising a 10-channel hydrogel/elastomer hybrid ion sensor and a wireless electronic control module.Nonetheless, ionotronic skins are vulnerable to physical degradation and pose operational challenges under extreme conditions such as high or low temperatures [30].
Previous notable studies have utilized Ecoflex elastomer as a substrate material owing to its high stretchability [27], [28], [29].However, the cured Ecoflex substrate exhibits a cloudy white appearance, posing difficulties in achieving the desired objective of transparency.Park et al. [27] developed a wearable soft artificial skin using the elastomer Ecoflex and liquid metal by incorporating microchannels into the elastomer base to inject conductive materials and measure finger joint movements.Nevertheless, the manufacturing process can be relatively complex and may not be precise enough to prevent the liquid metal from leaking.Li et al. developed an artificial skin integrating filmy stretchable strain sensors in a tri-layered configuration using carbon conductive grease as the resistance electrode layer in the middle and soft elastomer as the two sealing layers [28].Even though this system offers substantial functionality, there is variability in performance upon prolonged use.Jiang et al. [29] introduced a novel approach to recognizing hand gestures by estimating skin strain using a stretchable e-skin patch with multiple soft sensors optimally placed across the back of the hand.Although this method exhibited impressive performance results, the experiments did not incorporate a subject-independent validation approach, limiting its generalizability to different users.Despite promising findings in previous research regarding the application of artificial sensing skins for hand gesture recognition, developing optimal sensing skins that possess desired characteristics such as transparency, stretchability, lightness, and ease of wearing on human hands remains challenging.This challenge stems from the limited availability of suitable fabrication materials and the complexity of the manufacturing processes involved.Ideally, an artificial sensing skin should be thin, transparent, highly stretchable, and lightweight.
Physiologically, the hand is instrumental in grasping and sensory perception, with the fingers and palm working together for grasping, whereas the fingertips' mechanoreceptors perceive various tactile stimuli, such as texture, temperature, pressure, vibration, and pain [31].Conversely, the back of the hand plays a comparatively minor role in these functions.As such, placing sensors on the dorsum of the hand incurs minimal interference with grasping and sensory functions, making it more feasible for real-world applications than attaching sensors on the whole hand.Moreover, this approach resolves users' or patients' difficulty in donning bulky data gloves.
This paper aims to present a hand gesture recognition system based on a soft, transparent, and stretchable artificial skin integrated with soft ultrasonic waveguides (Fig. 1).The developed system pioneers the use of soft acoustic waveguides for static hand gesture recognition and addresses inherent challenges in design and implementation through a holistic approach, encompassing detailed prototype design and fabrication, algorithm formulation, and experimental validation.Our hand gesture recognition system shows great promise in the fields of rehabilitation and sign language translation.It supports rehabilitation by accurately tracking and interpreting hand gestures, aiding in exercise monitoring and providing immediate feedback to patients and therapists.For sign language, it offers an innovative approach for effective communication translation.The significant contributions of this work are: (1) The use of soft ultrasonic waveguides to develop a sensing skin that is soft, transparent, lightweight, and highly stretchable.This process involved the design and fabrication of soft ultrasonic waveguides and artificial skin, which were then integrated into a soft sensing skin to be worn on the back of the hand.These integrated attributes collectively enhance the developed system's wearability and user experience.
(2) The development of a subject-independent hand gesture recognition system.A static hand gesture experiment was performed to validate the system's efficacy and suitability for hand gesture recognition.Leave-One-Subject-Out Cross-Validation (LOSOCV) was employed to assess how well different machine learning models generalize to new (unseen) users.The experiment results indicate that the novel sensing skin developed in this study generalizes effectively to unseen subjects, showcasing its potential for real-world hand gesture recognition applications.

II. FABRICATION AND ASSEMBLY
Recently, soft polymer acoustic waveguides have demonstrated promising capabilities in the decoupled measurement of strain and contact location by guiding acoustic waves within the waveguide material [32].However, these waveguides lack transparency and are not easily adaptable for direct use on the human body, limiting their wearability.Additionally, the manufacturing process presents challenges, leading to reduced acoustic signal quality and increased acoustic loss.In this study, we address these limitations by integrating these acoustic waveguides with artificial skin, thereby developing a sensing skin that enhances wearability.Furthermore, our novel manufacturing method significantly improves acoustic signal quality, making the sensing skin more efficient and practical for real-world applications.The materials and fabrication process employed for the proposed sensing skin are simple and cost-effective.

A. Fabrication
The fabrication process encompasses multiple steps, including the manufacturing of artificial skin, hand frame, and soft ultrasonic waveguides.
1) Artificial Skin Manufacture: The fabrication procedure for the artificial skin commences with a design sketch generated on the user interface of a laser cutter.An acrylic plate with a thickness of 1 mm is then cut based on the design using a laser cutting machine (GD Han's Yueming Laser Group Co., Ltd, China).The plate's dimensions are tailored to the average human hand size, which measures 16 cm in length and 9 cm in width [33], [34].We chose a length shorter than the average hand length for specific targeting of the MCP joints of the fingers, rather than covering the entire finger.Following this, the chosen elastomer is thoroughly mixed, degassed, and poured onto the plate.A spin coater (KW-4C, Beijing Saidecase Electronics Co., Ltd, China) is used to evenly spread the elastomer throughout the plate using predetermined parameters to achieve the desired thickness of the artificial skin.For the elastomer Ecoflex TM 00-31 Near Clear TM , the parameters used include a speed of 200 rounds per minute (rpm), an acceleration of 500, and a time of 30 seconds (s).The coated elastomer layer is degassed once again and cured on a hot plate at 60 • C for approximately 25 minutes.Once fully cured, the artificial skin is carefully unmolded and detached from the plate.
Material selection for the artificial skin necessitates rigorous scrutiny to ensure it meets the criteria of being soft, lightweight, transparent, and compatible with the human hand.After testing different materials, we selected Ecoflex TM 00-31 Near Clear TM because when cured, this material exhibits transparency, softness, thinness, and sufficient stretchability, making it comfortable to wear on hands of varying sizes.The thickness of the fabricated artificial skin was measured to be around 200 µm.
2) Hand Frame Manufacture: To ensure proper attachment of the artificial skin to the dorsal part of the hand, a specialized hand frame is designed.The hand frame comprises finger rings and a wristband which were 3D printed using a soft and transparent Thermoplastic Polyurethane (TPU) material (WeNext Technology Co., Ltd., China).The finger rings boast excellent stretchability, enabling them to accommodate hands of different sizes.Additionally, Velcro strips are affixed to the ends of the wristband, facilitating easy adjustment of the artificial skin to fit various hand sizes.
3) Soft Ultrasonic Waveguide Manufacture: The design of the soft ultrasonic waveguides aimed to fulfill specific requirements, which are as follows: 1) Complete embedding of the piezoelectric transducer (AM1.2 × 1.2 × 1.7D-1F, Tokin, Japan) within the waveguide.2) Sufficient length of the soft ultrasonic waveguides to cover the MCP finger joints [33] while ensuring minimal acoustic losses.3) Softness, lightness, and transparency of the ultrasonic waveguides as desired attributes.
Choosing the appropriate material for the soft ultrasonic waveguide is paramount for attaining nuanced responsiveness to minor joint movements.In prior investigation, it was found that Ecoflex TM 00-45 Near Clear TM silicone elastomer exhibits superior signal properties, characterized by stronger echo signals and reduced noise.Additionally, it becomes transparent upon curing; this aligns with our design requirement of transparency, hence we opt for the Ecoflex TM 00-45 Near Clear TM elastomer for waveguide fabrication.
The manufacturing process of soft acoustic waveguides profoundly influences the final quality of acoustic signals.One of the major design requirements is the complete embedding of the piezoelectric transducer within the waveguide.This maximizes the transducer-waveguide contact surfaces, thereby improving the acoustic transmission and mechanical adhesion.Previous works utilized 3D-printed molds to fabricate these waveguides [32].The molds feature a square cross-sectional groove, measuring 1.7 mm by 1.7 mm in both height and width and 50 mm in length.The molds are terminated by a flared end on one side, facilitating easy and safe demolding of the waveguide.To ensure accurate placement and alignment of the transducer, a placement platform with dimensions of 0.25 × Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
1.2 × 1.5 mm is incorporated within the mold.This platform is located at the juncture of the square groove and flared end.The molds are coated with a release agent (Mann, Ease Release 200, USA), followed by manual transducer placement on the designated platform.Next, the chosen polymer is mixed, degassed, and poured into the mold, after which a 4-hour curing period at room temperature ensues.Subsequently, the cured waveguides are demolded by loosening the waveguide at the flared ends with a tweezer, then pulling it out slowly from the mold.After demolding, the flared ends are excised, leaving a 50 mm long square cross-sectional waveguide.
However, the demolding process often results in breakage at the transducer-polymer boundary.The breakage mainly occurs at the bottom of the transducer-polymer interface, especially where the transducer comes in contact with the elevated placement platform.This is due to the formation of a delicate thin polymer skin at the transducer-placement platform contact area, which can be easily broken when the waveguide is pulled out of the mold during demolding.Consequently, an undesired gap is created between the transducer and waveguide, leading to transducer misalignment.This phenomenon exacerbates noise due to reflected signals at the transducer-polymer boundary and results in unstable, lowamplitude echoes, eventually impacting the application of the soft acoustic waveguides as strain sensors.In response to this phenomenon, we present a new ultrasonic waveguide mold to enhance the acoustic signal quality by preventing breakage at the transducer-polymer boundary.
The newly-developed mold maintains structural congruence with its predecessor but has a hollowed bottom and thicker edges.The mold model was designed using CATIA software and fabricated by 3D printing using resin as the material.This design facilitates safer demolding of the waveguide by allowing for gentle upward or downward pushing instead of pulling, particularly at the sensitive transducer-polymer boundary.The mold minimizes transducer-polymer disjunctions and ensures that the transducer is fully encapsulated within the waveguide.Furthermore, this design improves the acoustic signal quality by reducing the noise from reflected signals at the transducerpolymer boundary, resulting in a stable echo signal.
The new soft ultrasonic waveguide manufacturing process is described as follows (Fig. 2(a)).Initially, a plate is laminated with a plastic film, ensuring a smooth surface without bubbles or wrinkles to avert geometric inconsistencies that would compromise signal quality.Subsequently, the mold is coated with a release agent, and the transducer is manually placed and aligned.In this modified mold model, the transducer placement platform is removed, and the transducer is directly placed into the mold at the intersection between the square groove and the flared end, allowing it to come into direct contact with the plastic-covered plate.Next, the chosen polymer is mixed, degassed, and poured into the mold, followed by a curing period of 4 hours at room temperature or approximately 40 minutes in an oven at 50 • C. Post curing, the waveguides are demolded by peeling off the plastic film from the plate, cutting out a section containing a single waveguide, and cleaning spill-over material at the mold's periphery.At last, the waveguide is gently loosened at the flared ends using tweezers and slowly pushed out from the mold.The flared ends and other remaining residuals are removed yielding an elongated cuboid-shaped waveguide.

B. Assembly
The system assembly process consists of three primary steps: wire embedding, hand frame attachment, and waveguide integration (Fig. 2(b)).After testing various wiring techniques, we adopted thin silver-plated copper wires (Kunshan Lvchuang Electronic Technology Co., Ltd., China) with a diameter of 0.1 mm to connect the transducer wires on the waveguides to the micro coaxial cables.Once the artificial skin is fabricated, the chosen wires are positioned atop it with double-sided tape (YZ202 0316, TianTian Factory, China).Thereafter, Ecoflex TM 00-31 Near Clear TM elastomer is dispensed over the wired artificial skin and uniformly coated using an adjustable film applicator (KTQ-II1, Guangzhou Xinyi Laboratory Equipment Co., Ltd, China).This process seamlessly encapsulates the wires while leaving a small section exposed at both ends to establish electrical connections with the rest of the system.The embedded structure is then cured on a hot plate at 60 • C for about 25 minutes.The resulting thickness of the wire-embedded artificial skin is around 500 µm.
Following this, the artificial skin is attached to the finger rings and wristband of the hand frame using super glue.The finger rings anchor one side of the artificial skin to the user's fingers, whereas the wristband, which is equipped with attached Velcro, secures the other side of the artificial skin to the user's wrist.Finally, the ultrasonic waveguides are strategically aligned over the MCP joint, on top of the artificial skin covering the back of the hand.The free ends of the waveguides are affixed to each finger ring of the hand frame using the same double-sided tape mentioned earlier.The ends of the embedded wires proximal to the fingers are soldered to the transducer wires on the waveguides, whereas the ends proximal to the wrist are soldered to the micro coaxial cables.This wiring approach streamlines the manufacturing process while maintaining a relatively thin profile for the wire-embedded artificial skin.

III. SYSTEM PERFORMANCE
This section examines the system's performance through two key aspects: the operating principle, encompassing the system setup and signal acquisition process, and the subsequent signal processing techniques employed for data extraction.

A. Operating Principle
The overall system comprises five soft ultrasonic waveguides, a multiplexer (CD74HC4067, Texas Instruments, USA), a MAX14808 acoustic pulser evaluation board (Maxim Integrated Products, USA), an Analog Digilent 2 digital oscilloscope (AD2, Digilent, USA), power supplies, and a PC.Fig. 3 illustrates the system architecture and signal transmission flow.The transducer within the ultrasonic waveguide is excited using the MAX14808 board, which generates an acoustic pulse train consisting of four consecutive square pulse waves.Precise control over the timing of excitation pulses is achieved through the AD2 digital output channels.These pulses have a frequency of 980 kHz and a voltage of 15 V, thus generating acoustic waves within the waveguide.As the transmitted acoustic wave gets reflected, the transducers function as receivers, capturing the echo vibrations.This captured signal is channeled through an AD2 oscilloscope channel.To facilitate simultaneous data collection from all five sensors using a single oscilloscope channel on the AD2 (which has only two oscilloscope channels), a multiplexer (here acting as a demultiplexer) is utilized to expand the available output channels.The multiplexer is also controlled through digital signals transmitted via the AD2 digital output channels.When the waveguide is stretched, the distance between the transducer and the end of the waveguide increases, causing a change in the Time of Flight (TOF) of the acoustic waves.The TOF represents the time interval between the emission of the acoustic pulse train and the reception of the echo signal's primary peak (Fig. 4).By measuring this time interval, the change in the length of the waveguide can be estimated.

B. Signal Processing
The AD2 oscilloscope channel captures analog voltage signal data during a 1040 µs long window.This window is subdivided into five 208 µs window segments, each corresponding to a signal from a specific waveguide positioned on the MCP joints.The acquired digital data is transmitted to the PC at a sampling frequency of 6.67 M H z, where a Python program performs real-time signal filtering using an 8th-order Chebyshev bandpass filter with a range of 500 K H z to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.3 M H z. Echo signal peaks are extracted via signal enveloping with the Hilbert transform.The TOF of the signal can be determined by subtracting the time of the acoustic pulse train's generation from the time of the echo signal's main peak reception.Each waveguide's signal from the five fingers' MCP joints is color-coded in the filtered signal illustration (Fig. 4).It is important to note that the first wave packet in each 208 µs window segment corresponds to noise resulting from reflected signals at the transducer-elastomer junction, while the second wave packet represents the first received echo, which is the desired target.

IV. GESTURE RECOGNITION A. Experimental Protocol
In this study, we conducted a static hand gesture experiment to evaluate the capability of the ultrasonic waveguide integrated artificial skin to recognize different hand gestures.Ten participants (Five males, Five females; average age: 24.8±2.8year s; average hand length: 17.8±1.2cm; average palm breadth: 8.4 ± 0.9 cm) participated in the experiment.Written informed consent was obtained from each participant enrolled in this study.Research protocols were conducted in accordance with the Declaration of Helsinki.Personal information and samples were de-identified and analyzed anonymously.The experiment involved participants wearing the developed system and performing static hand gestures.Two groups of gestures were selected, considering their relevance and frequent use in specific contexts.The first group, which we termed the daily life gesture set (Fig. 5(a)), included eight gestures primarily chosen for their widespread application in rehabilitation exercises.The second group consisted of the digits 0-9 from American Sign Language (ASL) (Fig. 5(b)), the most commonly used sign language in the world and frequently employed in human-computer interaction research for sign language translation studies [35], [36].Each participant was instructed to perform the corresponding gestures for ten trials, with each gesture lasting 5 s per trial.Pictographic cues were provided, displaying the target gesture's image on the screen for 5 s, with 5 s of rest between trials (if desired), adhering to a protocol commonly used in prior studies [9], [37].During each gesture's execution, a signal snapshot is captured and appended to a CSV file through a Python program.The data collection duration for each subject ranged from 1 to 1-and-a-half hours.

B. Data Analysis
Upon completing the data collection process, we obtained 800 samples for the daily life gesture set, which consisted of eight different gestures, each performed 10 times by 10 participants.Additionally, for the ASL gesture dataset, comprising ten different gestures, we acquired 1000 samples, with each gesture performed 10 times by 10 participants.The collected data were stored on a computer for offline data analysis.These data were preprocessed by checking for missing values and outliers, and annotating the data.During feature extraction, the TOF was computed for each sample, producing five distinct values per sample corresponding to the TOF values of each ultrasonic waveguide attached to the five MCP finger joints of the human hand.
Different machine learning algorithms, including Multilayer Perceptron (MLP), Decision Trees (DT), K-Nearest Neighbour (KNN), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGB), were applied to classify the extracted TOF features from the preprocessed data.LOSOCV was employed to evaluate the classification performance, and the two groups of gestures were processed separately.LOSOCV involves reserving the data from one subject for evaluation while training the model on the remaining subjects' data [38].By training the model in a subject-independent manner, the model can accurately predict hand gestures from new users.Classification accuracy was calculated as the ratio of correctly classified samples to total samples, providing a suitable metric for balanced or nearly balanced datasets, which is the case in this study.

C. Results
Soft ultrasonic waveguides, strategically positioned on the dorsal surface of the hand at each finger's MCP joint, serve as the core sensing element of the proposed system.These waveguides function by transmitting ultrasonic waves along the finger joints and receiving their echoes.When a finger joint is flexed, the waveguides stretch, thereby increasing the path length for the ultrasonic waves consequently the TOF of the received echo signal.TOF values, a key feature extracted from the acoustic signals, are plotted for a representative subject's data (Fig. 5), demonstrating a clear relationship with finger flexion and highlighting its importance for detecting distinct static hand gestures.Notably, TOF values remain low when none of the MCP joints are flexed, and increase upon the flexion of any MCP joint, reflecting the corresponding waveguide's elongation.This increase persists until the joint returns to its extended state, causing the TOF values to drop back down.Each gesture exhibits a specific range of TOF values, highlighting the system's excellent repeatability across trials.In Fig. 5, the signals acquired from the ultrasonic waveguides on the thumb, index, middle, ring, and pinky fingers' MCP joints are distinguished by different colors.
Fig. 6 compares the average accuracy obtained by different machine learning models for each gesture set using LOSOCV.The MLP model outperformed others, yielding accuracy values of 91.13% and 88.5% for the daily life and ASL gesture sets, respectively.In contrast, the DT model recorded the lowest accuracy for the daily life set at 86.5%, while the KNN model was least effective for the ASL set with 80.4%.In general, all models performed better on the daily life gesture set than the ASL gesture set.It is worth noting that the accuracy values reported in this study appear lower compared to those in other studies [35], [39].However, this discrepancy is not attributed to the model itself but instead to the evaluation approach.Earlier research also corroborated that performance results obtained with LOSOCV are notably lower compared to other evaluation methods [40].Although the classification accuracy values are slightly lower, LOSOCV provides a subject-independent analysis that estimates the model's performance for new subjects.Consequently, for real-world applications where it is not feasible to include data from every potential user in the training set, LOSOCV is preferred to ensure reliable model evaluation.
The model's performance could be influenced by how closely the features of the validation subject's data features align with those of the training subjects.Hence, there could be a significant variability in model accuracy across different subjects which accentuates the necessity of employing LOSOCV to capture subject-specific variations effectively.Moreover, the variance in standard deviations across models underscores variations in model consistency.Consequently, when contemplating the adoption of a single generic model for all users, it is imperative to consider the standard deviation during the model selection phase.

V. DISCUSSION
This paper presents a novel, soft, and stretchable wearable sensing skin based on an acoustic sensing approach Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
for gesture recognition.Ultrasonic waveguides serve as the sensing element and are integrated into soft artificial skin to form the sensing skin.This system offers enhanced stretchability and comfort compared to traditional rigid sensors and data gloves, while their use of near-transparent Ecoflex TM elastomers ensures the device is aesthetically pleasing and less noticeable, making it ideal for continuous wear.Soft ultrasonic waveguides, unlike vision-based and optical systems, offer greater reliability across various settings due to their reduced susceptibility to environmental changes such as lighting, temperature, and humidity, ensuring consistent performance under diverse conditions.
A streamlined manufacturing technique for soft acoustic waveguides has been introduced to address the issue of breakage at the transducer-polymer boundary and to enhance the quality of acoustic signals.This method ensures a safe demolding process for the waveguide and also enables the complete embedding of the transducer within the waveguide, further enhancing the acoustic signal quality.Any imperfections in the current manual integration of piezoelectric transducers into the waveguides can lead to inconsistent acoustic transmission and reception, underscoring the need for precise manufacturing to enhance gesture recognition accuracy.The inclusion of a hand frame in the sensing skin allows for secure attachment to the human hand while facilitating easy wearing and removal.In terms of wearable comfort, the materials utilized in fabricating the artificial skin and ultrasonic waveguide are carefully selected to ensure that the developed system exhibits attributes such as high softness, stretchability, compliance with finger flexion, and adaptability to various hand sizes, all while minimizing potential interference.Material choices were also optimized to minimize acoustic signal loss, considering that properties such as elasticity, density, and acoustic impedance significantly affect ultrasonic wave propagation and can lead to measurement errors.
The fabrication of these sensing skins does not necessitate the use of clean room facilities or intricate manufacturing procedures, making their production simple and cost-effective.Although the current lab-based fabrication utilizes manual steps, the inherent scalability of the designed measuring process and the electronics used, coupled with their compatibility with industry-standard automated processes, ensures scalability for future large-scale production.This advantage positions our method for broader adoption within the soft wearable technology compared to previously reported soft sensors [41], [42].
Though prior works [25], [26], [27], [28], [29] leverage data from all finger joints, potentially offering a wider information range, we focus on the MCP joint to reduce the system's footprint, prioritizing wearability for long-term rehabilitation use as many common rehab gestures rely primarily on MCP movement.Moreover, the sensing skin's placement exclusively on the hand's dorsal part while leaving the fingertips exposed guarantees that the palm and finger sensations remain unaffected.This approach, while potentially introducing minor limitations in range of motion (ROM) and function, minimizes these drawbacks due to the reduced sensor footprint, making it significantly more suitable for long-term wear and real-world applications compared to bulky data gloves that can significantly restrict dexterity and grasping capabilities.
Variability in gesture performance among subjects can lead to inconsistent acoustic signatures, necessitating the development of subject-independent models trained on diverse data from multiple subjects to enhance the system's ability to accurately classify a wide range of gesture variations from unseen users.The experimental validation of this study involves subject-independent hand gesture recognition employing various machine learning models, assessed through LOSOCV.TOF was leveraged as the key feature for the machine learning classification task, given its demonstrated distinction as the most prominent feature of the ultrasonic waveguides (Fig. 5).This system directly measures the movement of finger joints by capturing the TOF of acoustic waves, offering a more precise and direct measurement of hand gestures compared to methods that infer movement from secondary data, which can introduce errors or necessitate complex algorithms.
Among the machine learning models evaluated for gesture recognition, the MLP classifier achieved the highest performance for both gesture sets (Fig. 6).Considering the inherent sensitivity of machine learning models to the choice of hyperparameters, there is an opportunity for improved accuracy through further hyperparameter tuning.Our study employs LOSOCV to ensure that estimates are representative of the performance expected from new users, as it uses distinct subjects' data for model training and testing.Such an approach not only allows the system to be effectively used by new users without the need for personalized calibration-enhancing scalability and ease of use-but also facilitates straightforward integration into real-time applications like rehabilitation monitoring and sign language translation.This situation closely resembles real-world applications where it is difficult or impractical to require new users to train the model with their data before use.
The relative dip in classification accuracy observed with LOSOCV evaluation method underscores the necessity of devising a new model tailored for higher accuracy for new users.One potential approach to achieve this is through model personalization leveraging user similarities [43].Overall, the experiments emphasized the significance of employing LOSOCV to assess a machine learning model's performance for new users in human-computer interaction research.
The classification accuracy of the proposed system was compared to other subject-independent wearable hand gesture recognition systems (Table I).It is important to note that achieving completely objective comparisons between systems is unattainable due to the following factors: (i) variations in the number, types, technology, and placement of sensors among the systems; (ii) researchers building their systems based on their own unique datasets, which may differ in data acquisition methods (e.g., sample count per gesture, sampling rates) and the number of gesture classes; (iii) differences in preprocessing techniques; (iv) differences in feature extraction methods employed; and (v) differences in classifiers adopted with varying parameter settings.Even though achieving a completely objective comparison is challenging, we can highlight some observations regarding the classification performance when comparing other works to our proposed system.Notably, our results surpassed those of wrist-based or forearm-based techniques, such as surface electromyography (sEMG)-based methods [44], [45], [46].This observation could be attributed to the proximity of the sensors to finger movements which result in larger signal variations.The classification outcomes presented in this study exhibited lower performance than previously suggested systems that employed finger joint motion sensors [47], [48]; however, these systems are rigid and relatively bulky.Although the proposed system shows promise for practical embedded applications, a potential limitation is that it has not been integrated into a stand-alone setup encompassing a microcontroller, power supply, and wireless communication.Future research should concentrate on developing battery-free and low-power systems [49].Additionally, efforts could be made to integrate the EchoGest system with commercially available wristbands and smartwatches, and adopt an experimental protocol that resembles real-life scenarios.
While the current system effectively addresses the needs of many common rehabilitation exercises through MCP joint data analysis, the potential benefits of multi-joint assessments for more intricate gesture recognition tasks are acknowledged.Hence, another area of exploration focuses on optimal placement strategies for a limited number of soft ultrasonic waveguides.This optimization would aim to capture a broader range of hand motions while maintaining the critical aspect of user comfort and wearability for extended use.Future research directions could also investigate the potential benefits of incorporating anatomical normalization techniques or alternative feature extraction methods to account for variations in hand anthropometry across a wider and more diverse participant pool.Furthermore, to expand the developed model's applicability beyond rehabilitation and sign language gestures, future work can explore incorporating domain-independent feature extraction and transfer learning techniques.Additionally, continual learning approaches hold promise for continually updating the model with new gestures, further enhancing its gesture recognition capabilities.

VI. CONCLUSION
In this paper, we present EchoGest, a soft, transparent, lightweight, and stretchable sensing skin integrated with ultrasonic waveguides for subject-independent hand gesture recognition.The work delves into the design, fabrication, and integration of a sensing skin based on ultrasonic waveguides.A new method to manufacture soft ultrasonic waveguides was introduced, aimed at resolving the breakage at the transducerelastomer interface, thereby improving the quality of acoustic signals.Experiments were conducted to validate the proposed system's performance for hand gesture recognition through Leave-One-Subject-Out Cross-Validation (LOSOCV).The experiment results highlight the system's capability to generalize to new users and its significant potential for practical implementation.The proposed EchoGest system holds promise for advancing human-computer interaction particularly demonstrating substantial potential in rehabilitation, assisting in hand mobility and therapy exercises, and in sign language translation, enabling effective communication for the hearing impaired.

Fig. 3 .
Fig. 3.The developed system's architecture and signal transmission flow of the electronic circuit.

Fig. 4 .
Fig. 4. Filtered analog signal collected by the oscilloscope channel of the AD2.

Fig. 6 .
Fig. 6.Classification accuracy of different classifiers for the two gesture sets.