EMG gesture signal analysis towards diagnosis of upper limb using dual-pathway convolutional neural network

: This research introduces a novel dual-pathway convolutional neural network


Introduction
Classifying upper limb gestures using multichannel surface electromyography (sEMG) poses a formidable challenge with implications for both diagnostic and therapeutic applications [1,2].The inherent non-linear and stochastic nature of sEMG signals introduces a significant hurdle, complicating the precise categorization of upper limb gestures [3,4].The complex interplay among various muscles, coupled with their complex signaling patterns, contributes to the complexity of extracting meaningful information from sEMG recordings [5,6].This challenge is particularly pronounced in the context of employing sEMG signals for electromechanical hand prostheses, where precision and reliability are paramount [7,8].The variability inherent in sEMG signals, influenced by factors such as muscle fatigue, electrode placement, and individual anatomical differences, adds layers of intricacy to the task of achieving accurate and robust gesture classification [9][10][11].Effective classification methods for sEMG in diverse applications require innovative solutions to address the challenges posed by unique characteristics of upper limb gesture signals.
Machine learning and deep learning techniques have emerged as transformative tools in the domains of information processing, providing unprecedented capabilities for various tasks such as pattern recognition [12,13], feature extraction [14,15], and classification [16][17][18].In the context of sEMG for gesture classification, numerous studies have explored the applicability of machine learning and deep learning techniques [19,20].For instance, Saeed et al. [21] applied machine learning techniques to raw signals from the DB1 dataset, achieving accuracies of 85.41% and 91.14% using an artificial neural network (ANN) and linear discriminant analysis (LDA), respectively.Karnam et al. [22] achieved a classification accuracy of 88.8% on DB1 using K-nearest neighbours (KNN).Akmal et al. [23] explored training strategies for artificial neural networks (ANN), which emerged as a pivotal aspect in sEMG signal classification.This study analyzes twelve different training strategies, evaluating their performance on multiday EMG data.The results highlight the resilience of backpropagation and scaled conjugate gradient methods, providing valuable insights into optimal training approaches for efficient prosthetic control.SVM-based classification for prosthetic finger movements is investigated for real-time implementation [24].Leveraging the stability and efficiency of SVM on a Raspberry Pi, this study achieved 78% classification accuracy.Inam et al. [25] explored gender-specific considerations in sEMG for upper-limb prosthetics.Evaluating EMG differences between males and females, this research employs an ANN for classification.While overall similarities are observed, certain features exhibit gender-specific variations, shedding light on the importance of tailored approaches for diverse user populations.In the domain of deep learning, Hu et al. [26] implemented a recurrent neural network (RNN) and a convolutional neural network (CNN) on sEMG signals from NinaPro DB1, attaining an accuracy of 87%.Pancholi et al. [27] applied a CNN model to various NinaPro datasets, achieving classification accuracies ranging from 81.67% to 99.11%.Cheng et al. [28] utilized the NinaPro DB1 dataset, applying CNN to sEMG-feature-images extracted from raw signals and achieving a classification accuracy of 82.54%.Additionally, Tong et al. [29] applied a CNN and long-short term memory (LSTM) based hybrid classifier on the NinaPro DB1 dataset, yielding an accuracy of 78.31%.[30] explored a concatenate feature fusion recurrent convolutional neural network (CFF-RCNN) to address this by introducing a concatenate feature fusion (CFF) strategy, achieving notable accuracies.CFF-RCNN surpasses reported results, achieving 88.87% on DB1, 99.51% on DB2, and 99.29% on DB4 with over 50 gestures.Qureshi et al. [31] introduced the efficient concatenated convolutional neural network (E2CNN) as a robust solution for real-time sEMG classification.By converting raw sEMG signals into Log-Mel spectrograms (LMS) and employing concatenation layers, E2CNN achieves high accuracy and response times for both non disabled and amputee subjects, positioning it as a potential candidate for prosthetic control in real-world scenarios.Another study [32] also explored improving myoelectric control in wearable prostheses using CNNs.By comparing multiday sEMG recordings, the proposed CNN exhibits superior accuracy for able-bodied and amputee subjects in within-day and between-day analyses.This research underscores the CNN's efficacy and computational efficiency, presenting a promising avenue for enhancing prosthetic hand control.
In the proposed study, we are introducing a dual-pathway convolutional neural network (DP-CNN) to classify sEMG signals from both healthy and amputee subjects.The novelty of this architecture resides in its ability to process Log-Mel spectrogram (LMS) images derived from raw multichannel EMG signals.Utilizing spectrograms instead of raw EMG signals has demonstrated enhanced performance across various studies in the field [27,[31][32][33].This trend is evident in multiple investigations where LMS significantly improved the efficiency of the classification system.LMS effectively captured the time-frequency characteristics of the EMG signals [34], and they are commonly used in various applications, including driver fatigue detection and EMG classification [32].For instance, in a study focused on driver fatigue detection, a model based on LMS and a convolution recurrent neural network (CRNN) was proposed and demonstrated high accuracy in distinguishing between alert and fatigued states [34].LMS also brings superior performance to the deep learning models.This is evidenced in a study in which a CNN achieved an impressive classification accuracy of over 90% on healthy and amputee subjects when applied to LMS-based images [31].Furthermore, LMS images derived from EMG signals have been employed successfully in hybrid deep-learning methods to classify four different EMG-signal patterns, achieving significant classification results [35].Moreover, LMS also offers a data augmentation method, which has been shown to significantly enhance the accuracy of deep learning models in classifying EMG signals [33].
The rationale behind this preference for spectrograms lies in their ability to capture intricate time-frequency characteristics inherent in EMG signals.Spectrograms, particularly LMS, provide a more comprehensive representation of the signal, highlighting nuanced patterns that might be obscured in raw EMG data.This richer representation aids in extracting deep information from the signals, contributing to increased effectiveness in classification tasks.EMG signals exhibit complex and dynamic frequency patterns that convey information about muscle contractions.The Mel-frequency bands in LMS provide an effective means of representing these intricate patterns, offering a more compact and discriminative representation of the signal.
The logarithmic transformation in LMS compresses the higher frequencies, emphasizing the lower-frequency components.This is beneficial for EMG analysis, as the lower frequencies often contain valuable information related to muscle activities and gestures.
In alignment with these findings, the current study also adopts the use of spectrograms, specifically LMS, as the primary input for the proposed DP-CNN.The choice is grounded in the well-substantiated efficacy of spectrograms, particularly LMS, as demonstrated in previous literature.Spectrograms not only effectively capture time-frequency characteristics, but also offer superior performance to deep learning models.The LMS technique involves transforming raw EMG signals into a visual representation that emphasizes the spectral features relevant for gesture classification.By utilizing Mel-frequency bands, which are perceptually spaced to mimic human hearing, and applying a logarithmic scale, the LMS efficiently captures the frequency patterns within EMG signals.
As such, the adoption of LMS as the primary input in our proposed DP-CNN is grounded in evidence of its efficacy and superior performance in previous studies.By implementing this methodology, we aim to validate the performance of DP-CNN on LMS images extracted from sEMG signals, thereby offering a robust, reliable, and real-time solution for prosthetic control applications.The proposed DP-CNN architecture has been implemented on surface EMG (sEMG) of healthy and amputee subjects taken from NinaPro DB1, DB2, and DB3, respectively.The proposed DP-CNN is based on the CNN, and input is provided as a spectrogram images converted from raw EMG signals.
The LMS images are obtained from the raw multichannel EMG signals available in NinaPro Databases 1 (DB1) and 3 (DB3).The contribution of this work is validating the performance of the DP-CNN implemented on LMS images extracted from bio-signals, in this case, EMG signals.
The main contributions of this paper are: • Preprocessing raw EMG signals into Log-Mel spectrogram images, enhancing feature extraction.
• Development of a dual-pathway convolutional neural network (DP-CNN) combining convolutional and dense pathways for robust EMG signal classification.• Extensive assessment of the proposed DP-CNN's effectiveness and generalizability across diverse datasets, including both healthy and amputee subjects.• Thorough comparison of the DP-CNN's performance against prior works on the same dataset, providing insights into advancements.• Benchmarking against pre-trained transfer learning models (AlexNet, MobileNet, VGG19, DenseNet121, ResNet50), showcasing the uniqueness and efficacy of the proposed approach.
The paper is organized as follows: In Section 2, we provide a detailed description of the methodology used in this study.We first discuss the datasets used and the pre-processing steps undertaken to prepare the data for analysis.We then provide the details of the proposed deep neural network (DNN) used for classification.Section 3 illustrates the results obtained from the proposed methodology and we discuss the implications of our findings and their potential applications.Section 4 concludes the paper.

Experimental datasets and setup description
In this study, we are using three datasets from a publicly available database: NinaPro Database 1 (DB1) [36], Database 2 (DB2) [37], and Database 3 (DB3) [38] for investigation and validation of the proposed methodology.The details of each dataset are given below: 1) The first NinaPro database contains 10 repetitions of 52 different movements carried out by 27 intact subjects and serves as a standard dataset for the classification of myoelectric motions.The DB1 dataset includes three exercises categorized into (A) basic movements of the fingers, (B) isometric, isotonic hand configurations and basic wrist movements, and (C) grasping and functional movements.Ten Otto Bock MyoBock 13E200 electrodes are used to collect sEMG data; a Cyberglove 2 data glove is used to collect kinematic data.Each subject and exercise has a corresponding MATLAB file in the database that contains information about the subject, the exercise, the electrodes' sEMG signal, the 22 cyberglove sensors' uncalibrated signal, the subject's repeated movement, and the stimulus' repeated occurrence.
2) The DB2 dataset is composed of three types of exercises that are categorized into three groups: (A) basic movements of fingers and wrist, (B) grasping and functional movements, and (C) force patterns.To collect kinematic data, a dataglove (Cyberglove 2) and an accelerometer on the wrist were used, while a Delsys Trigno Wireless EMG system were utilized with 12 active double-differential wireless electrodes to record muscular activity.The sampling rate for sEMG signals is 2 kHz.Each exercise and subject have a synchronized MATLAB file that contains various variables, such as subject and exercise number, sEMG signal, kinematic information, inclinometer signal, movement repeated by the subject, and force recorded during the third exercise.Additionally, force sensor calibration values for the least and highest force are included in each file.
3) The third NinaPro database is a comprehensive resource for the development and evaluation of naturally-controlled non-invasive robotic hand prostheses.The experiment consists of the same three exercises as DB2: basic finger and wrist motions, grasping and functioning movements, and force patterns.The dataset includes 49 different movements (including rest) performed by 10 amputee participants with each movement repeated 6 times, and the movements were chosen from the hand taxonomy as well as literature on hand robotics.A Delsys Trigno Wireless EMG system with 12 active double-differential wireless electrodes was used to collect the muscular activity.The database provides one MATLAB file with synchronized variables for each exercise and subject, including subject and exercise number, sEMG signal, kinematic information, inclinometer signal, movement repeated by the subject, and force recorded during the third exercise.The collection also contains force sensor calibration information for the minimum and maximum force.
The number of repetitions of each gesture is ten in DB1, and six in DB2 and DB3.For this study, we randomly selected ten subjects from DB1 and DB2, five male and five female each; and ten from DB3 for validation of the proposed technique.We utilized the common movements from the three databases; exercise C of DB1 and exercise B of DB2 and DB3 are the only common gestures among the datasets.The exercise is composed of 23 hand gestures and is illustrated in Figure 1.Gestures utilized in this study [36].

Preprocessing technique
The data in this study was recorded using different numbers of channels in the three databases: 10 channels in DB1, and 12 channels in DB2 and DB3.The sampling rate for DB1 was 100 Hz, while for the DB2 and DB3 datasets, it was 2 kHz.The DB1 data was already shielded from power line noise; however, DB2 and DB3 are not shielded from power line noise [36].Therefore, a 50 Hz second-order Butterworth notch filter was used for DB2 and DB3.Furthermore, DB1 data was filtered at 1 Hz using a second-order Butterworth filter [39].In DB1, the signals available were root mean squared (RMS) values of raw signals, while in DB2 and DB3, raw EMG signals were available.To ensure optimal processing for EMG-based prosthetics, it is recommended to use signal segments with duration ranges from 150 ms to 250 ms [40,41].Therefore, each raw EMG signal is divided into smaller segments with a duration of 200 ms with an overlapping increment of 50 ms.Since the number of repetitions are different in each dataset, we obtained different numbers of segmented signals.
Specifically, we extracted 5750 × 10 segmented signals from DB1, while we extracted 14, 950 × 12 segmented signals from DB2 and DB3 for each subject.Here, the 10 and 12 represent the number of channels in each signal.
After segmentation, each signal was converted into a Log-Mel spectrogram (LMS) image using the librosa library in Python.The LMS is a representation of the spectral content of a signal on a logarithmic frequency scale.By using LMS, we were able to analyze the frequency content of the segmented signals, which can provide more information for classification.
Let us define a segmented signal s w (t) with length L and sampling frequency f s w in hertz.Its shortterm fourier transform (STFT) S w is then given by Here, H ∈ N represents the hop length, w : [0 : τ − 1] ∈ R is the Hann window defined as w = 0.5 − 0.5 cos ( 2πt τ−1 ), where τ ∈ N is the length of w, x ∈ [0 : L−τ H ] denotes the time index, and ȳ ∈ [0 : N  2 ] denotes the frequency index.The short-term fourier transform spectrogram of S w can be obtained as ( The Mel spectrum and linear frequency are related by f mel = 2959 × log 10 (1 + f 700 ).We can estimate the LMS using log 10 (M FB ( x, ỹ) • S S T FT ( x, ỹ)). (2.3) Here, M FB ( x, ỹ) is the Mel filter bank and can be estimated from Here, f (ỹ) denotes the linear frequency, and f c ( x) = x • δ f mel represents the center frequencies on the Mel-scale.
For each windowed signal in DB1, we convert it into an LMS individually.Then, this process is iteratively repeated for all ten channels, providing us with ten LMS images.We then combine these ten LMS vertically to form an input image, as shown in Figure 2(a).This process yields 5750 EMG images as LMS, which serve as the input dataset to the DP-CNN model for each subject in DB1.Similar to DB1, we apply the same technique to each windowed signal in DB2 and DB3, resulting in twelve LMS for each signal.These twelve LMS are combined vertically to form an image, as illustrated in Figure 2 (b),(c).This process resulted in 14,950 EMG images as LMS for each subject in DB2 and DB3.

Dual pathway convolutional neural network architecture
We propose a dual-pathway convolutional neural network (DP-CNN) with batch normalization and max-pooling functions along with dropout layers for classification of electromyogram (EMG) signals.
The proposed model is designed to operate on LMS as input, with a fixed feature size of 224 × 224.The DP-CNN is comprised of two pathways, namely the traditional convolutional pathway and a dense pathway, which are combined using a concatenation layer.The input to the DP-CNN is denoted as S LM ( x, ỳ), and the convolution operation can be explained from [42] and is given as follows: Here, f n represents the n-th layer of the DP-CNN, and N is the total number of layers used, which in this study is set to 8. The parameter for the n-th layer is denoted as θ 1 = [X, b].We can express the convolutional layer operations in the following way: Here, I n represents the input of the nth-layer, X is the corresponding filter, * denotes the valid convolution operation, h(•) denotes the pointwise activation function, and b denotes the vector bias term.
In the proposed DP-CNN, the convolutional layers of the first pathway are estimated using where C i, j is the value of the feature map for the i th row and j th column, W k,l is the weight for the k th row and l th column of the filter, I i+k−1, j+l−1 is the value for the corresponding position in the input image, b is the bias, and n and m are the dimensions of the filter.The activation function, rectified linear unit (ReLU), can be represented as ReLU(x) = max(0, x). (2.8) The dense layer equation can be represented by where y is the output of the dense layer, w i is the weight for the i th unit, x i is the input for the i th unit, n is the number of units in the dense layer, and the dropout layer D is a binary mask with a probability of 0.2 to set the corresponding value to 0.
The concatenation function can be represented by where O is the final output of the concatenation function, and [•, •] represents the concatenation operation of two arrays.
Overall, the proposed DP-CNN can be represented by where and and x j is the output of the previous dense layer.
The architecture of the DP-CNN model is shown in Figure 3.The primary advantage of the convolutional pathway is its ability to automatically learn and extract relevant features from input images without the need for manual feature engineering.This is accomplished through the use of convolutional and pooling layers, which learn local and global patterns in the input data.The model is able to learn increasingly complex representations of the input images by stacking multiple convolutional layers on top of each other.
On the other hand, the advantage of the dense pathway is its ability to detect global patterns and relationships in the input data.This is accomplished through the use of fully connected layers that learn to combine features from all parts of the input data.The model is able to learn increasingly complex representations of the input data by stacking multiple fully connected layers on top of each other.Another benefit of the dense pathway is its ability to handle input data of any size and shape, as long as it can be converted to a one-dimensional format.This makes the model suitable for a wide range of input data types, such as text or time series data, which may not have the same spatial structure as images.Furthermore, the use of dropout layers in the dense pathway helps to regularize the model and prevent overfitting.Dropout layers randomly set a fraction of the activations in the previous layer to zero during training, which helps to reduce co-adaptation between neurons and forces the model to learn more robust features.The DP-CNN model can capture different types of information from the input images because it has two separate pathways for processing the input data.The convolutional pathway can extract spatial features from images by using convolutional and pooling layers, whereas the dense pathway can capture global patterns and relationships in input data by using fully connected layers.The model can make more accurate predictions by combining the outputs of both pathways.
The use of multiple pathways allows for greater regularization and reduces the risk of overfitting.By having two distinct pathways that process the input data in different ways, the model is less likely to memorize the training data and more likely to generalize well to new, previously unseen data.Furthermore, the use of dropout layers in the dense pathway provides additional regularization and helps to prevent overfitting.
The DP-CNN model can capture different types of information from the input images by using two separate pathways for processing the input data.The convolutional pathway can extract spatial features from images by using convolutional and pooling layers, whereas the dense pathway can capture global patterns and relationships in the input data by using fully connected layers.By combining the outputs of both pathways, the model is able to make more accurate predictions by leveraging both types of information.
The use of multiple pathways improves regularization and reduces the risk of overfitting.The model is less likely to memorize the training data and more likely to generalize well to new, unseen data by having two separate pathways that process the input data in different ways.Furthermore, the use of dropout layers in the dense pathway provides additional regularization and aids in the prevention of overfitting.
For training the proposed DP-CNN model, a total of 50 epochs are set and a batch size of 32 is used.The Adam optimizer is employed for optimization.The learning rate was set to 0.001.The input to the DP-CNN model is an image with a size of 224 × 224 × 3. Prior to feeding the images to the DP-CNN model, they are resized using a rescaling layer (RL), which scales the pixel values in the range of 0 to 1.This is done to ensure that the input data is within a suitable range for the neural network model.The rescaled input images are then provided as input to the DP-CNN model for training.
Convolutional pathway: The input images processed by the rescaling layer are fed to the convoultional pathway.Details of each layer are given below: • Layer 1: The first layer is composed of a convolutional layer of 32 filters with a filter size of 9 × 9, with ReLU and L2 regularization of 0.001.The convolutional layer is succeeded by a batch normalization layer.After that, a maximum pooling layer is used with a pool size of 4 × 4. • Layer 2: The second layer is similar to the first layer and is composed of same batch normalization and max-pooling layer, except the the number of filters are 48 and a filter size of 5 × 5 is used.• Layer 3: The third layer is made up of a convolutional layer of 48 filters with a filter size of 3 × 3 with ReLU activations.A batch normalization layer is introduced after the convolutional layer, and then a max-pooling layer is used with a pool size of 2 × 2.

Dense pathway:
The same input images from the rescaling layer are converted to one-dimensional data using a flatten layer and then are fed to the dense pathway: The first layer in the second pathway is a fully connected dense layer composed of 24 units, and ReLU is used as the activation function.The second layer is a dropout layer with a rate of 0.2.• Layers 3 & 4: The third and fourth layers are similar to the first and second layer, except the fully connected dense layers have 32 units.• Layers 5 & 6: The fifth and sixth layers are similar to the above two layers, except the fully connected dense layers have 48 units.

Concatenation and classification layer:
The data from the first and second pathways are combined using a concatenation layer.This layer allows the outputs of both pathways within a neural network to be combined into a single output.The last layer is composed of 23 units.

Performance evaluation metrics
The performance of the proposed DP-CNN is evaluated on metrics based on multi-class classification.For this study, we are using mean accuracy, mean precision, mean recall, and mean F1-score along with their standard deviations [43].
Let C T P be true positives and C T N true negatives, while C FP are false positives and C FN false negatives.Multi-class classification accuracy can be estimated using

.14)
For binary-class classification n = 1, precision P k n=1 and recall R k n=1 for a particular class k can be calculated as follows: (2.16) For multi-class classification n = N, precision P k N and recall R k N can be calculated as follows: .17) F1-score for multi-class classification is given as (2.19)

Results and discussion
Table 1 illustrates the developmental setup employed in this study.We have evaluated the performance of the individual pathways of the proposed dual-pathway CNN, and compared the results of each pathway with the DP-CNN.For this purpose, we have only tested the performance on a single subject from each dataset.For all three datasets, the dense pathway and concatenation layer are removed and the performance is validated.Alternatively, in the second phase, the convolutional pathway along with the concatenation layer in DP-CNN are removed and the performance of only the dense path is validated.Finally, the performance of the complete DP-CNN is validated on a single subject from each dataset.Figure 4 illustrates the accuracies achieved by each pathway along with DP-CNN on one subject from all three datasets.It can be seen that neither pathway could achieve the desired accuracy compared to the complete DP-CNN.

Performance assessment of the proposed method
We have evaluated the proposed DP-CNN on 30 subjects: 20 able-bodied subjects from DB1 and DB2, and 10 amputee subjects from DB3.To ensure diversity and balance, we randomly selected ten subjects from each of the two databases DB1 and DB2.In each database, we ensured that five males and five females were selected.Regarding DB3, we were able to select ten subjects out of the eleven available, but we could not include subject one due to the unavailability of the desired gestures in the dataset [36].Additionally, subjects 7 and 8 had fewer electrodes, resulting in ten channels instead of twelve, but we still processed their data for our analysis.This approach allowed us to gather a representative sample that can help us draw accurate conclusions and insights from our research.The testing and training are done for each subject from both dataset.Each dataset is divided into a 70-30 split, where 70% is used for training and 30% for validation.This is done for each subject and the performance metrics are determined for each subject.Using the proposed DP-CNN, DB1, DB2, and DB3 are classified individually, and then the mean accuracy, mean precision, mean recall, and mean F1-score are determined from all subjects.Table 2 shows the accuracy, precision, recall, and F1-score for DB1 subjects.The proposed DP-CNN achieved a mean classification accuracy of 94.93 ± 1.71%, mean precision of 94.93 ± 1.71%, mean recall of 94.93 ± 1.71%, and mean F1-score of 94.93 ± 1.71% on the subjects of the DB1 dataset.Similarly, the proposed DP-CNN achieved a mean classification accuracy of 94.00 ± 3.56%, mean precision of 94.00 ± 3.56%, mean recall of 94.00 ± 3.56%, and mean F1-score of 94.00 ± 3.56% when applied to DB2 subjects.Table 3 illustrates the accuracy, precision, recall, and F1-score for DB2 subjects.Similarly, the proposed DP-CNN has achieved a mean classification accuracy of 85.36 ± 0.82%, mean precision of 85.35 ± 0.86%, mean recall of 85.34 ± 0.81%, and mean F1-score of 85.36 ± 0.82%, when applied to DB3 subjects.Table 4 illustrates the accuracy, precision, recall, and F1-score for DB3 subjects.

Comparison of results with previous studies
In this study, we evaluated the performance of DP-CNN on LMS extracted from raw EMG signals of the publicly available NinaPro datasets.To assess the effectiveness of our approach, we compared the results with previous studies that utilized deep learning-based techniques on the same datasets.Although, to best of our knowledge, only one experiment has been conducted on the NinaPro database using Mel or Log-Mel spectrograms, we have compared our results with other studies that have employed any deep learning-based techniques on this database.
The proposed DP-CNN model was compared to earlier studies on the three databases: DB1, DB2, and DB3.Table 5 reveals that the DP-CNN model attained an accuracy of 94.93% on DB1, exceeding earlier research that ranged from 66.60% to 91.27%.On DB2, the DP-CNN model achieved the similar accuracy as on DB1, 94.00%, as shown in Table 6.Previous investigations achieved accuracies ranging from 60.27% to 89.45%.Finally, on DB3 the DP-CNN model attained an accuracy of 85.36%, as shown in Table 7, which is higher than earlier research, which ranged from 46.27% to 81.67%.These findings show that the suggested DP-CNN model is successful for gesture classification across all three datasets, with higher accuracy than earlier studies.Table 8 shows the computational cost of our proposed model compared to other studies.When comparing our DP-CNN model to previous methods, we found that, although it achieves slightly lower accuracy in DB2 compared to CFF-RCNN [30], it has notable advantages in terms of computational efficiency.In DB2, where CFF-RCNN achieves an accuracy of 99.51%, our proposed DP-CNN maintains a competitive accuracy of 94.00%.However, DP-CNN outperforms CFF-RCNN in terms of training time, taking only 462.82 seconds compared to CFF-RCNN's 542.245 seconds.This represents a 14.56% reduction in training time, highlighting the efficiency of DP-CNN.Our proposed model also has a lower prediction time, further emphasizing its computational advantages.Since the sEMG signals are converted to images, we have compared the performance of our proposed DP-CNN with pre-trained transfer learning models that has been trained on million of images.The transfer learning models utilized for comparison in this study include AlexNet, MobileNet, VGG19, DenseNet121, and ResNet50.The performance of the DP-CNN on each database (DB1, DB2, and DB3) was compared to these transfer learning models.As shown in Figure 5, the DP-CNN outperformed all other models on DB1 and DB2, achieving accuracies of 94.00% and 94.93%, respectively.However, on DB3, the DP-CNN achieved an accuracy of 85.36%, which was lower than the accuracy achieved by AlexNet (87.29%) and VGG19 (87.96%).The other models achieved lower accuracies on all three databases compared to the DP-CNN.The suggested DP-CNN model's findings were compared to earlier studies on the NinaPro DB1, DB2, and DB3.The accuracy gained by each model was used in the comparison.The suggested DP-CNN model surpassed earlier studies in terms of accuracy across all three datasets, according to the results.On DB1, the DP-CNN model achieved an accuracy of 94.93%, the highest accuracy achieved on this dataset thus far.On DB2, the DP-CNN model achieved an accuracy of 94.00%, the highest accuracy ever achieved on this dataset.Finally, on DB3, the DP-CNN model attained an accuracy of 85.36%, the highest accuracy achieved on this dataset thus far.These findings show that the DP-CNN model performs well for gesture classification across all three datasets.

Conclusions
In this study, we addressed the challenges associated with machine and deep learning algorithms, especially their performance decline in the face of increased number of classes, diverse data collected over multiple days, and population differences.Recognizing the need for a robust learning system, we proposed and applied a dual-pathway convolutional neural network (DP-CNN) to diverse datasets featuring both able-bodied and amputee subjects.The DP-CNN operated on Log-Mel spectrogram-based images derived from surface electromyography signals obtained from NinaPro DB1 and DB3.The results were benchmarked against other CNN models implemented on the same datasets, revealing the superior performance of the proposed DP-CNN.In DB1, the DP-CNN achieved a remarkable mean classification accuracy of 94.93%, a substantial 28.33% increase from the baseline and a noteworthy improvement of 6.73% over the previous highest accuracy.Similar advancements were observed in DB2 and DB3, showcasing the model's consistent and robust performance across datasets.The architecture of the DP-CNN, featuring convolutional and dense pathways, played a pivotal role in capturing both local and global patterns within EMG signals.
Integrating outputs from these pathways enhanced predictive accuracy and classification capabilities.The incorporation of batch normalization and dropout layers in both pathways further contributed to model regularization and mitigated overfitting.Comparisons with prior studies on NinaPro DB1, DB2, and DB3 demonstrated that the DP-CNN consistently outperformed earlier models in terms of accuracy.Achieving the highest accuracy on each dataset-94.93% on DB1, 94.00% on DB2, and 85.36% on DB3-the DP-CNN showcased its effectiveness in gesture classification.Additionally, a comparative analysis against pre-trained transfer learning models, including AlexNet, MobileNet, VGG19, DenseNet121, and ResNet50, highlighted the DP-CNN's supremacy in terms of accuracy on DB1 and DB2.Although, on DB3, it slightly lagged behind specific models, but the overall performance improvement in sEMG-based gesture detection was significant.The DP-CNN model, equipped with dual pathways, proved to be an effective solution for improving the accuracy and robustness of sEMG-based gesture classification.This study contributes valuable insights into advancing machine learning techniques for prosthetic control applications, emphasizing the practical significance of employing sophisticated architectures like the DP-CNN in real-world scenarios.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Figure 3 .
Figure 3. General architecture of the proposed dual-pathway convolutional neural network.

Figure 4 .
Figure 4. Performance of each individual pathway and the complete DP-CNN on a single subject from each database.

Figure 5 .
Figure 5. Performance of DP-CNN on each database.

Table 2 .
Performance of the proposed DP-CNN on DB1 subjects.

Table 3 .
Performance of the proposed DP-CNN on DB2 subjects.

Table 4 .
Performance of the proposed DP-CNN on DB3 subjects.

Table 5 .
Comparison of the proposed DP-CNN with previous methods applied on the NinaPro DB1.

Table 6 .
Comparison of the proposed DP-CNN with previous methods applied on the NinaPro DB2.

Table 7 .
Comparison of the proposed DP-CNN with previous methods applied on the NinaPro DB3.

Table 8 .
Comparison of the proposed DP-CNN with previous methods in terms of computational cost.