Repository logo

Research data supporting [Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency]

Change log


Xu, Muzi 
Yi, Wentian 
Occhipinti, Luigi 


This work encompasses five related datasets, accessible via an open-source link provided at the end of the manuscript:

  1. Dataset1_20 Frequently Used Words: This dataset contains signals of the 20 most frequently used words (10 nouns and 10 verbs) collected from participants, with 100 samples per class. Each sample of a word is represented in a row, with the last number in each row indicating the class label for that word (the same applies to the following datasets).
  2. Dataset2_Confusing Words: This dataset includes 5 pairs of 10 words with similar pronunciations that are easily confused, with 100 samples per class.
  3. Dataset3_Different Reading Speeds: This dataset comprises signals of 5 long words read at three different speeds: fast, medium, and slow, with approximately 33 samples for each word at each reading speed.
  4. New User Generalization Test: This dataset contains signals of 5 commonly used words (included in Dataset1) collected from three new users, with 50 samples per class.
  5. Noise Injection Data: This dataset includes around five minutes of silent noise signals (containing physiological noises such as breathing and swallowing) recorded in the absence of speech.


Software / Usage instructions

The processing of the data and the training of the network were conducted in an environment based on Python 3.8.13, Miniconda 3, and PyTorch 2.0.1, with training acceleration provided by Apple’s Metal Performance Shaders (MPS). During the noise injection phase, each original sample was augmented with real-world noise from four different random noise windows, creating four new samples.


Machine Learning, Silent Speech Recognition, Textile Sensor


EPSRC (EP/W024284/1)