/subvocalization-emg

🧠 Project for recording and training subvocalization EMG data with the Cyton Board.

Primary LanguageJupyter NotebookBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause


Subvocalization EMG

Project for recording and training subvocalization EMG data with the Cyton Board.
by Mateus de Aquino Batista for the Bachelor's Degree Final Project.

📄 Abstract

Dysarthria is a change in the normal pronunciation of words, usually caused by neurological disturbances. In advanced cases, when speech therapy is not enough to enable communication in a practical way, a silent speech interface can be used to carry out a conversation with a smaller vocabulary, through the identification of subtle movements in speech captured by an electromyography device of surface.

The present study deals with an instrumentation project of a low-cost silent speech interface and the development of a neural network to identify subvocalized words together plus an experimental case study for the equipment validation. An OpenBCI electromyography modular board will be used to read muscle activity data from the face, and these data will be later on processed, and classified by a convolutional neural network for the identification of subvocalized words.

Regarding the validation, the success rates of the neural network of the silent speech interface will be used to evaluate the general and individual performance for each participant. A hit rate between 90 and 95% is expected from the general validation.

Henceforth, with positive results coming from the interface, more words can be added in the advance, in order to guarantee greater usefulness in the daily lives of patients with communication difficulties.

🔧 Hardware Requirements

Cyton Biosensing Board

This project requires setting up a Cyton Biosensing Board (8-Channels), which is a neural interface board developed by OpenBCI with a 32-bit processor that can be used to sample EEG, EMG, and ECG activity. Follow the starter guide to make sure you get it right.

This project also allows using the Synthetic board as a mock to the Cyton board. However, as it generates random data, training and validation of the Neural Networks with the Synthetic option will not work.

⚠️ Important

This study is currently underway, and as such, the findings outlined in this article are preliminary and subject to change. The ongoing development phase of the project means that its present iteration is intended solely for research purposes. If you're interested, you can explore the progress in the Papers section (Portuguese).

🚀 Setup

After setting up your Cyton Board, you'll need to install the package dependencies:

python -m venv .venv # optional: install requirements into a virtual env
source .venv/bin/activate # optional: activate virtual env
pip install -r requirements.txt

Once you're done simply run:

python ./start.py

It should be accessible at localhost:8000. In case the Cyton dongle is not available you might need to run with administrator privileges.

🧠 How to use

The main page includes the Time Series (filtered) for all 8 channels, you can also see some logging information and access to the board session on top of the page. Once the session is started you'll have access to the Recording tab, a page to setup the words and amount of information you'll want to train later. Note that all the default existing words are currently hardcoded into the HTML file, but they can be changed anytime:

EMG Tab Recording Tab
EMG Tab Recording Tab

After recording your first session (automatically saved as a csv file), the Neural Network tab will be available for training. This is where you include all recordings and setup all the training configs. Once started, you can check the training progress in real time. After the training is complete, you'll have access to the Evaluation tab, where you can test the predicting capability of the models you've trained.

Neural Network Tab Evaluator Tab
Neural Network Tab Evaluator Tab

📌 Electrodes Placement

The montage was defined using MIT's preliminary study about top muscle regions evaluated in a pilot user study:

Electrodes placement schema on Cyton Board
image
Region Color Pin
Earlobe Black BIAS
Mental Yellow N3P (upper)
Inner laryngeal Blue N1P (upper)
Outer laryngeal Red N1P (lower)
Hyoid Green N3P (lower)
Inner infra-orbital Purple N2P (upper)
Outer infra-orbital Brown N4P
Buccal Orange N2P (lower)
Selection of the final electrode target areas through feature selection on muscular areas of interest
image
Arnav Kapur, S. Kapur, P. Maes, et al. (2018)

📻 Proof of Concept (PoC)

The PoC containing the steps of processing and training can be accessed by Complete processing.ipynb. This Jupyter Notebook has all the important pieces of code to reproduce the experiment, and also some visual graphs for a better understanding.

Synthetic 8-Channels Input Words visualization
Synthetic 8-Channels Input Words visualization

If you want to see the PoC with public EMG data instead, you can check Public data.ipynb, processing a public EMG hand gesture dataset.

Also, if you want to run the ipynb notebook in a virtualenv, make sure you setup jupyter correctly:

source .venv/bin/activate
python -m pip install ipykernel # install ipykernel / jupyter in the venv if not present
python -m ipykernel install --user --name=venv # self-install
# > Then, open Jupyter Notebook and select venv in "Switch kernel" option

📚 Papers

This marks the concluding phase of our research. For this publicly available repository, we included data from human participants and shared the electrodes placement, as we have obtained approval from Brazil's Ethics Committee (CAAE: 65587722.5.0000.5503, Parecer 6112574).

You can view the processing and results obtained from the human sEMG in the third PoC file: Subvocalization.ipynb.

📊 EMG Folder Structure

The dataset folder /saves is where all the EMG data is stored by default in the application, however this folder was organized to include the data from all 10 participants of the study. The all sessions recorded from all participants, it is grouped by: Participants Code -> Speech style -> Words.

The Speech style could be any of:

  • F: Normal speech ("Fala")
  • A: Lip articulation ("Articulação labial")
  • S: Subvocalization ("Subvocalização")

The four possible words selected in this study was: Yes ("Sim"), No ("Não"), Maybe ("Talvez") and Silence, whereas the silence itself is stored inside the folders between the words.

📝 Conclusion

The results demonstrate that this processing and training method is effective for detecting strong gestures, such as speech and articulation. However, for more subtle muscle movements, such as subvocalization, further improvements in noise reduction and data augmentation are necessary.

It is important to note that the same methods, montage, and algorithm yielded varying results across different participants. Some exhibited lower accuracy in detecting easily recognizable speech, while others achieved higher accuracy in identifying less discernible subvocalizations.

📜 License

All source code is made available under a BSD 3-clause license. You can freely use and modify the code, without warranty, so long as you provide attribution to the author. See LICENSE for the full license text.

The author reserve the rights to the article content.