DeepLearning4UTI
The aim of this repository is to create a comprehensive, curated list of resources which can be used for ultrasound tongue image (UTI) analysis, especially for the deep learning-based approaches.
Contents
Scientific Papers
Motion Tracking
- Extraction and tracking of the tongue surface from ultrasound image sequences. Akgul, Yusuf Sinan, Chandra Kambhamettu, and Maureen Stone. (1998) IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
- Automatic extraction and tracking of the tongue contours. Akgul, Yusuf Sinan, Chandra Kambhamettu, and Maureen Stone. (1999) IEEE Transactions on Medical Imaging.
- A guide to analysing tongue motion from ultrasound images. Stone, Maureen. (2005) Clinical linguistics & phonetics.
- Automatic contour tracking in ultrasound images. Li, Min, Chandra Kambhamettu, and Maureen Stone (2005) Clinical linguistics & phonetics.
- Tongue tracking in ultrasound images with active appearance models. Roussos, Anastasios, Athanassios Katsamanis, and Petros Maragos. (2009) IEEE International Conference on Image Processing (ICIP).
- Deep belief networks for real-time extraction of tongue contours from ultrasound during speech. Fasel, Ian, and Jeff Berry. (2010) 20th International Conference on Pattern Recognition. IEEE.
- Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Tang, Lisa, Tim Bressmann, and Ghassan Hamarneh (2012) Medical image analysis.
- Tongue contour extraction from ultrasound images based on deep neural network. Jaumard-Hakoun, Aurore, et al. (2016) ICPhS.
- A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization. Xu, Kele, et al. (2016) The Journal of the Acoustical Society of America Express Letters.
- Robust contour tracking in ultrasound tongue image sequences. Xu, Kele, et al. (2016) Clinical linguistics & phonetics.
- BowNet: Dilated Convolution Neural Network for Ultrasound Tongue Contour Extraction. Mozaffari, M. Hamed, and Won-Sook Lee (2019)
- Transfer Learning for Ultrasound Tongue Contour Extraction with Different Domains. Mozaffari, M. Hamed, and Won-Sook Lee (2019)
- A CNN-based tool for automatic tongue contour tracking in ultrasound images. Zhu, Jian, Will Styler, and Ian Calloway.(2019).
- Encoder-decoder CNN models for automatic tracking of tongue contours in real-time ultrasound data. Mozaffari, M. Hamed, and Won-Sook Lee. (2020).
- Tongue Contour Tracking and Segmentation in Lingual Ultrasound for Speech Recognition: A Review. Al-Hammuri, Khalid, et al. (2022).
3D Ultrasound
Ultrasound-Based Silent Speech Interface
- Speech synthesis from real time ultrasound images of the tongue. Bruce Denby and Maureen Stone (2004) IEEE International Conference on Acoustics, Speech, and Signal Processing.
- Prospects for a silent speech interface using ultrasound imaging. Denby, Bruce, et al. (2006) IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
- Eigentongue feature extraction for an ultrasound-based silent speech interface. Hueber, Thomas, et al. (2007) IEEE International Conference on Acoustics, Speech and Signal Processing.
- Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips. Hueber, Thomas, et al. (2010) Speech Communication
- Towards a Practical Silent Speech Interface Based on Vocal Tract Imaging Denby, Bruce, et al. (2011) Speech Communication
- Updating the silent speech challenge benchmark with deep learning Ji, Yan, et al. (2018) Speech Communication.
- Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder. Csapó, T. G. et al. (2019)
Tutorials
Code
- Predicting tongue motion in unlabeled ultrasound videos using convolutional LSTM neural network - Chaojie Zhao, Peng Zhang, Jian Zhu, Chengrui Wu, Huaimin Wang, Kele Xu. ICASSP 2019.
- Synchronising audio and ultrasound by learning cross-modal embeddings - Eshky, Aciel, et al. arXiv preprint arXiv:1907.00758.
- Improving the Classification of Phonetic Segments from Raw Ultrasound Using Self-Supervised Learning and Hard Example Mining. (Xiong, Yunsheng, et al). ICASSP 2022.
Data
Related Labs
- Vocal Tract Visualization Laboratory, University of Maryland, Baltimore, USA
- Haskins Laboratories, Yale University, New Haven, USA
- Speech Disorders & Technology Lab, UT Dallas, USA
- School of Electrical Engineering and Computer Science, University of Ottawa, Canada
- Laboratoire Signaux, Mod les, Apprentissage statistique (SIGMA), Paris, France
- The Langevin Institute, Paris, France
- GIPSA-lab, Grenoble, France
- Centre for Speech Technology Research, University of Edinburgh, UK
- Psychological Sciences and Health, University of Strathclyde, UK
- Clinical Audiology, Speech and Language Research Centre, Queen Margaret University, UK
- Articulate Instruments Ltd., UK
- Speech Technology and Smart Interactions Laboratory, Budapest University of Technology and Economics, Hungary
- The Key Laboratory of Cognitive Computing and Applications, Tianjin University, China
Contributing
Your contributions are always welcome! Please take a look at the contribution guidelines first. Please feel free to pull requests, email or join our chats to add links. I will keep some pull requests open if I'm not sure whether those libraries are awesome, you could vote for them by adding 👍 to them.