2019 语音识别
[论文]
一、 Listen, Attend and Spell 代码如下:
1、 https://github.com/thomasschmied/Speech_Recognition_with_Tensorflow
二、 TACOTRON: TOWARDS END-TO-END SPEECH SYNTHESIS 代码如下:
1、 https://github.com/audier/DeepSpeechRecognition/tree/master/tutorial/self-attention_tutorial.ipynb https://blog.csdn.net/chinatelecom08/article/details/85048019
[数据集]
1、 CSTRVCTK Corpus: http://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html
2、 LibriSpeech ASR corpus: http://www.openslr.org/12/
3、 Switchboard-1 Telephone Speech Corpus: https://catalog.ldc.upenn.edu/ldc97s62
4、 TED-LIUM Corpus: http://www-lium.univ-lemans.fr/en/content/ted-lium-corpus
5、 AIShell-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline:
a、 进入到项目\datasets\AishellSpeech 【Terminal运行如下命令】
```bash
$ wget http://www.openslr.org/resources/33/data_aishell.tgz
```
b、 运行aishell_speech.py Extract data_aishell.tgz:
```bash
$ python aishell_speech.py
```
c、 Extract wav files into train/dev/test folders:
```bash
$ cd datasets/AishellSpeech/data_aishell/wav
$ find . -name '*.tar.gz' -execdir tar -xzvf '{}' \;
```
d、 Scan transcript data, generate features:
```bash
$ cd demos/utils
$ python pre_process.py
```
Now the folder structure under data folder is sth. like:
<pre>
AishellSpeech/
data_aishell.tgz
data_aishell/
transcript/
aishell_transcript_v0.8.txt
wav/
train/
dev/
test/
audio_index.pkl
</pre>
[参考]