luchuanze/luasr

luasr is end to end asr project

PythonApache-2.0

LuASR is an end-to-end ASR toolkit

概述

LuASR 是一个端到端语音识别项目，目的是基于 PYTORCH 框架提供当前流行的 CTC , TRANSDUCE , TRANSFORMER 等多任务端到端识别架构，支持不同编码模块如，TDNN, LSTM, MHA, CONFORMER 供从事语音识别者学习；也将提供基于 C/C++的runtime （x86、ARM）解码器可用于项目工程化。

该项目参考当前一些流行的语音识别开源项目，如 wenet，next gen kaldi, ESPnet 等。

更新

计划年底发布，敬请期待！

安装及使用

环境配置

使用 Linux 系统，推荐 Ubantu 20.+, CUDA = 11.3, GPU >= 8G 。

安装 Python 开发环境，推荐版本 >= 3.8 。

安装 pytorch 框架，推荐版本 >= 1.10 。

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

获取源码

git clone https://github.com/luchuanze/luasr.git

模型训练