CMU 11-767 On-Device Machine Learning. Group Project: Efficient Speech Translation

@tjysdsg @VincieSlytherin @kurisujhin @SandyLuXY

Setup

Init all git submodules
Enter pretrained/ and pull git lfs files
```
git lfs install
git lfs pull
```
Install ESPnet: https://espnet.github.io/espnet/installation.html

Prepare data

cd must_c_test_subset
python ../prepare_data.py --out_dir ../data --sample_rate 16000

Lab2: Baseline

Pretrained model that's used: https://huggingface.co/espnet/brianyan918_mustc-v2_en-de_st_conformer_asrinit_v2_raw_en_de_bpe_tc4000_sp

cd pretrained
python plot_lab2.py --result_dir output --out_dir output/plot_all

Lab3: Quantization

lab3.py
plot_all.py

Lab4: Pruning

lab4.py
lab4_export_onnx.py
lab4_benchmark_onnx.py

Lab5: Energy

Export ONNX models w/o other techniques

cd pretrained/
python export_onnx.py --pruned --quantized

Run run_lab5.sh and run_lab5_gpu.sh. Check comments carefully before running.

Final

Under pretrained/:

Export ONNX models w/o other techniques

cd pretrained/
python ../export_onnx.py --pruned --quantized  # adjust these flags as needed

Run benchmarks

cd pretrained/
python ../final_benchmark.py --data_dir ../data --out_dir ../output

Ablation study

output_onnx/ is constructed from results of final_benchmark.py, containing ablation study of ONNX, pruning, and quantization. Run plot_all.py on this folder to generate the plot.

tjysdsg/on-device-speech-translation