/on-device-speech-translation

CMU 11767: Pretending to be resource-constrained on your gaming laptop

Primary LanguageJupyter Notebook

CMU 11-767 On-Device Machine Learning. Group Project: Efficient Speech Translation

@tjysdsg @VincieSlytherin @kurisujhin @SandyLuXY

Setup

  1. Init all git submodules
  2. Enter pretrained/ and pull git lfs files
    git lfs install
    git lfs pull
  3. Install ESPnet: https://espnet.github.io/espnet/installation.html
  4. Prepare data
    cd must_c_test_subset
    python ../prepare_data.py --out_dir ../data --sample_rate 16000

Lab2: Baseline

cd pretrained
python plot_lab2.py --result_dir output --out_dir output/plot_all

Lab3: Quantization

lab3.py
plot_all.py

Lab4: Pruning

lab4.py
lab4_export_onnx.py
lab4_benchmark_onnx.py

Lab5: Energy

  1. Export ONNX models w/o other techniques

    cd pretrained/
    python export_onnx.py --pruned --quantized
  2. Run run_lab5.sh and run_lab5_gpu.sh. Check comments carefully before running.

Final

Under pretrained/:

  1. Export ONNX models w/o other techniques

    cd pretrained/
    python ../export_onnx.py --pruned --quantized  # adjust these flags as needed
  2. Run benchmarks

    cd pretrained/
    python ../final_benchmark.py --data_dir ../data --out_dir ../output

Ablation study

output_onnx/ is constructed from results of final_benchmark.py, containing ablation study of ONNX, pruning, and quantization. Run plot_all.py on this folder to generate the plot.

See also