/Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Primary LanguageJupyter NotebookOtherNOASSERTION

Repo of ZeroC

A zero QKV compression overhead inference system.

Folder

/zeroc

  • exp
  • kernels
  • measurements
  • quantization
  • svd_qkv
exp: it has the implementation code of zeroc.
       # zeroc.py/h2o-zeroc.py/gear-zeroc.py
kernels: kernel functions.
measurements: it has the measurement code for SVD and model analysis.
quantization: it has the code of quantization methods.
svd_qkv: SVD related code for QKV compression and analysis.