whyNLP

NLP research projects for Haoyi Wu.

Pinned Repositories

Conic10K
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
Language:Python25 1 22
hf-starter
General starter code for creative model architecture with huggingface transformer library.
Language:Python20
LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
Language:Python155 3 1114
nni-slurm
A patch for NNI with slurm and W&B.
Language:Python8 1 10
Probabilistic-Transformer
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
Language:Python25 3 02
tinyllama
A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
Language:Python13 3 41
tinyllama-zh
A side project that pretrains a tinyllama on Chinese corpora, with the minimal modification to the huggingface transformers code.
Language:Python71

whyNLP/LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
Language:Python155 3 1114
whyNLP/Conic10K
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
Language:Python25 1 22
whyNLP/Probabilistic-Transformer
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
Language:Python25 3 02
whyNLP/tinyllama
A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
Language:Python13 3 41
whyNLP/nni-slurm
A patch for NNI with slurm and W&B.
Language:Python8 1 10
whyNLP/tinyllama-zh
A side project that pretrains a tinyllama on Chinese corpora, with the minimal modification to the huggingface transformers code.
Language:Python71