Pinned Repositories
Conic10K
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
hf-starter
General starter code for creative model architecture with huggingface transformer library.
LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
nni-slurm
A patch for NNI with slurm and W&B.
Probabilistic-Transformer
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
tinyllama
A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
tinyllama-zh
A side project that pretrains a tinyllama on Chinese corpora, with the minimal modification to the huggingface transformers code.
whyNLP's Repositories
whyNLP/LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
whyNLP/Conic10K
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
whyNLP/Probabilistic-Transformer
A probabilitic model for contextual word representation. Accepted to ACL2023 Findings.
whyNLP/tinyllama
A side project that follows all the acceleration tricks in tinyllama, with the minimal modification to the huggingface transformers code.
whyNLP/nni-slurm
A patch for NNI with slurm and W&B.
whyNLP/tinyllama-zh
A side project that pretrains a tinyllama on Chinese corpora, with the minimal modification to the huggingface transformers code.