Pinned Repositories
CompeteSMoE
Code for this paper "CompeteSMoE - Effective Sparse Mixture of Experts Training via Competition"
crystalcoder-train
Pre-training code for CrystalCoder 7B LLM
llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
DL_project
Deep learning class project -- A rational search engine
EEG-Cross-Subject-Emotion-Recognition
eegnet_pytorch
EEGNet implementation in PyTorch
EMoE
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
HMA
HMA: Heterogenous Memory Augmented Neural Networks
MemLMM
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Tuning-keys-v.s.-values
Official PyTorch Implementation of Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers [Tiny Paper @ ICLR 2024]
qiuzh20's Repositories
qiuzh20/EMoE
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
qiuzh20/EEG-Cross-Subject-Emotion-Recognition
qiuzh20/HMA
HMA: Heterogenous Memory Augmented Neural Networks
qiuzh20/DL_project
Deep learning class project -- A rational search engine
qiuzh20/Tuning-keys-v.s.-values
Official PyTorch Implementation of Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers [Tiny Paper @ ICLR 2024]
qiuzh20/eegnet_pytorch
EEGNet implementation in PyTorch
qiuzh20/MemLMM
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.