Pinned Repositories
CompeteSMoE
Code for this paper "CompeteSMoE - Effective Sparse Mixture of Experts Training via Competition"
llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
DL_project
Deep learning class project -- A rational search engine
EEG-Cross-Subject-Emotion-Recognition
eegnet_pytorch
EEGNet implementation in PyTorch
EMoE
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
HMA
HMA: Heterogenous Memory Augmented Neural Networks
MemLMM
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
RMoE
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
Tuning-keys-v.s.-values
Official PyTorch Implementation of Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers [Tiny Paper @ ICLR 2024]
qiuzh20's Repositories
qiuzh20/EMoE
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
qiuzh20/RMoE
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
qiuzh20/EEG-Cross-Subject-Emotion-Recognition
qiuzh20/HMA
HMA: Heterogenous Memory Augmented Neural Networks
qiuzh20/Tuning-keys-v.s.-values
Official PyTorch Implementation of Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers [Tiny Paper @ ICLR 2024]
qiuzh20/DL_project
Deep learning class project -- A rational search engine
qiuzh20/eegnet_pytorch
EEGNet implementation in PyTorch
qiuzh20/MemLMM
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.