Pinned Repositories
BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
grokadamw
new optimizer
Kosmos2.5
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
Magic-AI-Wiki
mamba-1.58bits
matmulfreellm
Implementation for MatMul-free LM.
MS-AMP
Microsoft Automatic Mixed Precision Library
qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
DewEfresh's Repositories
DewEfresh/BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
DewEfresh/BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
DewEfresh/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
DewEfresh/grokadamw
new optimizer
DewEfresh/Kosmos2.5
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
DewEfresh/Magic-AI-Wiki
DewEfresh/mamba-1.58bits
DewEfresh/matmulfreellm
Implementation for MatMul-free LM.
DewEfresh/MS-AMP
Microsoft Automatic Mixed Precision Library
DewEfresh/qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
DewEfresh/relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
DewEfresh/RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
DewEfresh/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities