DewEfresh

Pinned Repositories

BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Language:Python0 0 00
BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
Language:Python0 0 00
graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python0 0 00
grokadamw
new optimizer
Language:Python00
Kosmos2.5
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
Language:Python0 0 00
Magic-AI-Wiki
Language:Python0 0 00
mamba-1.58bits
Language:Python0 0 00
matmulfreellm
Implementation for MatMul-free LM.
Language:Python0 0 00
MS-AMP
Microsoft Automatic Mixed Precision Library
Language:Python0 0 00
qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
Language:Python0 0 00

DewEfresh's Repositories

DewEfresh/BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Language:Python0 0 00
DewEfresh/BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
Language:Python0 0 00
DewEfresh/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python0 0 00
DewEfresh/grokadamw
new optimizer
Language:Python00
DewEfresh/Kosmos2.5
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
Language:Python0 0 00
DewEfresh/Magic-AI-Wiki
Language:Python0 0 00
DewEfresh/mamba-1.58bits
Language:Python0 0 00
DewEfresh/matmulfreellm
Implementation for MatMul-free LM.
Language:Python0 0 00
DewEfresh/MS-AMP
Microsoft Automatic Mixed Precision Library
Language:Python0 0 00
DewEfresh/qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
Language:Python0 0 00
DewEfresh/relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Language:Jupyter Notebook0 0 00
DewEfresh/RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
Language:Jupyter Notebook0 0 00
DewEfresh/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python0 0 00