vasqu

Pay Attention to Linear Recurrence

Würzburg, Germany

vasqu's Stars

lucidrains/minGRU-pytorch
Implementation of the proposed minGRU in Pytorch
Language:Python845
Adibvafa/CodonTransformer
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
Language:Python883
Modalities/modalities
Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
Language:Python595
goombalab/phi-mamba
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)
Language:Python703
kyutai-labs/moshi
Language:Python6k449
sustcsonglin/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.2k66
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3.1k166
goombalab/hydra
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
Language:Python976
Hprairie/Bi-Mamba2
A Triton Kernel for incorporating Bi-Directionality in Mamba2
Language:Python461
black-forest-labs/flux
Official inference repo for FLUX.1 models
Language:Python14.5k1k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook11.4k974
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python3.8k302
LSX-UniWue/SuperGLEBer
German Language Understanding Evaluation Benchmark @NAACL24
Language:Python61
marzenakrp/nocha
361
hsiehjackson/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Language:Python60939
karpathy/LLM101n
LLM101n: Let's build a Storyteller
29.2k1.6k
Dao-AILab/causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
Language:Cuda29055
HazyResearch/based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
Language:Python20914
NX-AI/xlstm
Official repository of the xLSTM.
Language:Python1.3k92
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.7k1.3k
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
1.1k61
microsoft/FILM
Official repo for "Make Your LLM Fully Utilize the Context"
Language:Python23919
kolinko/effort
An implementation of bucketMul LLM inference
Language:Swift2149
kuleshov-group/caduceus
Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Language:Python14923
flbbb/locost-summarization
Language:Assembly241
xfactlab/orpo
Official repository for ORPO
Language:Python41338
facebookresearch/schedule_free
Schedule-Free Optimization in PyTorch
Language:Python1.8k64
Yale-LILY/DYLE
Repository for ACL'22 paper: Dynamic Latent Extraction for Abstractive Long-Input Summarization
Language:Python5411
abertsch72/unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
Language:Python1.1k80
psunlpgroup/Summ-N
Code for ACL 2022 Paper "SUMM^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents"
Language:Python588

vasqu

vasqu's Stars

lucidrains/minGRU-pytorch

Adibvafa/CodonTransformer

Modalities/modalities

goombalab/phi-mamba

kyutai-labs/moshi

sustcsonglin/flash-linear-attention

linkedin/Liger-Kernel

goombalab/hydra

Hprairie/Bi-Mamba2

black-forest-labs/flux

facebookresearch/sam2

InternLM/xtuner

LSX-UniWue/SuperGLEBer

marzenakrp/nocha

hsiehjackson/RULER

karpathy/LLM101n

Dao-AILab/causal-conv1d

HazyResearch/based

NX-AI/xlstm

Dao-AILab/flash-attention

yyyujintang/Awesome-Mamba-Papers

microsoft/FILM

kolinko/effort

kuleshov-group/caduceus

flbbb/locost-summarization

xfactlab/orpo

facebookresearch/schedule_free

Yale-LILY/DYLE

abertsch72/unlimiformer

psunlpgroup/Summ-N