Pinned Repositories
annotated-mamba
Annotated version of the Mamba paper
Awesome-state-space-models
Collection of papers on state-space models
benchmark_sequence_modeling
Curse-of-memory
Curse-of-memory phenomenon of RNNs in sequence modelling
flash-fft-conv
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
mamba
mamba-minimal-jax
profiling-cuda-in-torch
radarFudan.github.io
S6
Figure out what's next for S6
radarFudan's Repositories
radarFudan/Awesome-state-space-models
Collection of papers on state-space models
radarFudan/mamba-minimal-jax
radarFudan/Curse-of-memory
Curse-of-memory phenomenon of RNNs in sequence modelling
radarFudan/mamba
radarFudan/profiling-cuda-in-torch
radarFudan/radarFudan.github.io
radarFudan/annotated-mamba
Annotated version of the Mamba paper
radarFudan/flash-fft-conv
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
radarFudan/S5
radarFudan/attention_with_linear_biases
Code for the ALiBi method for transformer language models (ICLR 2022)
radarFudan/causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
radarFudan/EffHDC
radarFudan/flash-attention
Fast and memory-efficient exact attention
radarFudan/google-research
Google Research
radarFudan/lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
radarFudan/pythia
The hub for EleutherAI's work on interpretability and learning dynamics
radarFudan/RWKV-CUDA
The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )
radarFudan/SSM_examples
radarFudan/t5-pegasus-pytorch
radarFudan/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
radarFudan/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
radarFudan/triton
Development repository for the Triton language and compiler
radarFudan/flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
radarFudan/gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
radarFudan/in-context-operator-networks
ICON for in-context operator learning
radarFudan/llm.c
LLM training in simple, raw C/CUDA
radarFudan/LongMamba
Some preliminary explorations of Mamba's context scaling.
radarFudan/mamba2-minimal
Minimal Mamba-2 implementation in PyTorch
radarFudan/s4
Structured state space sequence models
radarFudan/snippets