Pinned Repositories
attention-surgery
Attention surgery for LLMs
chessformer
Chessformer
flash-attention
Fast and memory-efficient exact attention
flashT5
A fast implementation of T5/UL2 in PyTorch using Flash Attention
portfolio
Personal Portfolio in Machine Learning
simple-decoder
A simple yet optimized decoder-only architecture
flashT5
A fast implementation of T5/UL2 in PyTorch using Flash Attention
rustlm
RustLM : An efficient Rust CTC Decoder supporting external language models
triton-rust
An api for interfacing Nvidia Trition Inference Server with Rust
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
b-albar's Repositories
b-albar/attention-surgery
Attention surgery for LLMs
b-albar/chessformer
Chessformer
b-albar/flash-attention
Fast and memory-efficient exact attention
b-albar/flashT5
A fast implementation of T5/UL2 in PyTorch using Flash Attention
b-albar/portfolio
Personal Portfolio in Machine Learning
b-albar/simple-decoder
A simple yet optimized decoder-only architecture