jen-pan's Stars
stas00/ml-engineering
Machine Learning Engineering Open Book
cs231n/cs231n.github.io
Public facing notes page
gpu-mode/lectures
Material for gpu-mode lectures
eureka-research/Eureka
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
isaac-sim/IsaacGymEnvs
Isaac Gym Reinforcement Learning Environments
hyp1231/awesome-llm-powered-agent
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
ELS-RD/kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
facebookresearch/diplomacy_cicero
Code for Cicero, an AI agent that plays the game of Diplomacy with open-domain natural language negotiation.
gpu-mode/resource-stream
GPU programming related news and material links
alxndrTL/mamba.py
A simple and efficient Mamba implementation in pure PyTorch and MLX.
Denys88/rl_games
RL implementations
isaac-sim/OmniIsaacGymEnvs
Reinforcement Learning Environments for Omniverse Isaac Gym
sublee/trueskill
An implementation of the TrueSkill rating system for Python
lucidrains/ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
apoorvumang/prompt-lookup-decoding
srush/annotated-mamba
Annotated version of the Mamba paper
linjames0/Transformer-CUDA
An implementation of the transformer architecture onto an Nvidia CUDA kernel
shreyansh26/FlashAttention-PyTorch
Implementation of FlashAttention in PyTorch
PeaBrane/mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
tlc-pack/libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
kotoba-tech/kotomamba
Mamba training library developed by kotoba technologies
NVIDIA/online-softmax
Benchmark code for the "Online normalizer calculation for softmax" paper
tanaymeh/mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
johnma2006/candle
Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.
google-deepmind/diplomacy
andyzoujm/breaking-llama-guard
Code to break Llama Guard
sgiraz/CUDA-Training
Some CUDA projects and utility
debowin/cuda-parallel-scan-prefix-sum
An implementation of a work-efficient Parallel Prefix-Sum(Scan) algorithm on the GPU.
TasnimAK/BPE-Vocabulary-Builder
An implementation of Byte Pair Encoding (BPE), a data compression technique that can also be used for efficient subword tokenization in natural language processing tasks