Pinned Repositories
CRATE
Code for CRATE (Coding RAte reduction TransformEr).
CUDA-Learn-Note
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
GaLore
github-slideshow
A robot powered training repository :robot:
hallow
i just wanna learn deep learning
llama2.c
Inference Llama 2 in one file of pure C
OpenMMLabCamp
paper-reading
深度学习经典、新论文逐段精读
ScanNet_Vis
Transformer-Series
hmxiong's Repositories
hmxiong/Transformer-Series
hmxiong/OpenMMLabCamp
hmxiong/paper-reading
深度学习经典、新论文逐段精读
hmxiong/CRATE
Code for CRATE (Coding RAte reduction TransformEr).
hmxiong/CUDA-Learn-Note
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
hmxiong/GaLore
hmxiong/github-slideshow
A robot powered training repository :robot:
hmxiong/hallow
i just wanna learn deep learning
hmxiong/llama2.c
Inference Llama 2 in one file of pure C
hmxiong/ScanNet_Vis
hmxiong/Tarurs
competition files
hmxiong/pytorch-distributed-training
Simple tutorials on Pytorch DDP training
hmxiong/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
hmxiong/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)