Pinned Repositories
ai_poem
ai写诗
alien_game
a small python game, just a demo in the book Python Crash Course
AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
angry_bird
仿制安卓游戏:愤怒的小鸟
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
comm_channel_bench
The ping-pong benchmark of DOCA Comm Channel
dma_bench
A DMA benchmark in BlueField
my_ib_traffic_gen_ib
The ib traffic gen for IB device. Basically RDMA write/send the same memory
notes
some notes
Oobleck
A resilient distributed training framework
ZhuJiaqi9905's Repositories
ZhuJiaqi9905/comm_channel_bench
The ping-pong benchmark of DOCA Comm Channel
ZhuJiaqi9905/dma_bench
A DMA benchmark in BlueField
ZhuJiaqi9905/my_ib_traffic_gen_ib
The ib traffic gen for IB device. Basically RDMA write/send the same memory
ZhuJiaqi9905/notes
some notes
ZhuJiaqi9905/Oobleck
A resilient distributed training framework
ZhuJiaqi9905/AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
ZhuJiaqi9905/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
ZhuJiaqi9905/benchmark
benchmark of io and network in rust
ZhuJiaqi9905/blog_service
ZhuJiaqi9905/c_rdma_demo
ZhuJiaqi9905/copy_an_os
ZhuJiaqi9905/CppCoreGuidelines
The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++
ZhuJiaqi9905/stack_share
某科学的创业课网站
ZhuJiaqi9905/dpdk_engineer_manual
【冲破内核瓶颈,让I/O性能飙升】DPDK工程师手册,官方文档,最新视频,开源项目,实战案例,论文,大厂内部ppt,知名工程师一览表
ZhuJiaqi9905/FlashFlex
Accommodating Large Language Model Training over Heterogeneous Environment.
ZhuJiaqi9905/ib_traffic_gen
ZhuJiaqi9905/learn-kvm
Qemu KVM(Kernel Virtual Machine)学习笔记
ZhuJiaqi9905/leetcode_rust
ZhuJiaqi9905/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
ZhuJiaqi9905/lsm_manager
a lsm manager
ZhuJiaqi9905/Megatron-LM
Ongoing research training transformer models at scale
ZhuJiaqi9905/my_ib_traffic_gen_roce
The ib tranffic gen for RoCE. Basically RDMA send/write the same memory.
ZhuJiaqi9905/my_leetcode
ZhuJiaqi9905/my_vmm
A VMM demo writen in rust, using the crates in rust-vmm which is binding of KVM.
ZhuJiaqi9905/nccl
Optimized primitives for collective multi-GPU communication
ZhuJiaqi9905/sicp
ZhuJiaqi9905/simple_kv
ZhuJiaqi9905/Triton-Puzzles
Puzzles for learning Triton
ZhuJiaqi9905/UCAS-enroll
A Python course enrollment assistant framework. 一个Python的选课助手框架
ZhuJiaqi9905/varuna