derekwin
EngD student @ SDU, focus on Cloud Native, HPC Network Protocol, GPU, Distributed Memory, ebpf, Congestion Control, Reinforcement Learning, etc.
CS@SDUQingdao, China
derekwin's Stars
commaai/openpilot
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
numpy/numpy
The fundamental package for scientific computing with Python.
Byaidu/PDFMathTranslate
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
pybind/pybind11
Seamless operability between C++11 and Python
huggingface/smolagents
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
triton-lang/triton
Development repository for the Triton language and compiler
richards199999/Thinking-Claude
Let your Claude able to think
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
cupy/cupy
NumPy & SciPy for GPU
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
cpp-best-practices/cppbestpractices
Collaborative Collection of C++ Best Practices. This online resource is part of Jason Turner's collection of C++ Best Practices resources. See README.md for more information.
mixxxdj/mixxx
Mixxx is Free DJ software that gives you everything you need to perform live mixes.
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
ysymyth/ReAct
[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
NVIDIA/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
eunomia-bpf/bpftime
Userspace eBPF runtime for Observability, Network & General Extensions Framework
Mellanox/libvma
Linux user space library for network socket acceleration based on RDMA compatible network adaptors
p12tic/cppreference-doc
C++ standard library reference
facebookincubator/dynolog
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
IBM/tensorflow-large-model-support
Large Model Support in Tensorflow
AIFM-sys/AIFM
AIFM: High-Performance, Application-Integrated Far Memory
hyperai/triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
Sys-KU/DeepPlan
Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)
0x5ec1ab/gpu-tlb
DataManagementLab/RDMA_synchronization
This is the source code for our (Tobias Ziegler, Jacob Nelson-Slivon, Carsten Binnig and Viktor Leis) published paper at SIGMOD’23: Design Guidelines for Correct, Efficient, and Scalable Synchronization using One-Sided RDMA
wangchenxi7/Atlas
0x5ec1ab/invalidate-compare
joonspk-research/gabm-stanford-main
liuweiseu/400GbE_Demo