ClubieDong

I'm a graduate student at Nanjing University, interested in accelerating ML training/inference.

Nanjing UniversityNanjing, China

ClubieDong's Stars

markverick/ns3-ospf
Simplified, native OSPFv2 implementation on ns-3's external module for research purpose.
Language:C++4
HigherOrderCO/Bend
A massively parallel, high-level programming language
Language:Rust17.9k439
pytorch/kineto
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Language:HTML755170
facebookresearch/HolisticTraceAnalysis
A library to analyze PyTorch traces.
Language:Python31945
openucx/ucc
Unified Collective Communication Library
Language:C221103
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.8k2.3k
rendercv/rendercv
The engine of the RenderCV App
Language:Python2.1k179
AnthonyCalandra/modern-cpp-features
A cheatsheet of modern C++ language and library features.
20k2.1k
ClubieDong/QAQ-KVCacheQuantization
QAQ: Quality Adaptive Quantization for LLM KV Cache
Language:Python437
merrymercy/awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
2.5k308
cabaletta/baritone
google maps for block game
Language:Java7.4k1.5k
NiHoel/Anno1800Calculator
Calculator for the production and consumption of goods in the computer game Anno 1800
Language:JavaScript10528
KarlsruheMIS/KaMIS
Maximum independent sets and vertex covers of large sparse graphs.
Language:C++7327
ChatGPTNextWeb/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini/Claude LLM 应用。
Language:TypeScript78.4k60k
LijunChang/Near-Maximum-Independent-Set
Near-linear time algorithm for computing near-maximum independent set
Language:C++184
iPapatsoris/Maximum-Independent-Set
An exact algorithm for computing the Maximum Independent Set on graphs
Language:C++93
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python7.4k2k
alibaba/clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
Language:Jupyter Notebook1.7k413
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python33.4k5.1k
cypress-io/cypress
Fast, easy and reliable testing for anything that runs in a browser.
Language:JavaScript47.6k3.2k
gabime/spdlog
Fast C++ logging library.
Language:C++24.9k4.6k
michalusio/screeps-async-example
Using some fancy Rollup plugins to convert async/await into generators under the hood!
Language:TypeScript1
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2.1k341
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python36.2k4.2k
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++6k895
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Language:Python2k161
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python146k27.3k
jupyter-xeus/xeus-cling
Jupyter kernel for the C++ programming language
Language:C++3.1k302
bencbartlett/screeps-packrat
Lightning-fast and memory-efficient serialization of Screeps IDs, Coords, and RoomPositions
Language:JavaScript318
mgth/LittleBigMouse
DPI Aware mouse move across screens
Language:C#4.3k200

ClubieDong

ClubieDong's Stars

markverick/ns3-ospf

HigherOrderCO/Bend

pytorch/kineto

facebookresearch/HolisticTraceAnalysis

openucx/ucc

meta-llama/llama-recipes

rendercv/rendercv

AnthonyCalandra/modern-cpp-features

ClubieDong/QAQ-KVCacheQuantization

merrymercy/awesome-tensor-compilers

cabaletta/baritone

NiHoel/Anno1800Calculator

KarlsruheMIS/KaMIS

ChatGPTNextWeb/ChatGPT-Next-Web

LijunChang/Near-Maximum-Independent-Set

iPapatsoris/Maximum-Independent-Set

EleutherAI/lm-evaluation-harness

alibaba/clusterdata

vllm-project/vllm

cypress-io/cypress

gabime/spdlog

michalusio/screeps-async-example

NVIDIA/TransformerEngine

microsoft/DeepSpeed

NVIDIA/FasterTransformer

IST-DASLab/gptq

AUTOMATIC1111/stable-diffusion-webui

jupyter-xeus/xeus-cling

bencbartlett/screeps-packrat

mgth/LittleBigMouse