wangtianxia-sjtu

真正的粉丝

膜都

wangtianxia-sjtu's Stars

QwenLM/Qwen2.5-Coder
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
Language:Python70058
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
4.7k255
jingyaogong/minimind
「大模型」3小时完全从0训练26M的小参数GPT，个人显卡即可推理训练！
Language:Python2.4k278
ekondis/gpumembench
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
Language:C++9923
intel/xFasterTransformer
Language:C++36862
NVIDIA/cccl
CUDA Core Compute Libraries
Language:C++1.2k149
lllyasviel/Fooocus
Focus on prompting and generating
Language:Python40.8k5.7k
karpathy/build-nanogpt
Video+code lecture on building nanoGPT from scratch
Language:Python3.5k485
xtensor-stack/xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
Language:C++2.2k254
karpathy/LLM101n
LLM101n: Let's build a Storyteller
29.4k1.6k
FlagOpen/FlagPerf
FlagPerf is an open-source software platform for benchmarking AI chips.
Language:Python305102
NVIDIA/gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Language:C++871143
ProjectMitosisOS/dmerge-eurosys24-ae
Artifact evaluation repo for EuroSys'24.
Language:Python192
smartnickit-project/smartnic-bench
A rust-based benchmark for BlueField SmartNICs.
Language:Rust264
boostorg/compute
A C++ GPU Computing Library for OpenCL
Language:C++1.6k333
lipracer/cuda-rt-hook
Language:C++2511
LargeWorldModel/LWM
Language:Python7.1k550
3b1b/manim
Animation engine for explanatory math videos
Language:Python68.2k6.1k
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Language:C6.3k1.8k
ccfddl/ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Language:Vue6.1k429
vosen/ZLUDA
CUDA on non-NVIDIA GPUs
Language:Rust9.5k623
haoliuhl/ringattention
Transformers with Arbitrarily Large Context
Language:Python62748
andravin/wincnn
Winograd minimal convolution algorithm generator for convolutional neural networks.
Language:Python601145
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.9k2.1k
NVIDIA/cuda-checkpoint
CUDA checkpoint and restore utility
Language:Cuda21210
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.8k1.3k
SJTU-IPADS/Bamboo
Bamboo-7B Large Language Model
881
microsoft/DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
Language:C++2.2k293
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language:C++7.9k409
halpz/re3
Language:C++2.4k413

wangtianxia-sjtu

wangtianxia-sjtu's Stars

QwenLM/Qwen2.5-Coder

hijkzzz/Awesome-LLM-Strawberry

jingyaogong/minimind

ekondis/gpumembench

intel/xFasterTransformer

NVIDIA/cccl

lllyasviel/Fooocus

karpathy/build-nanogpt

xtensor-stack/xsimd

karpathy/LLM101n

FlagOpen/FlagPerf

NVIDIA/gdrcopy

ProjectMitosisOS/dmerge-eurosys24-ae

smartnickit-project/smartnic-bench

boostorg/compute

lipracer/cuda-rt-hook

LargeWorldModel/LWM

3b1b/manim

NVIDIA/cuda-samples

ccfddl/ccf-deadlines

vosen/ZLUDA

haoliuhl/ringattention

andravin/wincnn

hpcaitech/Open-Sora

NVIDIA/cuda-checkpoint

Dao-AILab/flash-attention

SJTU-IPADS/Bamboo

microsoft/DirectML

SJTU-IPADS/PowerInfer

halpz/re3