wenyawei

A Computer Science Student

TU KaiserslauternKaiserslautern, Germany

wenyawei's Stars

facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.4k 314 6805.7k
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python38.3k 382 3206.2k
pybind/pybind11
Seamless operability between C++11 and Python
Language:C++16k 250 2.1k2.1k
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python15.2k 112 1.1k1.2k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.9k 123 1.2k1.4k
state-spaces/mamba
Mamba SSM architecture
Language:Python13.7k 101 5831.2k
chenzomi12/AISystem
AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Language:Jupyter Notebook11.8k 153 411.7k
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Language:Python11k 70 108698
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
Language:Python11k 168 8132.5k
FMInference/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
Language:Python9.1k 111 81540
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.9k 77 580634
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.5k 100 1.2k1.4k
huggingface/accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Language:Python8.1k 98 1.7k1k
facebookresearch/SlowFast
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Language:Python6.7k 97 6941.2k
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
Language:Python6.6k 59 339464
baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python5.7k 68 128506
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.3k 50 456400
fundamentalvision/BEVFormer
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Language:Python3.5k 72 274560
mit-han-lab/bevfusion
[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Language:Python2.4k 41 607438
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python2.1k 33 369338
andikleen/pmu-tools
Intel PMU profiling tools
Language:Python2.1k 90 443341
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.5k 26 67110
databricks/megablocks
Language:Python1.2k 16 62176
NVIDIA/MatX
An efficient C++17 GPU numerical computing library with Python-like syntax
Language:C++1.2k 24 20488
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Language:Python1.1k 17 3955
NVIDIA-Merlin/Merlin
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Language:Python788 34 443119
NVIDIA/nvbench
CUDA Kernel Benchmarking Library
Language:Cuda541 19 10169
bytedance/ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
Language:C++466 10 1038
MegEngine/mperf
mperf是一个面向移动/嵌入式平台的算子性能调优工具箱
Language:C++174 7 1528
sunlex0717/DissectingTensorCores
Language:Cuda81 3 619

wenyawei

wenyawei's Stars

facebookresearch/segment-anything

karpathy/nanoGPT

pybind/pybind11

QwenLM/Qwen

Dao-AILab/flash-attention

state-spaces/mamba

chenzomi12/AISystem

microsoft/LoRA

NVIDIA/Megatron-LM

FMInference/FlexGen

facebookresearch/xformers

NVIDIA/apex

huggingface/accelerate

facebookresearch/SlowFast

InternLM/InternLM

baichuan-inc/Baichuan-7B

QwenLM/Qwen-VL

fundamentalvision/BEVFormer

mit-han-lab/bevfusion

NVIDIA/TransformerEngine

andikleen/pmu-tools

QwenLM/Qwen-Audio

databricks/megablocks

NVIDIA/MatX

deepseek-ai/DeepSeek-MoE

NVIDIA-Merlin/Merlin

NVIDIA/nvbench

bytedance/ByteTransformer

MegEngine/mperf

sunlex0717/DissectingTensorCores