Pinned Repositories
MegEngine
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
PaddleNLP
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
TensorFlow.NET
.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
answer-of-ImageProcessing100Wen
answer of ImageProcessing100Wen
employee-management-system-implemented-by-CPP
B站黑马程序员C++教程复现
Huffman-coding
python实现霍夫曼编译码
mnist_alexnet_we_app
TinyTensor
web_app
Wanglongzhi2001's Repositories
Wanglongzhi2001/Awesome-LLM-System-Papers
Wanglongzhi2001/BitNet
Official inference framework for 1-bit LLMs
Wanglongzhi2001/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Wanglongzhi2001/CustomPaddleCudaKernel
自定义Paddle的CUDA算子,用来学习CUDA以及benchmark, profile 和对精度
Wanglongzhi2001/cutlass-kernels
Wanglongzhi2001/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Wanglongzhi2001/EAGLE
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Wanglongzhi2001/FastDeploy
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.
Wanglongzhi2001/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Wanglongzhi2001/flashinfer
FlashInfer: Kernel Library for LLM Serving
Wanglongzhi2001/flux
Official inference repo for FLUX.1 models
Wanglongzhi2001/gpt2-from-scratch
Build and Train a GPT-2 from scratch using PyTorch
Wanglongzhi2001/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
Wanglongzhi2001/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Wanglongzhi2001/llumnix
Efficient and easy multi-instance LLM serving
Wanglongzhi2001/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Wanglongzhi2001/MagPy
Wanglongzhi2001/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Wanglongzhi2001/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
Wanglongzhi2001/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Wanglongzhi2001/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Wanglongzhi2001/PaddleNLP
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Wanglongzhi2001/sglang
SGLang is yet another fast serving framework for large language models and vision language models.
Wanglongzhi2001/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
Wanglongzhi2001/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
Wanglongzhi2001/stable-fast
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Wanglongzhi2001/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Wanglongzhi2001/weekly
科技爱好者周刊,每周五发布
Wanglongzhi2001/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Wanglongzhi2001/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.