Wanglongzhi2001

University of Electronic Science and Technology of ChinaChengdu

Pinned Repositories

MegEngine
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
Language:C++4.8k 136 371545
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++22.6k 714 18.6k5.7k
PaddleNLP
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Language:Python12.4k 103 3.7k3k
TensorFlow.NET
.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
Language:C#3.3k 127 848532
answer-of-ImageProcessing100Wen
answer of ImageProcessing100Wen
Language:Jupyter Notebook1 1 00
employee-management-system-implemented-by-CPP
B站黑马程序员C++教程复现
Language:C++1 1 00
Huffman-coding
python实现霍夫曼编译码
Language:Jupyter Notebook1 1 01
mnist_alexnet_we_app
Language:HTML1 1 02
TinyTensor
Language:C++1 1 00
web_app
Language:Go1 1 00

Wanglongzhi2001's Repositories

Wanglongzhi2001/Awesome-LLM-System-Papers
0 0
Wanglongzhi2001/BitNet
Official inference framework for 1-bit LLMs
Wanglongzhi2001/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Wanglongzhi2001/CustomPaddleCudaKernel
自定义Paddle的CUDA算子，用来学习CUDA以及benchmark, profile 和对精度
Language:Cuda
Wanglongzhi2001/cutlass-kernels
Language:Cuda0 0
Wanglongzhi2001/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Language:Python0 0
Wanglongzhi2001/EAGLE
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Language:Python0 0
Wanglongzhi2001/FastDeploy
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.
Language:C++
Wanglongzhi2001/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
Wanglongzhi2001/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda0 0
Wanglongzhi2001/flux
Official inference repo for FLUX.1 models
Wanglongzhi2001/gpt2-from-scratch
Build and Train a GPT-2 from scratch using PyTorch
Language:Jupyter Notebook0 0
Wanglongzhi2001/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
Wanglongzhi2001/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Language:Python0 0
Wanglongzhi2001/llumnix
Efficient and easy multi-instance LLM serving
Wanglongzhi2001/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Wanglongzhi2001/MagPy
Wanglongzhi2001/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Wanglongzhi2001/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
Wanglongzhi2001/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Language:Python0 0
Wanglongzhi2001/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++0 0
Wanglongzhi2001/PaddleNLP
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Language:Python0 0
Wanglongzhi2001/sglang
SGLang is yet another fast serving framework for large language models and vision language models.
Language:Python0 0
Wanglongzhi2001/Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
Wanglongzhi2001/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
Wanglongzhi2001/stable-fast
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Language:Python0 0
Wanglongzhi2001/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0
Wanglongzhi2001/weekly
科技爱好者周刊，每周五发布
Wanglongzhi2001/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Wanglongzhi2001/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.