zhangjun

Beijing

Pinned Repositories

.vim
My vim configuration
Language:Vim Script10
action-automatic-releases
READONLY: Auto-generated mirror for https://github.com/marvinpinto/actions/tree/master/packages/automatic-releases
1 0 00
agora
an universal log collect system
Language:C1 1 01
hey
HTTP load generator, ApacheBench (ab) replacement, formerly known as rakyll/boom
Language:Go0 0 00
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++0 1 00
Paddle-Lite
Multi-platform high performance deep learning inference engine (『飞桨』多平台高性能深度学习预测引擎）
Language:C++0 0 01
Serving
A flexible, high-performance carrier for machine learning models（『飞桨』服务化部署框架）
Language:C++0 0 00
task-schedule
Have implemented a threadpool and task queue for running tasks on multicore CPU
Language:C++0 1 01
tf_serving_client_brpc
tensorflow serving client using brpc
Language:C++5 1 02
zhangjun.github.io
https://zhangjun.github.io
Language:Stylus2 2 350

zhangjun's Repositories

zhangjun/zhangjun.github.io
https://zhangjun.github.io
Language:Stylus2 2 350
zhangjun/ai-chatbot
A full-featured, hackable Next.js AI chatbot built by Vercel
1
zhangjun/stable_diffusion_compile
compile stable diffusion to run faster
Language:Python1 1 00
zhangjun/WeChatMsg
提取微信聊天记录，将其导出成HTML、Word、CSV文档永久保存，对聊天记录进行分析生成年度聊天报告
Language:Python1 0 0
zhangjun/Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）
Language:C++0 1 00
zhangjun/llm-inference-benchmark
LLM Inference benchmark
Language:Python0 0
zhangjun/llm-quant
Language:Python
zhangjun/llm-tools
Language:Go1 0
zhangjun/llm_chat
Language:Python1 0
zhangjun/llmc
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
zhangjun/my_notes
Language:C++1 12
zhangjun/oneflow-diffusers
OneFlow backend for 🤗 Diffusers and ComfyUI
Language:Python0 0
zhangjun/openai-node
The official Node.js / Typescript library for the OpenAI API
zhangjun/paper-reading
1 0
zhangjun/puck
Language:Jupyter Notebook0 0
zhangjun/sarathi-serve
A low-latency & high-throughput serving engine for LLMs
Language:Python0 0
zhangjun/sglang
SGLang is a fast serving framework for large language models and vision language models.
zhangjun/stable-diffusion-webui-docker
stable diffusion webui docker
Language:Shell1 0
zhangjun/stable-fast
An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Language:Python0 0
zhangjun/StableTriton
The first open source triton inference engine for Stable Diffusion, specifically for sdxl
Language:Python0 0
zhangjun/Taipy-Chatbot-Demo
A template to create any LLM Inference Web Apps using Python only
Language:Python0 0
zhangjun/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++0 0
zhangjun/tmp
Language:Cuda
zhangjun/tmp2
Language:C++
zhangjun/torch-play
Language:Python1 0
zhangjun/torch2trt
An easy to use PyTorch to TensorRT converter
zhangjun/torchtune-example
torchtune, llm
Language:Shell1 0
zhangjun/transformer_framework
framework for plug and play of various transformers (vision and nlp) with FSDP
Language:Python0 0
zhangjun/triton
Development repository for the Triton language and compiler
Language:C++0 0
zhangjun/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0