xingjinglu

HPC & Platform & Tools for AI

Pinned Repositories

ai_models
Language:Python0 1 00
ArchMeasureBench
Language:PostScript1 0 00
BladeDISC
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
Language:C++1 0 00
CLBlast
Tuned OpenCL BLAS
Language:C++1 0 00
flexible-gemm
flexible-gemm conv of deepcore
Language:C1 0 00
hpc_dev_docs
0 1 00
miCore
Language:C++1 0 04
vim-setup
Language:Vim Script1 0 00
WorkTips
Language:Jupyter Notebook5 0 10

xingjinglu's Repositories

xingjinglu/WorkTips
Language:Jupyter Notebook5 0 10
xingjinglu/MNN
MNN is a lightweight deep neural network inference engine.
Language:C++1 0 00
xingjinglu/vim-setup
Language:Vim Script1 0 00
xingjinglu/Awesome-GPU
Awesome resources for GPUs
0 0
xingjinglu/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
xingjinglu/baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python0 0
xingjinglu/ComfyUI
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Language:Python0 0
xingjinglu/FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
Language:Python0 0
xingjinglu/inference
Reference implementations of MLPerf™ inference benchmarks
Language:Python0 0
xingjinglu/ios-cmake
A CMake toolchain file for iOS, macOS, watchOS & tvOS C/C++/Obj-C++ development
Language:CMake0 0
xingjinglu/KsanaLLM
Language:C++0 0
xingjinglu/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python0 0
xingjinglu/llama
Inference code for LLaMA models
Language:Python0 0
xingjinglu/LLM-FineTuning-Large-Language-Models
LLM (Large Language Model) FineTuning
Language:Jupyter Notebook0 0
xingjinglu/mnn-llm
llm deploy project based mnn.
Language:C++0 0
xingjinglu/mupdf
mupdf mirror
xingjinglu/netron
Visualizer for neural network, deep learning, and machine learning models
Language:JavaScript0 0
xingjinglu/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
xingjinglu/opencv-mobile
The minimal opencv for Android, iOS, ARM Linux, Windows, Linux, MacOS, WebAssembly
Language:C++0 0
xingjinglu/prompt-cache
Modular and structured prompt caching for low-latency LLM inference
xingjinglu/PTX-Samples
Reproducers for various PTX related issues
xingjinglu/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Language:C++0 0
xingjinglu/sgl-learning-materials
Materials for learning SGLang
xingjinglu/sglang
SGLang is yet another fast serving framework for large language models and vision language models.
Language:Python0 0
xingjinglu/Stable-Diffusion-WebUI-TensorRT
TensorRT Extension for Stable Diffusion Web UI
Language:Python0 0
xingjinglu/streaming-llm
Efficient Streaming Language Models with Attention Sinks
xingjinglu/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++0 0
xingjinglu/triton
Development repository for the Triton language and compiler
Language:C++0 0
xingjinglu/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Language:Python0 0
xingjinglu/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0