wzzju's Stars
huggingface/nanotron
Minimalistic large language model 3D-parallelism training
huggingface/candle
Minimalist ML framework for Rust
AIMPED/plotly_dash
A collection of small dash apps which I created for learning purposes. Some of them answer questions asked on the plotly forum. https://community.plotly.com/
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
mli/paper-reading
深度学习经典、新论文逐段精读
huggingface/safetensors
Simple, safe way to store and distribute tensors
KEKE046/mlir-tutorial
Hands-On Practical MLIR Tutorial
pabloariasal/modern-cmake-sample
Example library that shows best practices and proper usage of CMake by using targets
QidiLiu/project-example
項目模板
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
immich-app/immich
High performance self-hosted photo and video management solution.
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
sshaoshuai/MTR
MTR: Motion Transformer with Global Intention Localization and Local Movement Refinement, NeurIPS 2022.
ztxz16/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
flet-dev/flet
Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.
ggerganov/llama.cpp
LLM inference in C/C++
Mycenae/PaperWeekly
Papers for CNN, object detection, keypoint detection, semantic segmentation, medical image processing, SLAM, etc.
HiveChat/hive-desktop
🐝 A small LAN chat app
siboehm/ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
milahu/awesome-qt6
amhndu/SimpleNES
An NES emulator in C++
daohu527/dig-into-apollo
Apollo notes (Apollo学习笔记) - Apollo learning notes for beginners.
Visualize-ML/Book4_Power-of-Matrix
Book_4_《矩阵力量》 | 鸢尾花书:从加减乘除到机器学习;上架!
PaddlePaddle/PaddleHub
Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
nod-ai/pandas-mlir
Bridging Pandas and MLIR ecosystems