yhwang-hub's Stars
deepseek-ai/DeepSeek-V3
deepseek-ai/DeepSeek-R1
huggingface/open-r1
Fully open reproduction of DeepSeek-R1
kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Jiayi-Pan/TinyZero
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
jingyaogong/minimind-v
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
alibaba/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
FireRedTeam/FireRedASR
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.
triton-inference-server/tutorials
This repository contains tutorials and examples for Triton Inference Server
zeux/calm
CUDA/Metal accelerated language model inference
deeperlearning/professional-cuda-c-programming
Maharshi-Pandya/cudacodes
Learnings and programs related to CUDA
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
godweiyang/GrabGPU
一款便捷的抢占显卡脚本
andrewkchan/deepseek.cpp
CPU inference for the DeepSeek family of large language models in pure C++
IrohXu/Awesome-Multimodal-LLM-Autonomous-Driving
[WACV 2024 Survey Paper] Multimodal Large Language Models for Autonomous Driving
leimao/ONNX-Runtime-Inference
ONNX Runtime Inference C++ Example
datawhalechina/llm-deploy
大模型/LLM推理和部署理论与实践
Tongkaio/CUDA_Kernel_Samples
CUDA 算子手撕与面试指南
quic/ai-hub-apps
The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
bytedance/decoupleQ
A quantization algorithm for LLM
daquexian/faster-rwkv
BBuf/tensorrt-llm-moe
ViffyGwaanl/DeepSeek-Api-Test
Currently, there are many DeepSeek API providers on the market. Use DeepSeek Api Test to test which API performs the best
DataXujing/DeepSeek-R1-Android
:fire: 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型
caibucai22/awesome-cuda
Awesome code, projects, books, etc. related to CUDA
Shibodd/cpp_scipy_rectangular_lsap
scipy.optimize.linear_sum_assignment edited for straightforward usage in C++ and Eigen
shifan3/TensorRT-LLM-qwen2-vl
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.