SirlyDreamer's Stars
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
seerge/g-helper
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally. Control tool for ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, TUF, Strix, Scar and other models
microsoft/T-MAC
Low-bit LLM inference on CPU with lookup table
hkust-nlp/dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
ClConstantine/CS106L
wandb/openui
OpenUI let's you describe UI using your imagination, then see it rendered live.
HSLiu-Initial/CtrlA
This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control.
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
louislam/uptime-kuma
A fancy self-hosted monitoring tool
fbelavenuto/arpl
Automated Redpill Loader
meta-llama/llama3
The official Meta Llama 3 GitHub site
BlairLeng/Expression
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
ClConstantine/CCNU-Beamer-Theme
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
lllyasviel/sd-forge-layerdiffuse
[WIP] Layer Diffusion for WebUI (via Forge)
SpursGoZmy/IM-TQA
Dataset and Code for ACL 2023 paper: "IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures". We proposed a new TQA problem which aims at real application scenarios, together with a supporting dataset and a baseline method.
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
huggingface/text-embeddings-inference
A blazing fast inference solution for text embeddings models
huggingface/text-generation-inference
Large Language Model Text Generation Inference
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
ChatGPTNextWeb/ChatGPT-Next-Web
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Jittor/JittorLLMs
计图大模型推理库,具有高性能、配置要求低、中文支持好、可移植等特点
clue-ai/PromptCLUE
PromptCLUE, 全中文任务支持零样本学习模型
kubeflow/kubeflow
Machine Learning Toolkit for Kubernetes
deepseek-ai/DeepSeek-LLM
DeepSeek LLM: Let there be answers