pprp's Stars
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
abhisheknaiidu/awesome-github-profile-readme
😎 A curated list of awesome GitHub Profile which updates in real time
outlines-dev/outlines
Structured Text Generation
TimDettmers/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
fchollet/ARC-AGI
The Abstraction and Reasoning Corpus
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen2.x, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
ridgerchu/matmulfreellm
Implementation for MatMul-free LM.
verazuo/jailbreak_llms
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
yingDev/Tickeys
Instant audio feedback for typing. macOS version. (Rust)
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
AgentEra/Agently-Daily-News-Collector
An open-source LLM based automatically daily news collecting workflow showcase powered by Agently AI application development framework.
pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Infini-AI-Lab/Sequoia
scalable and robust tree-based speculative decoding algorithm
mit-han-lab/Quest
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
AutoSurveys/AutoSurvey
xuyuzhuang11/OneBit
The homepage of OneBit model quantization framework.
SkyworkAI/Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
pprp/Pruner-Zero
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
htqin/BiBench
[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.
ThisisBillhe/EfficientDM
[ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models"
tanganke/fusion_bench
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
ModelTC/Outlier_Suppression_Plus
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
TianjinYellow/EdgeDeviceLLMCompetition-Starting-Kit
Gaffey/ExCP
Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".
xufangzhi/Symbol-LLM
[ACL 2024] The project of Symbol-LLM
2404589803/hf-daily-paper-newsletter-chinese
HF🤗每日简报机器人
snu-mllab/LayerMerge
Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML'24)
Nicolas-BZRD/llm-recipes
yyyujintang/Awesome-VideoLLM-Papers
This repository compiles a list of papers related to Video LLM.
paulohenriquecrs/RelevAI-Reviewer