polarispw's Stars
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
state-spaces/mamba
Mamba SSM architecture
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
HanGuo97/lq-lora
nomic-ai/nomic
Interact, analyze and structure massive text, image, embedding, audio and video datasets
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
princeton-nlp/MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
OpenMatch/UniVL-DR
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval".
horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
hkust-nlp/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
yuchenlin/rebiber
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
brucefan1983/CUDA-Programming
Sample codes for my CUDA programming book
ggerganov/ggml
Tensor library for machine learning
BlinkDL/RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
ggerganov/llama.cpp
LLM inference in C/C++
wangzhaode/mnn-llm
llm deploy project based mnn.
QC-LY/Prompt-Tuning-For-Sentiment-Classification
Code for the Internship at NEU-NLP
alibaba/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Tencent/TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
storage-db/ToolDiy
一本工具指南和开箱即用配置,旨在让大家选用和上手合适的工具。
jaywcjlove/reference
为开发人员分享快速参考备忘清单(速查表)
fluctlight001/SampleCPU
jgraph/drawio-desktop
Official electron build of draw.io
NiuTrans/MTBook
《机器翻译:基础与模型》肖桐 朱靖波 著 - Machine Translation: Foundations and Models