dwzhu-pku's Stars
mem0ai/mem0
The Memory layer for your AI apps
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
jbhuang0604/awesome-tips
deepseek-ai/DeepSeek-Coder-V2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
multimodal-art-projection/MAP-NEO
zhijing-jin/nlp-phd-global-equality
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP
microsoft/MInference
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
okhat/blog
TIGER-AI-Lab/LongRAG
Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".
HaozheZhao/UltraEdit
metame-ai/awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
google-deepmind/loft
LOFT: A 1 Million+ Token Long-Context Benchmark
KbsdJames/Awesome-LLM-Preference-Learning
The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"
hkust-nlp/llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
AI21Labs/Parallel-Context-Windows
llyx97/TempCompass
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou
MozerWang/Loong
[EMNLP 2024 Main]Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
zhiyuanhubj/LongRecipe
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
alonj/Same-Task-More-Tokens
The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"
Zefan-Cai/Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of đź“™Awesome LLM KV Cache Papers with Codes.
mutonix/pyramidinfer
Yifan-Song793/GoodBadGreedy
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
WeiminXiong/IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement
KbsdJames/Omni-MATH
The official repository of the Omni-MATH benchmark.
chenllliang/MMEvalPro
Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
FranxYao/Retrieval-Head-with-Flash-Attention
Efficient retrieval head analysis with triton flash attention that supports topK probability
geronimi73/accelerate_tricks