Executedone's Stars
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
modularml/mojo
The Mojo Programming Language
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
modelscope/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
THUDM/CodeGeeX
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
mnotgod96/AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
KwaiKEG/KwaiAgents
A generalized information-seeking agent system with Large Language Models (LLMs).
amirgholami/PyHessian
PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks
ezelikman/quiet-star
Code for Quiet-STaR
GAIR-NLP/MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
sanderwood/bgpt
Beyond Language Models: Byte Models are Digital World Simulators
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
yuhuixu1993/qa-lora
Official PyTorch implementation of QA-LoRA
billvsme/my_openai_api
部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响应
VITA-Group/LiGO
[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
locuslab/edge-of-stability
formll/dog
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
zyushun/hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective