Executedone

Executedone's Stars

hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python37.7k 220 5.7k4.6k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python33.5k 275 5.9k5.1k
modularml/mojo
The Mojo Programming Language
Language:Mojo23.5k 267 2.3k2.6k
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Language:TypeScript21.5k 116 4231.7k
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Language:Python20.4k 135 1.2k1.4k
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python15.3k 112 1.1k1.2k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python15k 123 1.2k1.4k
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python12.8k 212 2.4k2.6k
modelscope/facechain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Language:Jupyter Notebook9.2k 89 352864
THUDM/CodeGeeX
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Language:Python8.3k 88 222614
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
6.2k 94 11340
mnotgod96/AppAgent
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Language:Python5.3k 70 87580
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python5.3k 50 456403
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文仓库（随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档）
Language:Python4.1k 46 54338
atfortes/Awesome-LLM-Reasoning
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
2.2k 40 3127
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Language:Python1.9k 25 184345
Ucas-HaoranWei/Vary
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Language:Python1.8k 54 137161
KwaiKEG/KwaiAgents
A generalized information-seeking agent system with Large Language Models (LLMs).
Language:Python1.1k 21 43111
amirgholami/PyHessian
PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks
Language:Jupyter Notebook710 14 19120
ezelikman/quiet-star
Code for Quiet-STaR
Language:Python693 12 1389
GAIR-NLP/MathPile
[NeurlPS D&B 2024] Generative AI for Math: MathPile
Language:Python400 7 521
HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Language:Python377 7 2119
sanderwood/bgpt
Beyond Language Models: Byte Models are Digital World Simulators
Language:Python314 4 420
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
Language:Python269 13 2420
yuhuixu1993/qa-lora
Official PyTorch implementation of QA-LoRA
Language:Python122 4 3711
billvsme/my_openai_api
部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ，实现了OpenAI中Chat, Models和Completions接口，包含流式响应
Language:Python87 2 68
VITA-Group/LiGO
[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Zhangyang Wang, Yoon Kim
Language:Python85 26 29
locuslab/edge-of-stability
Language:Python62 4 119
formll/dog
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Language:Python58 3 22
zyushun/hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
Language:Jupyter Notebook46 1 04