649453932

649453932's Stars

xai-org/grok-1
Grok open release
Language:Python49.6k 569 2108.3k
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models.
Language:Python40.6k 331 3.7k5.3k
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python34.4k 213 5.3k4.2k
Johnshall/Shadowrocket-ADBlock-Rules-Forever
提供多款 Shadowrocket 规则，拥有强劲的广告过滤功能。每日 8 时重新构建规则。
13k 76 287823
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell9.6k 56 859593
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Language:Jupyter Notebook7.1k 76 212454
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
5.2k 83 9286
arcee-ai/mergekit
Tools for merging pretrained large language models.
Language:Python4.8k 52 317439
openai/transformer-debugger
Language:Python4k 25 14235
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！
Language:Python2.9k 19 195175
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language:Python2.5k 41 23237
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.3k 31 90158
LC1332/Chat-Haruhi-Suzumiya
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
Language:Jupyter Notebook1.8k 17 62163
SakanaAI/evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
Language:Python1.2k 40 1190
lmmlzn/Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
1k 4 2105
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Language:Python883 8 2246
SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
Language:Python823 12 14181
TencentARC/LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.
Language:Python477 20 3235
FranxYao/Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
Language:Python435 8 1728
microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
Language:Python371 10 4536
microsoft/FILM
Official repo for "Make Your LLM Fully Utilize the Context"
Language:Python241 6 419
skywalker023/sodaverse
🥤🧑🏻‍🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"
Language:Python221 18 813
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
Language:Jupyter Notebook88 5 104
yegcjs/mixinglaws
Language:Jupyter Notebook88 1 56
SalesforceAIResearch/GemFilter
Language:Python63 1 16
thu-coai/ComplexBench
Language:Python52 3 24
xverse-ai/XVERSE-MoE-A4.2B
XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.
Language:Python36 5 16
PedroUria/NLP-Movie_Scripts
Trying to predict a movie's success based on the script (before filming)
Language:Jupyter Notebook35 3 07
Magnetic2014/RoleEval
A Bilingual Role Evaluation Benchmark for Large Language Models
34 2 40
F2-Song/ScalingAlignment
The official implementation of "Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment".
Language:Python8 1 00