649453932's Stars
xai-org/grok-1
Grok open release
oobabooga/text-generation-webui
A Gradio web UI for Large Language Models.
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Johnshall/Shadowrocket-ADBlock-Rules-Forever
提供多款 Shadowrocket 规则,拥有强劲的广告过滤功能。每日 8 时重新构建规则。
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
arcee-ai/mergekit
Tools for merging pretrained large language models.
openai/transformer-debugger
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
LC1332/Chat-Haruhi-Suzumiya
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
SakanaAI/evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
lmmlzn/Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
TencentARC/LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.
FranxYao/Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
microsoft/FILM
Official repo for "Make Your LLM Fully Utilize the Context"
skywalker023/sodaverse
🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
yegcjs/mixinglaws
SalesforceAIResearch/GemFilter
thu-coai/ComplexBench
xverse-ai/XVERSE-MoE-A4.2B
XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.
PedroUria/NLP-Movie_Scripts
Trying to predict a movie's success based on the script (before filming)
Magnetic2014/RoleEval
A Bilingual Role Evaluation Benchmark for Large Language Models
F2-Song/ScalingAlignment
The official implementation of "Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment".