jiejie1993's Stars
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
arcee-ai/mergekit
Tools for merging pretrained large language models.
modelscope/ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek3, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
315386775/DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
huggingface/nanotron
Minimalistic large language model 3D-parallelism training
react-financial/react-financial-charts
Charts dedicated to finance.
WangRongsheng/CareGPT
🌞 CareGPT (关怀GPT)是一个医疗大语言模型,同时它集合了数十个公开可用的医疗微调数据集和开放可用的医疗大语言模型,包含LLM的训练、测评、部署等以促进医疗LLM快速发展。Medical LLM, Open Source Driven for a Healthy Future.
IEIT-Yuan/Yuan-2.0
Yuan 2.0 Large Language Model
HIT-SCIR/Chinese-Mixtral-8x7B
中文Mixtral-8x7B(Chinese-Mixtral-8x7B)
SciPhi-AI/synthesizer
A multi-purpose LLM framework for RAG and data creation.
chaoswork/sft_datasets
开源SFT数据集整理,随时补充
databonsai/databonsai
clean & curate your data with LLMs.
bigcode-project/bigcode-dataset
liucongg/ChatGPTBook
《ChatGPT原理与实战:大型语言模型的算法、技术和私有化》
sangmichaelxie/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
FlagOpen/FlagData
qiaoliangxiang/cfa
FRM & CFA study notes
p-lambda/dsir
DSIR large-scale data selection framework for language model training
adlnlp/FinLLMs
This repository contains related work, benchmarks and datasets for the paper "Large Language Models in Finance (FinLLMs)", currently under review.
SciPhi-AI/library-of-phi
SUFE-AIFLM-Lab/FinEval
FinEval是一个中文金融领域高质量多项选择与文本问答题的集合。
zhenlohuang/awesome-chinese-llm
Awesome Chinese LLM: A curated list of Chinese Large Language Model 中文大语言模型数据集和模型资料汇总
yanqiangmiffy/how-to-train-tokenizer
怎么训练一个LLM分词器
xv44586/Chinese-instruction-datasets
中文 Instruction tuning datasets
qianniucity/llm_notebooks
AI 应用示例合集