gujita's Stars
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
EmbraceAGI/awesome-chatgpt-zh
ChatGPT 中文指南🔥,ChatGPT 中文调教指南,指令指南,应用开发指南,精选资源清单,更好的使用 chatGPT 让你的生产力 up up up! 🚀
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
microsoft/DeepSpeedExamples
Example models using DeepSpeed
Macielyoung/Bloom-Lora
Finetune Bloom big language model with Lora method
salesforce/CodeGen
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
togethercomputer/OpenChatKit
tloen/alpaca-lora
Instruct-tune LLaMA on consumer hardware
LC1332/Luotuo-Chinese-LLM
骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
yizhongw/self-instruct
Aligning pretrained language models with instruction data generated by themselves.
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
CLUEbenchmark/CLUECorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
CVI-SZU/Linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
ljynlp/W2NER
Source code for AAAI 2022 paper: Unified Named Entity Recognition as Word-Word Relation Classification
cchen-nlp/weiboNER
Chinese social media (Weibo) corpus rearrangement, taking the word as the basic unit instead of character.
SophonPlus/ChineseNlpCorpus
搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。
CLUEbenchmark/CLUENER2020
CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition
CLUEbenchmark/CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
meta-llama/llama
Inference code for Llama models
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
voidful/TextRL
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.