oneal2000

oneal2000's Stars

karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python36.5k 371 3155.7k
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python31.8k 203 4.9k3.9k
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.4k 339 2684k
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Language:Python18.2k 183 7311.9k
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
15.2k 196 241.4k
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
9.4k 286 451.5k
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
Language:HTML7.8k 107 441753
baichuan-inc/Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python5.7k 66 129506
imoneoi/openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data
Language:Python5.2k 49 187399
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
Language:Jupyter Notebook4.2k 129 28295
Facico/Chinese-Vicuna
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca
Language:C4.1k 58 244421
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
Language:Python4.1k 40 394293
attardi/wikiextractor
A tool for extracting plain text from Wikipedia dumps
Language:Python3.7k 74 243965
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
907 11 347
zjunlp/KnowledgeEditingPapers
Must-read Papers on Knowledge Editing for Large Language Models.
852 27 854
jianzhnie/LLamaTuner
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Language:Python568 9 4163
potsawee/selfcheckgpt
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Language:Python444 6 2654
xv44586/Chinese-instruction-datasets
中文 Instruction tuning datasets
113 2 06
AtomEcho/AtomBulb
旨在对当前主流LLM进行一个直观、具体、标准的评测
92 3 04
CoinCheung/gdGPT
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
Language:Python91 1 88
oneal2000/DRAGIN
Source code of DRAGIN, ACL 2024 main conference Long Paper
Language:Python67 2 011
NJU-LegalAI/Legal-ChatGLM
基于中文法律知识的ChatGLM指令微调
42 3 11
oneal2000/MIND
Source code of our paper MIND, ACL 2024 Long Paper
Language:Python22 2 63
oneal2000/Wikiformer
Code for AAAI 2024 paper Wikiformer
Language:Python16 1 00
andy-yangz/Awesome-Chinese-Instruction-Datasets
中文 Instruction 相关数据集整理
8 1 02
THUlawtech/LegalAttack
Language:Python7 1 01
oneal2000/STARD
StaRD: Statute Retrieval Dataset based on Real-World Legal Consultation
Language:Python6 1 10
ict-bigdatalab/utility_judgments
Language:Python4 0 01
oneal2000/EntityHallucination
Language:Python31
oneal2000/Caseformer
Source code of our long paper: Caseformer: Pre-training for Legal Case Retrieval
Language:Python2 1 11