Pinned Repositories
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
baichuan-speedup
纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,
BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
BERT-NER
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
ColossalAI
Making large AI models cheaper, faster and more accessible
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
makeMoE-
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
TC-GAN
code for IJCAI-2019
vincenschan's Repositories
vincenschan/makeMoE-
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
vincenschan/TC-GAN
code for IJCAI-2019
vincenschan/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
vincenschan/baichuan-speedup
纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,
vincenschan/BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
vincenschan/BERT-NER
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
vincenschan/Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
vincenschan/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
vincenschan/ColossalAI
Making large AI models cheaper, faster and more accessible
vincenschan/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
vincenschan/funNLP
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报
vincenschan/GPT2-Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
vincenschan/llama
Inference code for LLaMA models
vincenschan/LLaMA-Efficient-Tuning
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, ChatGLM2)
vincenschan/embedchain
The Open Source RAG framework
vincenschan/falcon
The no-magic web data plane API and microservices framework for Python developers, with a focus on reliability, correctness, and performance at scale.
vincenschan/invoice2data
Extract structured data from PDF invoices
vincenschan/Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM 等语言模型的本地知识库问答 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM) QA app with langchain
vincenschan/LaWGPT
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
vincenschan/mainstream_LLM_flow
主流的LLM路线及相关的技术介绍,方便学习参考
vincenschan/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练、有监督微调、RLHF(奖励建模、强化学习训练)和DPO(直接偏好优化)。
vincenschan/Multi-Label-Task
vincenschan/NEO4J_project
Import spo-triples to NEO4J with py2neo-v3 and Buliding graph.
vincenschan/projects_Multi-Label
vincenschan/QAnything
Question and Answer based on Anything.
vincenschan/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
vincenschan/rags
Build ChatGPT over your data, all with natural language
vincenschan/TextRank
TextRank用于抽取关键词,句子的重要度排序.
vincenschan/TigerBot
TigerBot: A multi-language multi-task LLM
vincenschan/Yi
A series of large language models trained from scratch by developers @01-ai