litetoooooom's Stars
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
ymcui/Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
ricklamers/gridstudio
Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
minimaxir/textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
brightmart/albert_zh
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
FlagAI-Open/FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
BingLingGroup/autosub
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
jxzhangjhu/Awesome-LLM-RAG
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
grammarly/gector
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
baiyang2464/chatbot-base-on-Knowledge-Graph
使用深度学习方法解析问题 知识图谱存储 查询知识点 基于医疗垂直领域的对话系统
SimmerChan/corpus
自然语言处理,知识图谱相关语料。按照Task细分,欢迎PR。
luge-ai/luge-ai
cokelaer/fitter
Fit data to many distributions
CLUEbenchmark/SimCLUE
3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型
GanjinZero/ChineseEHRBert
A Chinese EHR Bert Pretrained Model.
Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
PKU Team Zero's code for participation in ICDAR2019 ArT Recognition track (Champion)
HITsz-TMG/awesome-llm-attributions
A Survey of Attributions for Large Language Models
moskytw/uniout
Never see escaped bytes in output.
ZhuiyiTechnology/roformer-v2
RoFormer升级版
google-research/head2toe
Xovee/simplified-chinese-translation-of-neural-networks-and-deep-learning
本文是Michael Nielson所著的《Neural Networks and Deep Learning》的简体中文翻译版。
luoyangbiao/bert_flask
使用bert训练MRPC数据集,写成API接口模式以及简易的html界面
litetoooooom/chatbox_pattern
trie树实现多类型匹配
JarbasAl/MrData
Unified data for the semantic web
JinbiaoZhu/PaperReading
This is a Github repository that focuses on articles related to skill-based meta reinforcement learning. The main focus is on skill extraction, combination, and generalization.
litetoooooom/context2vec
litetoooooom/litetoooooom.github.io
litetoooooom/PaperAssistant
论文阅读助手