litetoooooom

litetoooooom's Stars

hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
Language:Python34.3k 1.1k 1.4k10.3k
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
17.4k 219 271.7k
ymcui/Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）
Language:Python9.8k 143 2401.4k
ricklamers/gridstudio
Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
Language:JavaScript8.9k 323 1311.5k
minimaxir/textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Language:Python4.9k 136 230751
brightmart/albert_zh
A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
Language:Python3.9k 103 166754
FlagAI-Open/FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
Language:Python3.8k 44 211416
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
3.6k 66 56252
BingLingGroup/autosub
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
Language:Python2k 34 196246
jxzhangjhu/Awesome-LLM-RAG
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
1k 10 064
grammarly/gector
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Language:Python913 21 173219
baiyang2464/chatbot-base-on-Knowledge-Graph
使用深度学习方法解析问题知识图谱存储查询知识点基于医疗垂直领域的对话系统
Language:Python731 16 29204
SimmerChan/corpus
自然语言处理，知识图谱相关语料。按照Task细分，欢迎PR。
Language:Python719 20 0153
luge-ai/luge-ai
Language:JavaScript438 13 467
cokelaer/fitter
Fit data to many distributions
Language:Python375 11 7158
CLUEbenchmark/SimCLUE
3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型
Language:Python288 5 340
GanjinZero/ChineseEHRBert
A Chinese EHR Bert Pretrained Model.
Language:Python253 11 1345
Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
PKU Team Zero's code for participation in ICDAR2019 ArT Recognition track (Champion)
Language:Roff221 15 1867
HITsz-TMG/awesome-llm-attributions
A Survey of Attributions for Large Language Models
181 5 49
moskytw/uniout
Never see escaped bytes in output.
Language:Python158 12 918
ZhuiyiTechnology/roformer-v2
RoFormer升级版
Language:Python150 7 615
google-research/head2toe
Language:Python81 6 713
Xovee/simplified-chinese-translation-of-neural-networks-and-deep-learning
本文是Michael Nielson所著的《Neural Networks and Deep Learning》的简体中文翻译版。
50 3 010
luoyangbiao/bert_flask
使用bert训练MRPC数据集，写成API接口模式以及简易的html界面
Language:Python22 1 013
litetoooooom/chatbox_pattern
trie树实现多类型匹配
Language:Python50
JarbasAl/MrData
Unified data for the semantic web
Language:Python2 3 02
JinbiaoZhu/PaperReading
This is a Github repository that focuses on articles related to skill-based meta reinforcement learning. The main focus is on skill extraction, combination, and generalization.
20
litetoooooom/context2vec
Language:Python1 0 00
litetoooooom/litetoooooom.github.io
Language:HTML1 1 00
litetoooooom/PaperAssistant
论文阅读助手
Language:Mermaid1 1 00