huangshaoze

huangshaoze's Stars

chinese-poetry/chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人，21050首词。
Language:JavaScript48.5k 1.2k 2069.7k
jhao104/proxy_pool
Python ProxyPool for web spider
Language:Python21.8k 445 6175.2k
unbug/codelf
A search tool helps dev to solve the naming things problem.
Language:JavaScript14.1k 254 120974
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
9.6k 287 451.5k
VincentSit/ChinaMobilePhoneNumberRegex
Regular expressions that match the mobile phone number in mainland China. / 一组匹配**大陆手机号码的正则表达式。
4.8k 164 22506
macanv/BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
Language:Python4.8k 88 4391.3k
codemayq/chinese_chatbot_corpus
中文公开聊天语料库
Language:Python3.7k 75 18773
ownthink/Jiagu
Jiagu深度学习自然语言处理工具知识图谱关系抽取中文分词词性标注命名实体识别情感分析新词发现关键词文本摘要文本聚类
Language:Python3.3k 87 71614
zjy-ucas/ChineseNER
A neural network model for Chinese named entity recognition
Language:Python1.8k 68 86567
NetEaseGame/git-webhook
:octocat: 使用 Python Flask + SQLAchemy + Celery + Redis + React 开发的用于迅速搭建并使用 WebHook 进行自动化部署和运维，支持 Github / GitLab / Gogs / GitOsc。
Language:Python1.5k 88 0413
liuhuanyong/CrimeKgAssitant
Crime assistant including crime type prediction and crime consult service based on nlp methods and crime kg,罪名法务智能项目,内容包括856项罪名知识图谱, 基于280万罪名训练库的罪名预测,基于20W法务问答对的13类问题分类与法律资讯问答功能.
Language:Python1.4k 36 19385
buppt/ChineseNER
中文命名实体识别，实体抽取，tensorflow，pytorch，BiLSTM+CRF
Language:Python1.4k 18 51395
liuhuanyong/TextGrapher
Text Content Grapher based on keyinfo extraction by NLP method。输入一篇文档，将文档进行关键信息提取，进行结构化，并最终组织成图谱组织形式，形成对文章语义信息的图谱化展示。
Language:Python1.4k 27 24362
liuhuanyong/ComplexEventExtraction
A concept and obvious expression pattern collection of Chinese compound event extraction which then be evolved into ComplexEventGraph，本项目提出了中文复合事件的概念与显式模式，包括条件事件、因果事件、顺承事件、反转事件等事件抽取，并形成事理图谱。
Language:Python1.2k 27 5286
fighting41love/cocoNLP
A Chinese information extraction tool.
Language:Python1.1k 28 36259
thunlp/THUOCL
THUOCL（THU Open Chinese Lexicon）中文词库
883 28 4197
wb14123/couplet-dataset
Dataset for couplets. 70万条对联数据库。
Language:Python720 19 3215
nonamestreet/weixin_public_corpus
微信公众号语料库
574 35 7166
yaoguangluo/Deta_Parser
快速中文分词分析word segmentation
Language:Java478 21 2988
liuhuanyong/ChineseNLPCorpus
An collection of Chinese nlp corpus including basic Chinese syntatic wordset, semantic wordset, historic corpus and evaluate corpus. 中文自然语言处理的语料集合，包括语义词、领域共时、历时语料库、评测语料库等。
Language:Python440 22 5116
sujeek/chinese_nlp
中文自然语言处理入门实战课程语料
400 21 0179
FanhuaandLuomu/BiLstm_CNN_CRF_CWS
BiLstm+CNN+CRF 法律文档（合同类案件）领域分词（100篇标注样本）
Language:Python383 16 13108
fighting41love/hardNLU
NLU is hard!!!
269 13 234
SeanLee97/nlp_learning
结合python一起学习自然语言处理 (nlp): 语言模型、HMM、PCFG、Word2vec、完形填空式阅读理解任务、朴素贝叶斯分类器、TFIDF、PCA、SVD
Language:Python239 11 394
supercoderhawk/DeepLearning_NLP
基于深度学习的自然语言处理库
Language:Python153 15 340
RicherDong/NER-LOC
中文命名实体识别& 中文命名实体检测 python实现基于字+ 词位分别使用tensorflow IDCNN+CRF 及 BiLSTM+CRF 搭配词性标注实现中文命名实体识别及命名实体检测
Language:Python64 1 224
ZephyrChenzf/NER-Sequence-labeling--Textcnn-bilstm-crf-pytorch
pytorch用Textcnn-bilstm-crf模型实现命名实体识别
Language:Python42 3 66
shen1994/chinese_bilstm_cnn_crf
keras+tensorflow+python3下的中文分词, 大数据可训练，解决内存不够用问题
Language:Python41 4 213
xiongxianzhu/qingmi
基于Python3+Flask二次开发的应用层框架
Language:Python18 4 17
engqiu/kaka
中文分词，经过严格筛选的分词词典，以及通过google_books生成ngram，精确分词。
1