chinese-text-segmentation
There are 41 repositories under chinese-text-segmentation topic.
wolfgarbe/SymSpell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
koth/kcws
Deep Learning Chinese Word Segment
fukuball/jieba-php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.
lionsoul2014/jcseg
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
mammothb/symspellpy
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
amutu/zhparser
zhparser is a PostgreSQL extension for full-text search of Chinese language
qinwf/jiebaR
Chinese text segmentation with R. R语言中文分词 (文档已更新 🎉 :https://qinwenfeng.com/jiebaR/ )
yongzhuo/Pytorch-NLU
中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolkit, supports multi class and multi label classification, text similsrity, text summary and NER.
hankcs/hanlp-lucene-plugin
HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统
blueshen/ik-analyzer
Tokenizer support Lucene5/6/7/8/9+ version, LTS
supercoderhawk/DNN_CWS
利用深度学习实现中文分词
yingrui/mahjong
开源中文分词工具包,中文分词Web API,Lucene中文分词,中英文混合分词
ReubenBond/HanBaoBao
Mandarin Chinese text segmentation and mobile dictionary Android app (中文分词)
oscarsun72/TextForCtext
為了《中國哲學書電子化計劃》輸入用-加速鍵入與排版,更好的輸入體驗+文房一寶勝四寶C#+WordVBA文史工具-中文博士寫程式
qiaofei32/dnn-lstm-word-segment
Chinese Word Segmention Base on the Deep Learning and LSTM Neural Network
blueshen/ik-rs
ik-analyzer for rust; chinese tokenizer for tantivy
fumiama/jieba
Jiebago 的性能优化版, 支持从 io.Reader 加载字典
fg607/ChatterBot
ChatterBot中文适配版,支持中文分词搜索和中文停用词
jason2506/esapp
An unsupervised Chinese word segmentation tool.
stephanoskomnenos/vscode-jieba
基于 jieba-rs 的中文分词插件
wycm/xuexin-ocr
学信网学籍&学历图片内容识别
ChiChou/zhparser-docker
Postgresql with zhparser
Colearo/HuhuSeg
Simple Chinese segmentator, keywords extractor and other examples
hshrimp/HMM_Chinese_seg
HMM 隐马尔可夫 中文分词
numb3r3/text_utils
Text Pre-processing toolkit
zhangsoledad/solr-ik
solr-ik
ssb22/CedPane
Chinese-English Dictionary Public-domain Additions for Names Etc (CedPane)
jk195417/chinese-segmentation-as-service
Using Flask export jieba, SnowNLP, pkuseg as http API web service.
deminy/jieba-php
"结巴中文分词"PHP版本
FlyingOE/q_BosonNLP
Wrapper for BosonNLP online API
Jarod-Wingfield/Sentimental-Response-of-COVID-19-Outbreak-in-Guangzhou-China-Based-on-Weibo-Night-Comments
This is a practical exercise in processing Chinese text using R packages.
ssb22/adjuster
Web Adjuster + Annotator Generator
ericlingit/jieba-go
A copy-cat implementation of jieba as a learning exercise.
smart-lands-com/smla-cut
Chinese text segmentation
secsilm/text-segmentation-trap
一些容易被分词工具被分错的句子。