jieba

There are 136 repositories under jieba topic.

  • go-ego/gse

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.

    Language:Go2.6k6356215
  • napi-rs/node-rs

    Node.js bindings ❤️ Rust crates

    Language:Rust1.2k109634
  • anderscui/jieba.NET

    jieba中文分词的.NET版本(支持.NET Framework与.NET Core)

    Language:C#1.1k7190253
  • messense/jieba-rs

    The Jieba Chinese Word Segmentation Implemented in Rust

    Language:Rust755134847
  • deepcs233/jieba_fast

    Use C Api and Swig to Speed up jieba 高效的中文分词库

    Language:Python632132675
  • sing1ee/elasticsearch-jieba-plugin

    jieba analysis plugin for elasticsearch 7.0.0, 6.4.0, 6.0.0, 5.4.0,5.3.0, 5.2.2, 5.2.1, 5.2, 5.1.2, 5.1.1

    Language:Java5262593148
  • 01joy/news-search-engine

    新闻搜索引擎

    Language:Python4321311128
  • fuqiuai/wordCloud

    用python进行文本分词并生成词云

    Language:Python4075397
  • qinwf/jiebaR

    Chinese text segmentation with R. R语言中文分词 (文档已更新 🎉 :https://qinwenfeng.com/jiebaR/ )

    Language:C++3444870108
  • lining0806/TextMining

    Python文本挖掘系统 Research of Text Mining System

    Language:Python334352152
  • GaoQ1/rasa_nlu_gq

    turn natural language into structured data(支持中文,自定义了N种模型,支持不同的场景和任务)

    Language:Python304213797
  • fendouai/Chinese-Text-Classification

    Chinese-Text-Classification,Tensorflow CNN(卷积神经网络)实现的中文文本分类。QQ群:522785813,微信群二维码:http://www.tensorflownews.com/

    Language:Python29024889
  • lb2281075105/Python-WeChat-ItChat

    微信机器人,基于Python itchat接口功能实例展示:01-itchat获取微信好友或者微信群分享文章、02-itchat获取微信公众号文章、03-itchat监听微信公众号发送的文章、04 itchat监听微信群或好友撤回的消息、05 itchat获得微信好友信息以及表图对比、06 python打印出微信被删除好友、07 itchat自动回复好友、08 itchat微信好友个性签名词云图、09 itchat微信好友性别比例、10 微信群或微信好友撤回消息拦截、11 itchat微信群或好友之间转发消息

    Language:Python287170119
  • Snailclimb/python

    Python学习第三方库案例总结

    Language:Python259260115
  • ixqbar/phpjieba

    结巴中文分词之php扩展,适用php5,php7

    Language:C++16111732
  • moyuweiqing/bilibili-barrage-analysis

    bilibili弹幕分析,包含爬虫、词云分析、词频分析、情感分析、构建衍生指标,可视化

    Language:HTML1562219
  • houbb/segment

    The jieba-analysis tool for java.(基于结巴分词词库实现的更加灵活优雅易用,高性能的 java 分词实现。支持词性标注。)

    Language:Java14441027
  • limccn/cacl2

    Lexicon for Chinese lexical analyzing, 中文语言分词词库

    Language:Python1176022
  • HongZhaoHua/jstarcraft-nlp

    专注于解决自然语言处理领域的几个核心问题:词法分析,句法分析,语义分析,语种检测,信息抽取,文本聚类和文本分类. 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了多种自然语言处理算法,适配了多个自然语言处理框架. 兼容Lucene/Solr/ElasticSearch插件.

    Language:Java1126129
  • Sweetiee-yi/Jaba

    结巴分词(java版)

    Language:Java962125
  • 3inchtime/douban_sentiment_analysis

    基于朴素贝叶斯实现的豆瓣影评情感分析

    Language:Python942022
  • zh3389/chatbot

    知识图谱 neo4j 答案查找 + 机器学习 分类模型 问题分析 = 电影知识库问答机器人

    Language:Python851316
  • hockyy/miteiru

    Miteiru is an open source Electron video player to learn Chinese, Cantonese, and Japanese. It can play all Youtube and HTML 5 supported format (.mkv, .mp4, .mov, and many more) videos, and lots of supports on other subtitle formats (.srt, .ass, .vtt, and many more)

    Language:TypeScript846471
  • sileixinhua/News-classification

    新闻分类系统&谣言处理系统

    Language:Python7810044
  • xujingguo58/tinySearchEngine

    基于vue前端框架/scrapy爬虫框架/结巴分词实现的小型搜索引擎

    Language:JavaScript735115
  • realdennis/igcloud

    *UNSUPPORTED* Use igcloud to generate Instagram Word Cloud ! 🛫 🛫 ✈ 🔝

    Language:Python663211
  • Alex-CHUN-YU/Word2vec

    訓練中文詞向量 Word2vec, Word2vec was created by a team of researchers led by Tomas Mikolov at Google.

    Language:Jupyter Notebook585029
  • fengkx/jieba-wasm

    WASM binding to jieba-rs

    Language:Rust58278
  • apple-han/flask-reptiles

    flask 打造分词搜索与web

    Language:Python44208
  • Cheereus/PdfSplitter

    将pdf转为txt然后进行分词,并进行词频统计

    Language:Python30102
  • cxumol/jieba-wasm-html

    Fast Jieba Chinese text segmentation on browser without backend/NPM | 结巴分词网页版, 基于 WebAssembly 的纯前端实现; 亦可用于 Deno

    Language:JavaScript28112
  • qiwihui/SMSFilters

    基于机器学习的 iOS 中文垃圾短信过滤 App

    Language:C++24126
  • MatoYing/TextMining

    一个比较基础、全面的文本挖掘过程。包含了利用机器学习和文本挖掘技术完成情感分析模型搭建;利用情感极性判断与程度计算来判断情感倾向;利用词频和TF-IDF挖掘出正负文本中的关键点情况;利用文本挖掘相关算法找到平台中用户讨论的集中点。

    Language:Jupyter Notebook23122
  • messense/rjieba-py

    jieba-rs Python binding

    Language:Python23222
  • bmxbmx3/anki_cloze_maker

    根据jieba的tf-idf算法,及自定义的关键词,对.txt文件批量生成anki填空符。

    Language:Python22306
  • moyuweiqing/CNKI-analysis

    使用python,从知网上爬取相关的数据,并进行数据分析,涉及到pycharm和jupyter notebook

    Language:Jupyter Notebook222110