stopwords
There are 273 repositories under stopwords topic.
sing1ee/elasticsearch-jieba-plugin
jieba analysis plugin for elasticsearch 7.0.0, 6.4.0, 6.0.0, 5.4.0,5.3.0, 5.2.2, 5.2.1, 5.2, 5.1.2, 5.1.1
MihaiValentin/lunr-languages
A collection of languages stemmers and stopwords for Lunr Javascript library
stopwords-iso/stopwords-iso
All languages stopwords collection
lining0806/TextMining
Python文本挖掘系统 Research of Text Mining System
Alir3z4/stop-words
List of common stop words in various languages.
mohataher/arabic-stop-words
Largest list of Arabic stop words on Github. أكبر قائمة لمستبعدات الفهرسة العربية على جيت هاب
igorbrigadir/stopwords
Default English stopword lists from many different sources
Donatello-za/rake-php-plus
A keyword and phrase extraction library based on the Rapid Automatic Keyword Extraction algorithm (RAKE).
milaan9/Python_Natural_Language_Processing
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
kharazi/persian-stopwords
Persian (Farsi) Stop Words List
bbalet/stopwords
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
biolab/orange3-text
🍊 :page_facing_up: Text Mining add-on for Orange3
trinker/lexicon
A data package containing lexicons and dictionaries for text analysis
voku/stop-words
PHP | A collection of stop words for e.g. search-functions.
skupriienko/Ukrainian-Stopwords
the list of ~2000 ukrainian stopwords (with numbers)
ziaa/Persian-stopwords-collection
A collection of Persian stopwords - فهرست کلمات ایست فارسی
yihleego/trie
📒 An Aho-Corasick algorithm based string-searching utility for Go. It supports tokenization, ignoring case, replacing text. So you can use it to find keywords in an article, filter sensitive words, etc.
huned/node-stopwords
npm install stopwords
mustafaturan/omnicat-bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
SannketNikam/Emotion-Detection-in-Text
This project employs emotion detection in textual data, specifically trained on Twitter data comprising tweets labeled with corresponding emotions. It seamlessly takes text inputs and provides the most fitting emotion assigned to it.
yihleego/trie4j
📒 An Aho-Corasick algorithm based string-searching utility for Java. It supports tokenization, ignoring case, replacing text. So you can use it to find keywords in an article, filter sensitive words, etc.
davidsbatista/lexicons
Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.
cmccomb/rust-stop-words
Common stop words in a variety of languages
koheiw/marimo
A multi-lingual stopwords lists
dohliam/hawaiian-corpus
Data from a corpus of written Hawaiian
dohliam/more-stoplists
stoplists for African languages generated from the ASP corpus
eklem/stopword-trainer
A module for creating stopword lists for any language, based on a set of documents.
vikasing/news-stopwords
A huge list of stopwords collected from millions of news articles
ani10030/bad-words-detector
A Python script to detect language of some text and filter out the BAD words
SssiiiSssiii/ArabicTextCleaner
Arabic Text Cleaner
trajceskijovan/Structural-Topic-Modeling-in-R
Structural Topic Modeling in R (published two articles on Medium). STM, LDA, metadata, NLP.
ddhira123/Stop-Words-List
The stop words list for all languages around the world made by the contributors around the world! Start your contributions now!
hklemp/dotnet-stop-words
Get list of common stop words in various languages in dotnet
quasoft/postgres-tsearch-bulgarian
Bulgarian Full Text Search Dictionaries for PostgreSQL based on Ispell from bgOffice Project
icflorescu/postgresql-tsearch-utils
A collection of files and patterns to improve PostgreSQL text search
hantang/data-corpus
语料数据和词库收集:中文、英文停用词,情感分析,分类词典,敏感词库(违禁词,审查词)。stop words, sentiment analysis, thesaurus, censorship/sensitive word