Pinned Repositories
freefields-from-string
Code for extracting field-like text from unformatted strings
gs-scripts
Samples of .gs scripts
ht-getter
Searches a document for hash tags. Supports multiple natural languages. Works in various contexts.
ja-sentence
Light-weight sentence tokenizer for Japanese.
js-sentence-tokenizers
JavaScript sentence tokenizers for multiple natural languages.
kr-sentence
Light-weight sentence tokenizer for Korean. Supports full-width and half-width punctuation marks.
rr-search-tries
Trie-based search classes for JavaScript
sentence-tokenizers
Sentence tokenizers for several languages
thelangbot
Twitter bot to help you learn foreign languages. Building a community through tweets. Retweets #100DaysOfLanguage and #langtwt.
zh-sentence
Light-weight sentence tokenizer for Chinese languages.
Rairye's Repositories
Rairye/zh-sentence
Light-weight sentence tokenizer for Chinese languages.
Rairye/ht-getter
Searches a document for hash tags. Supports multiple natural languages. Works in various contexts.
Rairye/ja-sentence
Light-weight sentence tokenizer for Japanese.
Rairye/js-sentence-tokenizers
JavaScript sentence tokenizers for multiple natural languages.
Rairye/kr-sentence
Light-weight sentence tokenizer for Korean. Supports full-width and half-width punctuation marks.
Rairye/sentence-tokenizers
Sentence tokenizers for several languages
Rairye/thelangbot
Twitter bot to help you learn foreign languages. Building a community through tweets. Retweets #100DaysOfLanguage and #langtwt.
Rairye/back-cleaner
Server-side Python tool for escaping script tags and converting characters into HTML entities (no regex).
Rairye/content_moderation_ideas
A collection of proof-of-concept approaches for using ideas from NLP/text processing to handle content moderation. (Light-weight approaches, no ML)
Rairye/convert-with-ents
Light-weight tool for converting characters in a string into common HTML entities (without regex).
Rairye/freefields-from-string
Code for extracting field-like text from unformatted strings
Rairye/gs-scripts
Samples of .gs scripts
Rairye/rr-search-tries
Trie-based search classes for JavaScript
Rairye/CPP-samples
C++ samples
Rairye/js-mnl-punct-norm
Light-weight tool for removing punctuation. Supports multiple natural languages. Useful for scrapping, machine learning, and data analysis.
Rairye/js-mnl-ws-norm
Light-weight tool for normalizing whitespace and accurately tokenizing words. Multiple natural languages supported. Useful for scrapping, machine learning, and data analysis.
Rairye/ko-ww-stopwords
Set of whole-word (independent) stop words in Korean
Rairye/mnl-punct-norm
Light-weight tool for removing punctuation. Supports multiple natural languages. Useful for scrapping, machine learning, and data analysis.
Rairye/mnl-ws-norm
Light-weight tool for normalizing whitespace and accurately tokenizing words (no regex). Multiple natural languages supported. Useful for scrapping, machine learning, and data analysis.
Rairye/RairyeTrieSample
トライ木の実装のサンプル(オートコンプリート辞書)Sample implementation of trie (as auto-complete dictionary)
Rairye/sentence-tk-checker
Checks output of an English sentence tokenizer and modifies the output according to default or user-defined rules.
Rairye/st-no-love
Tool for escaping script tags using backslashes (no regex).
Rairye/TranslationQATools
Java / Swing / Apache POI 翻訳の品質確保ツール
Rairye/TwoLanguageFormOutputFromSingleLanguageInput
(React Native, JavaScript) 単数の言語の入力により、二つの言語でフォームを出力するためのアプリです。App for outputting forms in two languages from single-language user input.
Rairye/variant_lists