NLP Recipes for Japanese

This repository contains samples codes for natural language processing in Japanese. It's highly inspired by microsoft/nlp-recipes.

Content

The following is a summary of the commonly used NLP scenarios covered in the repository. Each scenario is demonstrated in one or more scripts or Jupyter notebook examples that make use of the core code base of models and repository utilities.

Category	Methods
Basic	Cleaning, Normalization, Stopwords, Sentence Segmantation, Ruby
Embeddings	Word2Vec, fastText, Universal Sentence Encoder
Feature Engineering	Bag-of-Words, TF-IDF, BM25, SWEM, SCDV
Morphological Analysis	Konoha, nagisa
Sentence Similarity	Cosine Similarity
Sentiment Analysis	oseti
Text Classification	TF-IDF & Logistic Regression, TF-IDF & LightGBM, BERT, T5
Visualization	Visualization with Japanese texts

Environment

docker-compose up -d --build
docker exec -it nlp-recipes-ja bash

upura/nlp-recipes-ja

NLP Recipes for Japanese

Content

Environment