protext is a Python library for processing Japanese text. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.
from protext import ProText
text = """
"""
doc = ProText(text)
doc.tags
doc.noun_phrases
- Noun phrase extraction
- Part-of-speech tagging
- Tokenization (splitting text into words and sentences)
- Parsing
- Word inflection (pluralization and singularization) and lemmatization
- WordNet integration
- Word Embeddings
To install protext, simply run:
$ pip install protext
$ python -m protext.download_corpora
MIT licensed. See the bundled LICENSE file for more details.