Inspired by Goken's idea presented in this YouTube video, this is an interactive tool for translating Japanese subtitles (and text in general) selectively with ease, using automation based on natural language processing engines.
At this stage, this is only a proof of concept for the value such a tool can be of for those wanting to learn Japanese (or other languages possibly) using Goken's subtitle format. If you like the idea or have any suggestions, please (!) open an issue or contact me!
Clone the repository and then:
pip install -e ./seltran
The seltran-editor
executable will be installed in your python scripts directory.
-
All tokenization, translation and phonetization operations are performed offline, using the
spacy
andpykakasi
NLP libraries for token generation and phonetization, and a simple offline Japanese dictionary usingjamdict
. -
An easy to configure filtration mechanism exists for determining which types of tokens (e.g. nouns, verbs, or any other token quality) get translation suggestions and which only get phonetized. Currently this filtration cannot be configured through the UI though, see the roadmap.
- Interactive text translation for individual tokens with suggestions based on part of speech.
- Configure suggestion filters through UI
- Apply actions to all tokens according to rules - e.g. phonetize all particles, translate all nouns using first dictionary entry
- Use actual tranlation engine select the best fitting translation for a word based on context.