Moses is an open source SMT (Statistical Machine Translation) system that was for the most part developed at the University of Edinburgh.
Moses is licensed under the LGPL.
This Dockerfile is made to construct a English to Japanese translation environment.
Mosesは、任意の言語間で翻訳が可能な機械翻訳システムです。 統計的機械翻訳と呼ばれる技術が使用されています。
この Dockerfileは、Mosesを使用して英日翻訳ができる環境を作成します。
To run this system, you will need an environment that can execute Docker commands. The first thing you will need to do is create a Docker image, as explained below.
Dockerを利用できる環境で、実行してください。
まず、以下の様にして、Dockerイメージを作成します。
$ cd dockerfile $ docker build -t moses:0.1 . ... Successfully built d2e2324dfe06
It can take some time to do this. With my PC(Core i7、 2.7GHz), it takes about 40 min. Once the process has finished, you can do the following to confirm your image exists.
これには時間がかかります。私のPC(Core i7、 2.7GHz)で、約40分でした。
以下の様に、Dockerイメージが出来ていることを確認できます。
$ docker images REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE moses 0.1 d2e2324dfe06 7 minutes ago 2.828 GB
To translate, you should startup the Moses command in the Docker Container and input the following English.
翻訳を行なうためには、以下の様に Dockerコンテナ内で Mosesを起動し、英語文を入力します。
$ docker run -i -t moses:0.1 // (1) /opt# mosesdecoder-master/bin/moses -f model/moses.ini // (2) Defined parameters (per moses.ini or switch): config: model/moses.ini distortion-limit: 6 feature: UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryMemory name=TranslationModel0 num-features=4 path=/opt/model/phrase-table.gz input-factor=0 output-factor=0 Distortion KENLM name=LM0 factor=0 path=/opt/corpus/tanaka.ja.arpa order=5 input-factors: 0 mapping: 0 T 0 weight: UnknownWordPenalty0= 1 WordPenalty0= -1 PhrasePenalty0= 0.2 TranslationModel0= 0.2 0.2 0.2 0.2 Distortion0= 0.3 LM0= 0.5 line=UnknownWordPenalty FeatureFunction: UnknownWordPenalty0 start: 0 end: 0 line=WordPenalty FeatureFunction: WordPenalty0 start: 1 end: 1 line=PhrasePenalty FeatureFunction: PhrasePenalty0 start: 2 end: 2 line=PhraseDictionaryMemory name=TranslationModel0 num-features=4 path=/opt/model/phrase-table.gz input-factor=0 output-factor=0 FeatureFunction: TranslationModel0 start: 3 end: 6 line=Distortion FeatureFunction: Distortion0 start: 7 end: 7 line=KENLM name=LM0 factor=0 path=/opt/corpus/tanaka.ja.arpa order=5 Loading the LM will be faster if you build a binary file. Reading /opt/corpus/tanaka.ja.arpa ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 **************************************************************************************************** FeatureFunction: LM0 start: 8 end: 8 Loading UnknownWordPenalty0 Loading WordPenalty0 Loading PhrasePenalty0 Loading Distortion0 Loading LM0 Loading TranslationModel0 Start loading text phrase table. Moses format : [1.693] seconds Reading /opt/model/phrase-table.gz ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 **************************************************************************************************** Created input-output object : [8.323] seconds my name is nakano // (3) Translating: my name is nakano Line 0: Initialize search took 0.002 seconds total Line 0: Collecting options took 0.000 seconds at moses/Manager.cpp Line 141 Line 0: Search took 0.008 seconds 私の 名前 nakano だ 。 // (4) BEST TRANSLATION: 私の 名前 nakano|UNK|UNK|UNK だ 。 [1111] [total=-112.850] core=(-100.000,-6.000,4.000,-1.224,-4.902,-3.495,-5.582,-3.000,-31.419) Line 0: Decision rule took 0.000 seconds total Line 0: Additional reporting took 0.000 seconds total Line 0: Translation took 0.014 seconds total
-
Start the Docker Container / Dockerコンテナの起動
-
Start the Moses / Mosesの起動
-
Input a English sentence / 入力した英語文
-
Output of Japanese sentence / 翻訳された日本語文
This Dockerfile references the blog linked below. I’d like to express my thanks to its author.
この Dockerfileは、次の Blogを参考にして作成しました。 Blogの執筆者に感謝いたします。ありがとうございました。
My next steps will be:
-
Japanese to English translation
-
RESTful Conversion Service
-
Integration with Chat application
今後は、日英翻訳と、RESTサービス化、Chatアプリとの融合、を考えています。