Quality Engineering Term Project
- Quality Evalutation of Google Translator

Analysis Flow and Development Guide

Download Test Data of English-Spanish Parallel Corpus (The data is from UN)
Translate the source language (English) into Target Language (Spanish) using Google Cloud Translator
Calculate RIBES score using NLTK
Calculate features on each sentence
Quality Engineering analysis on the processed data

Input Form of csv file

Input format of csv file is as follow and this should be followed for accurate execution without ERROR.

english	spanish
My name is john.	(Spanish true sentence of Target)
...	...

Ouput Form of csv file

english	spanish	translated_spanish	ribes_score	number_of_words	number_of_alphabets	noun	adj	verb	adp	conj	height_of_parse_tree
My name is john.	(Spanish true sentence of Target)	(Sentence generated by Google Tranlator)	0.2232	3	...	...	...	...	...	...	...
...	...	...	...	...	...	...	...	...	...	...	...

RIBES score: Evaluated score of translated sentence on the aspect of quality.
Feature #n: Features we set, like height of dependency parse tree, in order to verify some relationship among features. These will be used for making ANOVA or Orthogonal Array in DOE or Taguchi method.

Setup

Authentication

Authentication for this service is done via an API Key. To obtain an API Key:

Open the Cloud Platform Console
Make sure that billing is enabled for your project.
From the Credentials page, create a new API Key or use an existing one for your project.
Set the environmental variable before starting a program like this.

$ export GOOGLE_APPLICATION_CREDENTIALS=path_to_service_account_file

Install Dependencies

Install pip and virtualenv if you do not already have them.
Create a virtualenv. Samples are compatible with Python 3.4+.

$ virtualenv -p python3 env
$ source env/bin/activate
Install the dependencies needed to run the samples.

$ pip install -r requirements.txt

Samples

For step 1,

To make an input file for the program, run step 1.

$ python initialize_test.py ./data/es-en.csv 100

This will output a csv file as a form of the format above with the data set named 'es-en.csv'. And last argument '100' means this will have only 100 number of sentences from the data set.

Main Function including steps 2, 3, 4

To run main program with csv file ./data/input/sample.csv:

$ python main.py ./data/es-en.csv

Then it will output ./result/es-en.csv.

If the file as an argument is not the form of csv, it will print Input file is not a csv file..

For step 2,

To run the program with csv file ./data/es-en.csv:

 $ python translator_csv.py ./data/es-en.csv

Then it will output ./result/es-en.csv.

For step 3,

To run the program with csv file ./data/es-en.csv which is a file generated on step 2:

 $ python calculate_ribes.py ./data/es-en.csv

Then it will output ./result/es-en.csv.

For step 4,

To run the program with csv file ./data/es-en.csv which is a file generated on step 3:

 $ python calculate_features.py ./data/es-en.csv

Then it will output ./result/es-en.csv.

thesunghwan/google-translator-performance-analyzer

Quality Engineering Term Project
- Quality Evalutation of Google Translator

Analysis Flow and Development Guide

Input Form of csv file

Ouput Form of csv file

Setup

Authentication

Install Dependencies

Samples

For step 1,

Main Function including steps 2, 3, 4

For step 2,

For step 3,

For step 4,

References

thesunghwan/google-translator-performance-analyzer

Quality Engineering Term Project - Quality Evalutation of Google Translator

Analysis Flow and Development Guide

Input Form of csv file

Ouput Form of csv file

Setup

Authentication

Install Dependencies

Samples

For step 1,

Main Function including steps 2, 3, 4

For step 2,

For step 3,

For step 4,

References

Quality Engineering Term Project
- Quality Evalutation of Google Translator