tensorflow/tensor2tensor

cant train translation on my own data

Opened this issue · 2 comments

Description

when I am trying to train english2hindi translation, I got following error
...raphil@raphil-Lenovo-G580:$ PROBLEM=translateenhi
raphil@raphil-Lenovo-G580:
$ MODEL=transformer
raphil@raphil-Lenovo-G580:$ HPARAMS=transformer_base
raphil@raphil-Lenovo-G580:
$ DATA_DIR=$HOME/t2t_data
raphil@raphil-Lenovo-G580:$ TMP_DIR=/tmp/t2t_datagen
raphil@raphil-Lenovo-G580:
$ t2t-datagen \

--data_dir=$DATA_DIR
--tmp_dir=$TMP_DIR
--problem=$PROBLEM
raphil@raphil-Lenovo-G580:$ PROBLEM=translateenhi
raphil@raphil-Lenovo-G580:
$ MODEL=transformer
raphil@raphil-Lenovo-G580:$ HPARAMS=transformer_base
raphil@raphil-Lenovo-G580:
$ DATA_DIR=$HOME/t2t_data
raphil@raphil-Lenovo-G580:$ TMP_DIR=/tmp/t2t_datagen
raphil@raphil-Lenovo-G580:
$ TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS
raphil@raphil-Lenovo-G580:$ mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR
raphil@raphil-Lenovo-G580:
$ t2t-trainer
--data_dir=$DATA_DIR
--problems=$PROBLEM
--model=$MODEL
--hparams_set=$HPARAMS
--output_dir=$TRAIN_DIR
--train_steps=1000
--eval_steps=100
INFO:tensorflow:Found unparsed command-line arguments. Checking if any start with --hp_ and interpreting those as hparams settings.
WARNING:tensorflow:Found unknown flag: --problems=translateenhi
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
WARNING:tensorflow:Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
Traceback (most recent call last):
File "/home/raphil/.local/bin/t2t-trainer", line 32, in
tf.app.run()
File "/home/raphil/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/raphil/.local/bin/t2t-trainer", line 28, in main
t2t_trainer.main(argv)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/bin/t2t_trainer.py", line 354, in main
exp = exp_fn(create_run_config(hparams), hparams)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 344, in experiment_fn
return create_experiment(run_config, hparams, *args, **kwargs)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 265, in create_experiment
add_problem_hparams(hparams, problem_name)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 356, in add_problem_hparams
problem = registry.problem(problem_name)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/registry.py", line 265, in problem
base_name, was_reversed, was_copy = parse_problem_name(name)
File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/registry.py", line 256, in parse_problem_name
if problem_name.endswith("_rev"):
AttributeError: 'NoneType' object has no attribute 'endswith'
raphil@raphil-Lenovo-G580:$ ^C
raphil@raphil-Lenovo-G580:
$

Environment information

OS: <ubuntu>

$ pip freeze | grep tensor
# your output here
tensor2tensor==1.6.5
tensorboard==1.7.0
tensorflow==1.6.0
tensorflow-tensorboard==0.4.0

$ python -V
# your output here
```Python 2.7.12


### For bugs: reproduction and error logs

Steps to reproduce:

...


Error logs:

...

INFO:tensorflow:Found unparsed command-line arguments. Checking if any start with --hp_ and interpreting those as hparams settings.
WARNING:tensorflow:Found unknown flag: --problems=translateenhi
INFO:tensorflow:schedule=continuous_train_and_eval
INFO:tensorflow:worker_gpu=1
INFO:tensorflow:sync=False
WARNING:tensorflow:Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.
INFO:tensorflow:datashard_devices: ['gpu:0']
INFO:tensorflow:caching_devices: None
INFO:tensorflow:ps_devices: ['gpu:0']
Traceback (most recent call last):
  File "/home/raphil/.local/bin/t2t-trainer", line 32, in <module>
    tf.app.run()
  File "/home/raphil/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "/home/raphil/.local/bin/t2t-trainer", line 28, in main
    t2t_trainer.main(argv)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/bin/t2t_trainer.py", line 354, in main
    exp = exp_fn(create_run_config(hparams), hparams)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 344, in experiment_fn
    return create_experiment(run_config, hparams, *args, **kwargs)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 265, in create_experiment
    add_problem_hparams(hparams, problem_name)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/trainer_lib.py", line 356, in add_problem_hparams
    problem = registry.problem(problem_name)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/registry.py", line 265, in problem
    base_name, was_reversed, was_copy = parse_problem_name(name)
  File "/home/raphil/.local/lib/python2.7/site-packages/tensor2tensor/utils/registry.py", line 256, in parse_problem_name
    if problem_name.endswith("_rev"):
AttributeError: 'NoneType' object has no attribute 'endswith'

Please, use https://help.github.com/articles/creating-and-highlighting-code-blocks/
and read the warnings in the output you posted, namely:

INFO:tensorflow:Found unparsed command-line arguments. Checking if any start with --hp_ and interpreting those as hparams settings.
WARNING:tensorflow:Found unknown flag: --problems=translateenhi

which seems to be the cause of your problem.

If you are adding your component please specify --t2t_usr_dir=. See the "Adding your own components" section and related examples for more details.

https://github.com/tensorflow/tensor2tensor#adding-your-own-components

Just a heads up on the sample file:
There are several different variations in the translation file (all decorated with @registry.register_problem). You only need one of these to get a sample run working.