suriyadeepan/easy_seq2seq

UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 1: invalid start byte

Opened this issue · 0 comments

martin@ubuntu:~/Downloads/easy_seq2seq-master$ python execute.py

>> Mode : train

Preparing data in working_dir/
Tokenizing data in data/test.enc
Traceback (most recent call last):
  File "execute.py", line 303, in <module>
    train()
  File "execute.py", line 117, in train
    enc_train, dec_train, enc_dev, dec_dev, _, _ = data_utils.prepare_custom_data(gConfig['working_directory'],gConfig['train_enc'],gConfig['train_dec'],gConfig['test_enc'],gConfig['test_dec'],gConfig['enc_vocab_size'],gConfig['dec_vocab_size'])
  File "/home/martin/Downloads/easy_seq2seq-master/data_utils.py", line 147, in prepare_custom_data
    data_to_token_ids(test_enc, enc_dev_ids_path, enc_vocab_path, tokenizer)
  File "/home/martin/Downloads/easy_seq2seq-master/data_utils.py", line 125, in data_to_token_ids
    normalize_digits)
  File "/home/martin/Downloads/easy_seq2seq-master/data_utils.py", line 104, in sentence_to_token_ids
    words = basic_tokenizer(sentence)
  File "/home/martin/Downloads/easy_seq2seq-master/data_utils.py", line 51, in basic_tokenizer
    word = str.encode(space_separated_fragment)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 1: invalid start byte

then i change from python 2.7 to python3

martin@ubuntu:~/Downloads/easy_seq2seq-master$ python3 execute.py

Mode : train

Preparing data in working_dir/
Tokenizing data in data/test.dec
Creating 3 layers of 256 units.
WARNING:tensorflow:At least two cells provided to MultiRNNCell are the same object and will share weights.
Traceback (most recent call last):
File "execute.py", line 301, in
train()
File "execute.py", line 124, in train
model = create_model(sess, False)
File "execute.py", line 96, in create_model
model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only)
File "/home/martin/Downloads/easy_seq2seq-master/seq2seq_model.py", line 147, in init
self.outputs, self.losses = tf.nn.seq2seq.model_with_buckets(
AttributeError: module 'tensorflow.nn' has no attribute 'seq2seq'