BUG: Byte like object is expected not a dict error
andreamoro opened this issue · 0 comments
andreamoro commented
Have been trying to read a bunch of texts lines from a txt file and use the the text_to_word_sequence unsuccessfully using a Colab (which runs the text prepocessing 1.1.0)
from keras.preprocessing.text import Tokenizer
lines_dataset = tf.data.TextLineDataset(CSV_PATH)
k_vocabulary_set = set()
for text_tensor in lines_dataset:
print(text_tensor)
print(type(text_tensor.numpy()))
print(keras.preprocessing.text.text_to_word_sequence(text_tensor.numpy()))
print()
break
Output
tf.Tensor(b'free game', shape=(), dtype=string)
<class 'bytes'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-71-04ae0d1bc4ad> in <module>()
6 print(text_tensor)
7 print(type(text_tensor.numpy()))
----> 8 print(keras.preprocessing.text.text_to_word_sequence(text_tensor.numpy()))
9 print()
10
/usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py in text_to_word_sequence(text, filters, lower, split)
56 text = text.replace(c, split)
57 else:
---> 58 translate_dict = {c: split for c in filters}
59 translate_map = maketrans(translate_dict)
60 text = text.translate(translate_map)
TypeError: a bytes-like object is required, not 'dict'