keras-team/autokeras

Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205. in Chinese path, not in English path

huangliang0828 opened this issue · 0 comments

Bug Description

The local Python files (mainly autokeras files)reported the following error in Chinese directory path, but was used normally in English directory/path.
File D:\ProgramData\miniconda3\envs\autokeras_1\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte.

Dataset used
from sklearn.datasets import fetch_20newsgroups:
fetch_20newsgroups( subset="train", shuffle=True, random_state=42, categories=categories))
or others
tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
even
“mnist.load_data() ” “ imdb.load_data()”

I don't know what the problem is.

Setup Details

Include the details about the versions of:

  • OS type and version: win10 64bit & minconda 2023
  • Python: 3.9.13 or 3.9.16 Spyder=5.3.3
  • autokeras: >=1.0.20
  • keras-tuner: >=1.1.0
  • scikit-learn:>= 1.0.1
  • numpy: >=1.21.5
  • pandas: >=1.3.5
  • tensorflow: tensorflow >=2.9.1 or 2.10

Additional context

CMD set, chcp 655001(UTF-8) or 936(gbk)

all files run into UnicodeDecodeError at the clf.fit(.....)
Search: Running Trial #1
Hyperparameter |Value |Best Value So Far
text_block_1/bl...|vanilla |?
........
optimizer |adam |?
learning_rate |0.001 |?

Epoch 1/100
Traceback (most recent call last):

Cell In[7], line 1
clf.fit(doc_train, label_train,epochs=100, verbose=2)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\tasks\text.py:160 in fit
history = super().fit(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\auto_model.py:292 in fit
history = self.tuner.search(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:193 in search
super().search(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\base_tuner.py:179 in search
results = self.run_trial(trial, *fit_args, **fit_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\tuner.py:304 in run_trial
obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:101 in _build_and_fit_model
_, history = utils.fit_with_adaptive_batch_size(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:88 in fit_with_adaptive_batch_size
history = run_with_adaptive_batch_size(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:101 in run_with_adaptive_batch_size
history = func(x=x, validation_data=validation_data, **fit_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:89 in
batch_size, lambda **kwargs: model.fit(**kwargs), **fit_kwargs
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
raise e.with_traceback(filtered_tb) from None
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte