DeepRec-AI/DeepRec

Parquet dataset reader throw InternalError: Read uninitialized Dataset variant.

Opened this issue · 1 comments

Here is my test code

import tensorflow as tf
from tensorflow.python.data.experimental.ops import parquet_dataset_ops
s = tf.Session()
ds = parquet_dataset_ops.ParquetDataset('test.parquet', batch_size=10)
dataset_output_types = tf.data.get_output_types(ds)
dataset_output_shapes = tf.data.get_output_shapes(ds)
iterator = tf.data.Iterator.from_structure(dataset_output_types, dataset_output_shapes)
init_op = iterator.make_initializer(ds)
s.run(init_op)
s.run(iterator.get_next())

following is the full log

Traceback (most recent call last):
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1374, in _do_call
    return fn(*args)
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1358, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1450, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InternalError: Read uninitialized Dataset variant.
         [[{{node IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    s.run(iterator.get_next())
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 964, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1188, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1367, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1393, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Read uninitialized Dataset variant.
         [[node IteratorGetNext (defined at /app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'IteratorGetNext':
  File "test.py", line 10, in <module>
    s.run(iterator.get_next())
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 425, in get_next
    flat_ret = gen_dataset_ops.iterator_get_next(
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 3683, in iterator_get_next
    _, _, _op = _op_def_lib._apply_op_helper(
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/op_def_library.py", line 792, in _apply_op_helper
    op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3360, in create_op
    return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 3422, in _create_op_internal
    ret = Operation(
  File "/app/anaconda3/lib/python3.8/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Use DeepRec release image: alideeprec/deeprec-release:deeprec2310-cpu-py38-ubuntu20.04
test code:

import tensorflow as tf
from tensorflow.python.data.experimental.ops import parquet_dataset_ops
s = tf.Session()
ds = parquet_dataset_ops.ParquetDataset('feature.parquet', batch_size=10)
dataset_output_types = tf.data.get_output_types(ds)
dataset_output_shapes = tf.data.get_output_shapes(ds)
iterator = tf.data.Iterator.from_structure(dataset_output_types, dataset_output_shapes)
init_op = iterator.make_initializer(ds)
s.run(init_op)
a = s.run(iterator.get_next())
print(a)

The result of the execution is as follows:
image

I was unable to reproduce the issue, and the program execution is running as expected.