yscacaca/DeepSense

OutOfRangeError in loading data

Closed this issue · 9 comments

I download the data, unzip it in ./ and run deepSense_HHAR_tf.py, and got this error:

Traceback (most recent call last):
  File "deepSense_HHAR_tf.py", line 257, in <module>
    _, lossV, _trainY, _predict = sess.run([discOptimizer, loss, batch_label, predict])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 64, current size 10)
	 [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
	 [[Node: shuffle_batch/_19477 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_2574_shuffle_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op u'shuffle_batch', defined at:
  File "deepSense_HHAR_tf.py", line 209, in <module>
    batch_feature, batch_label = input_pipeline(csvFileList, BATCH_SIZE)
  File "deepSense_HHAR_tf.py", line 59, in input_pipeline
    min_after_dequeue=min_after_dequeue)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 1217, in shuffle_batch
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 788, in _shuffle_batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 457, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 946, in _queue_dequeue_many_v2
    timeout_ms=timeout_ms, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

OutOfRangeError (see above for traceback): RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 64, current size 10)
	 [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
	 [[Node: shuffle_batch/_19477 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_2574_shuffle_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

The "current size" changes randomly for different run. If set shuffle_sample as False, it also gives an error but with RandomShuffleQueue changed to FIFOQueue.

Any recommendation to solve it? Thanks!

Python 2.7, tensorflow_gpu 1.2.1

Could you please see if the csvFileList contains the right file addresses of training data?

I've print csvFileList. First I found there's many data with "._" as prefix, like "sepHARData_a/train/._train_96.csv", with meaningless things in it. After skip all these data, I have a full csvFileList with correct data path. But the previous error still occurs.

Sorry, I'm not sure what happen on your side. I've tested the code on three machines, and they are all fine. Maybe you can try increaing your file descriptor limit?

My file descriptor limit seems good. BTW, I put the download data in the same dir of .py files and unzip the data by

tar xzf sepHARData_a.tar.gz 

Is it the right way?

Yes, it's right.

I am getting the same problem reported here with Tensorflow_gpu 1.8. Anyone has found a solution to this problem ?

Which cuda version are you using? I'm using python 2.7.18 version. Tried with tensorflow versions: 1.5.0, 1.6.0, 1.15.0 still facing the issue. Pls. help

Testing on Ubuntu 18.04

Hi @XinyuZhou-1014,
I faced this problem. Have you found how to solve it?