Exception occurred when running on Colab
HonkCY opened this issue · 1 comments
HonkCY commented
Environment
- Python version: Python 3.6.9
- Tensorflow version: 1.15.4
- GPU Info:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.45.01 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... Off | 00000000:00:04.0 Off | 0 |
| N/A 31C P0 23W / 300W | 0MiB / 16130MiB | 0% Default |
| | | ERR! |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
- Memory Info:
MemTotal: 26751732 kB
MemFree: 17748536 kB
MemAvailable: 25743180 kB
Buffers: 128664 kB
Cached: 7904776 kB
SwapCached: 0 kB
Active: 824088 kB
Inactive: 7606188 kB
Active(anon): 361548 kB
Inactive(anon): 408 kB
Active(file): 462540 kB
Inactive(file): 7605780 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 228 kB
Writeback: 0 kB
AnonPages: 397280 kB
Mapped: 208216 kB
Shmem: 960 kB
Slab: 405832 kB
SReclaimable: 343264 kB
SUnreclaim: 62568 kB
KernelStack: 4304 kB
PageTables: 5488 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 13375864 kB
Committed_AS: 2823532 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
Percpu: 1904 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 181480 kB
DirectMap2M: 6109184 kB
DirectMap1G: 23068672 kB
Expected behavior
The following exceptions occurred when I run the colab notebook.
INFO:tensorflow:Restoring parameters from models/mega/model.ckpt-220000
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: Read less bytes than requested
[[{{node save/RestoreV2}}]]
[[save/RestoreV2/_1499]]
(1) Out of range: Read less bytes than requested
[[{{node save/RestoreV2}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "scripts/demo.py", line 171, in <module>
saver.restore(sess, args.ckpt_fn)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1290, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: Read less bytes than requested
[[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[save/RestoreV2/_1499]]
(1) Out of range: Read less bytes than requested
[[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'save/RestoreV2':
File "scripts/demo.py", line 170, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 828, in __init__
self.build()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build
build_restore=build_restore)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 508, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2
name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
HonkCY commented
It's because the pl script cannot download the model from Google Cloud completely.