Can't download GLUE RTE dataset
shunyuzh opened this issue · 1 comments
shunyuzh commented
t5_mesh_transformer \
--model_dir="${MODEL_DIR}" \
--t5_tfds_data_dir="${DATA_DIR}" \
--gin_file="dataset.gin" \
--gin_param="utils.run.mesh_shape = 'model:1,batch:1'" \
--gin_param="utils.run.mesh_devices = ['gpu:0']" \
--gin_param="MIXTURE_NAME = 'glue_rte_v002'" \
--gin_file="./t5_data/small/operative_config.gin"
Using above script, I can't download RTE task dataset. However, I can download MRPC dataset by replace 'glue_rte_v002' with 'glue_mrpc_v002'.
Generating dataset glue (/home/shunyu/container/Project/t5_data/glue/glue/rte/1.0.0)
Downloading and preparing dataset glue/rte/1.0.0 (download: 680.81 KiB, generated: Unknown size, total: 680.81 KiB) to /home/shunyu/container/Project/t5_data/glue/glue/rte/1.0.0...
Dl Completed...: 0 url [00:00, ? url/s] I0810 06:04:16.266083 139915764279104 download_manager.py:476] Downloading https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FRTE.zip?alt=media&token=5efa7e85-a0bb-4f19-8ea2-9e1840f077fb into /home/shunyu/container/Project/t5_data/glue/downloads/fire.goog.com_v0_b_mtl-sent-repr.apps.6LYu5E5vi2rqdhk1koV5_-GqVdFhgIxILgclq73PnGQ.zipalt=media&token=5efa7e85-a0bb-4f19-8ea2-9e1840f077fb.tmp.e6859f06c60544f7a5e3e3b8972b64ea...
Extraction completed...: 0 file [00:00, ? file/s] | 0/1 [00:00<?, ? url/s]
Dl Size...: 0 MiB [00:00, ? MiB/s]
Dl Completed...: 0%| | 0/1 [00:00<?, ? url/s]
INFO:tensorflow:training_loop marked as finished
I0810 06:04:16.657165 139915764279104 error_handling.py:115] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W0810 06:04:16.657362 139915764279104 error_handling.py:149] Reraising captured error
Traceback (most recent call last):
File "/anaconda/envs/t5/bin/t5_mesh_transformer", line 8, in <module>
sys.exit(console_entry_point())
File "/home/shunyu/container/Project/text-to-text-transfer-transformer/t5/models/mesh_transformer_main.py", line 283, in console_entry_point
app.run(main)
File "/home/shunyu/.local/lib/python3.8/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/shunyu/.local/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/home/shunyu/container/Project/text-to-text-transfer-transformer/t5/models/mesh_transformer_main.py", line 272, in main
utils.run(
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/config.py", line 1069, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/config.py", line 1046, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/anaconda/envs/t5/lib/python3.8/site-packages/mesh_tensorflow/transformer/utils.py", line 2598, in run
train_model_fn(estimator, vocabulary, sequence_length, batch_size,
File "/anaconda/envs/t5/lib/python3.8/site-packages/mesh_tensorflow/transformer/utils.py", line 1815, in train_model
estimator.train(input_fn=input_fn, max_steps=train_steps, hooks=hooks)
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3110, in train
rendezvous.raise_errors()
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 150, in raise_errors
six.reraise(typ, value, traceback)
File "/anaconda/envs/t5/lib/python3.8/site-packages/six.py", line 703, in reraise
raise value
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3100, in train
return super(TPUEstimator, self).train(
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 349, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1175, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1201, in _train_model_default
self._get_features_and_labels_from_input_fn(input_fn, ModeKeys.TRAIN))
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1037, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3062, in _call_input_fn
return input_fn(**kwargs)
File "/anaconda/envs/t5/lib/python3.8/site-packages/mesh_tensorflow/transformer/utils.py", line 1792, in input_fn
dataset = train_dataset_fn(
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/config.py", line 1069, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.__traceback__) from None
File "/anaconda/envs/t5/lib/python3.8/site-packages/gin/config.py", line 1046, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/home/shunyu/container/Project/text-to-text-transfer-transformer/t5/models/mesh_transformer.py", line 77, in mesh_train_dataset_fn
ds = mixture_or_task.get_dataset(
File "/anaconda/envs/t5/lib/python3.8/site-packages/seqio/dataset_providers.py", line 1041, in get_dataset
ds = source.get_dataset(split=split, shuffle=shuffle, seed=seed)
File "/anaconda/envs/t5/lib/python3.8/site-packages/seqio/dataset_providers.py", line 371, in get_dataset
return self.tfds_dataset.load(
File "/anaconda/envs/t5/lib/python3.8/site-packages/seqio/utils.py", line 130, in load
return tfds.load(
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/load.py", line 346, in load
dbuilder.download_and_prepare(**download_and_prepare_kwargs)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/dataset_builder.py", line 385, in download_and_prepare
self._download_and_prepare(
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1022, in _download_and_prepare
super(GeneratorBasedBuilder, self)._download_and_prepare(
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/dataset_builder.py", line 961, in _download_and_prepare
for split_generator in self._split_generators(
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/text/glue.py", line 448, in _split_generators
dl_dir = dl_manager.download_and_extract(self.builder_config.data_url)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/download_manager.py", line 603, in download_and_extract
return _map_promise(self._download_extract, url_or_urls)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/download_manager.py", line 636, in _map_promise
res = tf.nest.map_structure(lambda p: p.get(), all_promises) # Wait promises
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow/python/util/nest.py", line 867, in map_structure
structure[0], [func(*x) for x in entries],
File "/home/shunyu/.local/lib/python3.8/site-packages/tensorflow/python/util/nest.py", line 867, in <listcomp>
structure[0], [func(*x) for x in entries],
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/download_manager.py", line 636, in <lambda>
res = tf.nest.map_structure(lambda p: p.get(), all_promises) # Wait promises
File "/anaconda/envs/t5/lib/python3.8/site-packages/promise/promise.py", line 512, in get
return self._target_settled_value(_raise=True)
File "/anaconda/envs/t5/lib/python3.8/site-packages/promise/promise.py", line 516, in _target_settled_value
return self._target()._settled_value(_raise)
File "/anaconda/envs/t5/lib/python3.8/site-packages/promise/promise.py", line 226, in _settled_value
reraise(type(raise_val), raise_val, self._traceback)
File "/anaconda/envs/t5/lib/python3.8/site-packages/six.py", line 703, in reraise
raise value
File "/anaconda/envs/t5/lib/python3.8/site-packages/promise/promise.py", line 844, in handle_future_result
resolve(future.result())
File "/anaconda/envs/t5/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/anaconda/envs/t5/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/anaconda/envs/t5/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/downloader.py", line 184, in _sync_download
with _open_url(url) as (response, iter_content):
File "/anaconda/envs/t5/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/downloader.py", line 231, in _open_with_requests
_assert_status(response)
File "/anaconda/envs/t5/lib/python3.8/site-packages/tensorflow_datasets/core/download/downloader.py", line 258, in _assert_status
raise DownloadError('Failed to get url {}. HTTP code: {}.'.format(
tensorflow_datasets.core.download.downloader.DownloadError: Failed to get url https://firebasestorage.googleapis.com/v0/b/mtl-sentence-representations.appspot.com/o/data%2FRTE.zip?alt=media&token=5efa7e85-a0bb-4f19-8ea2-9e1840f077fb. HTTP code: 403.
In call to configurable 'mesh_train_dataset_fn' (<function mesh_train_dataset_fn at 0x7f3fd8aeb790>)
In call to configurable 'run' (<function run at 0x7f3fd8b630d0>)
Who can help?
craffel commented
Hi, we use TensorFlow Datasets to download and prepare datasets, and that's where this error is occurring (looks like it can't access the URL for downloading, not sure why). You should open an issue on https://github.com/tensorflow/datasets