google-research/t5x

Accessing data in requester pays bucket

tomlimi opened this issue · 0 comments

I'm trying to pretrain mt5 model on mc4 data available at the requester pays bucket, following the example:

TFDS_DATA_DIR="gs://allennlp-tensorflow-datasets/c4/multilingual/3.0.1"

python3 ${T5X_DIR}/t5x/train.py \
  --gin_file="${T5X_DIR}/t5x/examples/pretrain_mt5_mc4.gin" \
  --gin.MODEL_DIR="'${MODEL_DIR}'" \
  --tfds_data_dir=${TFDS_DATA_DIR}

I get the following error caused because I cannot authenticate my project. How should I add my project ID?

tensorflow.python.framework.errors_impl.InvalidArgumentError: Error executing an HTTP request: HTTP response code 400 with body '{
  "error": {
    "code": 400,
    "message": "Bucket is a requester pays bucket but no user project provided.",
    "errors": [
      {
        "message": "Bucket is a requester pays bucket but no user project provided.",
        "domain": "global",
        "reason": "required"
      }
    ]
  }
}