Error permission denied model in GCS
giappham opened this issue · 11 comments
Google just updated the authentication method on April 1st, leading to an error in accessing the model on the cloud. How to fix?
InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
From /job:worker/replica:0/task:0:
Unsuccessful TensorSliceReader constructor: Failed to get matching files on gs://test_model/model.ckpt-524288: Permission denied: Error executing an HTTP request: HTTP response code 403 with body '{
"error": {
"code": 403,
"message": "service-495559152420@cloud-tpu.iam.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket.",
"errors": [
{
"message": "service-495559152420@cloud-tpu.iam.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket.",
"domain": "global",
"reason": "forbidden"
}
]
}
}
'
I no longer see the abc.json file when google upgrade colab. Because before google authenticated via key, now just click allow.
you can click Menu in GCS --> IAM-Admin --> add role Storage Admin for Principal
Hello,
Thank you for your answer. Unfortunately, I still have same the issue. I am running the following code in google colab:
print("Installing dependencies...")
%tensorflow_version 2.x
#!pip install -q tensorflow==2.8
#!pip install -q tensorflow-gcs-config==2.8
!pip install -q t5
import functools
import os
import time
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
import tensorflow.compat.v1 as tf
import tensorflow_datasets as tfds
import gin
import t5
ON_CLOUD = True
if ON_CLOUD:
print("Setting up GCS access...")
import tensorflow_gcs_config
from google.colab import auth
# Set credentials for GCS reading/writing from Colab and TPU.
TPU_TOPOLOGY = "v3-8" # v3-8
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
TPU_ADDRESS = tpu.get_master()
print('Running on TPU:', TPU_ADDRESS)
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
auth.authenticate_user()
tf.config.experimental_connect_to_host(TPU_ADDRESS)
tensorflow_gcs_config.configure_gcs_from_colab_auth()
tf.disable_v2_behavior()
# Improve logging.
from contextlib import contextmanager
import logging as py_logging
if ON_CLOUD:
tf.get_logger().propagate = False
py_logging.root.setLevel('INFO')
@contextmanager
def tf_verbosity_level(level):
og_level = tf.logging.get_verbosity()
tf.logging.set_verbosity(level)
yield
tf.logging.set_verbosity(og_level)
Thank you in advance.
Hello, my code it's working.
From the code I commented those lines:
tf.config.experimental_connect_to_host(TPU_ADDRESS)
tensorflow_gcs_config.configure_gcs_from_colab_auth()
Later, each bucket has some persmissions as the image shows:
So, I add the permissions are indicated in the following image:
The most important it's that I had to make public my bucket. Otherwise, it did not work.
you can click Menu in GCS --> IAM-Admin --> add role Storage Admin for Principal
Let me clarify this. We have to grant the appropriate storage role for TPU service account: service-495559152420@cloud-tpu.iam.gserviceaccount.com. For our case, just storage admin solved the problem.
I have made my bucker public by making allUsers role of storage admin just like @JessicaLopezEspejel 's screenshot shows but still it gives me error of FileNotFoundError: [Errno 2] No such file or directory: '/content/adc.json'
.
Does anyone know why it might be the case? Thanks.
@mshen2 can you try this code? please
%tensorflow_version 2.x
import tensorflow.compat.v1 as tf
import tensorflow_datasets as tfds
import gin
ON_CLOUD = True
if ON_CLOUD:
print("Setting up GCS access...")
import tensorflow_gcs_config
from google.colab import auth
# Set credentials for GCS reading/writing from Colab and TPU.
TPU_TOPOLOGY = "v3-8" # v3-8
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
TPU_ADDRESS = tpu.get_master()
print('Running on TPU:', TPU_ADDRESS)
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
auth.authenticate_user()
#tf.config.experimental_connect_to_host(TPU_ADDRESS)
#tensorflow_gcs_config.configure_gcs_from_colab_auth()
tf.disable_v2_behavior()
# Improve logging.
from contextlib import contextmanager
import logging as py_logging
if ON_CLOUD:
tf.get_logger().propagate = False
py_logging.root.setLevel('INFO')
@contextmanager
def tf_verbosity_level(level):
og_level = tf.logging.get_verbosity()
tf.logging.set_verbosity(level)
yield
tf.logging.set_verbosity(og_level)
Normally, if you modified the storage role for TPU service, it will work correctly. It's a code from T5, it is the one I am using and it is working well.
I'm told the solution is to authenticate with a service account instead due to https://developers.googleblog.com/2022/02/making-oauth-flows-safer.html#disallowed-oo
Can someone please try auth.authenticate_service_account
instead of auth.authenticate_user
to verify it works. You can create the requested key with http://cloud/iam/docs/creating-managing-service-account-keys#creating.
@JessicaLopezEspejel Thank you, commenting out those two lines do work, and I can indeed write and read from GCS on colab. I am not entirely sure what these two lines do and whether they are necessary for later, but now the code it's working fine at this point.
@adarob Same error happens when using auth.authenticate_service_account()
, and the link is down actually.
Hello everyone, probably I'm a bit late on this issue, but for me, adding the command below before setting up GCS solved the problem.
os.environ['USE_AUTH_EPHEM'] = '0'