Package upload fails with "AttributeError: can't set attribute" when using pypicloud[gcs] 1.3.9 or later at GAE
vanguard737 opened this issue ยท 6 comments
Hi,
I'm running pypicloud
within Google App Engine, using uWSGI
and with GCS as the storage backend. On pypicloud
1.3.8 and earlier, this works fine. However, after upgrading to pypicloud
1.3.9 or later, I can no longer upload packages (either via poetry publish --repository <my_repo_name>
, or via the web UI). I observed the errors below in /var/log/pypicloud.log
.
Given that this behavior manifested in 1.3.9, and that the errors are coming up from smart_open
, PR #304 seems possibly relevant. In meantime, workaround was just to roll back to 1.3.8.
Thanks, and let me know if you need any further debugging info -
C.J.
ERROR 2022-09-25 19:36:35,783 [pypicloud.views] can't set attribute
Traceback (most recent call last):
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/tweens.py", line 41, in excview_tween
response = handler(request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/router.py", line 143, in handle_request
response = _call_view(
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/view.py", line 674, in _call_view
response = view_callable(context, request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/config/views.py", line 151, in __call__
return view(context, request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/config/views.py", line 170, in attr_view
return view(context, request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/config/views.py", line 196, in predicate_wrapper
return view(context, request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid/viewderivers.py", line 427, in rendered_view
result = view(context, request)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pyramid_duh/params.py", line 306, in param_twiddler
return fxn(**scope)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pypicloud/views/simple.py", line 39, in upload
return request.db.upload(
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pypicloud/cache/base.py", line 151, in upload
self.storage.upload(new_pkg, data)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pypicloud/storage/gcs.py", line 207, in upload
with _open(
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/smart_open/smart_open_lib.py", line 224, in open
binary = _open_binary_stream(uri, binary_mode, transport_params)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/smart_open/smart_open_lib.py", line 400, in _open_binary_stream
fobj = submodule.open_uri(uri, mode, transport_params)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/smart_open/gcs.py", line 108, in open_uri
return open(parsed_uri['bucket_id'], parsed_uri['blob_id'], mode, **kwargs)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/smart_open/gcs.py", line 149, in open
fileobj = Writer(
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/smart_open/gcs.py", line 425, in __init__
setattr(self._blob, k, v)
AttributeError: can't set attribute
hmmm interesting.
I did some light triage because this caught my eye.
BLUF: believe the issue is that pypicloud is trying to set properties that do not have setters for gcs using transport_method.
# from https://github.com/stevearc/pypicloud/blob/master/pypicloud/storage/gcs.py starting at line 207
with _open(
self.get_uri(package),
"wb",
compression="disable",
transport_params={
"client": self.bucket.client,
"blob_properties": {
"metadata": metadata,
"acl": self.object_acl,
"storage_class": self.storage_class,
},
},
) as fp:
for chunk in stream_file(datastream):
fp.write(chunk) # multipart upload
and after reviewing the following:
- https://github.com/RaRe-Technologies/smart_open/blob/develop/smart_open/smart_open_lib.py
- https://github.com/RaRe-Technologies/smart_open/blob/4c6bc38d6b35aa75e5e9b8041bb80f0f956ac486/smart_open/smart_open_lib.py#L366
- https://github.com/RaRe-Technologies/smart_open/blob/develop/smart_open/gcs.py
- https://github.com/RaRe-Technologies/smart_open/blob/4c6bc38d6b35aa75e5e9b8041bb80f0f956ac486/smart_open/gcs.py#L395
- https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob#google_cloud_storage_blob_Blob_storage_class
- https://docs.python.org/3/library/functions.html#setattr
- https://stackoverflow.com/questions/27396339/attributeerror-cant-set-attribute
Appears to be straight forward:
pypicloud is passing "acl", "metadata", and "storage_class" which is resulting in these (ultimately) being passed to setattr
if blob_properties:
for k, v in blob_properties.items():
setattr(self._blob, k, v)
Reviewing the docs for GCS at https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob#google_cloud_storage_blob_Blob_storage_class
What jumps out to me:
- acl appears to be a dynamically generated property, and not settable.
- Essentially, it appears that pypicloud is not using the SDK for gcs properly; there's appears to be no reason to expect setting the object/blob ACL this way should work.
- storage_class seems to be same situation as for acl; setting this attribute is not the correct usage of the SDK; pypicloud should directly set this value for the blob via the proper SDK functions rather than passing down to smart_open.
- metadata appears to actually support the usage that pypicloud implements currently, so is likely ok; recommend validating with additional testing.
I'm just a passer-by whose attention was caught, and don't have bandwidth or use-case (I don't use the gcs backend at this time) to implement the changes. From my very brief perusal of the SDK docs though, it seems that the needed updates would be relatively minor (and in fact I suspect the correct usage is what's in v1.3.8 pre-smart_open, but haven't checked)
ok, I was interested to see what 1.3.8 had, and yep it's 90% of the way there.
def upload(self, package, datastream):
"""Upload the package to GCS"""
metadata = {"name": package.name, "version": package.version}
metadata.update(package.get_metadata())
blob = self._get_gcs_blob(package)
blob.metadata = metadata
blob.upload_from_file(datastream, predefined_acl=self.object_acl)
if self.storage_class is not None:
blob.update_storage_class(self.storage_class)
so recommended changes:
- update smart_open.storage.gcs to update the storage class method using the approach from 1.3.8
- For setting the object acl, here's where we hit the wall for my familiarity with gcs (I really don't use it) and the end of my mid-week late night curiosity. That said, i'm confident that whatever the solution is, it won't be that hard to implement.
P.S. did some more looking at the acl, and appears like using argname "predefined_acl" instead of "acl" might be all that's needed?
Thanks for the cc, and sorry that this slipped through. I think this needs upstream PRs.
- storage class: smart_open can do a
setattr(bucket, "storage_class", storage_class)
before uploading (docs, this behaviour is local only ref)- this to avoid a
blob.update_storage_class
which will create a new blob and copy over the contents
- this to avoid a
- acl: smart_open uses
blob.create_resumable_upload_session
, which (as only function of the bunch!) hard-codespredefined_acl=None
.- reverting to
blob.upload_from_file(datastream, predefined_acl=self.object_acl)
won't work, because for that one to perform a multipart upload, thesize
kwarg needs to be not None (and we don't know the total size ofdatastream
at this point)
- reverting to
Hi ๐
Quick update: we're looking at a major version bump from smart_open from a PR opened 2 days after mine: piskvorky/smart_open#729
As it changes internals significantly, my original PR at gcs became unnecessary for this issue. Also, instead of updating our mocks to account for new smart_open internals, I switched to fake-gcs-server analogous to switching to azurite in #304.
And that is working like a charm (as we can now actually assert ACL being returned on an uploaded blob, instead of mocking which caused this issue in the first place): I've tested the fix here using unreleased smart_open
, CI is green :)
So now we wait for smart_open to release 7.0.0! ๐