Unable to load serialized object persisted on Azure Blob storage
demihuman2020 opened this issue · 0 comments
demihuman2020 commented
We serialize the 'pipeline' object and store it on Azure Blob Storage.
with open("feg_pipeline.pkl", 'wb') as f:
dill.dump(pipeline, f, protocol=pickle.HIGHEST_PROTOCOL)
blob_service_client = BlobServiceClient.from_connection_string(conn_str=‘XXX’)
container_name = ‘YYY’
blob_service_client.create_container(container_name, public_access=PublicAccess.Container)
blob_client = blob_service_client.get_blob_client(
container=container_name, blob='feg_pipeline.pkl')
blob_client.upload_blob('feg_pipeline.pkl')
Later in a different function we read this from Blob storage as follows -
blob = BlobClient(account_url='https://fhghjhjh’,
container_name=‘YYY’,
blob_name='feg_pipeline.pkl’,
credential=‘XXX’)
feg_from_blob = None
with open("feg_pipeline.pkl”, "wb") as f:
data = blob.download_blob()
data.readinto(f)
with open("feg_pipeline.pkl”, "rb") as f:
feg_from_blob = dill.load(f)
For this we are getting
UnpicklingError: invalid load key, 'f'.
We have tried using -
Dill, Joblib, cPickle, CloudPickle and Pickle methods for serializing and deserializing, all of these gave keyerror while loading the object from the file downloaded from Blob.
Base64 encoding(while serializing) and decoding(while deserializing)..This gives a padding error while loading.
What is the best way to persist and reuse such objects in Azure?