Azure/azure-storage-python

list_blob_names is slow on my storage, returns markers

sordonia opened this issue · 5 comments

Which service(blob, file, queue) does this issue concern?

blob

Which version of the SDK was used? Please provide the output of pip freeze.

azure-storage-blob==2.1.0
azure-storage-common==2.1.0
azure-storage-file==2.1.0

What problem was encountered?

list_blob_names is extremely slow for a particular blob as it seems to be listing "markers" , never seen that before, I just call list_blob_names without num_results.

The response object is an empty list and a marker:

x-ms-date:Thu, 28 May 2020 15:43:56 GMT
x-ms-version:2019-02-02
/alsordon/phillytools
comp:list
delimiter:/
marker:2!472!MDAwMzA5IXByb2plY3RzL2luZm9uY2VzL2MxMF9yZXMxOC9wdC1yZXN1bHRzL2FwcGxpY2F0aW9uXzE1ODMzMDcxNTM4NjhfMzIzNjMvU3VwQ29uL2NpZmFyMTBfdGVuc29yYm9hcmQvSW5mb05DRXNfY2lmYXIxMF9yZXNuZXQxOF9scl8wLjVfZGVjYXlfMC4wMDAxX2Jzel8zMDcyX3RlbXBfMC41X2wxMl8xLjAwX2wxY3V0Ml8xLjAwX2wxMkkxY3V0XzEuMDBfbnNfNTEyX3Zhcl8wLjEwMF90cmlhbF8wX2Nvc2luZV93YXJtL2V2ZW50cy5vdXQudGZldmVudHMuMTU5MDM0ODMxOS5jb250YWluZXItZTc3OC0xNTgzMzA3MTUzODY4LTMyMzYzLTAxLTAwMDAwMiEwMDAwMjghMjAyMC0wNS0yNVQwMzo0Mzo0OS41NjEyMDg5WiE-
prefix:projects/infonces/
restype:container
2020-05-28 11:44:01 AM - DD https://alsordon.blob.core.windows.net:443 "GET /phillytools?restype=container&comp=list&prefix=projects%2Finfonces%2F&delimiter=%2F&marker=2%21472%21MDAwMzA5IXByb2plY3RzL2luZm9uY2VzL2MxMF9yZXMxOC9wdC1yZXN1bHRzL2FwcGxpY2F0aW9uXzE1ODMzMDcxNTM4NjhfMzIzNjMvU3VwQ29uL2NpZmFyMTBfdGVuc29yYm9hcmQvSW5mb05DRXNfY2lmYXIxMF9yZXNuZXQxOF9scl8wLjVfZGVjYXlfMC4wMDAxX2Jzel8zMDcyX3RlbXBfMC41X2wxMl8xLjAwX2wxY3V0Ml8xLjAwX2wxMkkxY3V0XzEuMDBfbnNfNTEyX3Zhcl8wLjEwMF90cmlhbF8wX2Nvc2luZV93YXJtL2V2ZW50cy5vdXQudGZldmVudHMuMTU5MDM0ODMxOS5jb250YWluZXItZTc3OC0xNTgzMzA3MTUzODY4LTMyMzYzLTAxLTAwMDAwMiEwMDAwMjghMjAyMC0wNS0yNVQwMzo0Mzo0OS41NjEyMDg5WiE- HTTP/1.1" 200 None
2020-05-28 11:44:01 AM - DD String_to_sign=GET

x-ms-client-request-id:0da62ca2-a0fa-11ea-a479-554857c9fd70
x-ms-date:Thu, 28 May 2020 15:44:01 GMT
x-ms-version:2019-02-02
/alsordon/phillytools
comp:list
delimiter:/
marker:2!400!MDAwMjU1IXByb2plY3RzL2luZm9uY2VzL2MxMF9yZXMxOC9wdC1yZXN1bHRzL2FwcGxpY2F0aW9uXzE1ODMzMDcxNTM4NjhfMzIzNjYvU3VwQ29uL2NpZmFyMTBfdGVuc29yYm9hcmQvU2ltQ0xSX2NpZmFyMTBfcmVzbmV0MThfbHJfMC41X2RlY2F5XzAuMDAwMV9ic3pfMzA3Ml90ZW1wXzAuNV90cmlhbF8wX2Nvc2luZV93YXJtL2V2ZW50cy5vdXQudGZldmVudHMuMTU5MDM0ODQyMC5jb250YWluZXItZTc3OC0xNTgzMzA3MTUzODY4LTMyMzY2LTAxLTAwMDAwMiEwMDAwMjghMjAyMC0wNS0yNFQyMjowNjo1NS40NzUzMDk4WiE-
prefix:projects/infonces/
restype:container

Have you found a mitigation/solution?

No

Note: for table service, please post the issue here instead: https://github.com/Azure/azure-cosmosdb-python.

Hi @sordonia

Thanks for reaching out!
Are you doing something like this next(self.bs.list_blobs(container_name))?
Just want to clarify that the "log" you pasted there are requests. If there's a marker in request that means the response returned that marker, then the response should have returned the list of blobs.

BTW you said "never seen that before" so does that mean this was working and suddenly it's not working? if that's the case we will look into service log and see what happens!

Thanks!

Hi !

i am looping the return of the list_blobs call

gen = self.bs.list_blobs(..)
for x in gen:
...

Yes suddendly markers started to show up. first few blobs list fine, then the markers slow down the listing terribly

Thanks!

Hi @sordonia
would you like to clarify a bit about "markers slow down the listing terribly", do you mean you have to wait for the marker to continue the next request?

Hi @xiafu-msft , @lmazuel

I am getting the same issue.
Listing blobs is very slow :

self.blob_service = BlockBlobService(account_name=self.account_name, account_key=self.account_key)
for i in self.blob_service.list_blobs(container_name):
print(i)

It stucks as follows :

image

Azure package version :
azure-common==1.1.25
azure-nspkg==3.0.2
azure-storage==0.36.0

Thanks

Hi @erlisb
azure-storage has been deprecated, would you like to use azure-storage-blob<=2.1.0 or azure-storage-blob>=12.0.0 (there are be breaking changes if you use >=12.0.0)

@zezha-msft do you have any idea about the list_blobs behavior in azure-storage==0.36...