Azure/azure-storage-python

Uploading large file (> 256MB) using HTTP client from python, without using Azure SDK

akshayDhotre opened this issue · 5 comments

Which service(blob, file, queue) does this issue concern?

Azure Blob storge

Which version of the SDK was used? Please provide the output of pip freeze.

NA

What problem was encountered?

Not able to upload large file (> 256MB) to azure blob storage with HTTP client on Azure Blob SAS URL

Hi @akshayDhotre

Thanks for reaching out. It looks like this is not correct repo for your question.
While can you explain a bit about "on Azure Blob SAS URL" so we can see if there's anything we can help.

@xiafu-msft To elaborate on @akshayDhotre 's question -

  • We have a azure blob SAS URL which is of a zip file from customer (you can upload a zip file in azure blob storage and create a SAS URL out of it)
  • We have some business logic and wants to upload/replace an expected zipped file (whose size can be >256 MB or > 400 MB etc.) to the URL above. (Again, to repro on your end, you can have a folder with any number of files, zip it, make sure zip is
    ~500 MB and try to upload/replace this to above URL). As mentioned by @akshayDhotre - point 1 is a customer SAS URL, we don't have control on that or credentials and cannot use azure sdk for python to upload (?)
  • While doing this from python, we are using below code -
headers = {
        'x-ms-blob-type': 'BlockBlob'
    }
    data = open(<local_zip_file_path>, mode='rb').read()

    logging_message = None
    try:
        response = requests.put(<customer-provided-destination-path-url>, headers=headers, data=data)
        response.raise_for_status()
    except requests.exceptions.HTTPError as err:
        logging_message = f'Http error while uploading output zip: {str(err)}'
    except requests.exceptions.ConnectionError as err:
        logging_message = f'Error connecting while uploading output zip: {str(err)}'
    except requests.exceptions.Timeout as err:
        logging_message = f'Timeout error while uploading output zip: {str(err)}'
    except requests.exceptions.RequestException as err:
        logging_message = f'An error occurred while uploading output zip: {str(err)}'
  • During execution, we get '413 Client Error: The request body is too large and exceeds the maximum permissible limit.' error.
    Hope this helps.

Please let us know if you need any more information

I think the question title states that we need to do this without using azure sdk. Currently it is possible using azure blob sdk or calling azure blob storage REST API but the case here is different.

Consider you are building a independent service and the service does not know whether its going to write on azure blob storage or amazon or google or any file server http path.

The service is responsible for creating a zip and uploading it to the destination path provided by the user. The destination path can be any valid URI with write permission.

To make solution generic we have used httpclient to post that file. The limitation of 256MB is holding back for larger files.
I see only two way we can solve this problem -

  1. Increase the size limit from 256MB to configurable size
  2. Partial upload using httpClient

Point 2 - Partial upload seems to be not possible, below stackoverflow link regarding the same
https://stackoverflow.com/questions/20969331/standard-method-for-http-partial-upload-resume-upload

Hi @deeagar

If you are using service version 2019-12-12, the max put size is 5GB. Can you try that and let us know if that works?