ansible-collections/amazon.aws

`s3_object` fails to copy in AWS when source is larger than 5GiB

colin-nolan opened this issue · 1 comments

Summary

amazon.aws.s3_object fails to copy files within AWS when they are larger than 5GiB. The use-case where we encountered this issue was when copying between buckets (mode: copy with copy_src set) - but it likely effects all copy usage.

I'd guess the switch to a multi-part upload strategy is required for files over 5GiB.

Issue Type

Bug Report

Component Name

s3_object

Ansible Version

$ ansible --version
ansible [core 2.16.7]
  config file = <redacted>/ansible/ansible.cfg
  configured module search path = ['<redacted>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible
  ansible collection location = <redacted>/.ansible/collections:/usr/share/ansible/collections
  executable location = <redacted>/ansible/.venv/bin/ansible
  python version = 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] (<redacted>/ansible/.venv/bin/python)
  jinja version = 3.1.4
  libyaml = True

Collection Versions

$ ansible-galaxy collection list
# <redacted>/.ansible/collections/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               8.0.0  

# <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               7.6.0  
ansible.netcommon                        5.3.0  
ansible.posix                            1.5.4  
ansible.utils                            2.12.0 
ansible.windows                          2.3.0  
arista.eos                               6.2.2  
awx.awx                                  23.9.0 
azure.azcollection                       1.19.0 
check_point.mgmt                         5.2.3  
chocolatey.chocolatey                    1.5.1  
cisco.aci                                2.9.0  
cisco.asa                                4.0.3  
cisco.dnac                               6.13.3 
cisco.intersight                         2.0.9  
cisco.ios                                5.3.0  
cisco.iosxr                              6.1.1  
cisco.ise                                2.9.1  
cisco.meraki                             2.18.1 
cisco.mso                                2.6.0  
cisco.nxos                               5.3.0  
cisco.ucs                                1.10.0 
cloud.common                             2.1.4  
cloudscale_ch.cloud                      2.3.1  
community.aws                            7.2.0  
community.azure                          2.0.0  
community.ciscosmb                       1.0.9  
community.crypto                         2.20.0 
community.digitalocean                   1.26.0 
community.dns                            2.9.1  
community.docker                         3.10.1 
community.general                        8.6.1  
community.grafana                        1.8.0  
community.hashi_vault                    6.2.0  
community.hrobot                         1.9.2  
community.library_inventory_filtering_v1 1.0.1  
community.libvirt                        1.3.0  
community.mongodb                        1.7.4  
community.mysql                          3.9.0  
community.network                        5.0.2  
community.okd                            2.3.0  
community.postgresql                     3.4.1  
community.proxysql                       1.5.1  
community.rabbitmq                       1.3.0  
community.routeros                       2.15.0 
community.sap                            2.0.0  
community.sap_libs                       1.4.2  
community.sops                           1.6.7  
community.vmware                         4.4.0  
community.windows                        2.2.0  
community.zabbix                         2.4.0  
containers.podman                        1.13.0 
cyberark.conjur                          1.2.2  
cyberark.pas                             1.0.25 
dellemc.enterprise_sonic                 2.4.0  
dellemc.openmanage                       8.7.0  
dellemc.powerflex                        2.4.0  
dellemc.unity                            1.7.1  
f5networks.f5_modules                    1.28.0 
fortinet.fortimanager                    2.5.0  
fortinet.fortios                         2.3.6  
frr.frr                                  2.0.2  
gluster.gluster                          1.0.2  
google.cloud                             1.3.0  
grafana.grafana                          2.2.5  
hetzner.hcloud                           2.5.0  
hpe.nimble                               1.1.4  
ibm.qradar                               2.1.0  
ibm.spectrum_virtualize                  2.0.0  
ibm.storage_virtualize                   2.3.1  
infinidat.infinibox                      1.4.5  
infoblox.nios_modules                    1.6.1  
inspur.ispim                             2.2.1  
inspur.sm                                2.3.0  
junipernetworks.junos                    5.3.1  
kaytus.ksmanage                          1.2.1  
kubernetes.core                          2.4.2  
lowlydba.sqlserver                       2.3.2  
microsoft.ad                             1.5.0  
netapp.aws                               21.7.1 
netapp.azure                             21.10.1
netapp.cloudmanager                      21.22.1
netapp.elementsw                         21.7.0 
netapp.ontap                             22.11.0
netapp.storagegrid                       21.12.0
netapp.um_info                           21.8.1 
netapp_eseries.santricity                1.4.0  
netbox.netbox                            3.18.0 
ngine_io.cloudstack                      2.3.0  
ngine_io.exoscale                        1.1.0  
openstack.cloud                          2.2.0  
openvswitch.openvswitch                  2.1.1  
ovirt.ovirt                              3.2.0  
purestorage.flasharray                   1.28.0 
purestorage.flashblade                   1.17.0 
purestorage.fusion                       1.6.1  
sensu.sensu_go                           1.14.0 
splunk.es                                2.1.2  
t_systems_mms.icinga_director            2.0.1  
telekom_mms.icinga_director              1.35.0 
theforeman.foreman                       3.15.0 
vmware.vmware_rest                       2.3.1  
vultr.cloud                              1.12.1 
vyos.vyos                                4.1.0  
wti.remote                               1.0.5  

AWS SDK versions

$ pip show boto boto3 botocore
WARNING: Package(s) not found: boto
Name: boto3
Version: 1.34.99
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: botocore, jmespath, s3transfer
Required-by: 
---
Name: botocore
Version: 1.34.99
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer

Configuration

$ ansible-config dump --only-changed
CONFIG_FILE() = <redacted>/ansible/ansible.cfg
DEFAULT_INVENTORY_PLUGIN_PATH(<redacted>/ansible/ansible.cfg) = ['<redacted>/ansible/plugins/inventory']
DUPLICATE_YAML_DICT_KEY(<redacted>/ansible/ansible.cfg) = ignore
INVENTORY_IGNORE_EXTS(<redacted>/ansible/ansible.cfg) = ["{{(REJECT_EXTS + ('.orig'", '.cfg', "'.retry'))}}"]
INVENTORY_UNPARSED_IS_FAILED(<redacted>/ansible/ansible.cfg) = True

OS / Environment

N/A

Steps to Reproduce

- amazon.aws.s3_object:
    bucket: bucket-wanting-big-file
    mode: copy
    copy_src:
      bucket: bucket-with-big-file

Expected Results

Expected to copy any files over 5GiB to the destination bucket in an idempotent manor.

Actual Results

Task failure, resulting in the traceback:

The full traceback is:
Traceback (most recent call last):
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1320, in copy_object_to_bucket
    s3.copy_object(aws_retry=True, **params)
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/retries.py", line 105, in deciding_wrapper
    return retrying_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 119, in _retry_wrapper
    return _retry_func(
           ^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 68, in _retry_func
    return func()
           ^^^^^^
  File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 565, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 1021, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1579, in main
    func(module, s3, s3_v4, s3_object_params)
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1354, in s3_object_do_copy
    updated, result = copy_object_to_bucket(
                      ^^^^^^^^^^^^^^^^^^^^^^
  File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1331, in copy_object_to_bucket
    raise S3ObjectFailure(
S3ObjectFailure: Failed while copying object 7G.bin from bucket None.
fatal: [staging]: FAILED! => {
    "boto3_version": "1.34.99",
    "botocore_version": "1.34.99",
    "changed": false,
    "error": {
        "code": "InvalidRequest",
        "message": "The specified copy source is larger than the maximum allowable size for a copy source: 5368709120"
    },
    "invocation": {
        "module_args": {
            "access_key": "<redacted>",
            "aws_ca_bundle": null,
            "aws_config": null,
            "bucket": "<redacted>",
            "ceph": false,
            "content": null,
            "content_base64": null,
            "copy_src": {
                "bucket": "<redacted>",
                "object": null,
                "prefix": "",
                "version_id": null
            },
            "debug_botocore_endpoint_logs": false,
            "dest": null,
            "dualstack": false,
            "encrypt": true,
            "encryption_kms_key_id": null,
            "encryption_mode": "AES256",
            "endpoint_url": null,
            "expiry": 600,
            "headers": null,
            "ignore_nonexistent_bucket": false,
            "marker": "",
            "max_keys": 1000,
            "metadata": null,
            "mode": "copy",
            "object": null,
            "overwrite": "different",
            "permission": [],
            "prefix": "",
            "profile": null,
            "purge_tags": true,
            "region": null,
            "retries": 0,
            "secret_key": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "session_token": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "sig_v4": true,
            "src": null,
            "tags": null,
            "validate_bucket_name": true,
            "validate_certs": true,
            "version": null
        }
    },
    "msg": "Failed while copying object 7G.bin from bucket None.: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120",
    "response_metadata": {
        "host_id": "<redacted>",
        "http_headers": {
            "connection": "close",
            "content-type": "application/xml",
            "date": "Wed, 29 May 2024 12:59:34 GMT",
            "server": "AmazonS3",
            "transfer-encoding": "chunked",
            "x-amz-id-2": "<redacted>",
            "x-amz-request-id": "<redacted>"
        },
        "http_status_code": 400,
        "request_id": "<redacted>",
        "retry_attempts": 0
    }
}

Code of Conduct

  • I agree to follow the Ansible Code of Conduct

@alinabuzachis many thanks for assigning on this one. I absolutely understand that you undoubtedly have a lot to do - but I just wondered if you could give an indication on whether/when this might sit on your roadmap?

Ideally, we would push a fix up from our side but we're not currently in a great position to do this. I'm trying to determine whether we should invest in a temp work-around, or just put up with manually syncing some of our larger data until a fix is in place.

Thanks again.