`s3_object` fails to copy in AWS when source is larger than 5GiB
colin-nolan opened this issue · 1 comments
Summary
amazon.aws.s3_object
fails to copy files within AWS when they are larger than 5GiB. The use-case where we encountered this issue was when copying between buckets (mode: copy
with copy_src
set) - but it likely effects all copy usage.
I'd guess the switch to a multi-part upload strategy is required for files over 5GiB.
Issue Type
Bug Report
Component Name
s3_object
Ansible Version
$ ansible --version
ansible [core 2.16.7]
config file = <redacted>/ansible/ansible.cfg
configured module search path = ['<redacted>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible
ansible collection location = <redacted>/.ansible/collections:/usr/share/ansible/collections
executable location = <redacted>/ansible/.venv/bin/ansible
python version = 3.12.3 (main, Apr 9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] (<redacted>/ansible/.venv/bin/python)
jinja version = 3.1.4
libyaml = True
Collection Versions
$ ansible-galaxy collection list
# <redacted>/.ansible/collections/ansible_collections
Collection Version
---------------------------------------- -------
amazon.aws 8.0.0
# <redacted>/ansible/.venv/lib/python3.12/site-packages/ansible_collections
Collection Version
---------------------------------------- -------
amazon.aws 7.6.0
ansible.netcommon 5.3.0
ansible.posix 1.5.4
ansible.utils 2.12.0
ansible.windows 2.3.0
arista.eos 6.2.2
awx.awx 23.9.0
azure.azcollection 1.19.0
check_point.mgmt 5.2.3
chocolatey.chocolatey 1.5.1
cisco.aci 2.9.0
cisco.asa 4.0.3
cisco.dnac 6.13.3
cisco.intersight 2.0.9
cisco.ios 5.3.0
cisco.iosxr 6.1.1
cisco.ise 2.9.1
cisco.meraki 2.18.1
cisco.mso 2.6.0
cisco.nxos 5.3.0
cisco.ucs 1.10.0
cloud.common 2.1.4
cloudscale_ch.cloud 2.3.1
community.aws 7.2.0
community.azure 2.0.0
community.ciscosmb 1.0.9
community.crypto 2.20.0
community.digitalocean 1.26.0
community.dns 2.9.1
community.docker 3.10.1
community.general 8.6.1
community.grafana 1.8.0
community.hashi_vault 6.2.0
community.hrobot 1.9.2
community.library_inventory_filtering_v1 1.0.1
community.libvirt 1.3.0
community.mongodb 1.7.4
community.mysql 3.9.0
community.network 5.0.2
community.okd 2.3.0
community.postgresql 3.4.1
community.proxysql 1.5.1
community.rabbitmq 1.3.0
community.routeros 2.15.0
community.sap 2.0.0
community.sap_libs 1.4.2
community.sops 1.6.7
community.vmware 4.4.0
community.windows 2.2.0
community.zabbix 2.4.0
containers.podman 1.13.0
cyberark.conjur 1.2.2
cyberark.pas 1.0.25
dellemc.enterprise_sonic 2.4.0
dellemc.openmanage 8.7.0
dellemc.powerflex 2.4.0
dellemc.unity 1.7.1
f5networks.f5_modules 1.28.0
fortinet.fortimanager 2.5.0
fortinet.fortios 2.3.6
frr.frr 2.0.2
gluster.gluster 1.0.2
google.cloud 1.3.0
grafana.grafana 2.2.5
hetzner.hcloud 2.5.0
hpe.nimble 1.1.4
ibm.qradar 2.1.0
ibm.spectrum_virtualize 2.0.0
ibm.storage_virtualize 2.3.1
infinidat.infinibox 1.4.5
infoblox.nios_modules 1.6.1
inspur.ispim 2.2.1
inspur.sm 2.3.0
junipernetworks.junos 5.3.1
kaytus.ksmanage 1.2.1
kubernetes.core 2.4.2
lowlydba.sqlserver 2.3.2
microsoft.ad 1.5.0
netapp.aws 21.7.1
netapp.azure 21.10.1
netapp.cloudmanager 21.22.1
netapp.elementsw 21.7.0
netapp.ontap 22.11.0
netapp.storagegrid 21.12.0
netapp.um_info 21.8.1
netapp_eseries.santricity 1.4.0
netbox.netbox 3.18.0
ngine_io.cloudstack 2.3.0
ngine_io.exoscale 1.1.0
openstack.cloud 2.2.0
openvswitch.openvswitch 2.1.1
ovirt.ovirt 3.2.0
purestorage.flasharray 1.28.0
purestorage.flashblade 1.17.0
purestorage.fusion 1.6.1
sensu.sensu_go 1.14.0
splunk.es 2.1.2
t_systems_mms.icinga_director 2.0.1
telekom_mms.icinga_director 1.35.0
theforeman.foreman 3.15.0
vmware.vmware_rest 2.3.1
vultr.cloud 1.12.1
vyos.vyos 4.1.0
wti.remote 1.0.5
AWS SDK versions
$ pip show boto boto3 botocore
WARNING: Package(s) not found: boto
Name: boto3
Version: 1.34.99
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email:
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: botocore, jmespath, s3transfer
Required-by:
---
Name: botocore
Version: 1.34.99
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email:
License: Apache License 2.0
Location: <redacted>/ansible/.venv/lib/python3.12/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer
Configuration
$ ansible-config dump --only-changed
CONFIG_FILE() = <redacted>/ansible/ansible.cfg
DEFAULT_INVENTORY_PLUGIN_PATH(<redacted>/ansible/ansible.cfg) = ['<redacted>/ansible/plugins/inventory']
DUPLICATE_YAML_DICT_KEY(<redacted>/ansible/ansible.cfg) = ignore
INVENTORY_IGNORE_EXTS(<redacted>/ansible/ansible.cfg) = ["{{(REJECT_EXTS + ('.orig'", '.cfg', "'.retry'))}}"]
INVENTORY_UNPARSED_IS_FAILED(<redacted>/ansible/ansible.cfg) = True
OS / Environment
N/A
Steps to Reproduce
- amazon.aws.s3_object:
bucket: bucket-wanting-big-file
mode: copy
copy_src:
bucket: bucket-with-big-file
Expected Results
Expected to copy any files over 5GiB to the destination bucket in an idempotent manor.
Actual Results
Task failure, resulting in the traceback:
The full traceback is:
Traceback (most recent call last):
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1320, in copy_object_to_bucket
s3.copy_object(aws_retry=True, **params)
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/retries.py", line 105, in deciding_wrapper
return retrying_wrapper(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 119, in _retry_wrapper
return _retry_func(
^^^^^^^^^^^^
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/module_utils/cloud.py", line 68, in _retry_func
return func()
^^^^^^
File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<redacted>/ansible/.venv/lib/python3.12/site-packages/botocore/client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1579, in main
func(module, s3, s3_v4, s3_object_params)
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1354, in s3_object_do_copy
updated, result = copy_object_to_bucket(
^^^^^^^^^^^^^^^^^^^^^^
File "/var/folders/hh/s685n1156js1ll921mgz3b8r0000gn/T/ansible_amazon.aws.s3_object_payload_p8li_bbd/ansible_amazon.aws.s3_object_payload.zip/ansible_collections/amazon/aws/plugins/modules/s3_object.py", line 1331, in copy_object_to_bucket
raise S3ObjectFailure(
S3ObjectFailure: Failed while copying object 7G.bin from bucket None.
fatal: [staging]: FAILED! => {
"boto3_version": "1.34.99",
"botocore_version": "1.34.99",
"changed": false,
"error": {
"code": "InvalidRequest",
"message": "The specified copy source is larger than the maximum allowable size for a copy source: 5368709120"
},
"invocation": {
"module_args": {
"access_key": "<redacted>",
"aws_ca_bundle": null,
"aws_config": null,
"bucket": "<redacted>",
"ceph": false,
"content": null,
"content_base64": null,
"copy_src": {
"bucket": "<redacted>",
"object": null,
"prefix": "",
"version_id": null
},
"debug_botocore_endpoint_logs": false,
"dest": null,
"dualstack": false,
"encrypt": true,
"encryption_kms_key_id": null,
"encryption_mode": "AES256",
"endpoint_url": null,
"expiry": 600,
"headers": null,
"ignore_nonexistent_bucket": false,
"marker": "",
"max_keys": 1000,
"metadata": null,
"mode": "copy",
"object": null,
"overwrite": "different",
"permission": [],
"prefix": "",
"profile": null,
"purge_tags": true,
"region": null,
"retries": 0,
"secret_key": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"session_token": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"sig_v4": true,
"src": null,
"tags": null,
"validate_bucket_name": true,
"validate_certs": true,
"version": null
}
},
"msg": "Failed while copying object 7G.bin from bucket None.: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120",
"response_metadata": {
"host_id": "<redacted>",
"http_headers": {
"connection": "close",
"content-type": "application/xml",
"date": "Wed, 29 May 2024 12:59:34 GMT",
"server": "AmazonS3",
"transfer-encoding": "chunked",
"x-amz-id-2": "<redacted>",
"x-amz-request-id": "<redacted>"
},
"http_status_code": 400,
"request_id": "<redacted>",
"retry_attempts": 0
}
}
Code of Conduct
- I agree to follow the Ansible Code of Conduct
@alinabuzachis many thanks for assigning on this one. I absolutely understand that you undoubtedly have a lot to do - but I just wondered if you could give an indication on whether/when this might sit on your roadmap?
Ideally, we would push a fix up from our side but we're not currently in a great position to do this. I'm trying to determine whether we should invest in a temp work-around, or just put up with manually syncing some of our larger data until a fix is in place.
Thanks again.