An example of how to stream large files to S3 from an external source on AWS Lambda using the JavaScript AWS SDK v3
In use cases where lambda needs to download and upload files that can range widely in size, using a standard up-front download and upload flow will require increasing the lambda memory configuration to be large enough to store the file while the transfer occurs.
By using an approach with streams, the lambda memory configuration is no longer a constraint, allowing us to process files of any size as long as the transfer can occur within the maximum execution time (15 minutes).
To deploy this CDK application, you will need the following:
- Node.js v10 or later (LTS only)
- Docker (for building the lambda function)
- An AWS profile with valid IAM credentials
Deploy the CDK stack by cloning this repository then running:
npm run build && npm run deploy
Run the invoke.sh
script with a URL of large file (will take some time while
the lambda pulls in the file and uploads):
./invoke.sh "https://ai2-public-datasets.s3.us-west-2.amazonaws.com/arc/ARC-V1-Feb2018.zip" # 649 MB file
Your lambda will then stream the incoming data from your URL while streaming the data to S3.
Open the S3 Console when the lambda has completed to see the file has been uploaded to your bucket.
Note: If you need to specify an AWS profile other than default
to use,
set the AWS_PROFILE
environment variable in your shell before running
any commands.
Run the following command to remove all of the resources created by this CDK stack.
npm run destroy
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.