This project uses a combination of serverless and terraform to create a lambda that is trigged when
a csv (can be customized in serverless.yml
file) is created in s3.
After trigged the python code creates a stream with limitation (chunck size) to read each line and obtain a structured data to easy access.
The serverless is configured to encrypted all s3 files using AWS KMS even as Lambda function.
Continuos integrations has developed using Github actions. More detalis in .github/worflows
folder.
The sample utilized for this project is example/municipalities.csv
. This file contains all municipalities
of Brazil.
Must be contains headers and csv delimiter is ;
that can easyly configurated in fieldDelimiter
.
You can change row.get('HEADER')
to get a value
The max chunck size is limited to 1MB (s3 select limitation)
Used to manage terraform state. https://app.terraform.io/
After create your terraform cloud account and started new workspace. Same variables must be created to apply changes
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
. Both generated in AWS IAM with programmatically access mode
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
. See more information above.
TF_API_TOKEN
generated in terraform cloud. Settings > Tokens > Create an API token
- S3
- Lambda
- Provider
- KMS
- Python 3.8