This project details the deployment of a Python FastAPI project, which uses a PostgreSQL RDS database, to AWS. The AWS technologies used in production are:
- RDS
- Lambda
- Cloud Formation
- API Gateway
- S3
Major shoutout to iwpnd. I followed his tutorial to have a successful first basic deployment. He also provided sage advice when I got stuck in a few places. Essentially, I catered his project to my specific needs, and added the ability to connect to a PostgreSQL database.
I participate in a "fitness challenge" where players log daily points. The format of the collected data is not ideal, so I aim to clean this data, store it indefinitely on AWS in RDS, and make it available via FastAPI so that others can use the data for analysis.
.
├── app
| ├── __init__.py
│ ├── crud.py
│ ├── database.py
│ ├── main.py
│ ├── models.py
│ ├── routers
│ │ ├── __init__.py
│ │ ├── players.py
│ │ ├── seasons.py
│ │ └── teams.py
│ └── schemas.py
├── .pre-commit.yaml
├── LICENSE
├── README.md
├── requirements.txt
└── template.yml
crud.py
: specifies crud (create, read, update, delete) actionsdatabase.py
: sets up connection with PostgreSQLmain.py
: brings all routes togethermodels.py
: sqlalchemy models specifiedschemas.py
: pydantic models specified, which I believe dictates the output format when API is calledrouters/
: folder containing subsets of routes.pre-commit.yaml
: config file for pre-commit toolrequirements.txt
: requirements to install when project is built using samtemplate.yml
: essentially the recipe for deploying the project to AWS
In order to proceed with set-up and deployment, AWS CLI and SAM need to be installed and configured on your machine.
- IAM Console >> Roles >> Create
- Select
AWS service
as type, and chooseLambda
and use case - Add policies:
AWSLambdaBasicExecutionRole
: Permission to upload logs to CloudWatchAWSLambdaVPCAccessExecutionRole
: Permission to connect our Lambda function to a VPC
- Finish creating role, and set name as
fastapilambdarole
. This name matches role specified intemplate.yml
.
When we deploy our code with AWS SAM, a zip folder of our code will be uploaded to S3. There are two options for creating an S3 bucket.
(1) In the AWS console
(2) With the AWS CLI
aws s3api create-bucket \
--bucket {your bucket name here} \
--region eu-central-1 \
--create-bucket-configuration LocationConstraint=eu-central-1
Please note that S3 bucket names need to be globally unique. So the name of the bucket you create here will determine the bucket name used in later steps. Also, ensure that you change the region to your local region.
git clone https://github.com/KurtKline/fastapi-postgres-aws-lambda.git
cd fastapi-postgres-aws-lambda
# create and activate a virtual environment
pip install -r requirements.txt
pip install uvicorn
In order to test locally without errors, PostgreSQL needs to be installed on your local machine, and the sample data needs to be loaded into a database table. Installing PostgreSQL on Ubuntu 20.04
From the linux terminal:
psql
postgres=# CREATE DATABASE fitness;
postgres=# CREATE TABLE fit (id serial, player varchar(50), team varchar(50), season varchar(50), data_date date, points float);
postgres=# \copy fit(player, team, season, data_date, points) from 'clean_fit.csv' with DELIMITER ',' CSV HEADER;
Start FastAPI
uvicorn app.main:app --reload
# click the link to open the browser at http://127.0.0.1:8000
Once you click the link, add /docs or /redoc to the URL http://127.0.0.1:8000/docs
. You will then see the Swagger UI.
In order to deploy to AWS, our code AND our database needs to live on AWS. Here are some basic guidelines to setting up the RDS PostgreSQL instance.
- In RDS instance settings, make sure
Public Accessibility
is set toYes
- Specify
initial database name
, which will be used in pg_restore below - Whitelist IP
- Create new EC2 security group
- Inbound Rules:
Type
:All Traffic
,Source
:My IP
- Add this security group to RDS instance
How to gain access to RDS instance from Linux terminal:
psql \
--host=<DB instance endpoint from AWS> \
--port=<port> \
--username=<master user name> \
--password \
--dbname=<database name>
Once the data is dumped into your RDS PostgreSQL instance, you can set Public Accessibility
back to No
if you'd like. This just prevents external sources, like your local PC, from accessing your RDS instance.
Here are two options for loading the data into RDS PostgreSQL
- Dump (done from terminal line):
$ pg_dump -Fc mydb > db.dump
- Restore with:
pg_restore -v -h [RDS endpoint] -U [master username ("postgres" by default)] -d [RDS database name] [dumpfile].dump
- Verify load was successful by connecting with psql block shown above
First connect to RDS through psql as shown above. Depending on your database name, fit=>
shown below may be different for you.
fit=> create table fit (id serial, player varchar(50), team varchar(50), season varchar(50), data_date date, points float);
fit=> \copy fit(player, team, season, data_date, points) from 'clean_fit.csv' with DELIMITER ',' CSV HEADER;
COPY 105
More options for loading data into PostgreSQL RDS
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html
The template.yml
file is used for deployment with AWS SAM.
(1) Replace the values in template.yml
specified as {replace}
.
(2) Uncomment # openapi_prefix="/prod"
in app/main.py
. This allows proper access of API when deployed.
(3) Run following steps for SAM in linux terminal
sam validate
sam build --debug
sam package --s3-bucket {your bucket name here} --output-template-file out.yml --region eu-central-1
sam deploy --template-file out.yml --stack-name example-stack-name --region eu-central-1 --no-fail-on-empty-changeset --capabilities CAPABILITY_IAM
-
Accessing PostgreSQL RDS instance locally: Make sure
Public Accessibility
is set toYes
, otherwise you will get a timeout error. -
psycopg2-binary
instead ofpsycopg2
: For some reason, AWS lambda doesn't play well withpsycopg2
, even though it works locally -
Lambda VPC: When this project is deployed as-is, VPC connection is set to none. I needed to change this to
Custom VPC
, and add my default VPC and security group here. This has since been added directly into the template.yml file. -
openapi_prefix="/prod"
: This value needs to matchStageName: prod
intemplate.yml
. The sample project I pulled from hadProd
with a capital P, which would not load the /docs and /redoc properly when deployed. -
Need to add
AWSLambdaVPCAccessExecutionRole
policy to thefastapilambdarole
, otherwise will get errors when using thesam deploy
command.
-
I'm currently using Lambda environment variables to set the database credentials (including password), so I need to figure out a more secure solution. Someone recommended to use KMS for this.
-
Add VPC settings for Lambda to template.yml file if possible, so that no changes need to be made after deployment
-
Add data samples which can be used to illustrate full set-up
-
Add black formatting and pre-commit