Basically, a (dummy) blog API that interacts with a (dummy) content moderation system. The content moderation takes the form of detection of foul language in blog posts and flagging said blogposts.
Simple answer: docker-compose up
A bit longer answer: If you don't want to use docker-compose
or docker
in general, you can launch it locally, following the steps below.
- Create a virtual environment, be it
venv
orconda
. - Once done, activate it. Depending on what kind of virtual environment you use, it will be either
source <virtual environment name>/bin/activate
forvenv
, orconda activate <virtual environment name>
forconda
. - Before launching only the dummy API, install
pip install -r requirements.txt
. - What do I mean by only? You see, to also run tests, type checking, or a better REPL, you will need to install
pip install -r dev-requirements.txt
. Why so many requirements files, you ask? To keep the docker image size to the minimum. - You're ready to launch it. Type
uvicorn blog_api:app
into your shell, and you're good to go.
You can access the Swagger docs at http://localhost:8000/docs
.
First, follow the How to launch it? part. Then...
- Run
pip install -r dev-requirements.txt
. - Run
pytest tests/
for testing the project. - Run
mypy .
for type checking the project. On its first run, it might take a while, don't worry, it's not broken.
I couldn't keep myself from showing off at least a bit. That's why this solution has the following whistles and bells.
- Because I decided it would be better to detect and flag blogposts with foul language soon after they were added, I opted for a deferred task approach of architecting this thing.
- Because of this, I needed a way to defer tasks, and that's why I used a
ThreadPoolExecutor
. Why not go theasyncio
way? Because usingThreadPoolExecutor
allows for a simple implementation of a kind of client-side rate limiter. Basically, it would somewhat prevent overflowing the ML API from too many in-flight requests by limiting them to the number ofmax_workers
, which can be set with an env var. - Also, an exponential backoff strategy, with jitter, was implemented to handle occasional 5xx errors from the ML API. It was inspired by this blog on AWS.
- More tests.
- Better (read cleaner) code organization, although I tried.
- A persistent connection to the ML API, so as to not waste time on TCP and TLS handshakes.
- Obviously a proper database, the actual content-moderation system, and an easy way to vertically scale the application. I forwarded the container port outside, so it's non-trivial now to
docker-compose up --scale=K blog
.
There's more, but you got the idea.