This is the first working version of a simple pipeline to extract Stripe Events using the API and store them in Redshift.
This initial version only works with Stripe Subscription Events and ignores Subscription Items and Metadata.
It was successfully used to backfill a period of data needed for ad-hoc analysis.
We plan on adding further functionality to this pipeline:
- Support pulling in all the subscription event data
- More events like invoices, charges, etc. Ideally we can pull in all event types at some point
- Smarter loading logic, support for resuming from last data in Redshift
- Compressing data before loading to Redshift to decreasing bandwith used
- More robust error handling
- Copy the
.env_templatefile:
cp .env_template .env
-
Fill out the
.envfile. You'll need Redshift connection details and credentials, AWS credentials and an S3 bucket location, as well as a Stripe API key -
Set up a dev environment and install all the dependencies
make init
- Create the table schema if it doesn't exist. Currently the only table to create is in
redshift/subscription_events.sql
After completing the setup, you should be able to run the crawler!
stripe-pipeline crawler run
Or:
make run
The crawler run command will query the database to find the last date an event was saved
in the subscription events table.
It will then backfill from that date up until now, after which it will keep polling stripe for new events every minute and load them into the pipeline.
The whole application can also be run within a Docker container. To build a image and run the crawler in Docker:
make rund
To deploy the project to Kubernetes, make sure you have kubectl set up with access to your cluster.
You also need to have the needed secrets to be set up in Kubernetes. If you already have a .env file with the right credentials and ran the make init command, this will all be taken care of.
Otherwise we'll expect you to create secret configuration files matching the kube/*.secret.yaml.template files (without the .template extension, these files will ignored by git). Remember that you'll also need to base64 encode the values of each secret.
To deploy, all you need is:
make deploy
Once you have deployed, you can view your running application logs using:
make logs