/pandas-sqlalchemy-integration

A repository to showcase how Python's pandas library can be used to interact with a SQL database via SQLAlchemy.

Primary LanguagePython

pandas-sqlalchemy-integration

A repository to showcase how Python's pandas library can be used to interact with a SQL database via SQLAlchemy. The primary use case for this type of integration is when a "data science" workflow needs to integrate with a software development workflow. This repository in particular is organized around a scheduled job that involves fetching, preprocessing, and writing data to a SQL database.

Folder Organization

data

For illustrative purposes, a small csv with fake data is included to simulate what raw data on employees of an organization might look like.

schedule

config

main

All code that will run as part of the scheduled job goes in here.

There is a main function which, in this example, is run as a scheduled job using Python's apscheduler library.

main should call all functions that are required to perform the workflow. Beyond this, there is no restriction on how modules should be structured inside of the main folder.

call graph

test

playground

Resources

  1. SQLalchemy guide ch 1
  2. sqlalchemy schema
  3. pandas with sqlalchemy