/FECDataConnect

FECDataConnect is a project aimed at extracting data from the Federal Election Commission (FEC) and integrating it into a MariaDB database. This ETL pipeline ensures that the data remains fresh and accessible for further analysis.

MIT LicenseMIT

FECDataConnect

An ETL pipeline for extracting data from the Federal Election Commission (FEC) and integrating it into a MariaDB database.

Overview

FECDataConnect is designed to fetch, transform, and load data from the FEC into a structured database. This ensures easy accessibility and analysis of election-related data.

Table of Contents

Features

  • Data Extraction: Automated scraping of FEC data.
  • Transformation: Pre-processing and cleaning of raw FEC data to ensure database readiness.
  • Loading: Streamlined insertion of the transformed data into a MariaDB database.

Installation & Setup

  1. Clone the Repository:
    git clone https://github.com/your_username/FECDataConnect.git
  2. Install Required Libraries:
    pip install -r requirements.txt
  3. Database Configuration: [Instructions for setting up your MariaDB database, configuring user privileges, etc.]

Usage

Run the main script to start the ETL process:

python main.py

Contributing

Contributions are welcome!

  1. Fork the repository.
  2. Create your feature branch ('git checkout -b feature/AmazingFeature').
  3. Commit your changes ('git commit -m 'Add some AmazingFeature').
  4. Push the branch ('git push origin feature/AmazingFeature').
  5. Open a pull request.

For major changes, please open an issue first to discuss what you'd like to change.

License

This project is licensed under the MIT License. For more details, see the LICENSE file in the repository. Contact

Project File Structure

FECDataConnect/
│
├── data/
│   ├── raw/                 # For storing raw scraped data
│   ├── processed/           # For data that's been cleaned/transformed
│   └── archive/             # For archival purposes (optional)
│
├── src/
│   ├── etl/
│   │   ├── extract.py       # Code to extract data
│   │   ├── transform.py     # Code to transform data
│   │   └── load.py          # Code to load data into MariaDB
│   │
│   ├── utils/               # Helper scripts, utilities, etc.
│   └── config.py            # Configuration variables/settings
│
├── logs/                    # Directory for logs (if you're logging events/errors)
│
├── tests/                   # For unit tests
│
├── .gitignore               # Specifies intentionally untracked files to ignore
├── LICENSE
├── README.md
└── requirements.txt         # Lists all project dependencies