/Datacompass

Repository that stores resources for data competitions

Primary LanguagePythonGNU General Public License v3.0GPL-3.0




CURRENT STATUS: IN DEVELOPMENT

Repository storing automation resources, infrastructure, and more for Kaggle-based data competitions

General-purpose and adaptable project for various types of data competitions


Main features 🔥

General.

  • Github Codespaces support is included (see devcontainer)

  • Support for local execution is also included.


Getting started 🚀

This repository includes scripts and resources for the following areas:

  1. Data Processing Pipeline - An automated workflow that spans from data acquisition and cleaning to transformation and preparation for modeling.

  2. Dataset Management - Capabilities to easily load and manage Kaggle datasets, including automatic data download from the platform.

  3. Results Reporting - Automatic creation of detailed reports summarizing performance metrics, visualizations, and model outcomes.

  4. Integration with Kaggle - Ability to upload and submit results directly to Kaggle competitions, streamlining the development and testing cycle.

⚡ All use of cloud (AWS and Azure) generate costs, use it at your own risk!

Built with 🛠️


Disclaimer 📝

  • I'm not responsible for bricked devices or software misconfigurations.
  • I'm not responsible for possible high cloud costs costs generated by using the code of this project.
    • You are free to use the software of this project and it is your decision.
  • I'm not responsible for data loss.

This is a personal project with academic origins and is not intended to be a commercial or professional solution. If you want to use it, it is at your own risk.


Roadmap 🗓️

❗ Check the Project dashboard for more info!

Wiki 📕

❗ No wiki at the moment!

License 📌

This project is licensed under the License (GNU GPL-V3) - see the LICENSE.md file for details.


⌨️ with ❤️ by Alexvidalcor 😊