statistik-kurs.de - Improving Business Understanding for a Professional Statistics Coach

Capstone Project for neuefische Data Analytics Bootcamp by Helena Heil, Leon Hocker, Sascha Müller

Description

As a new small-business owner offering online statistics courses, Julian was wondering which way would help him grow best. Thanks to regular communication with him and data from the video hosting platform, not only did we find out what he should focus on, but also built a data pipeline, so he can easily analyze the data for future purposes.

Data Pipeline

File Info

  1. connection_test: Script to test the user's connection
  2. environment.yml: Info about the Python virtual environment, especially dependencies, to make this repo's code run
  3. full_dl_master: The heart of the project - download, cleaning and upload of data
  4. GUI: User Interface for data download
  5. sql_functions: Functions for communicating with the database, plus further functions for e.g. data cleaning
  6. t_testing: T-tests to show statistical relations

Hidden files

  1. .env: For saving access credentials / login data
  2. .gitignore: Includes everything that is not supposed to be published in a remote repository, e.g. the .env file