This is university homework taken to an extreme.
What Was asked of us is to explore some of the reasons people dropout.
First step is to ask the questions that could lead
us to knowing some of the factors and then fill up a dataset manually
and visualize said dataset.
I thought it was silly to ask us to fill it up manually and so i spend an entire day
setting up genQuestionnaireDataset.ipynb
so I could have it create a dataset of 17k rows
Let me be clear, the dataset is in no way perfect but it does the job for now.
genQuestionnaireDataset.ipynb
: generates questionnaire dataset using data insiderandomUsers.csv
genRandomUser.ipynb
: fetches Random fake users using the API mentioned below and saves it torandomUsers.csv
visualizeDropouts.ipynb
: data visualization attemptprogress bar.ipynb
: self explanatory 😋
If you want to create a new dataset of random users using the api you'll need to sign up at RapidAPI and subscribe to the API for Free and get your own api key
I took the printProgressBar
function from this Stackoverflow Comment
Create Issues or pull requests if there are any improvements i can make to the dataset generation process or any other part of this project