/Investigate_the_No-show_appointments_Dataset

Analyze No-show appointments data and unveil the relationships between multiple variables

Primary LanguageHTMLMIT LicenseMIT

Investigate the No-show appointments Dataset

Introduction

This dataset collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row.

alt text

(Image is from a copyright-free website: https://www.pexels.com/royalty-free-images/.)

  • ScheduledDay tells us on what day the patient set up their appointment;
  • Neighborhood’ indicates the location of the hospital;
  • Scholarship’ indicates whether or not the patient is enrolled in Brasilian welfare program Bolsa Família;
  • Be careful about the encoding of the last column: it says ‘No’ if the patient showed up to their appointment, and ‘Yes’ if they did not show up.
Table of Contents
Prerequisites 🔍📜
Design 📐
Conclusions 📌
License 🔖

Prerequisites

  • Python 3.6.3
  • Jupyter Notebook
  • Anaconda-Navigator

Design

Step One - Choose Data Set

Click this link to download the corresponding data.

Step Two - Get Organized

This project eventually contain:

  • The report communicating any findings;
  • Any Python code used during the analysis;
  • The data set;

Step Three - Analyze

Brainstorm some questions that could be answered using the data set, then start answering those questions, we would mainly focus on looking at the relationships between multiple variables.

Conclusions

In current study, a good amount of profound analysis has been carried out. Prior to each step, deailed instructions was given and interpretions was also provided afterwards. The dataset included 110527 pieces of patients's information from only 2016, which is substantial but limited to only one year. Therefore, even based on such large amount of data, the analysis would not be very representative. The good aspect of current study was it didn't include NaN values nor duplicates, which could affect the process of analysis.

License

MIT Licence