✨ This repo contains my projects for Udacity - Advanced Data Analysis. You can find my certificate here.
In this project, I have gone through the data analysis process and saw how everything fits together. I have used the Python libraries NumPy, pandas, and Matplotlib, which make writing data analysis code in Python a lot easier!
This project is based on the dataset which collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row.
A/B tests are very commonly performed by data analysts and data scientists. It is important that we get some practice working with the difficulties of these.
For this project, I have worked to understand the results of an A/B test run by an e-commerce website. My goal was to work through this to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.
This project has two parts that demonstrate the importance and value of data visualization techniques in the data analysis process. In the first part, I have used Python visualization libraries to systematically explore a selected dataset, starting from plots of single variables and building up to plots of multiple variables. In the second part, I have produced a short presentation that illustrates interesting properties, trends, and relationships that I discovered in your selected dataset.
This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others. This data dictionary explains the variables in the data set. The project objective is not expected to explore all of the variables in the dataset! But focus on only exploration on about 10-15 of them.