Data Analyst Nanodegree, Udacity

A Bertelsmann-sponsored initiative involved selecting participants for phase one from a pool of 10,000+ applicants and selecting 500+ successful participants from a pool of 3000+ for phase two. Phase 2 selection involved completion of course prerequisite, as well as, active engagement in Slack social events (days of code challenges, helping colleges, and working together to host webinar sessions) as part of Udacity's goal to strengthen our ability to collaborate.

The projects are stored in different directory in order to mimic Udacity curation of the program

data

Summary of Project Deliverables

  • Global temperature as became a hot topic over the years, as politicans argue about temperature policies and scientist try to understand how the world climate is actually changing. Temperature data around the world is an important part of this conversation. In this project, the goal was to analyze my country local and the global temperature, thereafter, compair the two trend.

png1

  • The analysis was performed using data from a learning database of Udacity

  • Project specification: refer to this rubric

  • Repository for project code and artifacts: Click here

  • Tool: PostGre SQL & Excel

  • Carried out a study that involved going through the data analysis process and see how everything fits together. It started by selecting from a pool of dataset, taking a look at it and brainstorming what questions could be answered using it. Note: Inferential statistics or machine learning were not required for project completion. The process is demonstrated below (Image credit: Udacity Data Analytics Nanodegree)

png2

  • The analysis was performed using the cleaned TMDB Movies dataset from Kaggle.

  • Project specification: refer to this rubric

  • Repository for project code and artifacts: Click here

  • Tool: Python- Pandas, Numpy, Matplotlib

  • The demand for e-commerce has been observed to continue growing as a result of ongoing digitalization and the effects of COVID-19 on the worldwide market. Because of this rise in demand, customers now expect more from online retailers in terms of services. Companies release changes to their e-commerce platforms (website and application) more frequently to cater to this demand, give clients the best possible service, and increase revenue. Although one of this company's goals is to improve services, choosing the best option often requires considering a variety of options. The most advantageous choice can be discovered by doing and evaluating an A/B text.

png3

  • The analysis was performed using cleaned sample e-commerce web A/B test data provided by Udacity

  • Repository for project code and artifacts: Click here

  • Project specification: refer to this rubric & Review

  • Tool: Python- Pandas, Numpy, Matplotlib, Random

  • Real-world data rarely comes clean. In preparation for this challenge, I gathered data from a variety of sources and in a variety of formats, assess its quality and tidiness, then clean it. This is called data wrangling. The dataset that I wrangled (and analyzed and visualized) were the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.

png4

  • The analysis was performed using uncleaneed data which were obtained for SQL database, csv and twitter

  • Repository for project code and artifacts: Click here

  • Project specification: refer to this rubric & Review

  • Tool: SQL, Python- Pandas, Numpy, Matplotlib, Twitter API, Beautiful soup, Request, Selinium

  • Data visualization is an important skill that is used in many parts of the data analysis process. It is divided into two part: Exploratory data visualization, which generally occurs during and after the data wrangling process, and is the main method that you use to understand the patterns and relationships present in your data. This understanding will help you approach any statistical analyses and will help you build conclusions and findings. This process might also illuminate additional data cleaning tasks to be performed. And the other is explanatory data visualization techniques are used after generating your findings, and are used to help communicate your results to others. Understanding design considerations will make sure that your message is clear and effective. In addition to being a good producer of visualizations, going through this project will also help you be a good consumer of visualizations that are presented to you by others. Note: Both types of visualization were employed in this project

png5

  • The analysis was performed using flight data from amstat

  • Repository for project code and artifacts: Click here

  • Project specification: refer to this rubric & Review

  • Tool: Python- Pandas, Numpy, Matplotlib, Seaborn