Data Visualization Project

Data

The data I propose to visualize for my project is Udemy Courses

Prototypes

I’ve created a proof of concept visualization of this data. It's a bar chart and it shows the top 10 most subscribed courses and the number of subscribers.

image

Questions & Tasks

The following tasks and questions will drive the visualization and interaction decisions for this project:

  1. Which is the most popular subject on Udemy Website?
  2. Is the number of subsribers and the number of reviews correlated?
  3. What is the top 10 most popular courses?

Sketches

image

The sketch calculates the how many courses each subject has. So, from the bar chart, we can conclude which subject is the most popular on Udemy website.

image

This sketch is a scatter plot of the number of reviews and the number of subscribers. We can roughly see if the two variables are correlated from the scatter plot.

Open Questions

  • For drawing the scatter plot, maybe there is a need to removing the outliers. For this part, I don't exactly know how to remove the outliers.

Schedule of Deliverables

  1. Which is the most popular subject on Udemy Website?
  • Calculating the course number of each subject.
  • Finding the overall subscribers for each subject.
  1. Is the number of subsribers and the number of reviews correlated?
  • drawing the scatter plot of these two variables.
  • Getting rid of some outliers if necessary.
  1. What is the top 10 most popular courses?
  • define the popularity.
  • drawing a bar chart calculating the top 10 popular courses.

links:

Interaction

Add a menu bar to the bar chart so that when we select a variable, the corresponding variable will be highlighted.

Future work

  • Getting rid of some outliers of the dataset when I want to draw a scatter plot between two different variables.
  • Draw dimension-view plot to reflect the popularity of the course.