Plots are used to convey different ideas. Depending on the objective of the visualisation task, an appropriate plot is chosen.
There are two types of data, that help to create charts and plots effectively which are as follows:
1. Facts
2. Dimensions
Facts and dimensions are different types of variables that help you interpret data better.
Facts are numerical data, and dimensions are metadata. Metadata explains the additional information associated with the factual variable. Both facts and dimensions are equally important for generating actionable insights from a given data set.
For example, in a data set about the height of students in a class, the height of the students would be a fact variable, whereas the gender of the students would be a dimensional variable. Dimensions are used to slice data for easier analysis. In this case, the distribution of height based on the gender of a student can be studied.
A bar graph is helpful when you need to visualise a numeric feature (fact) across multiple categories. Using the bar graph, one can easily distinguish between the performance of the categories.
Scatter plot, as the name suggests, displays how the variables are spread across the range considered. It can be used to identify a relationship or pattern between two quantitative variables and the presence of outliers within them.
A line graph is used to present continuous time-dependent data. It accurately depicts the trend of a variable over a specified time period. A line graph can be helpful when you want to identify the trend of a particular variable. Some key industries and services that rely on line graphs include financial markets and weather forecast
A histogram is a frequency chart that records the number of occurrences of an entry or an element in a data set. It can be useful when one wants to understand the distribution of a given series.
Box plots are quite effective in summarising the spread of a large data set into a visual representation. They use percentiles to divide the data range. The percentile value gives the proportion of the data range that falls below a chosen data point when all the data points are arranged in the descending order. For example, if a data point with a value of 700 has a percentile value of 99% in a data set, then it means that 99% of the values in the data set are less than 700.
Box plots divide the data range into three important categories, which are as follows:
Median value: This is the value that divides the data range into two equal halves, i.e., the 50th percentile.
Interquartile range (IQR): These data points range between the 25th and 75th percentile values.
Outliers: These are data points that differ significantly from other observations and lie beyond the whiskers.
A pie chart is a pictorial representation of data in the form of a circular chart or pie where the slices of the pie show the size of the data. A list of numerical variables along with categorical variables is needed to represent data in the form of a pie chart. The arc length of each slice and consequently the area and central angle it forms in a pie chart is proportional to the quantity it represents.
To choose a plot type, first define the objective of creating a plot. Based on the objective following are the different categories:
- These charts can be used when you want to compare one set of values with other sets of values.
- The objective is to differentiate one particular set of values from the other sets.
- Example, quarterly sales of competing phones in the market.
- Charts Used - Column Chart, Bar Chart
- These charts can be used to display how the various elements make up the complete data.
- Composition charts can be a. static - shows the composition at a particular instance of time b. dynamic - which shows the changes in the composition over a period of time
- Charts Used - Doughnut Chart, Stacked Column Chart
- These charts helps in visualising the correlation between variables.
- It can help in answering questions such as a. Is there a correlation between the amount spent on marketing and the sales revenue? b. How does the gross profit vary with the change in offers?
- Charts Used - Scatter Plot, Bubble Chart
- These charts tries to answer the question - How is the data distributed?.
- The distribution can be over a variable, or it can also be over a period of time.
- Charts Used - Histogram, Scatter Plot
- Histograms are quite good at displaying the distribution of data over intervals
- Scatter plots are good at visualising the distribution of data over two different variables.