Data-collection-and-visualization Overview

Definition

Data Collection is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes.

The goal for all data collection is to capture quality evidence that allows analysis to lead to the formulation of convincing and credible answers to the questions that have been posed.

Data Visualisation is an interdisciplinary field that deals with the graphic representation of data and information.

It is a particularly efficient way of communicating when the data or information is numerous as for example a time series.

The goal is to communicate information clearly and efficiently to users. It is one of the steps in data analysis or data science.

History

Contrary to general belief, data visualization is not a modern development.

Since prehistory, stellar data, or information such as location of stars were visualized on the walls of caves (such as those found in Lascaux Cave in Southern France) since the Pleistocene era.

The first documented data visualization can be tracked back to 1160 B.C. with Turin Papyrus Map which accurately illustrates the distribution of geological resources and provides information about quarrying of those resources.

The recent emphasis on visualization started in 1987 with the special issue of Computer Graphics on Visualization in Scientific Computing.

The invention of paper and parchment allowed further development of visualizations throughout history.

By the 16th century, techniques and instruments for precise observation and measurement of physical quantities, and geographic and celestial position were well-developed.

French philosopher and mathematician René Descartes and Pierre de Fermat developed analytic geometry and two-dimensional coordinate system which heavily influenced the practical methods of displaying and calculating values.

In the second half of the 20th century, Jacques Bertin used quantitative graphs to represent information "intuitively, clearly, accurately, and efficiently".

John Tukey and Edward Tufte pushed the bounds of data visualization; Tukey with his new statistical approach of exploratory data analysis and Tufte with his book "The Visual Display of Quantitative Information" paved the way for refining data visualization techniques for more than statisticians.

Data Collection Methods

Direct observation.

1-on-1 interviews.

Open-ended surveys and questionnaires.

Closed-ended surveys and quizzes.

Focus groups.

Data Collection Techniques

Bar Chart: It presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent.

Histogram: It is an approximate representation of the distribution of numerical data.

Scatter plot: It uses Cartesian coordinates to display values for typically two variables for a set of data

Pie Chart: It uses Cartesian coordinates to display values for typically two variables for a set of data.

Network: It finds clusters in a network.

Line Chart: It represents information as a series of data points called 'markers' connected by straight line segments.

Heat Map and many more.

Applications

Data Collection and Visualization are used in almost all areas. Examples are;

Health sector

Data Mining

Financial data analysis

Market Studies and many more.

Zindi has hosted some challenges based on Data Collection and Visualization.

List of Data Collection and Visualization solutions from our hosted challenges:

  1. AFD Solutions for Gender based Violence Challenge

  2. Hack the Continent Open Buildings Challenge