COVID-19 Prediction with Visualization

kaggle link: https://www.kaggle.com/jinghuiwong/arima-eda-with-map-visualization

Analysis of Novel Corona Virus 2019 (COVID-19) Dataset

Objective

We would like to analyse the impact of COVID-19 on different countries across the world. By understanding the historical information and current numbers of confirmed, death, recovered cases, we can assess the growth rate of COVID-19 across different countries and determine whether the spread of COVID-19 is slowing or increasing in specific countries. We would like to predict the future rate of spread and number of deaths using various time series models.


Context

From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.

So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.

Johns Hopkins University has made an excellent dashboard using the affected cases data. This data is extracted from the same link and made available in csv format.


Content

2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC

This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.

The data is available from 22 Jan, 2020.

Data at individual level obtained from the below two sources


Acknowledgements

Johns Hopkins university has made the data available in google sheets format here. Sincere thanks to them.

Thanks to WHO, CDC, NHC and DXY for making the data available in first place.