/covid-vaccinations-python

Basic python program that loads covid vaccination data sets available on Kaggle and shows national progress

Primary LanguageJupyter Notebook

Covid Vaccinations vs population

Tracking Covid vaccinations across the globe. How far have we travelled along the vaccination journey?

Vaccinations Scatter plot

Datasets

Platform

I used the Kaggle Python Container Image. It is way overkill for this task.

  • Kaggle Python Container Image data science docker container to rule them all
  • Any reasonable anaconda or Jupyter notebook environment your link here. You can use any Anaaconda / Jupyter Notebook environment.

Genesis / Credits

Execution with provided script

The included bash script will download the data and run the server.

  1. Open a terminal and cd to this directory
  2. execute bash start-kaggle-container.sh It will
    1. download the data
    2. download the docker image
    3. run the container and Jupyter notebook server
  3. Open a browser to http://localhost:8080/
  4. Open and run code/vaccinations_by_country.ipynb in the Jupyter Notebook browser view in the left pane.
    1. It will prompt you for city or state and pick the correct data file based on your prompt

Manual Execution

  1. Open a terminal and cd to this directory

  2. Make a directory in this directory called data

  3. Blah blah the csv files from Github and put it in data/vaccinations.csv

  4. Download the global vaccination data and the us state vaccination data. This can be done from inside the notebook or the command line

    curl https://covid.ourworldindata.org/data/vaccinations/vaccinations.csv -o data/vaccinations_world.csv
    curl https://covid.ourworldindata.org/data/vaccinations/us_state_vaccinations.csv -o data/vaccinations_state.csv
  5. Start your Jupyter server

    1. You can use any environment, local, docker, etc
    2. I use the Kaggle Python Docker image by running bash start-kaggle-container.sh in this directory. It will download the container 18GB and start the Jupyter server.
  6. Open Jupyter Notebook server.

    1. Open a browser to http://localhost:8080/ or wherever your notbook server is locate
    2. Open and run vaccinations_by_country.ipynb in the Jupyter Notebook browser view in the left pane.

Shutting down the server

There are a couple ways to terminate the server

  1. ctrl-c in the terminal window and answer Y
  2. Terminate the server in the Jupyter Notebook menu in the browser window

Demo: order of operations

Source data may be missing days and columns

Sample Data 2 Countries

We add missing days and interpolate or fill missing cell values vaccinations_by_country.ipynb

loading and adjusting the data flow

Sample results from various data phases 19 Feb 2021 data set

Phase Number of Records Daily Vaccinations (populated) Total Vaccinations (populated) Vaccinated per 100 (populated)
Initial Load 3679 3542 2461 1367
Post row fill 6831 3432 2461 1367
Post value interpolation 6831 3868 4008 1837