/monkeypox

Monkeypox 2022 repository

Primary LanguagePythonOtherNOASSERTION

Monkeypox data

tests quality-checks

Monkeypox data GSheets -> S3 script deploy Monkeypox data S3 -> Github

Contents

Data are updated Monday - Friday.

Data changes

  • 2022-07-07: Only confirmed cases for Brazil are reported
  • 2022-07-11: From this date data files (latest.csv, timeseries-*.csv) have cases from the current outbreak, and from countries where MPXV is endemic. The lists are distinguished by the first letter of the ID, which is a string: N denoting cases from the current outbreak (equivalent to the current list), and E for cases from endemic countries.
  • 2022-07-22: Endemic data has been updated to accurately reflect confirmed/suspected/total cases from source reporting.

This repository contains dated records of curated Monkeypox cases from the 2022 outbreak (April - ), a data dictionary, and a script used to pull contents from a spreadsheet into JSON and CSV files.

The script is intended for use by the curation team and supporting engineers. It requires access to the relevant Google Sheet, and a Google Cloud service account.

The data dictionary contains information about columns/fields in the data sets.

The archives folder contains dated JSON and CSV files. They are currently uploaded manually; regularly and automatically updated data sets live in an (currently private) S3 bucket.

The analytics folder contains scripts that use the curated data set. This currently includes an R file that finds the risk of re-identification based on curated data.

There is also a daily briefing report generated from this data at https://www.monkeypox.global.health

Getting the data

Line list (CSV): https://raw.githubusercontent.com/globaldothealth/monkeypox/main/latest.csv
Line list (JSON): https://raw.githubusercontent.com/globaldothealth/monkeypox/main/latest.json

Timeseries: https://raw.githubusercontent.com/globaldothealth/monkeypox/main/timeseries-confirmed.csv
Timeseries by country: https://raw.githubusercontent.com/globaldothealth/monkeypox/main/timeseries-country-confirmed.csv

Python (requires pandas):

import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/globaldothealth/monkeypox/main/latest.csv")

R :

df <- read.csv("https://raw.githubusercontent.com/globaldothealth/monkeypox/main/latest.csv")

Contributing

If you would like to request changes, open an issue on this repository and we will happily consider your request. If requesting a fix please include steps to reproduce undesirable behaviors.

If you would like to contribute, assign an issue to yourself and/or reach out to a contributor and we will happily help you help us.

If you want to send data to us, you can use our template at monkeypox-template.csv which makes it easier for us to add to our list. Just open an issue and attach a CSV / XLSX file in this repository, or email data to info@global.health. Remove any Personally Identifiable Information.

Visualizations

License and attribution

This repository and data exports (except files in the ecdc folder) are published under the CC BY 4.0 license.

Please cite as: "Global.health Monkeypox (accessed on YYYY-MM-DD)"

&

Kraemer, Tegally, Pigott, Dasgupta, Sheldon, Wilkinson, Schultheiss, et al. Tracking the 2022 Monkeypox Outbreak with Epidemiological Data in Real-Time. The Lancet Infectious Diseases. https://doi.org/10.1016/S1473-3099(22)00359-0.

For files in the ecdc folder, please cite (reproduction is authorized, provided the source is acknowledged):

European Centre for Disease Prevention and Control/WHO Regional Office for Europe. Monkeypox, Joint Epidemiological overview, {day} {month}, 2022