Last updated: Mon 01 Feb 2021 (16:15:31 UTC [+0000])
- Data available from 26 Feb 2020 until 01 Feb 2021 (342 days).
- Download the user friendly data from: covid19pt_DSSG_Long.csv or use the following direct link in your program:
- Variables
data
: Date (Portuguese spelling).origVars
: Variable name taken from source data.origType
: Orginal variable count type.other
: Other types oforigVars
.symptoms
: Recorded COVID-19 symptoms.sex
: Gender (F
- Females,M
- Males,All
- Females & Males).ageGrp
: Age groups in years (desconhecidos
- unknown).ageGrpLower
: Lower limit of age group (useful for sorting).ageGrpUpper
: Upper limit of age group.region
: Portuguese Regionsvalue
: Numeric value.valueUnits
: Units for the variablevalue
.
- Download the original unprocessed data (json to CSV) from: covid19pt_DSSG_Orig.csv
For more information about the data and variables see: https://github.com/dssg-pt/covid19pt-data
The original data were downloaded from an API provide by VOST https://covid19-api.vost.pt/Requests/get_entry/
Date | Cases (7 Day Mean) | Active Cases | Deaths (7 Day Mean) |
---|---|---|---|
Sat 23 Jan 2021 | 15333 (12150.4) | 162951 | 274 (212.1) |
Sun 24 Jan 2021 | 11721 (12341.3) | 169230 | 275 (229.7) |
Mon 25 Jan 2021 | 6923 (12372.9) | 170635 | 252 (241.9) |
Tue 26 Jan 2021 | 10765 (12417.1) | 167381 | 291 (252.3) |
Wed 27 Jan 2021 | 15073 (12478.0) | 172893 | 293 (262.9) |
Thu 28 Jan 2021 | 16432 (12890.6) | 180076 | 303 (274.6) |
Fri 29 Jan 2021 | 13200 (12778.1) | 181811 | 278 (280.9) |
Sat 30 Jan 2021 | 12435 (12364.1) | 179939 | 293 (283.6) |
Sun 31 Jan 2021 | 9498 (12046.6) | 181623 | 303 (287.6) |
Mon 01 Feb 2021 | 5805 (11886.9) | 179180 | 275 (290.9) |
Using the data.table
package to process the data.
# Load Libraries
library(data.table)
library(here)
# Read in data as a data.frame and data.table object.
CVPT <- fread(here("data", "covid19pt_DSSG_Long.csv"))
# You can use the direct link:
# CV <- fread("https://raw.githubusercontent.com/CEAUL/Dados_COVID-19_PT/master/data/covid19pt_DSSG_Long.csv")
# Looking at the key variables in the original long dataset.
CVPT[, .(data, origVars, origType, sex, ageGrp, region, value, valueUnits)]
## data origVars origType sex ageGrp region value valueUnits
## 1: 2020-02-26 ativos ativos All Portugal NA
## 2: 2020-02-27 ativos ativos All Portugal NA
## 3: 2020-02-28 ativos ativos All Portugal NA
## 4: 2020-02-29 ativos ativos All Portugal NA
## 5: 2020-03-01 ativos ativos All Portugal NA
## ---
## 29750: 2021-01-28 vigilancia vigilancia All Portugal 223150 Count
## 29751: 2021-01-29 vigilancia vigilancia All Portugal 225507 Count
## 29752: 2021-01-30 vigilancia vigilancia All Portugal 225365 Count
## 29753: 2021-01-31 vigilancia vigilancia All Portugal 223991 Count
## 29754: 2021-02-01 vigilancia vigilancia All Portugal 220353 Count
# Order data by original variable name and date.
setkeyv(CVPT, c("origVars", "data"))
# Convert data to a data object in dataset and add a change from previous day variable.
# Added a 7 day rolling average for origVars (except for symptoms).
# Columns `data` is date in Portuguese.
CV <- CVPT[, data := as.Date(data, format = "%Y-%m-%d")][
, dailyChange := value - shift(value, n=1, fill=NA, type="lag"), by = origVars][
grepl("^sintomas", origVars), dailyChange := NA][
, mean7Day := fifelse(origVars %chin% c("ativos", "confirmados", "obitos", "recuperados"),
frollmean(dailyChange, 7), as.numeric(NA))]
library(ggplot2)
library(magrittr)
# Change the ggplot theme.
theme_set(theme_bw())
# Data error prevents by sex plot.
# obMF <- CV[origType=="obitos" & sex %chin% c("M", "F") & ageGrp=="" & region == "Portugal"]
obAll <- CV[origType=="obitos" & sex %chin% c("All") & ageGrp=="" & region == "Portugal"][
, sex := NA]
obAll %>%
ggplot(aes(x=data, y=dailyChange)) +
geom_bar(stat = "identity", fill = "grey75") +
geom_line(data = obAll, aes(x = data, y = mean7Day), group=1, colour = "brown") +
scale_x_date(date_breaks = "1 months",
date_labels = "%b-%y",
limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
theme(legend.position = "bottom") +
labs(
title = "COVID-19 Portugal: Number Daily Deaths with 7 Day Rolling Mean",
x = "",
y = "Number of Deaths",
colour = "",
fill = "",
caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])"))
)
## Warning: Removed 1 rows containing missing values (position_stack).
## Warning: Removed 7 row(s) containing missing values (geom_path).
CV[origType=="confirmados" & !(ageGrp %chin% c("", "desconhecidos"))][
, .(valueFM = sum(value)), .(data, ageGrp)] %>%
ggplot(., aes(x=data, y=valueFM, colour = ageGrp)) +
geom_line() +
scale_x_date(date_breaks = "1 months",
date_labels = "%b-%y",
limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
scale_y_continuous() +
theme(legend.position = "bottom") +
labs(
title = "COVID-19 Portugal: Number of Confirmed Cases by Age Group",
x = "",
y = "Number of Confirmed Cases",
caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])")),
colour = "Age Group")
## Warning: Removed 54 row(s) containing missing values (geom_path).
CV[origType=="confirmados" & ageGrp=="" & region!="Portugal"] %>%
ggplot(., aes(x=data, y=value, colour=region)) +
geom_line() +
scale_x_date(date_breaks = "1 months",
date_labels = "%b-%y",
limits = c(min(cvwd$data2, na.rm = TRUE), NA)) +
scale_y_log10() +
theme(legend.position = "bottom") +
labs(
title = "COVID-19 Portugal: Number of Confirmed Cases by Region",
x = "",
y = "Number of Confirmed Cases",
caption = paste0("Updated on: ", format(Sys.time(), "%a %d %b %Y (%H:%M:%S %Z [%z])")),
colour = "Region")
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 326 row(s) containing missing values (geom_path).
The data are provided as is. Any quality issues or errors in the source data will be reflected in the user friend data.
Please create an issue to discuss any errors, issues, requests or improvements.
CV[dailyChange<0 & !(origType %in% c("vigilancia", "internados"))][
, .(data, origType, origVars, value, dailyChange)]
## data origType origVars value dailyChange
## 1: 2020-05-12 ativos ativos 23737 -249
## 2: 2020-05-16 ativos ativos 23785 -280
## 3: 2020-05-17 ativos ativos 23182 -603
## 4: 2020-05-18 ativos ativos 21548 -1634
## 5: 2020-05-22 ativos ativos 21321 -862
## ---
## 392: 2020-10-25 obitos obitos_arsalgarve 25 -10
## 393: 2020-05-23 obitos obitos_arscentro 230 -3
## 394: 2020-07-03 obitos obitos_arscentro 248 -1
## 395: 2020-06-20 obitos obitos_f 768 -1
## 396: 2020-05-21 transmissao transmissao_importada 767 -3