At the Mobile Tartu in Estonia I learned about the country’s strong open data in general, and open traffic count data in particular. A quick search, via their open data repository, led here: https://avaandmed.eesti.ee/datasets/liiklusloenduse-andmed
We’ll use the tidyverse:
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ purrr::%||%() masks base::%||%()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Let’s try to download some data, starting with the following:
list.files()
[1] "d0cc4ba5-9a4d-448c-b268-bfb5e7b71537-LL-meta.xlsx.csv"
[2] "e501fb9b-4a71-453f-9d7f-bb5e819ee692-ll_2024.csv.csv"
[3] "map.html"
[4] "README.md"
[5] "README.qmd"
[6] "README.rmarkdown"
f = "e501fb9b-4a71-453f-9d7f-bb5e819ee692-ll_2024.csv.csv"
if (!file.exists(f)) {
stop("Go to the portal and download the data")
}
You also need to download the traffic location data:
locations = read_csv("d0cc4ba5-9a4d-448c-b268-bfb5e7b71537-LL-meta.xlsx.csv")
Rows: 118 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): Name, Connection ID, County
dbl (4): Road nr, Road km, Lon, Lat
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# # A tibble: 118 × 7
# Name `Connection ID` `Road nr` `Road km` County Lon Lat
# <chr> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
# 1 LOO 1_13236 VBV 0cfcf 1 13.2 Harju… 25.0 59.4
# 2 PRÜGILA RIST 1_17794 … 25785 1 17.8 Harju… 25.0 59.5
# 3 KODASOO 1_32100 VBV 077a6 1 32.1 Harju… 25.3 59.4
# 4 VIITNA 1_73241 VBV 67241 1 73.2 Lääne… 26.0 59.5
# 5 SÄMI 1_109455 VBV 387ad 1 109. Lääne… 26.6 59.4
# 6 VARJA 1_146054 VBV 748fe 1 146. Ida-V… 27.1 59.4
# 7 KUKRUSE 1_158295 VBV 5e92d 1 158. Ida-V… 27.3 59.4
# 8 KONJU 1_176970 VBV 180b3 1 177. Ida-V… 27.6 59.4
# 9 SINIMÄE 1_194738 VBV 91e72 1 195. Ida-V… 27.9 59.4
# 10 PEETRI 2_7050 VBV 0f55c 2 7 Harju… 24.8 59.4
# # ℹ 108 more rows
# # ℹ Use `print(n = ...)` to see more rows
traffic_data = read_csv(f)
Rows: 1029423 Columns: 24
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): id
dbl (22): 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, kanal, <40Kph, 40-<50, 50-<60, 60-...
dttm (1): aeg
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
names(traffic_data)
[1] "1" "2" "3" "4" "5" "6"
[7] "7" "8" "9" "10" "id" "kanal"
[13] "aeg" "<40Kph" "40-<50" "50-<60" "60-<70" "70-<80"
[19] "80-<90" "90-<100" "100-<110" "110-<120" "120-<130" "=>130"
# [1] "1" "2" "3" "4" "5" "6"
# [7] "7" "8" "9" "10" "id" "kanal"
# [13] "aeg" "<40Kph" "40-<50" "50-<60" "60-<70" "70-<80"
# [19] "80-<90" "90-<100" "100-<110" "110-<120" "120-<130" "=>130"
unique_counters = unique(traffic_data$id)
length(unique_counters) # 116
[1] 116
Let’s see how well the IDs of the locations matches the IDs from the traffic data
summary(locations$`Connection ID` %in% unique_counters)
Mode FALSE TRUE
logical 3 115
Let’s plot the locations on a map:
locations_geo = sf::st_as_sf(locations, coords = c("Lon", "Lat"))
map = tmap::qtm(locations_geo)
tmap::tmap_save(map, "map.html")
Interactive map saved to /home/robin/github/robinlovelace/estonianTrafficData/map.html
webshot2::webshot("map.html")