Comparison BIEN4/GBIF
basille opened this issue · 0 comments
Hey @bmaitner!
Following our conversation a couple of weeks ago, I just take time now to provide a comparison (with example) between BIEN4 and GBIF data, of course using the two relevant R packages. I'll take the sycamore maple (Acer pseudoplatanus) for the illustration, although it's probably irrelevant. Here we go:
BIEN4 occurrence data
Note: This comes from my own records from a few days ago, as BIEN servers seem unresponsive as of today (The BIEN servers are currently undergoing updates and may be slower than usual at present.
).
Information about BIEN
:
library("BIEN")
BIEN_metadata_database_version()
db_version db_release_date
1 4.2.5 2021-12-07
Get the data:
acps_bien <- BIEN_occurrence_species("Acer pseudoplatanus",
native.status = TRUE,
political.boundaries = TRUE)
dim(acps_bien)
[1] 1699 22
Only data after 1990:
acps_bien$date_collected <- lubridate::ymd(acps_bien$date_collected)
acps_bien <- subset(acps_bien, date_collected > lubridate::ymd("1990-01-01"))
dim(acps_bien)
[1] 728 22
Convert to sf
class for mapping:
acps_bien <- st_as_sf(acps_bien, coords = c("longitude", "latitude"), remove = FALSE,
crs = 4326, agr = "constant")
ggplot(data = world) +
geom_sf(color = gray(.5), fill= "antiquewhite") +
geom_sf(data = acps_bien, size = .1, alpha = .2, col = "brown3") +
coord_sf(xlim = c(2.5e6, 7e6), ylim = c(1.3e6, 5.3e6), crs = st_crs(3035)) +
labs(
x = "Longitude",
y = "Latitude",
title = acps_nom_scient,
subtitle = "Données BIEN"
) +
theme(
panel.grid.major = element_line(color = gray(.7),
linetype = "dashed", size = 0.5),
panel.background = element_rect(fill = "aliceblue"),
plot.title = element_text(face = "italic")
)
GBIF occurrence data and comparison
Prepare the query and download the data:
library("rgbif")
acps_gbif_dl <- occ_download(
pred("taxonKey", name_backbone(name = "Acer pseudoplatanus", rank = "species")$speciesKey), # Main key
pred("hasGeospatialIssue", FALSE), # Remove default geospatial issues
pred("hasCoordinate", TRUE), # Keep only records with coordinates
pred("occurrenceStatus","PRESENT"), # Remove absent records
pred_not(pred_in("basisOfRecord",c("FOSSIL_SPECIMEN","LIVING_SPECIMEN"))), # Remove fossils and living specimens (zoo/botanical garden)
pred_and( # Between 1990–2020 (both included)
pred_gte("year", "1990"),
pred_lte("year", "2020")),
format = "SIMPLE_CSV"
)
occ_download_wait(acps_gbif_dl)
acps_gbif <- occ_download_get(acps_gbif_dl, path = "Data/gbif-acps/", overwrite = TRUE) |>
occ_download_import()
Remove non-commercial data and check the resulting data:
acps_gbif <- subset(acps_gbif, license != "CC_BY_NC_4_0")
dim(acps_gbif)
[1] 387557 50
Convert to sf
class for mapping:
acps_gbif <- st_as_sf(acps_gbif, coords = c("decimalLongitude", "decimalLatitude"),
remove = FALSE, crs = 4326, agr = "constant")
ggplot(data = world) +
geom_sf(color = gray(.5), fill= "antiquewhite") +
geom_sf(data = acps_gbif, size = .1, alpha = .05, col = "brown3") +
coord_sf(xlim = c(2.5e6, 7e6), ylim = c(1.3e6, 5.3e6), crs = st_crs(3035)) +
labs(
x = "Longitude",
y = "Latitude",
title = acps_nom_scient,
subtitle = "Données GBIF"
) +
theme(
panel.grid.major = element_line(color = gray(.7),
linetype = "dashed", size = 0.5),
panel.background = element_rect(fill = "aliceblue"),
plot.title = element_text(face = "italic")
)
Summary
There is a striking difference between the two datasets, even after removing a bunch of data with non-commercial restrictions (728 vs. 387557 records).