Cells counted only once when extracting into overlapping geometries
turbanisch opened this issue · 3 comments
I'm preparing to teach a short course about geospatial data in R for economists. We will be using sf
for the introduction to vector data and most students are familiar with the Tidyverse. For that reason, I am inclined to use stars
rather than terra
or raster
.
When it comes to vector-raster interactions, collapsing and extracting raster data based on vector geometries is probably the function we use most. However, I found out that cells are extracted only once when geometries overlap using aggregate()
or st_extract()
(which reverts to aggregate()
in the case of polygons, as I understand).
Even though the article linked above was posted in March 2021, the behavior doesn't seem to have changed. Is there another proposed workflow for this application? I know that st_join()
and exactextractr
can be used instead but am still curious to hear about the use case for st_extract()
/aggregate()
. Overlapping geometries are not an uncommon scenario for us, for example when we create buffers around nearby points.
Yes, I can see this case; the thinking behind aggregate
is the implementation of stats::aggregate
, and the equivalence in SQL. For st_extract
it is a different story, basically lack of time / priority.
After looking at the code, I think st_extract
does the right thing for overlapping polygons: it calls aggregate
for each polygon. This may be highly inefficient, but should do the job.
Ah, I see, thank you so much! I didn't realize that st_extract()
calls aggregate()
on each individual polygon but that seems to be indeed the case. Perhaps a short remark in the function description that it can be used not only on point geometries could be helpful for other users? Anyways, I am looking forward to using the stars package in our course. For the didactical exercises in our course, performance will hopefully not be too much of an issue.