JOSE Review - comment on Geographical Unit Data section
Opened this issue · 6 comments
This section provides a good introduction to the different formats of socio-economic data, but without too much unnecessary detail. My main comment here is simply the sequencing of material. The Hands-On Exercise, Step 2 makes use of geographical unit data, but this data is not discussed until after. I realize that you want the structure of the tutorial to be such that there is an Exercise at the end of each Chapter, but Step 2 seems out of place. Please consider how you might restructure the Exercises to better align with the flow of the content.
===============================
Section: Generating Geographical Unit Data
- first paragraph: remove "and" -> "collected corresponding to..."
- See Also box: link just goes to the current issue -> needs to be revised
- Note that some of the packages for GIS-type data contain the word "raster". Perhaps you should introduce this term in the GIS data section as well.
- update GDAL Georeferencer link -> there is a note on the website indicating the the tutorial is now obsolete.
- Caution Box: would have been good to know this for Hands-On Exercise, Step 2.
==============================
Section: Constructing weather averages within spatial units
Sub-section: Approach 1
- I am not an R user, so I don't really know what you are doing without doing a lot of googling. Is it possible to add the python (or matlab) implementation? Also it would help if you defined some of your variables: longitude0, longitude1, gridwidth, etc.
- paragraph after second code block: "upright" -> "up right"
Sub-section: Approach 2
- can you show a diagram of this? I think this would help students visualize what's going on. There are some similar ones online. I don't think this is exactly the right image, but maybe something similar to this: https://www.esri.com/arcgis-blog/products/spatial-analyst/analytics/getting-the-most-out-of-zonal-statistics/
=============================
Section: Weighting Schemes
- the learning objectives appear at the beginning of this section rather than the beginning of the chapter (as is done in the other chapters). I suggest moving the LOs to the beginning fo the chapter again for consistency.
- third paragraph: missing "the" -> "...as described in the section..."
- nomenclature: in Hands-On Exercise, Step 2, a Psi was used for the weighting factor, but here an omega is used. I suggest using consistent nomenclature.
- LandScan link not working
- in the "The spatial gridding scheme" sub-section: what do you mean by "vertical"? I would typically interpret this as altitude in the atmosphere, but I don't think this is what you mean. To me, surface temperature, for example only has horizontal spatial dimensions. Perhaps, the terms, "meridional" and "zonal" might be useful here.
- in the "The geographic projection" sub-section: "species" -> "specifies"
- can you provide a code example for the python Geogrid package as there are no examples in the Geogrid repo?
Sub-section: Aligning Weather and Weighting Grids
- first line: missing "the" -> "...it conform to the data grid..."
- last line of first paragraph: vertical -> meridional
- second line of third paragraph: "code below" -> which code below? The code seems to show the other three cases.
- code examples: can you provide an example of what 'rr' is? Is it the grid or the actual data on the grid. The way it is worded it seems like it is the grid itself, but I don't think it is.
Sub-section: Example
- I would appreciate a python implementation of this example.
Sub-section: Plotting
- I would appreciate a python implementation of this example.
- You might need to provide a bit more guidance on which tab to click on the IRI link to download the data. "Click on Data Files and then click the netcdf file format option."
- it might be nice to add a note at the end of this sub-section about choosing colour maps: https://www.nature.com/articles/s41467-020-19160-7
Section: Hands-On Exercise, Step 3
- couldn't get the conda install of xagg to work. Used pip but had to also add xesmf.
- in the code examples, use the same path setup as in the previous exercises: ".../data/"
- got a "warning" for the weightmap = xa.pixel.overlaps() step (see attached screenshot)
- got an error for the xa.aggregate step (see attached screenshot)
Looks like this points to a bug in xagg as well. I can start an issue there and hopefully take a stab at it. It's high time for a new release anyways.
couldn't get the conda install of xagg to work. Used pip but had to also add xesmf.
got a "warning" for the weightmap = xa.pixel.overlaps() step (see attached screenshot)
got an error for the xa.aggregate step (see attached screenshot)
This is now fixed in xagg
v0.3.1, which is the latest version on conda-forge. Currently conda
and mamba
still want to download v0.3.0.2 (which has a few bugs) unless explicitly told xagg==0.3.1
. Perhaps we should make a note of that - I don't think we currently have an environment creation step in the tutorial anyway?
@atrisovic I made the ../data/
edits on the PR #87, since I make other edits to the example step 3 there.
For the warnings and errors from xagg
, here's how things look with the new version:
>>> weightmap = xa.pixel_overlaps(ds_tas, gdf_counties, weights=ds_pop.Population, subset_bbox=False)
creating polygons for each pixel...
lat/lon bounds not found in dataset; they will be created.
calculating overlaps between pixels and output polygons...
success!
>>> aggregated = xa.aggregate(ds_tas, weightmap)
aggregating tas_adj...
aggregating tas_sq...
all variables aggregated to polygons!
>>>
Hello! 👋
Thank you @kls2177 for your thoughtful and detailed review. We really appreciate it. 🙏
Here are the responses to the comments. They are all incorporated and will be merged with the pull request #90
- first paragraph: remove "and" -> "collected corresponding to..."
Fixed.
- See Also box: link just goes to the current issue -> needs to be revised
Fixed.
- Note that some of the packages for GIS-type data contain the word "raster". Perhaps you should introduce this term in the GIS data section as well.
A definition is now added in the same paragraph.
- update GDAL Georeferencer link -> there is a note on the website indicating the the tutorial is now obsolete.
Updated.
- Caution Box: would have been good to know this for Hands-On Exercise, Step 2.
The hands-on example is now updated.
- I am not an R user, so I don't really know what you are doing without doing a lot of googling. Is it possible to add the python (or matlab) implementation? Also it would help if you defined some of your variables: longitude0, longitude1, gridwidth, etc.
Added.
- paragraph after second code block: "upright" -> "up right"
Fixed.
- can you show a diagram of this? I think this would help students visualize what's going on. There are some similar ones online. I don't think this is exactly the right image, but maybe something similar to this: https://www.esri.com/arcgis-blog/products/spatial-analyst/analytics/getting-the-most-out-of-zonal-statistics/
Thank you! A diagram is added.
- the learning objectives appear at the beginning of this section rather than the beginning of the chapter (as is done in the other chapters). I suggest moving the LOs to the beginning fo the chapter again for consistency.
Fixed.
- third paragraph: missing "the" -> "...as described in the section..."
Fixed.
- nomenclature: in Hands-On Exercise, Step 2, a Psi was used for the weighting factor, but here an omega is used. I suggest using consistent nomenclature.
Both are now
- LandScan link not working
Fixed.
- in the "The spatial gridding scheme" sub-section: what do you mean by "vertical"? I would typically interpret this as altitude in the atmosphere, but I don't think this is what you mean. To me, surface temperature, for example only has horizontal spatial dimensions. Perhaps, the terms, "meridional" and "zonal" might be useful here.
Thank you, we updated the wording here to include "meridional" and "zonal" instead of vertical.
- in the "The geographic projection" sub-section: "species" -> "specifies"
Fixed.
- can you provide a code example for the python Geogrid package as there are no examples in the Geogrid repo?
We excluded the reference to geogrid
as it is surpassed by Python's rasterio
package.
- first line: missing "the" -> "...it conform to the data grid..."
Fixed.
- last line of first paragraph: vertical -> meridional
Fixed.
- second line of third paragraph: "code below" -> which code below? The code seems to show the other three cases.
The wording is updated.
- code examples: can you provide an example of what 'rr' is? Is it the grid or the actual data on the grid. The way it is worded it seems like it is the grid itself, but I don't think it is.
Fixed.
- I would appreciate a python implementation of this example.
Added.
- I would appreciate a python implementation of this example.
Added.
- You might need to provide a bit more guidance on which tab to click on the IRI link to download the data. "Click on Data Files and then click the netcdf file format option."
Thank you. Added.
- it might be nice to add a note at the end of this sub-section about choosing colour maps: https://www.nature.com/articles/s41467-020-19160-7
Great idea. Added!
- couldn't get the conda install of xagg to work. Used pip but had to also add xesmf.
Now updated.
- in the code examples, use the same path setup as in the previous exercises: ".../data/"
Now updated.
- got a "warning" for the weightmap = xa.pixel.overlaps() step (see attached screenshot)
Now updated.
- got an error for the xa.aggregate step (see attached screenshot)
Now updated.
Thanks for these updates. A couple of outstanding comments:
==============================
Section: Constructing weather averages within spatial units
Sub-section: Approach 1
It would help if you defined some of your variables: longitude0, longitude1, gridwidth, etc.
==============================
Hands-On Exercise, Step 3
During the aggregated = xa.aggregate(ds_tas, weightmap)
step, I am getting the following repeated warning:
RuntimeWarning: invalid value encountered in divide
a2 = a2/a2.sum()