mlampros/IceSat2R

CopernicusDEM comparison

evetion opened this issue · 11 comments

In the documentation here https://mlampros.github.io/IceSat2R/articles/IceSat-2_Atlas_products_HTML.html, you compare to ICESat-2 to CopernicusDEM, which shows a significant vertical difference. I expect this difference is because ICESat-2 is reference to the ellipsoid, while CopernicusDEM is referenced to the EGM2008 geoid.

I assume you mean this code snippet,

#......................................................................
# compute also the difference in distance between the beam measurements
# and the DEM coordinates (based on the 30-meter resolution cells)
#......................................................................

merg_cells$dem_dif_dist = geodist::geodist(x = merg_cells[, c('longitude', 'latitude')], 
                                           y = merg_cells[, c('lon_dem30', 'lat_dem30')], 
                                           paired = TRUE,
                                           measure = 'geodesic')

summary(merg_cells$dem_dif_dist)

#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 0.2356  8.1619 11.5943 11.1490 14.3201 20.5394

which shows the point difference between the ICESat-2 and Copernicus DEM difference (I used the centroids of the Copernicus DEM grid cells)

ICESat-2 is reference to the ellipsoid, while CopernicusDEM is referenced to the EGM2008 geoid

You are right on this. The documentation page mentions

Data structure
Each file is its own object in Amazon S3. The data are organised per tiles in original 1x1 degree grid.

and regarding the data comparison between ICESat and Copernicus DEM (page 9 of 23) is mentioned,

In order to provide a most accurate data comparison, the ICESat elevation values have been compared with the
source DEM of the Copernicus DEM, the WorldDEM at its native resolution of 0.4’’ arc seconds. To compare the ICESat
GLAS v33 reference points with the same horizontal, WGS84, and the vertical reference system, EGM08, as the Copernicus
DEM/WorldDEM, a transformation has been applied.

I'll have a look into this over the weekend and I'll adjust the vignette accordingly.
Thanks for making me aware of this issue.

I downloaded a sample Copernicus DEM image ("Copernicus_DSM_COG_10_N27_00_E061_00_DEM.tif") from the AWS open data registry using the "CopernicusDEM" R package.
It seems that there is no vertical coordinate reference system (CRS) in the metadata of the .tif file using "gdalinfo". I tried using the 'sf' R package,

sf::gdal_utils(util = 'info', source = "Copernicus_DSM_COG_10_N27_00_E061_00_DEM.tif", quiet = FALSE)

# Driver: GTiff/GeoTIFF
# Files: /home/lampros/DIR_SAVE_DEM/Copernicus_DSM_COG_10_N27_00_E061_00_DEM.tif
# Size is 3600, 3600
# Coordinate System is:
# GEOGCS["WGS 84",
#     DATUM["WGS_1984",
#         SPHEROID["WGS 84",6378137,298.257223563,
#             AUTHORITY["EPSG","7030"]],
#         AUTHORITY["EPSG","6326"]],
#     PRIMEM["Greenwich",0],
#     UNIT["degree",0.0174532925199433],
#     AUTHORITY["EPSG","4326"]]
# Origin = (60.999861111111109,28.000138888888888)
# Pixel Size = (0.000277777777778,-0.000277777777778)
# Metadata:
#   AREA_OR_POINT=Point
# Image Structure Metadata:
#   COMPRESSION=DEFLATE
#   INTERLEAVE=BAND
# Corner Coordinates:
# Upper Left  (  60.9998611,  28.0001389) ( 60d59'59.50"E, 28d 0' 0.50"N)
# Lower Left  (  60.9998611,  27.0001389) ( 60d59'59.50"E, 27d 0' 0.50"N)
# Upper Right (  61.9998611,  28.0001389) ( 61d59'59.50"E, 28d 0' 0.50"N)
# Lower Right (  61.9998611,  27.0001389) ( 61d59'59.50"E, 27d 0' 0.50"N)
# Center      (  61.4998611,  27.5001389) ( 61d29'59.50"E, 27d30' 0.50"N)
# Band 1 Block=1024x1024 Type=Float32, ColorInterp=Gray
#   Overviews: 1800x1800, 900x900, 450x450

and using the latest gdal ubuntu docker image,

docker run --rm -v /home:/home/ osgeo/gdal:ubuntu-small-latest gdalinfo $PWD/Copernicus_DSM_COG_10_N27_00_E061_00_DEM.tif

# Driver: GTiff/GeoTIFF
# Files: /home/lampros/Copernicus_DSM_COG_10_N27_00_E061_00_DEM.tif
# Size is 3600, 3600
# Coordinate System is:
#   GEOGCRS["WGS 84",
#           DATUM["World Geodetic System 1984",
#                 ELLIPSOID["WGS 84",6378137,298.257223563,
#                           LENGTHUNIT["metre",1]]],
#           PRIMEM["Greenwich",0,
#                  ANGLEUNIT["degree",0.0174532925199433]],
#           CS[ellipsoidal,2],
#           AXIS["geodetic latitude (Lat)",north,
#                ORDER[1],
#                ANGLEUNIT["degree",0.0174532925199433]],
#           AXIS["geodetic longitude (Lon)",east,
#                ORDER[2],
#                ANGLEUNIT["degree",0.0174532925199433]],
#           ID["EPSG",4326]]
# Data axis to CRS axis mapping: 2,1
# Origin = (60.999861111111109,28.000138888888888)
# Pixel Size = (0.000277777777778,-0.000277777777778)
# Metadata:
#   AREA_OR_POINT=Point
# Image Structure Metadata:
#   COMPRESSION=DEFLATE
# INTERLEAVE=BAND
# LAYOUT=COG
# PREDICTOR=3
# Corner Coordinates:
#   Upper Left  (  60.9998611,  28.0001389) ( 60d59'59.50"E, 28d 0' 0.50"N)
# Lower Left  (  60.9998611,  27.0001389) ( 60d59'59.50"E, 27d 0' 0.50"N)
# Upper Right (  61.9998611,  28.0001389) ( 61d59'59.50"E, 28d 0' 0.50"N)
# Lower Right (  61.9998611,  27.0001389) ( 61d59'59.50"E, 27d 0' 0.50"N)
# Center      (  61.4998611,  27.5001389) ( 61d29'59.50"E, 27d30' 0.50"N)
# Band 1 Block=1024x1024 Type=Float32, ColorInterp=Gray
#   Overviews: 1800x1800, 900x900, 450x450

I would expect that the "VERT_CS" tag is present in the "gdalinfo" output as mentioned in this stackoverflow thread.
The maintainer of the Copernicus DEM on AWS ("sentinelhub") confirmed in a related question that the vertical CRS is the "EGM2008".

I re-run the vignette to use the Copernicus DEM .tif files that correspond (as the ICESat-2 data) to the area of interest. Then I modified the code in the following way,

#...........................................................................................................
# The Copernicus DEM has a "horizontal" CRS (Coordinate Reference System) of WGS84-G1150 (EPSG 4326)
# and a "vertical" CRS of EGM2008 (EPSG 3855) - geoid. Therefore, a transformation of the vertical 
# CRS is required to match the ellipsoid CRS of the ICESat-2 data
# The Table 1 (page 10 of 37) of the following weblink includes more information:
# https://spacedata.copernicus.eu/documents/20126/0/GEO1988-CopernicusDEM-SPE-002_ProductHandbook_I1.00.pdf
# The 2.5 minute geoid grid .gtx file of EGM2008 was downloaded from the following "osgeo" url:
# http://download.osgeo.org/proj/vdatum/egm08_25/
#...........................................................................................................

pth_gtx = file.path(dem_dir, 'egm08_25.gtx')

download.file(url = 'http://download.osgeo.org/proj/vdatum/egm08_25/egm08_25.gtx', 
              destfile = pth_gtx, 
              method = 'curl')

pth_dem_transformed = file.path(dem_dir, 'copernicus_dem_egm2008_transformed.tif')

convrt = sf::gdal_utils(
  util = "warp",
  source = pth_egm,
  destination = pth_dem_transformed,
  options = c("-s_srs", glue::glue("+proj=longlat +datum=WGS84 +no_defs +geoidgrids={pth_gtx}"),
              "-t_srs", "+proj=longlat +datum=WGS84 +no_def"),
  quiet = FALSE)


#....................................
# load the transformed Copernicus DEM
#....................................

rst_crop_transformed = terra::rast(x = pth_dem_transformed)


#...............................................
# we also have to find the closest elevation 
# value to the 'winter' and 'summer' coordinates
# using the raster resolution
#...............................................

ter_dtbl = data.table::as.data.table(x = rst_crop_transformed, xy = TRUE, cells = TRUE)
colnames(ter_dtbl) = c("cell", "lon_dem30", "lat_dem30", "dem30")
# length(unique(ter_dtbl$cell)) == nrow(ter_dtbl)

xy = as.matrix(sw_hq_merg_beams[, c('longitude', 'latitude')])
sw_cells = terra::cellFromXY(object = rst_crop_transformed, xy = xy)

sw_hq_merg_beams$cell = sw_cells
# length(unique(sw_hq_merg_beams$cell)) < nrow(sw_hq_merg_beams)

merg_cells = merge(x = sw_hq_merg_beams, y = ter_dtbl, by = 'cell')


#......................................................................
# compute also the difference in distance between the beam measurements
# and the DEM coordinates (based on the 30-meter resolution cells)
#......................................................................

merg_cells$dem_dif_dist = geodist::geodist(x = merg_cells[, c('longitude', 'latitude')], 
                                           y = merg_cells[, c('lon_dem30', 'lat_dem30')], 
                                           paired = TRUE,
                                           measure = 'geodesic')

summary(merg_cells$dem_dif_dist)

 #   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 # 0.1228  5.9179  8.9506  9.1205 12.5481 17.1965 

Therefore,

  • I used the 2.5 min. grid .gtx file to perform the transformation using "gdalwarp"
  • then I computed the cells for the transformed Copernicus DEM and the ICESat-2 data
  • then I finally merged both based on the grid cells and I computed the difference using the lat long values (of the ICESat-2 and Copernicus DEM)

The difference is reduced from

#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 0.2356  8.1619 11.5943 11.1490 14.3201 20.5394

to

 #   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 # 0.1228  5.9179  8.9506  9.1205 12.5481 17.1965 

Worth mentioning is that the vertically transformed Copernicus DEM elevation is now higher than the ICESat-2 land-ice height (both in winter and summer)

# before transformation (previous summary data)

# summary(merg_cells$h_li_winter - merg_cells$dem30)
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 51.24   53.95   54.32   54.27   54.65   55.94 

# summary(merg_cells$h_li_summer - merg_cells$dem30)
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 53.02   54.36   54.67   54.68   54.99   56.34 


# after the transformation (using the .gtx file and gdalwarp)

# summary(merg_cells$h_li_winter - merg_cells$dem30)
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# -49.80  -46.53  -46.01  -46.22  -45.68  -44.15 

# summary(merg_cells$h_li_summer - merg_cells$dem30)
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# -48.21  -46.16  -45.73  -45.81  -45.37  -44.04 

What I would expect is that the ICESat-2 land-ice-height data would have higher values compared to the Copernicus DEM

An alternative way to come to the .gtx file is by using GeographicLib
I ported this C++ library in R and saved the .gtx file for different intervals per degree which allows a better resolution for the output .gtx file, however I observed that it doesn't give different results compared to the existing 2.5 minute geoid grid downloaded from "osgeo"

I'm happy to see you transformed the vertical reference. I can't say much about the results though, the first difference is the horizontal distance of the closest matching elevation? And the vertical differences are now negative instead of positive? I don't think the geoid has +100m values anywhere, are you sure you didn't apply the correction twice?

the first difference is the horizontal distance of the closest matching elevation?

yes this is the difference between the point coordinates of the ICESat-2 and their corresponding Copernicus DEM data

And the vertical differences are now negative instead of positive?

the negative differences are the differences in elevation between the ICESat-2 and the Copernicus DEM data

I applied the correction once based on 'gdalwarp' using the 'sf' R package (you can see the code in my previous message).

The attached "copernicus_dem_egm2008.tif.zip" includes the Copernicus DEM that I used as "pth_egm" in the previous mentioned code snippet (inside the "sf::gdal_utils()" function) and the "egm08_25.gtx" file can be downloaded from http://download.osgeo.org/proj/vdatum/egm08_25/ in case you want to reproduce (or verify) the differences. I don't think that I do something wrong.

copernicus_dem_egm2008.tif.zip

Now I realize that the "sw_hq_merg_beams" R object (data.table) is required, which I attach here as a .csv file,

sw_hq_merg_beams.csv

yes this is the difference between the point coordinates of the ICESat-2 and their corresponding Copernicus DEM data

So if you only changed the raster elevation values, and not the resolution or bounds, these horizontal differences should've stayed the same?

I don't think that I do something wrong.

I couldn't spot any mistakes as well, but I would argue that the results, either a ~+50m or ~-50m median difference are suspect. Especially given that the egm2008 geoid is ~+47m at that spot in Greenland. And ICESat-2, in ideal conditions, is known to be accurate within a handful of centimeters.

So if you only changed the raster elevation values, and not the resolution or bounds, these horizontal differences should've stayed the same?

That's true. Let me see if I have to explicitly specify these parameters (resolution, extend and CRS) of the input (source) image in the "gdalwarp" function (if it makes any difference and there is no difference compared to the initial computation)

It seems this was the reason for the negative differences in elevation. After specifying the resolution, extent and CRS (the following code snippet shows the changes compared to the previous code of my message),

pth_egm = file.path(dem_dir, 'copernicus_dem_egm2008.tif')
rst_dem = terra::rast(pth_egm)
# rst_dem

CRS_dem = terra::crs(rst_dem, proj = TRUE)
RES_dem = as.character(terra::res(rst_dem))
EXT_dem = as.vector(terra::ext(rst_dem))
EXT_dem = c(EXT_dem['xmin'], EXT_dem['ymin'], EXT_dem['xmax'], EXT_dem['ymax'])
EXT_dem = as.character(EXT_dem)

convrt = sf::gdal_utils(
  util = "warp",
  source = pth_egm,
  destination = pth_dem_transformed,
  options = c("-s_srs", glue::glue("{CRS_dem} +geoidgrids={pth_gtx}"),
              "-t_srs", CRS_dem,
              "-tr", RES_dem,
              "-te", EXT_dem),
  quiet = FALSE)

I received the same summary statistics for the differences in elevation (between ICESat-2 and Copernicus DEM) as previously (so there is no change in the horizontal differences),

summary(merg_cells$dem_dif_dist)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2356  8.1619 11.5943 11.1490 14.3201 20.5394 

and now the differences (between the ICESat-2 and Copernicus DEM data) for the summer and winter elevation data is positive,

summary(merg_cells$h_li_winter - merg_cells$dem30)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.6775  3.6997  4.1428  4.0206  4.4779  5.8357 
summary(merg_cells$h_li_summer - merg_cells$dem30)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.562   4.090   4.451   4.431   4.788   6.147

which is something would someone expect ( higher values for the land-ice-height ICESat-2 data compared to the Copernicus DEM)

I'll make the changes also to the code in the vignette and I'll push the changes to the repository

Those differences indeed look very sensible, glad we got it fixed :)