Much smaller output image, when compared to input image

Question

Much smaller output image, when compared to input image

Closed this issue 24 days ago · 1 comments

dennisshushackost commented 3 months ago

I have a question/issue regarding the output (extend) of the super resolution image. My input is a (512,512,4) sentinel-2 image with an extend of around 5km x 5km. I am using the pre-trained model, as indicated in the instructions with default ts (512). I noticed however, that the super resolution image has a much smaller extend, compared to the original image. Is there a way to fix this by any chance? (1st image is the input, second one is the output). Any help or any tips are greatly appreciated.

GDALINFO of the .tif file:
Size is 512, 512
Coordinate System is:
PROJCRS["WGS 84 / UTM zone 32N",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["UTM zone 32N",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",9,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Navigation and medium accuracy spatial referencing."],
AREA["Between 6┬░E and 12┬░E, northern hemisphere between equator and 84┬░N, onshore and offshore. Algeria. Austria. Cameroon. Denmark. Equatorial Guinea. France. Gabon. Germany. Italy. Libya. Liechtenstein. Monaco. Netherlands. Niger. Nigeria. Norway. Sao Tome and Principe. Svalbard. Sweden. Switzerland. Tunisia. Vatican City State."],
BBOX[0,6,84,12]],
ID["EPSG",32632]]
Data axis to CRS axis mapping: 1,2
Origin = (456680.000000000000000,5269380.000000000000000)
Pixel Size = (10.000000000000000,-10.000000000000000)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 456680.000, 5269380.000) ( 8d25'26.28"E, 47d34'35.00"N)
Lower Left ( 456680.000, 5264260.000) ( 8d25'28.10"E, 47d31'49.16"N)
Upper Right ( 461800.000, 5269380.000) ( 8d29'31.36"E, 47d34'36.15"N)
Lower Right ( 461800.000, 5264260.000) ( 8d29'32.97"E, 47d31'50.31"N)
Center ( 459240.000, 5266820.000) ( 8d27'29.68"E, 47d33'12.67"N)
Band 1 Block=512x2 Type=UInt16, ColorInterp=Gray
Min=684.000 Max=17728.000
Minimum=684.000, Maximum=17728.000, Mean=1722.797, StdDev=603.749
NoData Value=0
Metadata:
STATISTICS_MAXIMUM=17728
STATISTICS_MEAN=1722.7968677642
STATISTICS_MINIMUM=684
STATISTICS_STDDEV=603.74862160063
STATISTICS_VALID_PERCENT=95.56
Band 2 Block=512x2 Type=UInt16, ColorInterp=Undefined
Min=1145.000 Max=12056.000
Minimum=1145.000, Maximum=12056.000, Mean=1771.242, StdDev=457.061
NoData Value=0
Metadata:
STATISTICS_MAXIMUM=12056
STATISTICS_MEAN=1771.2417562238
STATISTICS_MINIMUM=1145
STATISTICS_STDDEV=457.06098967893
STATISTICS_VALID_PERCENT=95.56
Band 3 Block=512x2 Type=UInt16, ColorInterp=Undefined
Min=1083.000 Max=7524.000
Minimum=1083.000, Maximum=7524.000, Mean=1519.054, StdDev=379.979
NoData Value=0
Metadata:
STATISTICS_MAXIMUM=7524
STATISTICS_MEAN=1519.0543051275
STATISTICS_MINIMUM=1083
STATISTICS_STDDEV=379.97874273481
STATISTICS_VALID_PERCENT=95.56
Band 4 Block=512x2 Type=UInt16, ColorInterp=Undefined
Min=1076.000 Max=11880.000
Minimum=1076.000, Maximum=11880.000, Mean=4433.195, StdDev=849.466
NoData Value=0
Metadata:
STATISTICS_MAXIMUM=11880
STATISTICS_MEAN=4433.1945611037
STATISTICS_MINIMUM=1076
STATISTICS_STDDEV=849.46564802441
STATISTICS_VALID_PERCENT=95.56

Answer 1 · 2024-04-18T08:49:12.000Z

Hi @dennisshushackost, this is intentionally performed to avoid blocking artifacts.
If you want the exact same spatial extent, you can use the main output of the network (i.e. not cropped) but that will generate some blocking artifacts due to zero padding in convolutions of the model.

If you want a larger output, you have to feed a larger input, taking account of the cropping margin.