Re-download of RSR files
ralphk11 opened this issue · 9 comments
from glob import glob
from satpy import Scene
from pyresample.geometry import AreaDefinition
from pyresample.utils import proj4_str_to_dict
from pyproj import Proj
from satpy.utils import debug_on
import dask as da
from multiprocessing.pool import ThreadPool
from datetime import datetime
def EPSG_4326_definition(ll_ur=None, resolution=2.0):
# ll_ur is (lon, lat, lon, lat)
area_id = 'epsg4326'
description = 'EPSG:4326'
proj_id = 'epsg4326'
#projection = '+proj=eqc +lat_ts=0 +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m'
projection = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
proj4_dict = proj4_str_to_dict(projection)
#resolution = 2.0
y_size = 20480 / resolution # Divide by two for 2km resolution
x_size = 40960 / resolution # ditto!
ll_ur_ref = (-180.0, -90.0, 180.0, 90.0)
area_extent = ll_ur
y_size = int((ll_ur[3] - ll_ur[1]) / (ll_ur_ref[3] - ll_ur_ref[1]) * y_size)
x_size = int((ll_ur[2] - ll_ur[0]) / (ll_ur_ref[2] - ll_ur_ref[0]) * x_size)
print(y_size, x_size)
return(AreaDefinition(area_id, description, proj_id, proj4_dict, x_size, y_size, area_extent))
debug_on()
da.config.set(pool=ThreadPool(4))
FILES = glob("/data/kuehn/AHI_rt/DAT/two/*DAT")
scn = Scene(filenames=FILES, reader='ahi_hsd')
scn.load(['true_color'])
#scn['true_color'].data.chunks
areadef = EPSG_4326_definition(ll_ur=(80.0, -10, 180.0, 30))
local_scene = scn.resample(areadef, cache_dir='/data/kuehn/AHI_rt/cache')
local_scene.save_dataset('true_color', filename='./local_true_color_crefl.tif', GDAL_OPTIONS=['COMPRESS=JPEG', 'PHOTOMETRIC=YCBCR', 'TILED=YES'])
Problem description
The rsr data is downloaded every time I run the script. The data is indeed downloaded and stored in a local directory with write permission. It should not be downloaded every time.
Note that I've modified /satpy/etc/enhancements/generic.yaml such that crefl_scaling is the default true_color enhancement.
Actual Result, Traceback if applicable
[INFO: 2018-10-10 21:29:45 : pyspectral.rayleigh] Atmosphere chosen: us-standard
[DEBUG: 2018-10-10 21:29:45 : pyspectral.rayleigh] LUT filename: /home/kuehn/.local/share/pyspectral/marine_clean_aerosol/rayleigh_lut_us-standard.h5
[DEBUG: 2018-10-10 21:29:45 : pyspectral.rsr_reader] Filename: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5
[WARNING: 2018-10-10 21:29:45 : pyspectral.rsr_reader] rsr data may not be up to date: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5
[INFO: 2018-10-10 21:29:45 : pyspectral.rsr_reader] Will download from internet...
[INFO: 2018-10-10 21:29:45 : pyspectral.utils] Download RSR files and store in directory /home/kuehn/.local/share/pyspectral
[DEBUG: 2018-10-10 21:29:45 : pyspectral.utils] Get data. URL: https://zenodo.org/record/1409621/files/pyspectral_rsr_data.tgz
[DEBUG: 2018-10-10 21:29:45 : pyspectral.utils] Destination = /home/kuehn/.local/share/pyspectral
[DEBUG: 2018-10-10 21:29:45 : urllib3.connectionpool] Starting new HTTPS connection (1): zenodo.org:443
[DEBUG: 2018-10-10 21:29:46 : urllib3.connectionpool] https://zenodo.org:443 "GET /record/1409621/files/pyspectral_rsr_data.tgz HTTP/1.1" 200 2949478
2949478it [00:04, 611429.75it/s]
[DEBUG: 2018-10-10 21:29:53 : pyspectral.rsr_reader] Filename: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5
Versions of Python, package at hand and relevant dependencies
packages in environment at /home/kuehn/miniconda3/envs/satpy3:
Name Version Build Channel
affine 2.2.1 py_0 conda-forge
appdirs 1.4.3 py_1 conda-forge
asn1crypto 0.24.0 py36_1003 conda-forge
attrs 18.2.0 py_0 conda-forge
blas 1.0 mkl
bokeh 0.13.0 py36_0 conda-forge
boost-cpp 1.67.0 h3a22d5f_0 conda-forge
boto3 1.9.16 py_0 conda-forge
botocore 1.12.17 py_0 conda-forge
bottleneck 1.2.1 py36h7eb728f_1 conda-forge
bzip2 1.0.6 h470a237_2 conda-forge
ca-certificates 2018.8.24 ha4d7672_0 conda-forge
cairo 1.14.12 he6fea26_5 conda-forge
certifi 2018.8.24 py36_1001 conda-forge
cffi 1.11.5 py36h5e8e0c9_1 conda-forge
cftime 1.0.1 py36h7eb728f_0 conda-forge
chardet 3.0.4 py36_1003 conda-forge
click 7.0 py_0 conda-forge
click-plugins 1.0.4 py_0 conda-forge
cligj 0.5.0 py_0 conda-forge
cloudpickle 0.5.6 py_0 conda-forge
configobj 5.0.6 py_0 conda-forge
cryptography 2.3.1 py36hdffb7b8_0 conda-forge
cryptography-vectors 2.3.1 py36_1000 conda-forge
curl 7.61.0 h93b3f91_2 conda-forge
cytoolz 0.9.0.1 py36h470a237_1 conda-forge
dask 0.19.2 py_0 conda-forge
dask-core 0.19.2 py_0 conda-forge
distributed 1.23.2 py36_1 conda-forge
docutils 0.14 py36_1001 conda-forge
expat 2.2.5 hfc679d8_2 conda-forge
fontconfig 2.13.1 h65d0f4c_0 conda-forge
freetype 2.9.1 h6debe1e_4 conda-forge
freexl 1.0.5 h470a237_2 conda-forge
gdal 2.2.4 py36hb00a9d7_9 conda-forge
geos 3.6.2 hfc679d8_3 conda-forge
geotiff 1.4.2 h700e5ad_4 conda-forge
gettext 0.19.8.1 h5e8e0c9_1 conda-forge
giflib 5.1.4 h470a237_1 conda-forge
glib 2.55.0 h464dc38_2 conda-forge
h5netcdf 0.6.2 py_0 conda-forge
h5py 2.8.0 py36h7eb728f_3 conda-forge
hdf4 4.2.13 h951d187_2 conda-forge
hdf5 1.10.2 hc401514_2 conda-forge
heapdict 1.0.0 py36_1000 conda-forge
icu 58.2 hfc679d8_0 conda-forge
idna 2.7 py36_1002 conda-forge
intel-openmp 2019.0 118
jinja2 2.10 py_1 conda-forge
jmespath 0.9.3 py_1 conda-forge
jpeg 9c h470a237_1 conda-forge
json-c 0.12.1 h470a237_1 conda-forge
kealib 1.4.9 h0bee7d0_2 conda-forge
krb5 1.14.6 0 conda-forge
libdap4 3.19.1 h8fe5423_1 conda-forge
libffi 3.2.1 hfc679d8_5 conda-forge
libgcc-ng 7.2.0 hdf63c60_3 conda-forge
libgdal 2.2.4 hbd6f514_9 conda-forge
libgfortran 3.0.0 1 conda-forge
libgfortran-ng 7.2.0 hdf63c60_3 conda-forge
libiconv 1.15 h470a237_3 conda-forge
libkml 1.3.0 hccc92b1_8 conda-forge
libnetcdf 4.6.1 he6cff42_8 conda-forge
libpng 1.6.35 ha92aebf_2 conda-forge
libpq 9.6.3 0 conda-forge
libspatialite 4.3.0a hdfcc80b_23 conda-forge
libssh2 1.8.0 h5b517e9_2 conda-forge
libstdcxx-ng 7.2.0 hdf63c60_3 conda-forge
libtiff 4.0.9 he6b73bb_2 conda-forge
libuuid 2.32.1 h470a237_2 conda-forge
libxcb 1.13 h470a237_2 conda-forge
libxml2 2.9.8 h422b904_5 conda-forge
locket 0.2.0 py_2 conda-forge
markupsafe 1.0 py36h470a237_1 conda-forge
mkl 2019.0 118
mkl_fft 1.0.6 py36_0 conda-forge
mkl_random 1.0.1 py36_0 conda-forge
msgpack-python 0.5.6 py36h2d50403_3 conda-forge
ncurses 6.1 hfc679d8_1 conda-forge
netcdf4 1.4.1 py36h62672b6_0 conda-forge
numpy 1.15.0 py36h1b885b7_0
numpy-base 1.15.0 py36h3dfced4_0
olefile 0.46 py_0 conda-forge
openjpeg 2.3.0 h0e734dc_3 conda-forge
openssl 1.0.2p h470a237_0 conda-forge
packaging 18.0 py_0 conda-forge
pandas 0.23.4 py36hf8a1672_0 conda-forge
partd 0.3.8 py_1 conda-forge
pcre 8.41 hfc679d8_3 conda-forge
pillow 5.3.0 py36hc736899_0 conda-forge
pip 18.0 py36_1001 conda-forge
pixman 0.34.0 h470a237_3 conda-forge
poppler 0.67.0 h4d7e492_3 conda-forge
poppler-data 0.4.9 0 conda-forge
postgresql 9.6.3 4 conda-forge
proj4 4.9.3 h470a237_8 conda-forge
psutil 5.4.7 py36h470a237_1 conda-forge
pthread-stubs 0.4 h470a237_1 conda-forge
pycparser 2.19 py_0 conda-forge
pykdtree 1.3.1 py36h7eb728f_2 conda-forge
pyopenssl 18.0.0 py36_0 conda-forge
pyorbital 1.3.1 py_0 conda-forge
pyparsing 2.2.2 py_0 conda-forge
pyproj 1.9.5.1 py36h508ed2a_5 conda-forge
pyresample 1.10.2 py36hf8a1672_0 conda-forge
pysocks 1.6.8 py36_1002 conda-forge
pyspectral 0.8.3 py_0 conda-forge
python 3.6.6 h5001a0f_0 conda-forge
python-dateutil 2.7.3 py_0 conda-forge
python-geotiepoints 1.1.6 py36h24bf2e0_0 conda-forge
pytz 2018.5 py_0 conda-forge
pyyaml 3.13 py36h470a237_1 conda-forge
rasterio 1.0.8 py36h1b5fcde_0 conda-forge
readline 7.0 haf1bffa_1 conda-forge
requests 2.19.1 py36_1 conda-forge
s3transfer 0.1.13 py36_1001 conda-forge
satpy 0.9.1+307.g75e8cb86.dirty <pip>
scipy 1.1.0 py36hc49cb51_0
setuptools 40.4.3 py36_0 conda-forge
six 1.11.0 py36_1001 conda-forge
snuggs 1.4.1 py_1 conda-forge
sortedcontainers 2.0.5 py_0 conda-forge
sqlite 3.25.2 hb1c47c0_0 conda-forge
tblib 1.3.2 py_1 conda-forge
tk 8.6.8 ha92aebf_0 conda-forge
toolz 0.9.0 py_1 conda-forge
tornado 5.1.1 py36h470a237_0 conda-forge
tqdm 4.26.0 py_0 conda-forge
trollimage 1.5.7 py_0 conda-forge
trollsift 0.3.0 py_0 conda-forge
urllib3 1.23 py36_1 conda-forge
wheel 0.32.1 py36_0 conda-forge
xarray 0.10.9 py36_0 conda-forge
xerces-c 3.2.0 h5d6a6da_2 conda-forge
xorg-kbproto 1.0.7 h470a237_2 conda-forge
xorg-libice 1.0.9 h470a237_4 conda-forge
xorg-libsm 1.2.2 h8c8a85c_6 conda-forge
xorg-libx11 1.6.6 h470a237_0 conda-forge
xorg-libxau 1.0.8 h470a237_6 conda-forge
xorg-libxdmcp 1.1.2 h470a237_7 conda-forge
xorg-libxext 1.3.3 h470a237_4 conda-forge
xorg-libxrender 0.9.10 h470a237_2 conda-forge
xorg-renderproto 0.11.1 h470a237_2 conda-forge
xorg-xextproto 7.3.0 h470a237_2 conda-forge
xorg-xproto 7.0.31 h470a237_7 conda-forge
xz 5.2.4 h470a237_1 conda-forge
yaml 0.1.7 h470a237_1 conda-forge
zict 0.1.3 py_0 conda-forge
zlib 1.2.11 h470a237_3 conda-forge
Thank you for reporting an issue !
@ralphk11 Would be much helpful if you could actually make a minimal code example using pyspectral only that produce the same error.
@ralphk11 It is not a sustainable solution but you could add an environment pointing to a local customized pyspectral config file where you tell pyspectral to not go and try download. Like:
PSP_CONFIG_FILE=/home/a000680/pyspectral.yaml
And in there you could put
download_from_internet: False
You can read about interacring with this config file on the pyspectral documentation.
It is of course so that if you have the latest version of the rsr files it shouldn't go on try download more than once, even if the above variable is True.
But, please provide a minimal use case and I will do my best!
I am on travel until tonight and may not have time look at it further before tomorrow
Adam
@ralphk11 I tried with the code below, and could not see any unwanted behaviour of multiple downloading. Maybe you can just try the below and repeat it and see if the behaviour is okay also on your side?
import numpy as np
from pyspectral.rayleigh import Rayleigh
from pyspectral.utils import debug_on
debug_on()
msi = Rayleigh('Himawari-8', 'ahi', aerosol_type='marine_clean_aerosol')
sunz = np.array([[32., 40.], [31., 41.]])
satz = np.array([[45., 20.], [46., 21.]])
ssadiff = np.array([[110, 170], [120, 180]])
refl_cor_red = msi.get_reflectance(sunz, satz, ssadiff, 'B02')
@adybbroe I tried your example with pyspectral only and it only downloaded the files once. My test code with ABI data does not to have the problem, however the example with AHI as shown above still downloads rsr_ahi_Himawari-8.h5
every time.
@ralphk11 Ok, many thanks for verifying. So, perhaps a satpy or a satpy/pyspectral problem. I will try see if I can get the same behaviour as you with AHI and satpy later this week.
@ralphk11 I am finally looking more in detail into this. I am able to run your example above.
I simplified it to focus only on the multiple download issue:
import os
from glob import glob
from satpy import Scene
from satpy.utils import debug_on
import dask as da
from multiprocessing.pool import ThreadPool
#DATADIR = "/data/kuehn/AHI_rt/DAT/two"
#CACHE_DIR = '/data/kuehn/AHI_rt/cache'
DATADIR = "/home/a000680/data/himawari8/201502070300"
CACHE_DIR = '/tmp'
debug_on()
da.config.set(pool=ThreadPool(4))
FILES = glob(os.path.join(DATADIR, "*DAT"))
scn = Scene(filenames=FILES, reader='ahi_hsd')
scn.load(['true_color'])
And indeed, when removing my local Himawari RSR luts (and the version-indicator file PYSPECTRAL_RSR_VERSION
) it downloads twice. However, the second time I run, no download is attempted. I was using conda on linux, using python 3, latest satpy and pyspectral from conda-forge. Could you try upgrade your environment and run my stripped down example code above, and tell me what the behaviour is on your side?
I have been able to fix the double download issue, which is in pyspectral. That fix is currently bein put in a PR.
Maybe @djhoese you could have a look as well?