pytroll/pyspectral

Re-download of RSR files

ralphk11 opened this issue · 9 comments

from glob import glob
from satpy import Scene
from pyresample.geometry import AreaDefinition
from pyresample.utils import proj4_str_to_dict
from pyproj import Proj
from satpy.utils import debug_on
import dask as da
from multiprocessing.pool import ThreadPool
from datetime import datetime


def EPSG_4326_definition(ll_ur=None, resolution=2.0):
    # ll_ur is (lon, lat, lon, lat)
    area_id = 'epsg4326'
    description = 'EPSG:4326'
    proj_id = 'epsg4326'
    #projection = '+proj=eqc +lat_ts=0 +lat_0=0 +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m'
    projection = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
    proj4_dict = proj4_str_to_dict(projection)

    #resolution = 2.0
    y_size = 20480 / resolution  # Divide by two for 2km resolution
    x_size = 40960 / resolution  # ditto!

    ll_ur_ref = (-180.0, -90.0, 180.0, 90.0)
    
    area_extent = ll_ur

    y_size = int((ll_ur[3] - ll_ur[1]) / (ll_ur_ref[3] - ll_ur_ref[1]) * y_size)
    x_size = int((ll_ur[2] - ll_ur[0]) / (ll_ur_ref[2] - ll_ur_ref[0]) * x_size)
    print(y_size, x_size)

    return(AreaDefinition(area_id, description, proj_id, proj4_dict, x_size, y_size, area_extent))
    

debug_on()

da.config.set(pool=ThreadPool(4))

FILES = glob("/data/kuehn/AHI_rt/DAT/two/*DAT")
scn = Scene(filenames=FILES, reader='ahi_hsd')

scn.load(['true_color'])
#scn['true_color'].data.chunks

areadef = EPSG_4326_definition(ll_ur=(80.0, -10, 180.0, 30))

local_scene = scn.resample(areadef, cache_dir='/data/kuehn/AHI_rt/cache')
local_scene.save_dataset('true_color', filename='./local_true_color_crefl.tif', GDAL_OPTIONS=['COMPRESS=JPEG', 'PHOTOMETRIC=YCBCR', 'TILED=YES'])

Problem description
The rsr data is downloaded every time I run the script. The data is indeed downloaded and stored in a local directory with write permission. It should not be downloaded every time.
Note that I've modified /satpy/etc/enhancements/generic.yaml such that crefl_scaling is the default true_color enhancement.

Actual Result, Traceback if applicable

[INFO: 2018-10-10 21:29:45 : pyspectral.rayleigh] Atmosphere chosen: us-standard
[DEBUG: 2018-10-10 21:29:45 : pyspectral.rayleigh] LUT filename: /home/kuehn/.local/share/pyspectral/marine_clean_aerosol/rayleigh_lut_us-standard.h5
[DEBUG: 2018-10-10 21:29:45 : pyspectral.rsr_reader] Filename: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5
[WARNING: 2018-10-10 21:29:45 : pyspectral.rsr_reader] rsr data may not be up to date: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5
[INFO: 2018-10-10 21:29:45 : pyspectral.rsr_reader] Will download from internet...
[INFO: 2018-10-10 21:29:45 : pyspectral.utils] Download RSR files and store in directory /home/kuehn/.local/share/pyspectral
[DEBUG: 2018-10-10 21:29:45 : pyspectral.utils] Get data. URL: https://zenodo.org/record/1409621/files/pyspectral_rsr_data.tgz
[DEBUG: 2018-10-10 21:29:45 : pyspectral.utils] Destination = /home/kuehn/.local/share/pyspectral
[DEBUG: 2018-10-10 21:29:45 : urllib3.connectionpool] Starting new HTTPS connection (1): zenodo.org:443
[DEBUG: 2018-10-10 21:29:46 : urllib3.connectionpool] https://zenodo.org:443 "GET /record/1409621/files/pyspectral_rsr_data.tgz HTTP/1.1" 200 2949478
2949478it [00:04, 611429.75it/s]
[DEBUG: 2018-10-10 21:29:53 : pyspectral.rsr_reader] Filename: /home/kuehn/.local/share/pyspectral/rsr_ahi_Himawari-8.h5

Versions of Python, package at hand and relevant dependencies
packages in environment at /home/kuehn/miniconda3/envs/satpy3:

 Name                    Version                   Build  Channel
affine                    2.2.1                      py_0    conda-forge
appdirs                   1.4.3                      py_1    conda-forge
asn1crypto                0.24.0                py36_1003    conda-forge
attrs                     18.2.0                     py_0    conda-forge
blas                      1.0                         mkl  
bokeh                     0.13.0                   py36_0    conda-forge
boost-cpp                 1.67.0               h3a22d5f_0    conda-forge
boto3                     1.9.16                     py_0    conda-forge
botocore                  1.12.17                    py_0    conda-forge
bottleneck                1.2.1            py36h7eb728f_1    conda-forge
bzip2                     1.0.6                h470a237_2    conda-forge
ca-certificates           2018.8.24            ha4d7672_0    conda-forge
cairo                     1.14.12              he6fea26_5    conda-forge
certifi                   2018.8.24             py36_1001    conda-forge
cffi                      1.11.5           py36h5e8e0c9_1    conda-forge
cftime                    1.0.1            py36h7eb728f_0    conda-forge
chardet                   3.0.4                 py36_1003    conda-forge
click                     7.0                        py_0    conda-forge
click-plugins             1.0.4                      py_0    conda-forge
cligj                     0.5.0                      py_0    conda-forge
cloudpickle               0.5.6                      py_0    conda-forge
configobj                 5.0.6                      py_0    conda-forge
cryptography              2.3.1            py36hdffb7b8_0    conda-forge
cryptography-vectors      2.3.1                 py36_1000    conda-forge
curl                      7.61.0               h93b3f91_2    conda-forge
cytoolz                   0.9.0.1          py36h470a237_1    conda-forge
dask                      0.19.2                     py_0    conda-forge
dask-core                 0.19.2                     py_0    conda-forge
distributed               1.23.2                   py36_1    conda-forge
docutils                  0.14                  py36_1001    conda-forge
expat                     2.2.5                hfc679d8_2    conda-forge
fontconfig                2.13.1               h65d0f4c_0    conda-forge
freetype                  2.9.1                h6debe1e_4    conda-forge
freexl                    1.0.5                h470a237_2    conda-forge
gdal                      2.2.4            py36hb00a9d7_9    conda-forge
geos                      3.6.2                hfc679d8_3    conda-forge
geotiff                   1.4.2                h700e5ad_4    conda-forge
gettext                   0.19.8.1             h5e8e0c9_1    conda-forge
giflib                    5.1.4                h470a237_1    conda-forge
glib                      2.55.0               h464dc38_2    conda-forge
h5netcdf                  0.6.2                      py_0    conda-forge
h5py                      2.8.0            py36h7eb728f_3    conda-forge
hdf4                      4.2.13               h951d187_2    conda-forge
hdf5                      1.10.2               hc401514_2    conda-forge
heapdict                  1.0.0                 py36_1000    conda-forge
icu                       58.2                 hfc679d8_0    conda-forge
idna                      2.7                   py36_1002    conda-forge
intel-openmp              2019.0                      118  
jinja2                    2.10                       py_1    conda-forge
jmespath                  0.9.3                      py_1    conda-forge
jpeg                      9c                   h470a237_1    conda-forge
json-c                    0.12.1               h470a237_1    conda-forge
kealib                    1.4.9                h0bee7d0_2    conda-forge
krb5                      1.14.6                        0    conda-forge
libdap4                   3.19.1               h8fe5423_1    conda-forge
libffi                    3.2.1                hfc679d8_5    conda-forge
libgcc-ng                 7.2.0                hdf63c60_3    conda-forge
libgdal                   2.2.4                hbd6f514_9    conda-forge
libgfortran               3.0.0                         1    conda-forge
libgfortran-ng            7.2.0                hdf63c60_3    conda-forge
libiconv                  1.15                 h470a237_3    conda-forge
libkml                    1.3.0                hccc92b1_8    conda-forge
libnetcdf                 4.6.1                he6cff42_8    conda-forge
libpng                    1.6.35               ha92aebf_2    conda-forge
libpq                     9.6.3                         0    conda-forge
libspatialite             4.3.0a              hdfcc80b_23    conda-forge
libssh2                   1.8.0                h5b517e9_2    conda-forge
libstdcxx-ng              7.2.0                hdf63c60_3    conda-forge
libtiff                   4.0.9                he6b73bb_2    conda-forge
libuuid                   2.32.1               h470a237_2    conda-forge
libxcb                    1.13                 h470a237_2    conda-forge
libxml2                   2.9.8                h422b904_5    conda-forge
locket                    0.2.0                      py_2    conda-forge
markupsafe                1.0              py36h470a237_1    conda-forge
mkl                       2019.0                      118  
mkl_fft                   1.0.6                    py36_0    conda-forge
mkl_random                1.0.1                    py36_0    conda-forge
msgpack-python            0.5.6            py36h2d50403_3    conda-forge
ncurses                   6.1                  hfc679d8_1    conda-forge
netcdf4                   1.4.1            py36h62672b6_0    conda-forge
numpy                     1.15.0           py36h1b885b7_0  
numpy-base                1.15.0           py36h3dfced4_0  
olefile                   0.46                       py_0    conda-forge
openjpeg                  2.3.0                h0e734dc_3    conda-forge
openssl                   1.0.2p               h470a237_0    conda-forge
packaging                 18.0                       py_0    conda-forge
pandas                    0.23.4           py36hf8a1672_0    conda-forge
partd                     0.3.8                      py_1    conda-forge
pcre                      8.41                 hfc679d8_3    conda-forge
pillow                    5.3.0            py36hc736899_0    conda-forge
pip                       18.0                  py36_1001    conda-forge
pixman                    0.34.0               h470a237_3    conda-forge
poppler                   0.67.0               h4d7e492_3    conda-forge
poppler-data              0.4.9                         0    conda-forge
postgresql                9.6.3                         4    conda-forge
proj4                     4.9.3                h470a237_8    conda-forge
psutil                    5.4.7            py36h470a237_1    conda-forge
pthread-stubs             0.4                  h470a237_1    conda-forge
pycparser                 2.19                       py_0    conda-forge
pykdtree                  1.3.1            py36h7eb728f_2    conda-forge
pyopenssl                 18.0.0                   py36_0    conda-forge
pyorbital                 1.3.1                      py_0    conda-forge
pyparsing                 2.2.2                      py_0    conda-forge
pyproj                    1.9.5.1          py36h508ed2a_5    conda-forge
pyresample                1.10.2           py36hf8a1672_0    conda-forge
pysocks                   1.6.8                 py36_1002    conda-forge
pyspectral                0.8.3                      py_0    conda-forge
python                    3.6.6                h5001a0f_0    conda-forge
python-dateutil           2.7.3                      py_0    conda-forge
python-geotiepoints       1.1.6            py36h24bf2e0_0    conda-forge
pytz                      2018.5                     py_0    conda-forge
pyyaml                    3.13             py36h470a237_1    conda-forge
rasterio                  1.0.8            py36h1b5fcde_0    conda-forge
readline                  7.0                  haf1bffa_1    conda-forge
requests                  2.19.1                   py36_1    conda-forge
s3transfer                0.1.13                py36_1001    conda-forge
satpy                     0.9.1+307.g75e8cb86.dirty           <pip>
scipy                     1.1.0            py36hc49cb51_0  
setuptools                40.4.3                   py36_0    conda-forge
six                       1.11.0                py36_1001    conda-forge
snuggs                    1.4.1                      py_1    conda-forge
sortedcontainers          2.0.5                      py_0    conda-forge
sqlite                    3.25.2               hb1c47c0_0    conda-forge
tblib                     1.3.2                      py_1    conda-forge
tk                        8.6.8                ha92aebf_0    conda-forge
toolz                     0.9.0                      py_1    conda-forge
tornado                   5.1.1            py36h470a237_0    conda-forge
tqdm                      4.26.0                     py_0    conda-forge
trollimage                1.5.7                      py_0    conda-forge
trollsift                 0.3.0                      py_0    conda-forge
urllib3                   1.23                     py36_1    conda-forge
wheel                     0.32.1                   py36_0    conda-forge
xarray                    0.10.9                   py36_0    conda-forge
xerces-c                  3.2.0                h5d6a6da_2    conda-forge
xorg-kbproto              1.0.7                h470a237_2    conda-forge
xorg-libice               1.0.9                h470a237_4    conda-forge
xorg-libsm                1.2.2                h8c8a85c_6    conda-forge
xorg-libx11               1.6.6                h470a237_0    conda-forge
xorg-libxau               1.0.8                h470a237_6    conda-forge
xorg-libxdmcp             1.1.2                h470a237_7    conda-forge
xorg-libxext              1.3.3                h470a237_4    conda-forge
xorg-libxrender           0.9.10               h470a237_2    conda-forge
xorg-renderproto          0.11.1               h470a237_2    conda-forge
xorg-xextproto            7.3.0                h470a237_2    conda-forge
xorg-xproto               7.0.31               h470a237_7    conda-forge
xz                        5.2.4                h470a237_1    conda-forge
yaml                      0.1.7                h470a237_1    conda-forge
zict                      0.1.3                      py_0    conda-forge
zlib                      1.2.11               h470a237_3    conda-forge

Thank you for reporting an issue !

Thanks @ralphk11 I am looking into it, might be overlapping with #38
Will come back!

@ralphk11 Would be much helpful if you could actually make a minimal code example using pyspectral only that produce the same error.

@ralphk11 It is not a sustainable solution but you could add an environment pointing to a local customized pyspectral config file where you tell pyspectral to not go and try download. Like:
PSP_CONFIG_FILE=/home/a000680/pyspectral.yaml

And in there you could put
download_from_internet: False

You can read about interacring with this config file on the pyspectral documentation.

It is of course so that if you have the latest version of the rsr files it shouldn't go on try download more than once, even if the above variable is True.

But, please provide a minimal use case and I will do my best!

I am on travel until tonight and may not have time look at it further before tomorrow
Adam

@ralphk11 I tried with the code below, and could not see any unwanted behaviour of multiple downloading. Maybe you can just try the below and repeat it and see if the behaviour is okay also on your side?

import numpy as np
from pyspectral.rayleigh import Rayleigh
from pyspectral.utils import debug_on
debug_on()

msi = Rayleigh('Himawari-8', 'ahi', aerosol_type='marine_clean_aerosol')

sunz = np.array([[32., 40.], [31., 41.]])
satz = np.array([[45., 20.], [46., 21.]])
ssadiff = np.array([[110, 170], [120, 180]])

refl_cor_red = msi.get_reflectance(sunz, satz, ssadiff, 'B02')

@ralphk11 What do you want me to do with this?

@adybbroe I tried your example with pyspectral only and it only downloaded the files once. My test code with ABI data does not to have the problem, however the example with AHI as shown above still downloads rsr_ahi_Himawari-8.h5 every time.

@ralphk11 Ok, many thanks for verifying. So, perhaps a satpy or a satpy/pyspectral problem. I will try see if I can get the same behaviour as you with AHI and satpy later this week.

@ralphk11 I am finally looking more in detail into this. I am able to run your example above.

I simplified it to focus only on the multiple download issue:

import os
from glob import glob
from satpy import Scene
from satpy.utils import debug_on
import dask as da
from multiprocessing.pool import ThreadPool

#DATADIR = "/data/kuehn/AHI_rt/DAT/two"
#CACHE_DIR = '/data/kuehn/AHI_rt/cache'
DATADIR = "/home/a000680/data/himawari8/201502070300"
CACHE_DIR = '/tmp'

debug_on()

da.config.set(pool=ThreadPool(4))

FILES = glob(os.path.join(DATADIR, "*DAT"))
scn = Scene(filenames=FILES, reader='ahi_hsd')
scn.load(['true_color'])

And indeed, when removing my local Himawari RSR luts (and the version-indicator file PYSPECTRAL_RSR_VERSION) it downloads twice. However, the second time I run, no download is attempted. I was using conda on linux, using python 3, latest satpy and pyspectral from conda-forge. Could you try upgrade your environment and run my stripped down example code above, and tell me what the behaviour is on your side?

I have been able to fix the double download issue, which is in pyspectral. That fix is currently bein put in a PR.

Maybe @djhoese you could have a look as well?