user conda environments
Opened this issue · 12 comments
We've now set-up staging.nasa.pangeo.io to allow users to create their own conda environments
(see https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/deployments/nasa/config/common.yaml#L34).
I'm running into "The environment is inconsistent" and hanging "solving environment" issues with conda currently though in our image. I noticed that /srv/conda/.condarc has the following config:
channels:
- conda-forge
- defaults
auto_update_conda: false
show_channel_urls: true
update_dependencies: false
I'm wondering about the update_dependencies: false
causing trouble. It comes from repo2docker (https://github.com/jupyter/repo2docker/blob/9099def40a331df04ba3ed862ee27a8e4a77fe43/repo2docker/buildpacks/conda/install-miniconda.bash#L39).
I also noticed we end up with a mix of packages from conda-forge, defaults, and pypi currently, which I guess is originating from pangeo-stacks:
https://github.com/pangeo-data/pangeo-stacks/blob/master/base-notebook/binder/environment.yml
So... @yuvipanda , @jhamman
- Why is
update_dependencies: false
? - Should we change pangeo-stacks to just use conda-forge?
@ocefpaf, any insight here?
I'm running into "The environment is inconsistent" and hanging "solving environment" issues with conda currently though in our image. I noticed that /srv/conda/.condarc has the following config:
Can you try to update the conda
in your env and activating the strict channel options?
channel_priority: strict
That should help with the "hanging" and inconsistent environments. If you find any errors with that it will be easier to debug too.
PS: please check this gist for more on the .condarc
options.
We've now set-up staging.nasa.pangeo.io to allow users to create their own conda environments
BTW the options,
auto_update_conda: false
update_dependencies: false
are very useful for managing Dockerfiles but quite bad for local users. If you want users to manage their own envs I would remove that options. A Dockerfile adim would know when to update those by the average users should try to have the latest as much as possible. Conda is still evolving and the options that make it faster and more consistent are are being update/added constantly. The strict
channel option, for example, will be default in conda 4.7.
Thanks @ocefpaf ! This is great information.
After adding your suggestions things proceed but definitely still feel slow. In case we want to modify our base environment coming from Docker, I'm copying the results of a conda update --all
with the strict
channel requirement below. Also see the bottom for packages currently installed via pypi (includes dask so @jhamman - should we try to modify pangeo-stacks?)
(base) jovyan@jupyter-scottyhq:~$ conda update --all
Collecting package metadata: done
Solving environment: |
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:
- conda-forge/linux-64::ipykernel==5.1.0=py36h24bf2e0_1002
- conda-forge/label/broken/noarch::jupyter_client==5.2.4=py_1
- conda-forge/noarch::jupyterlab_launcher==0.13.1=py_2
- conda-forge/label/broken/linux-64::notebook==5.7.4=py36_1000
- conda-forge/linux-64::awscli==1.16.149=py36_0
- conda-forge/noarch::boto3==1.9.139=py_0
- conda-forge/noarch::botocore==1.12.139=py_0
- conda-forge/noarch::dask==1.1.1=py_0
- conda-forge/noarch::datacube==1.6.2=py_1
- conda-forge/noarch::datashader==0.6.9=py_0
- conda-forge/noarch::geocube==0.0.2=py_0
- conda-forge/noarch::intake==0.4.4=py_0
- conda-forge/noarch::intake-xarray==0.3.0=py_0
- conda-forge/linux-64::matplotlib==3.0.3=py36_1
- conda-forge/linux-64::matplotlib-base==3.0.3=py36h5f35d83_1
- conda-forge/linux-64::nb_conda_kernels==2.2.1=py36_0
- conda-forge/noarch::regionmask==0.4.0=py_0
- conda-forge/noarch::rioxarray==0.0.3=py_0
- conda-forge/noarch::xarray==0.12.1=py_0
- conda-forge/noarch::alembic==1.0.8=py_0
- conda-forge/linux-64::bokeh==1.0.4=py36_1000
- conda-forge/linux-64::cartopy==0.17.0=py36h0aa2c8f_1004
- conda-forge/linux-64::climlab==0.7.3=py36h4c70da7_0
- conda-forge/noarch::dask-glm==0.1.0=0
- conda-forge/noarch::dask-jobqueue==0.4.1=py_0
- conda-forge/noarch::dask-kubernetes==0.7.0=py_0
- conda-forge/noarch::dask-ml==0.12.0=py_0
- conda-forge/noarch::datashape==0.5.4=py_1
- conda-forge/noarch::descartes==1.1.0=py_3
- conda-forge/noarch::geopandas==0.4.1=py_1
- conda-forge/noarch::geoviews==1.6.2=py_0
- conda-forge/noarch::geoviews-core==1.6.2=py_0
- conda-forge/noarch::holoviews==1.11.3=py_0
- conda-forge/noarch::hvplot==0.4.0=py_1
- conda-forge/noarch::intake-esm==2019.2.28=py_1
- conda-forge/linux-64::ipyleaflet==0.10.1=py36_0
- conda-forge/noarch::ipywidgets==7.4.2=py_0
- conda-forge/linux-64::iris==2.2.0=py36_1003
- conda-forge/linux-64::jupyterhub==0.9.6=py36_0
- conda-forge/noarch::mapclassify==2.0.1=py_0
- conda-forge/linux-64::metpy==0.10.0=py36_1001
- conda-forge/noarch::nbserverproxy==0.8.8=py_1000
- conda-forge/noarch::owslib==0.17.1=py_0
- conda-forge/linux-64::pandas==0.24.2=py36hf484d3e_0
- conda-forge/noarch::panel==0.4.0=1
- conda-forge/noarch::pyspectral==0.8.7=py_0
- conda-forge/linux-64::python-geotiepoints==1.1.7=py36h3010b51_0
- conda-forge/linux-64::python-kubernetes==4.0.0=py36_1
- conda-forge/noarch::s3fs==0.2.1=py_0
- conda-forge/linux-64::s3transfer==0.2.0=py36_0
- conda-forge/noarch::satpy==0.14.1=pyh326bf55_0
- conda-forge/linux-64::scikit-image==0.14.2=py36hf484d3e_1
- conda-forge/noarch::trollimage==1.7.0=py_0
- conda-forge/linux-64::widgetsnbextension==3.4.2=py36_1000
- conda-forge/linux-64::xesmf==0.1.1=py36_1
- conda-forge/noarch::xgcm==0.2.0=py_0
- conda-forge/noarch::xrft==0.2.0=py_0
- conda-forge/noarch::jupyter==1.0.0=py_2
- conda-forge/noarch::jupyter_console==6.0.0=py_0
- conda-forge/linux-64::jupyterlab==0.35.4=py36_0
- conda-forge/noarch::jupyterlab_server==0.2.0=py_0
- conda-forge/noarch::qtconsole==4.4.3=py_0
done
Full Output
## Package Plan ##
environment location: /srv/conda
The following packages will be downloaded:
package | build
---------------------------|-----------------
atk-2.25.90 | hb9dd440_1002 430 KB conda-forge
attrs-19.1.0 | py_0 32 KB conda-forge
binutils_impl_linux-64-2.31.1| h6176602_1 16.5 MB defaults
binutils_linux-64-2.31.1 | h6176602_3 9 KB defaults
colorama-0.4.1 | py_0 15 KB conda-forge
conda-env-2.6.0 | 1 2 KB conda-forge
cryptography-2.5 | py36hb7f436b_1 645 KB conda-forge
curl-7.64.0 | h646f8bb_0 143 KB conda-forge
dask-core-1.2.1 | py_0 534 KB conda-forge
dbus-1.13.6 | he372182_0 602 KB conda-forge
fiona-1.8.6 | py36hf242f0b_0 1.1 MB conda-forge
gcc_impl_linux-64-7.3.0 | habb00fd_1 73.2 MB conda-forge
gcc_linux-64-7.3.0 | h553295d_3 10 KB conda-forge
gdal-2.4.0 |py36h1c6dbfb_1002 1.3 MB conda-forge
gdk-pixbuf-2.36.12 | h49783d7_1002 592 KB conda-forge
giflib-5.1.9 | h516909a_0 108 KB conda-forge
gobject-introspection-1.58.2|py36h2da5eee_1000 1.2 MB conda-forge
graphviz-2.40.1 | h0dab3d1_0 6.8 MB conda-forge
grpcio-1.16.0 |py36h4f00d22_1000 1.0 MB conda-forge
gsw-3.3.1 | py36h516909a_0 1.8 MB conda-forge
gtk2-2.24.31 | hb68c50a_1001 7.3 MB conda-forge
gxx_impl_linux-64-7.3.0 | hdf63c60_1 18.7 MB conda-forge
gxx_linux-64-7.3.0 | h553295d_3 9 KB conda-forge
jsonschema-3.0.1 | py36_0 84 KB conda-forge
keras-2.1.6 | py36_0 500 KB conda-forge
keras-applications-1.0.7 | py_0 30 KB conda-forge
keras-preprocessing-1.0.9 | py_0 32 KB conda-forge
kiwisolver-1.1.0 | py36hc9558a2_0 86 KB conda-forge
krb5-1.16.3 | hc83ff2d_1000 1.4 MB conda-forge
libcurl-7.64.0 | h01ee5af_0 586 KB conda-forge
libedit-3.1.20170329 | hf8c457e_1001 172 KB conda-forge
libgdal-2.4.0 | h982c1cc_1002 18.5 MB conda-forge
libpq-10.6 | h13b8bad_1000 2.5 MB conda-forge
libssh2-1.8.0 | h1ad7b7a_1003 246 KB conda-forge
libxml2-2.9.9 | h13577e0_0 2.0 MB conda-forge
lz4-2.1.6 |py36hd79334b_1001 37 KB conda-forge
lz4-c-1.8.3 | he1b5a44_1001 187 KB conda-forge
mock-2.0.0 | py36_1001 106 KB conda-forge
nbconvert-5.5.0 | py_0 375 KB conda-forge
netcdf4-1.5.1 | py36had58050_0 535 KB conda-forge
nodejs-11.14.0 | he1b5a44_0 16.6 MB conda-forge
openssl-1.0.2r | h14c3975_0 3.1 MB conda-forge
pandoc-2.7.2 | 0 21.7 MB conda-forge
pbr-5.1.3 | py_0 70 KB conda-forge
pcre-8.41 | hf484d3e_1003 249 KB conda-forge
pooch-0.3.1 | py36_0 26 KB conda-forge
postgresql-10.6 | h66cca7a_1000 4.7 MB conda-forge
prometheus_client-0.6.0 | py_0 34 KB conda-forge
psycopg2-2.7.7 | py36hb7f436b_0 305 KB conda-forge
pycurl-7.43.0.2 | py36hb7f436b_0 60 KB defaults
pyqt-5.6.0 |py36h13b7fb3_1008 5.4 MB conda-forge
pyrsistent-0.15.1 | py36h516909a_0 88 KB conda-forge
python-3.6.7 | hd21baee_1002 34.6 MB conda-forge
qt-5.6.2 | hce4f676_1013 44.6 MB conda-forge
rasterio-1.0.22 | py36h5b3f9e8_0 8.2 MB conda-forge
shapely-1.6.4 |py36h2afed24_1004 330 KB conda-forge
sip-4.18.1 |py36hf484d3e_1000 277 KB conda-forge
tensorboard-1.13.1 | py36_0 3.3 MB conda-forge
tensorflow-1.13.1 | py36_0 77.2 MB conda-forge
tensorflow-estimator-1.13.0| py_0 205 KB defaults
terminado-0.8.2 | py36_0 23 KB conda-forge
testpath-0.4.2 | py_1001 85 KB conda-forge
theano-1.0.4 |py36hf484d3e_1000 3.6 MB conda-forge
websocket-client-0.56.0 | py36_0 58 KB conda-forge
zarr-2.3.1 | py36_0 223 KB conda-forge
------------------------------------------------------------
Total: 384.2 MB
The following NEW packages will be INSTALLED:
atk conda-forge/linux-64::atk-2.25.90-hb9dd440_1002
binutils_impl_lin~ pkgs/main/linux-64::binutils_impl_linux-64-2.31.1-h6176602_1
binutils_linux-64 pkgs/main/linux-64::binutils_linux-64-2.31.1-h6176602_3
gcc_impl_linux-64 conda-forge/linux-64::gcc_impl_linux-64-7.3.0-habb00fd_1
gcc_linux-64 conda-forge/linux-64::gcc_linux-64-7.3.0-h553295d_3
gdk-pixbuf conda-forge/linux-64::gdk-pixbuf-2.36.12-h49783d7_1002
gobject-introspec~ conda-forge/linux-64::gobject-introspection-1.58.2-py36h2da5eee_1000
gtk2 conda-forge/linux-64::gtk2-2.24.31-hb68c50a_1001
gxx_impl_linux-64 conda-forge/linux-64::gxx_impl_linux-64-7.3.0-hdf63c60_1
gxx_linux-64 conda-forge/linux-64::gxx_linux-64-7.3.0-h553295d_3
mock conda-forge/linux-64::mock-2.0.0-py36_1001
pbr conda-forge/noarch::pbr-5.1.3-py_0
tensorflow-estima~ pkgs/main/noarch::tensorflow-estimator-1.13.0-py_0
zstd conda-forge/linux-64::zstd-1.3.3-1
The following packages will be UPDATED:
attrs 18.2.0-py_0 --> 19.1.0-py_0
blas 2.5-openblas --> 2.8-openblas
cffi 1.12.2-py36hf0e25f4_1 --> 1.12.3-py36h8022711_0
colorama 0.3.9-py_1 --> 0.4.1-py_0
dask-core 1.1.1-py_0 --> 1.2.1-py_0
dbus 1.13.0-h4e0c4b3_1000 --> 1.13.6-he372182_0
decorator 4.3.2-py_0 --> 4.4.0-py_0
distributed 1.25.3-py36_0 --> 1.27.1-py36_0
giflib 5.1.7-h516909a_1 --> 5.1.9-h516909a_0
graphviz 2.38.0-hf68f40c_1011 --> 2.40.1-h0dab3d1_0
gsw 3.3.0-py36h14c3975_0 --> 3.3.1-py36h516909a_0
ipython 7.1.1-py36h24bf2e0_1000 --> 7.5.0-py36h24bf2e0_0
jedi 0.13.2-py36_1000 --> 0.13.3-py36_0
jinja2 2.10-py_1 --> 2.10.1-py_0
jsonschema 3.0.0a3-py36_1000 --> 3.0.1-py36_0
keras-applications 1.0.4-py_1 --> 1.0.7-py_0
keras-preprocessi~ 1.0.2-py_1 --> 1.0.9-py_0
kiwisolver 1.0.1-py36h6bb024c_1002 --> 1.1.0-py36hc9558a2_0
libblas 3.8.0-5_openblas --> 3.8.0-8_openblas
libcblas 3.8.0-5_openblas --> 3.8.0-8_openblas
libedit pkgs/main::libedit-3.1.20170329-h6b74~ --> conda-forge::libedit-3.1.20170329-hf8c457e_1001
libffi 3.2.1-hf484d3e_1005 --> 3.2.1-he1b5a44_1006
libgcc-ng 7.3.0-hdf63c60_0 --> 8.2.0-hdf63c60_1
liblapack 3.8.0-5_openblas --> 3.8.0-8_openblas
liblapacke 3.8.0-5_openblas --> 3.8.0-8_openblas
libstdcxx-ng 7.3.0-hdf63c60_0 --> 8.2.0-hdf63c60_1
libtiff 4.0.10-h648cc4a_1001 --> 4.0.10-h9022e91_1002
libxml2 2.9.8-h143f9aa_1005 --> 2.9.9-h13577e0_0
lz4 2.1.6-py36ha8eefa0_1000 --> 2.1.6-py36hd79334b_1001
lz4-c 1.8.1.2-0 --> 1.8.3-he1b5a44_1001
markupsafe 1.1.0-py36h14c3975_1000 --> 1.1.1-py36h14c3975_0
nbconvert conda-forge/label/broken::nbconvert-5~ --> conda-forge::nbconvert-5.5.0-py_0
netcdf4 1.5.0.1-py36had58050_0 --> 1.5.1-py36had58050_0
nodejs 11.11.0-hf484d3e_0 --> 11.14.0-he1b5a44_0
numpy 1.16.2-py36h8b7e671_1 --> 1.16.3-py36he5ce36f_0
openblas 0.3.5-h9ac9557_1001 --> 0.3.6-h6e990d7_1
pandoc 1.19.2-0 --> 2.7.2-0
parso 0.3.3-py_0 --> 0.4.0-py_0
pexpect 4.6.0-py36_1000 --> 4.7.0-py36_0
pooch 0.2.1-py36_1000 --> 0.3.1-py36_0
prometheus_client 0.5.0-py_0 --> 0.6.0-py_0
prompt_toolkit 2.0.8-py_0 --> 2.0.9-py_0
psutil 5.6.1-py36h14c3975_0 --> 5.6.2-py36h516909a_0
ptyprocess conda-forge/linux-64::ptyprocess-0.6.~ --> conda-forge/noarch::ptyprocess-0.6.0-py_1001
pyqt 4.11.4-py36_3 --> 5.6.0-py36h13b7fb3_1008
pyrsistent 0.14.10-py36h14c3975_0 --> 0.15.1-py36h516909a_0
pyyaml 3.13-py36h14c3975_1001 --> 5.1-py36h14c3975_0
pyzmq 17.1.2-py36h6afc9c9_1001 --> 18.0.1-py36hc4ba49a_1
qt pkgs/free::qt-4.8.7-2 --> conda-forge::qt-5.6.2-hce4f676_1013
setuptools 40.8.0-py36_0 --> 41.0.1-py36_0
shapely 1.6.4-py36h2afed24_1003 --> 1.6.4-py36h2afed24_1004
sip 4.18-py36_1 --> 4.18.1-py36hf484d3e_1000
sqlite 3.26.0-h67949de_1000 --> 3.26.0-h67949de_1001
tensorboard 1.10.0-py36_0 --> 1.13.1-py36_0
tensorflow 1.10.0-py36_0 --> 1.13.1-py36_0
terminado 0.8.1-py36_1001 --> 0.8.2-py36_0
testpath conda-forge/linux-64::testpath-0.3.1-~ --> conda-forge/noarch::testpath-0.4.2-py_1001
theano 1.0.3-py36_0 --> 1.0.4-py36hf484d3e_1000
tk 8.6.9-h84994c4_1000 --> 8.6.9-h84994c4_1001
tornado 5.1.1-py36h14c3975_1000 --> 6.0.2-py36h516909a_0
urllib3 1.24.1-py36_1000 --> 1.24.2-py36_0
websocket-client 0.40.0-py36_0 --> 0.56.0-py36_0
wheel 0.32.3-py36_0 --> 0.33.1-py36_0
yaml pkgs/main::yaml-0.1.7-had09818_2 --> conda-forge::yaml-0.1.7-h14c3975_1001
zarr conda-forge/noarch::zarr-2.2.0-py_1 --> conda-forge/linux-64::zarr-2.3.1-py36_0
zeromq 4.2.5-hf484d3e_1006 --> 4.3.1-hf484d3e_1000
The following packages will be SUPERSEDED by a higher-priority channel:
conda-env pkgs/main/linux-64 --> conda-forge/noarch
grpcio pkgs/main::grpcio-1.16.1-py36hf8bcb03~ --> conda-forge::grpcio-1.16.0-py36h4f00d22_1000
pcre pkgs/main::pcre-8.43-he6710b0_0 --> conda-forge::pcre-8.41-hf484d3e_1003
The following packages will be DOWNGRADED:
cryptography 2.6.1-py36h72c5cf5_0 --> 2.5-py36hb7f436b_1
curl 7.64.1-hf8cf82a_0 --> 7.64.0-h646f8bb_0
fiona 1.8.6-py36hf242f0b_3 --> 1.8.6-py36hf242f0b_0
gdal 2.4.1-py36hf242f0b_0 --> 2.4.0-py36h1c6dbfb_1002
keras 2.2.4-py36_0 --> 2.1.6-py36_0
krb5 1.16.3-h05b26f9_1001 --> 1.16.3-hc83ff2d_1000
libcurl 7.64.1-hda55be3_0 --> 7.64.0-h01ee5af_0
libgdal 2.4.1-hdb8f723_0 --> 2.4.0-h982c1cc_1002
libpq 11.2-h4770945_0 --> 10.6-h13b8bad_1000
libssh2 1.8.2-h22169c7_2 --> 1.8.0-h1ad7b7a_1003
openssl 1.1.1b-h14c3975_1 --> 1.0.2r-h14c3975_0
postgresql 11.2-h61314c7_0 --> 10.6-h66cca7a_1000
psycopg2 2.8.2-py36h72c5cf5_0 --> 2.7.7-py36hb7f436b_0
pycurl 7.43.0.2-py36h1ba5d50_0 --> 7.43.0.2-py36hb7f436b_0
python 3.6.7-h381d211_1004 --> 3.6.7-hd21baee_1002
rasterio 1.0.22-py36h5b3f9e8_1 --> 1.0.22-py36h5b3f9e8_0
(base) jovyan@jupyter-scottyhq:~$ conda list | grep pypi
alembic 1.0.7 pypi_0 pypi
bokeh 1.1.0 pypi_0 pypi
cachetools 3.1.0 pypi_0 pypi
click 7.0 pypi_0 pypi
dask 1.2.0 pypi_0 pypi
dask-labextension 0.3.1 pypi_0 pypi
distributed 1.27.0 pypi_0 pypi
heapdict 1.0.0 pypi_0 pypi
intake-stac 0+untagged.28.g661390e pypi_0 pypi
jupyterhub 0.9.4 pypi_0 pypi
kubernetes 9.0.0 pypi_0 pypi
mako 1.0.7 pypi_0 pypi
mercantile 1.0.4 pypi_0 pypi
msgpack 0.6.1 pypi_0 pypi
msgpack-python 0.5.6 pypi_0 pypi
nteract-on-jupyter 2.0.0 pypi_0 pypi
pillow 6.0.0 pypi_0 pypi
pyasn1 0.4.5 pypi_0 pypi
pyasn1-modules 0.2.4 pypi_0 pypi
python-dateutil 2.7.5 pypi_0 pypi
python-editor 1.0.4 pypi_0 pypi
python-oauth2 1.1.0 pypi_0 pypi
pyyaml 5.1 pypi_0 pypi
rio-cogeo 1.0.0 pypi_0 pypi
rsa 4.0 pypi_0 pypi
sat-search 0.2.0 pypi_0 pypi
sat-stac 0.1.2 pypi_0 pypi
sqlalchemy 1.2.17 pypi_0 pypi
supermercado 0.0.5 pypi_0 pypi
tblib 1.3.2 pypi_0 pypi
toolz 0.9.0 pypi_0 pypi
websocket-client 0.56.0 pypi_0 pypi
zict 0.1.4 pypi_0 pypi
After adding your suggestions things proceed but definitely still feel slow. In case we want to modify our base environment coming from Docker, I'm copying the results of a conda update --all with the strict channel requirement below
conda
is getting faster but some big envs are still slow to solve. Specially when trying to update on top of an existing env. I usually recommend to never update. Just remove the old env and re-create it. (At least until conda
's solver gets better.)
Also see the bottom for packages currently installed via pypi (includes dask so @jhamman - should we try to modify pangeo-stacks?)
I'm not familiar with the current pangeo dependencies but I know that in the past they relied on development version of some packages. Maybe those are a "pip install from master"? If that is not the case I would love to work with the pangeo group to get all the packages you need in conda-forge.
thanks again. pinging @rabernat and adding a couple links to related issues:
conda-forge packages: pangeo-data/pangeo-stacks#23
image size: pangeo-data/pangeo-stacks#22
Also, one more comment confirming that b/c we've set dask workers to have the same home directory you can launch a KubeCluster that matches a user-created conda environment with the following:
Not sure how to get this incorporated w/ the dask jupyterlab extension (@ian-r-rose, @mrocklin )
from dask_kubernetes import KubeCluster
from dask.distributed import Client
cluster = KubeCluster(env={'PATH': '/home/jovyan/my-conda-envs/dask-minimal/bin:$PATH'})
cluster.scale(2)
client=Client(cluster)
check worker environments with
client.get_versions(check=True)
client.run(lambda: os.environ)
just documenting that for persistent user-defined conda environments we currently must place this config file in /home/jovyan/.condarc
:
# Override Dockerfile conda settings
channel_priority: strict
channels:
- conda-forge
- defaults
auto_update_conda: true
show_channel_urls: true
update_dependencies: true
auto_activate_base: false
envs_dirs:
- /home/jovyan/my-conda-envs/
create_default_packages:
- ipykernel
- blas=*=openblas
And dask_config.yaml is currently:
https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/deployments/nasa/image/binder/dask_config.yaml
distributed:
logging:
bokeh: critical
dashboard:
link: /user/{JUPYTERHUB_USER}/proxy/{port}/status
admin:
tick:
limit: 5s
kubernetes:
name: dask-{JUPYTERHUB_USER}-{uuid}
worker-template:
spec:
nodeSelector:
alpha.eksctl.io/nodegroup-name: dask-worker
restartPolicy: Never
containers:
- name: dask-${JUPYTERHUB_USER}
image: ${JUPYTER_IMAGE_SPEC}
args:
- dask-worker
- --local-directory
- /home/jovyan/dask-worker-space
- --nthreads
- '2'
- --no-bokeh
- --memory-limit
- 7GB
- --death-timeout
- '60'
resources:
limits:
cpu: "1.75"
memory: 7G
requests:
cpu: 1
memory: 7G
volumeMounts:
- name: nfs
mountPath: /home/jovyan
subPath: "nasa.pangeo.io/home/${JUPYTERHUB_USER}"
volumes:
- name: nfs
persistentVolumeClaim:
claimName: home-nfs
labextension:
factory:
module: dask_kubernetes
class: KubeCluster
args: []
kwargs: {}
And another update... Since this change to repo2docker jupyterhub/repo2docker#651 , containers seem to have a new entrypoint script : https://github.com/jupyter/repo2docker/blob/80b979f8580ddef184d2ba7d354e7a833cfa38a4/repo2docker/buildpacks/conda/activate-conda.sh
So to share the currently active conda environment in a notebook with dask workers, launch a cluster with
cluster = KubeCluster(env={'NB_PYTHON_PREFIX':sys.prefix})
Not sure this is the 'best' way to use different conda environments among workers with dask-kubernetes. In particular, this setup means each worker is accessing the same python files under /home/jovyan/myenv via NFS instead of making a local copy of the conda environment. Thoughts @mrocklin and @TomAugspurger?
Looking into this now.
@scottyhq can you test if including
kubernetes:
env:
NB_PYTHON_PREFIX: $NB_PYTHON_PREFIX
works / breaks anything? You can remove the env={'NB_PYTHON_PREFIX':sys.prefix}
. It looks like dask_kubernetes will expand environment variables (which will I think evaluate to the correct thing in the notebook) before passing though.