client dashboard 404
msmicker opened this issue ยท 26 comments
With 2.15.0 of dask and dask distributed I now get 404 errors in the client dashboard. When downgrading, everything works fine. I am using a cluster.
Thank you for reporting. Were there any log messages that came out of the scheduler process?
Those will be helpful to determine what went wrong.
I still see this error with 2.15.1. @mrocklin apologies but how do I check distributed logs? I am not yet scheduling anything. Simply starting the cluster and client and then opening the dashboard.
from dask_jobqueue import SGECluster
from dask.distributed import Client
cluster = SGECluster(...)
cluster.adapt(minimum_jobs=5, maximum_jobs=150)
client = Client(cluster)
clientclient.get_scheduler_logs()
(('INFO', 'distributed.scheduler - INFO - Clear task state'),
('INFO',
'distributed.scheduler - INFO - Scheduler at: tcp://157.206.230.212:38448'),
('INFO',
'distributed.scheduler - INFO - dashboard at: :8787'),
('INFO',
'distributed.scheduler - INFO - Receive client connection: Client-cc91b07d-89c3-11ea-9dc8-94f128bfa670'),
('INFO',
"distributed.scheduler - INFO - Register worker <Worker 'tcp://157.206.230.55:42049', name: 3, memory: 0, processing: 0>"),
('INFO',
'distributed.scheduler - INFO - Starting worker compute stream, tcp://157.206.230.55:42049'),
('INFO',
"distributed.scheduler - INFO - Register worker <Worker 'tcp://157.206.230.76:34078', name: 0, memory: 0, processing: 0>"),
('INFO',
'distributed.scheduler - INFO - Starting worker compute stream, tcp://157.206.230.76:34078'),
('INFO',
"distributed.scheduler - INFO - Register worker <Worker 'tcp://157.206.230.76:33739', name: 1, memory: 0, processing: 0>"),
('INFO',
'distributed.scheduler - INFO - Starting worker compute stream, tcp://157.206.230.76:33739'),
('INFO',
"distributed.scheduler - INFO - Register worker <Worker 'tcp://157.206.230.76:32793', name: 2, memory: 0, processing: 0>"),
('INFO',
'distributed.scheduler - INFO - Starting worker compute stream, tcp://157.206.230.76:32793'),
('INFO',
"distributed.scheduler - INFO - Register worker <Worker 'tcp://157.206.230.55:34488', name: 4, memory: 0, processing: 0>"),
('INFO',
'distributed.scheduler - INFO - Starting worker compute stream, tcp://157.206.230.55:34488'))
client.get_worker_logs()
{'tcp://157.206.230.55:34488': (('INFO',
'distributed.worker - INFO - Start worker at: tcp://157.206.230.55:34488'),
('INFO',
'distributed.worker - INFO - Listening to: tcp://157.206.230.55:34488'),
('INFO',
'distributed.worker - INFO - dashboard at: 157.206.230.55:34286'),
('INFO',
'distributed.worker - INFO - Waiting to connect to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Threads: 1'),
('INFO',
'distributed.worker - INFO - Memory: 24.00 GB'),
('INFO',
'distributed.worker - INFO - Local Directory: '),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Registered to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------')),
'tcp://157.206.230.55:42049': (('INFO',
'distributed.worker - INFO - Start worker at: tcp://157.206.230.55:42049'),
('INFO',
'distributed.worker - INFO - Listening to: tcp://157.206.230.55:42049'),
('INFO',
'distributed.worker - INFO - dashboard at: 157.206.230.55:39319'),
('INFO',
'distributed.worker - INFO - Waiting to connect to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Threads: 1'),
('INFO',
'distributed.worker - INFO - Memory: 24.00 GB'),
('INFO',
'distributed.worker - INFO - Local Directory: '),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Registered to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------')),
'tcp://157.206.230.76:32793': (('INFO',
'distributed.worker - INFO - Start worker at: tcp://157.206.230.76:32793'),
('INFO',
'distributed.worker - INFO - Listening to: tcp://157.206.230.76:32793'),
('INFO',
'distributed.worker - INFO - dashboard at: 157.206.230.76:34934'),
('INFO',
'distributed.worker - INFO - Waiting to connect to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Threads: 1'),
('INFO',
'distributed.worker - INFO - Memory: 24.00 GB'),
('INFO',
'distributed.worker - INFO - Local Directory: '),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Registered to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------')),
'tcp://157.206.230.76:33739': (('INFO',
'distributed.worker - INFO - Start worker at: tcp://157.206.230.76:33739'),
('INFO',
'distributed.worker - INFO - Listening to: tcp://157.206.230.76:33739'),
('INFO',
'distributed.worker - INFO - dashboard at: 157.206.230.76:37945'),
('INFO',
'distributed.worker - INFO - Waiting to connect to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Threads: 1'),
('INFO',
'distributed.worker - INFO - Memory: 24.00 GB'),
('INFO',
'distributed.worker - INFO - Local Directory: '),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Registered to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------')),
'tcp://157.206.230.76:34078': (('INFO',
'distributed.worker - INFO - Start worker at: tcp://157.206.230.76:34078'),
('INFO',
'distributed.worker - INFO - Listening to: tcp://157.206.230.76:34078'),
('INFO',
'distributed.worker - INFO - dashboard at: 157.206.230.76:38661'),
('INFO',
'distributed.worker - INFO - Waiting to connect to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Threads: 1'),
('INFO',
'distributed.worker - INFO - Memory: 24.00 GB'),
('INFO',
'distributed.worker - INFO - Local Directory: '),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'),
('INFO',
'distributed.worker - INFO - Registered to: tcp://157.206.230.212:38448'),
('INFO',
'distributed.worker - INFO - -------------------------------------------------'))}
With SGECluster there may be a silence_logs=False keyword option. I don't remember exactly though.
@mrocklin just to check if logs I added via update were enough or should I try for more (e.g. with silence_logs=True)
I have the same problem. I'm using dask-jobqueue with a HTCondor Cluster.
@msmicker @lenaWitterauf can you try with 2.15.2? It includes a second fix that others have reported success with.
@TomAugspurger I just updated dask distributed to 2.15.2 and the dashboard worked! Thanks a lot :-)
Same, all is good with 2.15.2, thanks!
For anyone stumbling upon this issue by searching for dask dashboard 404:
You will see 404 if you try to run it in an environment without bokeh installed.
Thanks for engaging @elanmart. There's an open PR (xref dask/dask#6215) to add a note to the docs about getting 404 errors when bokeh isn't installed. Hopefully that will help others running into this issue
@jrbourbeau oops, in that case apologies for adding noise, thanks for the reference!
@dankerrigan you might find this interesting
For anyone stumbling upon this issue by searching for
dask dashboard 404:You will see
404if you try to run it in an environment withoutbokehinstalled.
I have bokeh installed but unfortunately I'm still getting 404: Not Found when visiting dashboard address.
Same here:
$ /deploy/env/bin/pip freeze | grep bokeh
bokeh==2.1.0
$ curl localhost:8787
<html><title>404: Not Found</title><body>404: Not Found</body></html>
If it helps, this reliably reproduces it for me:
$ virtualenv -p python3 dask
$ source dask/bin/activate
$ pip install 'dask[distributed,diagnostics]'
$ pip freeze | grep bokeh
bokeh==2.1.0
$ pip freeze | grep dask
dask==2.18.1
$ dask-scheduler
# localhost:8787 now 404s
Yes, reproduced. It looks like an internal function in bokeh moved in bokeh 2.1.0. I'll resolve this on our end. In the meantime, I recommend downgrading to bokeh 2.0.
In the future, I also recommend raising a new issue rather than commenting on an old one, although the symptoms are the same as this old issue, the cause is entirely new.
Thanks! Apologies, the issue seemed pretty fresh.
No need to apologize, and thank you for providing explicit steps to reproduce. This is a serious issue and I thank you for catching and reporting it.
I'm seeing this as well, and I've got bokeh installed (distributed 2.18.0).
Also shouldn't this issue be reopened?
@CMCDragonkai I'd recommend opening a new issue with a minimal reproducer

