Open dask monitoring interface?
davidlandry93 opened this issue · 7 comments
I see there is an example to use dask_distributed
in this doc.
Is there a way to open the monitoring interface? Usually it's made available on port 8787. As far as I understand this is impossible without port forwarding. Unless the scheduler is run on a head node?
Cheers,
David
Hi @davidlandry93 ,
When I was using dask_distributed
I did not use the monitoring interface, and I think it would be difficult to do so using the example I wrote because Jean Zay will typically prohibit port forwarding.
That being said, I want to add 2 things:
- with jupyterhub, there might be a way to use Dask in a much more convenient way. I am not aware of it and haven't looked into it, but if you do please let us know.
- I would try to avoid using the
dask_distributed
example for the following reason: there is a process that will monitor all the CPU processes on the front nodes and kill them is they exceed a certain CPU time limit (I don't know what that is). Therefore, the scheduler is going to be killed at some point because it will reach this limit pretty quickly given it pings the workers quite often. This was not in place when I created the example. Now I would suggest using something more tailored to SLURM likesubmitit
.
Hi. I can confirm that you should be able to use the Dask dashboard from JupyterHub.
Thanks to both of you
with jupyterhub, there might be a way to use Dask in a much more convenient way. I am not aware of it and haven't looked into it, but if you do please let us know.
I did make some progress. Using the jupyter launcher launcher does not work. If I click on the "Dask" button, whatever the port I input, I get this page when it launches:
However, if I go to a terminal inside JupyterHub and start the Dask cluster myself, I am able to access it through the jupyter proxy (ex: https://jupyterhub.idris.fr/user/ugd42cy/jupyterlab_1/proxy/8787/status/)
I would try to avoid using the dask_distributed example for the following reason: there is a process that will monitor all the CPU processes on the front nodes and kill them is they exceed a certain CPU time limit (I don't know what that is). Therefore, the scheduler is going to be killed at some point because it will reach this limit pretty quickly given it pings the workers quite often. This was not in place when I created the example. Now I would suggest using something more tailored to SLURM like submitit.
I avoid this problem by creating a JupyterHub that lives on a compute node rather than a head node
Using the jupyter launcher launcher does not work. If I click on the "Dask" button, whatever the port I input
That's because it is just a launcher for the Dask dashboard itself, not the Dask cluster.
Indeed, the procedure to start a dask cluster is:
- Inside the JupyterServer page, on the left toolbar, click the dask Icon
- Launch a cluster using the side menu
- Once the cluster is started, click the "Dask" button in the launcher tab to open the monitoring interface
Thanks all for your help
I am interested by Dask usage on Jean-Zay, so if you manage to get something useful for your research working, don't hesitate to share!
Also just curious, I see that you do climate sciences, so that means you use package like xarray?
I would try to avoid using the dask_distributed example for the following reason: there is a process that will monitor all the CPU processes on the front nodes and kill them is they exceed a certain CPU time limit (I don't know what that is)
As far as I remember the limit is 30 minutes CPU time, so for a process like dask-scheduler that uses ~5% of CPU (last time I checked) that is roughly 10 hours. After that if you run dask-scheduler on the login node it will get killed and you will lose the job you have done in your workers (unless you have some mechanism to save partial progress).
There was this old issue dask/dask-jobqueue#471 with more details but no real solution.
@lesteve Yes I use xarray quite a bit! I do all my data preparation in it and then convert the data to Pytorch tensors. Feel free to shoot me a mail if you ever come to INRIA Paris ;)