ProjectPythia/interactive-sentinel-2-cookbook

Best practice for creating a local dask cluster?

Closed this issue · 2 comments

I'm not sure that creating a local cluster with with n_workers=os.cpu_count() is best practice. First, os.cpu_count() can return None, and it's not clear what n_workers=None would do. Secondly, os.cpu_count() returns the number of CPUs on the system, which may not be the same as the number of cores available to the process. Thus os.sched_getaffinity(0) might be a better choice. However, though the Dask documentation on LocalCluster is incomplete, the best choice may simply be to let LocalCluster figure out the value itself. See discussion here.

cluster = LocalCluster(n_workers=os.cpu_count())
client = Client(cluster)
client

"cluster = LocalCluster(n_workers=os.cpu_count())\n",

I hadn't thought about this behavior of os.cpu_count() earlier, thanks for your review. I have opened a PR (#6) addressing this issue.

Fixed by #6