How to avoid fails in parallel

Question

How to avoid fails in parallel

VEZcoding opened this issue 2 years ago · 2 comments

VEZcoding commented 2 years ago

First of all thanks for this package. It is super helpful for me and my workflow.

I was just wondering if you have any tips on how to get rid of fails from parallel workflow?

Most of the time I run it I get errors like this:
ValueError: n_samples=64 should be >= n_clusters=100.

or the same just for batches.

Do you have any simple tips to guide me in the right direction to overcome this problem.

Keep up the good work :)

Answer 1 · 2023-06-07T04:35:24.000Z

could you share some more details on the configuration used to run Mango when you get these errors. We do not need details on the objective function so those can be abstracted.

Answer 2 · 2023-11-29T13:37:04.000Z

Hello, bringing this up again because I encountered the same error as above. I use the following configuration setup for param_space:

dict( profile_size=range(10, 20), distance_metric=['euclidean', 'dtw', 'cc'] )

and the error is the following:
ValueError: n_samples=7 should be >= n_clusters=12.

when using n_jobs=12. I suppose n_clusters depends on the number of jobs, but what is the relation between n_samples and the param_space shape and how n_jobs should be determined?