`n_jobs=-1` is not converted to use all cores, instead silently fails to run any evaluations
iXanthos opened this issue · 5 comments
Greetings,
I am trying to run GAMA on a small dataset (180 samples, 8 features) with the default settings, but I receive the BrokenPipeError
More specifically, the code I am running:
from gama import GamaClassifier
print("Started GAMA demo...")
automl = GamaClassifier(max_total_time=300, n_jobs=-1)
automl.fit(train_data, train_target)
predictions = automl.predict(test_data)
and the error I get:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "AutoML_init_tester.py", line 113, in <module>
gama_demo(train_data, train_target, test_data, test_target)
File "AutoML_init_tester.py", line 83, in gama_demo
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
automl.fit(train_data, train_target)
File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/GamaClassifier.py", line 134, in fit
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
super().fit(x, y, *args, **kwargs)
File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/gama.py", line 549, in fit
self.model = self._post_processing.post_process(
File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/postprocessing/best_fit.py", line 26, in post_process
self._selected_individual = selection[0]
IndexError: list index out of range
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Does the error show no model could be fitted in the allotted time (5 minutes)? Or does it mean something else?
Regards,
IX
Hi, thanks for reporting your issue! The BrokenPipeError
does not necessarily mean that no model could be fit. It's typically an issue when shutting down the subprocesses which evaluate pipelines, and can normally be safely ignored (we're working on making sure it doesn't happen, though).
It's actually this little part which likely prevented predictions from being made:
super().fit(x, y, *args, **kwargs)
File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/gama.py", line 549, in fit
self.model = self._post_processing.post_process(
File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/postprocessing/best_fit.py", line 26, in post_process
self._selected_individual = selection[0]
IndexError: list index out of range
Could you provide us with the data you used to generate this behavior?
Unfortunately I cannot provide you with the data as they are proprietary. But I have tested the same data (same splits and all) in other AutoML frameworks and so far only GAMA seems to produce this error.
It's actually this little part which likely prevented predictions from being made:
super().fit(x, y, *args, **kwargs) File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/gama.py", line 549, in fit self.model = self._post_processing.post_process( File "/home/ixanthos/Documents/gama_venv/lib/python3.8/site-packages/gama/postprocessing/best_fit.py", line 26, in post_process self._selected_individual = selection[0] IndexError: list index out of range
I also saw this index error, does it mean that there was no fitted model or is it something else?
Do you think this error can be fixed if I increase the max_total_time
?
Regards,
Iordanis
Yes, I suspect that no pipeline has been successfully evaluated. This could mean that either there's something in the input data that GAMA doesn't deal with, or that it simply did not have enough time. Given how small the dataset is, I would rather expect the former. The logs (gama.log
, evaluations.log
) might reveal more specifically what the issue is, if you can share those.
I am pasting the logs:
evaluations.log
id;pid;t_start;t_wallclock;t_process;score;pipeline;error;parent0;parent1;origin
gama.log
[2021-07-30 02:45:31,327 - gama.gama] Using GAMA version 21.0.0.
[2021-07-30 02:45:31,327 - gama.gama] INIT:GamaClassifier(scoring=neg_log_loss,regularize_length=True,max_pipeline_length=None,random_state=None,max_total_time=300,max_eval_time=None,n_jobs=-1,max_memory_mb=None,verbosity=30,search=AsyncEA(),post_processing=BestFitPostProcessing(),output_directory=gama_1b8b00a8-fcb2-4c7f-9af6-f6de86c35809,store=logs)
[2021-07-30 02:45:31,328 - gama.utilities.generic.timekeeper] START: preprocessing default
[2021-07-30 02:45:31,331 - gama.utilities.generic.timekeeper] STOP: preprocessing default after 0.0026s.
[2021-07-30 02:45:31,331 - gama.utilities.generic.timekeeper] START: search AsyncEA
[2021-07-30 02:45:31,339 - gama.utilities.generic.async_evaluator] Process 19297 starting -1 subprocesses.
[2021-07-30 02:45:31,339 - gama.search_methods.async_ea] Starting EA with new population.
[2021-07-30 02:50:00,379 - gama.utilities.generic.async_evaluator] Signaling 0 subprocesses to stop.
[2021-07-30 02:50:00,383 - gama.gama] Search phase evaluated 0 individuals.
[2021-07-30 02:50:00,384 - gama.utilities.generic.timekeeper] STOP: search AsyncEA after 269.0520s.
[2021-07-30 02:50:00,384 - gama.utilities.generic.timekeeper] START: postprocess BestFitPostProcessing
Regards
It looks like n_jobs=-1
is broken and no longer correctly sets the cpu
count, which results in no evaluation processes being run (and thus no individuals being evaluated). I'll patch it but because I'm having vacation I might not do an immediate PyPI release. In the meanwhile please set n_jobs
to the number of cores you want to use explicitly (or leave it None
to use half of your cores). Thanks again for raising the issue and providing me with the details to find the bug.
The bug should be fixed in 21.0.1 release (not yet published).