Is it possible that pqdm passes over error messages
SilasK opened this issue ยท 9 comments
- Parallel TQDM version: 0.1.0
- Python version: 3.6
- Operating System: linux cluster
Description
I wanted:
Read chunk of a big file
process and save chunk to a new files
Run this in parallel with pqdm.threads
What I Did
def process_chunck(genome):
D=pd.read_hdf(input_tmp_file,where=f"Genome == {genome}")
out= process ...
out.to_parquet(output_file)
from pqdm.threads import pqdm
pqdm(all_genomes, process_chunck, n_jobs=threads)
Now there was a bug in my function process_chunk
which was not raised.
What can I do to do better error handling with pqdm?
Regrettably, I've ended up seeing this issue, and writing a wrapper function akin to:
def do_work_or_error(): Optional[Exception]
try:
do_work()
except Exception as e:
return e
...
errors = pqdm(do_work_or_error, work)
assert not any(errors), [e for e in errors if e]
@niedakh Also saw this in a place where it would be useful to interrupt execution. Would you take a PR to add a kwarg to enable raising the exceptions rather that returning them?
I've done some work on this. PR to follow.
I just got a chance to work with the @dangercrow PR and the immediate
behavior worked well for my use case! ๐
Right, I've emailed @niedakh
If that doesn't get any response I'll tweet him or something in a few days
It's in, thanks for the tweet, I'm pretty overworked these days, it's merged now, I'll look into #2 in the morning and release a new pqdm later tommorow.
Also great thanks for the PR and your work and using pqdm!