niedakh/pqdm

Is it possible that pqdm passes over error messages

SilasK opened this issue ยท 9 comments

  • Parallel TQDM version: 0.1.0
  • Python version: 3.6
  • Operating System: linux cluster

Description

I wanted:
Read chunk of a big file
process and save chunk to a new files
Run this in parallel with pqdm.threads

What I Did

        def process_chunck(genome):

            D=pd.read_hdf(input_tmp_file,where=f"Genome == {genome}")
            out= process ...

            out.to_parquet(output_file)




        from pqdm.threads import pqdm
        pqdm(all_genomes, process_chunck, n_jobs=threads)

Now there was a bug in my function process_chunk which was not raised.

What can I do to do better error handling with pqdm?

Regrettably, I've ended up seeing this issue, and writing a wrapper function akin to:

def do_work_or_error(): Optional[Exception]
    try:
        do_work()
    except Exception as e:
        return e

...

errors = pqdm(do_work_or_error, work)
assert not any(errors), [e for e in errors if e]

@niedakh Also saw this in a place where it would be useful to interrupt execution. Would you take a PR to add a kwarg to enable raising the exceptions rather that returning them?

I've done some work on this. PR to follow.

@niedakh please do review the linked PR #56

@niedakh Found myself missing this feature today. Please do review the linked PR #56

I just got a chance to work with the @dangercrow PR and the immediate behavior worked well for my use case! ๐Ÿ™‡

Right, I've emailed @niedakh
If that doesn't get any response I'll tweet him or something in a few days

It's in, thanks for the tweet, I'm pretty overworked these days, it's merged now, I'll look into #2 in the morning and release a new pqdm later tommorow.

Also great thanks for the PR and your work and using pqdm!