/p_tqdm

Parallel processing with progress bars

Primary LanguagePythonMIT LicenseMIT

p_tqdm

Build Status

p_tqdm makes parallel processing with progress bars easy.

p_tqdm is a wrapper around pathos.multiprocessing and tqdm. Unlike Python's default multiprocessing library, pathos provides a more flexible parallel map which can apply almost any type of function --- including lambda functions, nested functions, and class methods --- and can easily handle functions with multiple arguments. tqdm is applied on top of pathos's parallel map and displays a progress bar including an estimated time to completion.

Installation

pip install p_tqdm

p_tqdm works with Python versions 2.7, 3.4, 3.5, 3.6.

Example

Let's say you want to add two lists element by element. Without any parallelism, this can be done easily with a Python map.

l1 = ['1', '2', '3']
l2 = ['a', 'b', 'c']

def add(a, b):
    return a + b
    
added = map(add, l1, l2)
# added == ['1a', '2b', '3c']

But if the lists are much larger or the computation is more intense, parallelism becomes a necessity. However, the syntax is often cumbersome. p_tqdm makes it easy and adds a progress bar too.

from p_tqdm import p_map

added = p_map(add, l1, l2)
# added == ['1a', '2b', '3c']
  0%|                                    | 0/3 [00:00<?, ?it/s]
 33%|████████████                        | 1/3 [00:01<00:02, 1.00s/it]
 66%|████████████████████████            | 2/3 [00:02<00:01, 1.00s/it]
100%|████████████████████████████████████| 3/3 [00:03<00:00, 1.00s/it]

p_tqdm functions

Parallel maps

  • p_map - parallel ordered map
  • p_imap - iterator for parallel ordered map
  • p_umap - parallel unordered map
  • p_uimap - iterator for parallel unordered map

Sequential maps

  • t_map - sequential ordered map
  • t_imap - iterator for sequential ordered map

p_map

Performs an ordered map in parallel.

from p_tqdm import p_map

def add(a, b):
    return a + b

added = p_map(add, ['1', '2', '3'], ['a', 'b', 'c'])
# added = ['1a', '2b', '3c']

p_imap

Returns an iterator for an ordered map in parallel.

from p_tqdm import p_imap

def add(a, b):
    return a + b

iterator = p_imap(add, ['1', '2', '3'], ['a', 'b', 'c'])

for result in iterator:
    print(result) # prints '1a', '2b', '3c'

p_umap

Performs an unordered map in parallel.

from p_tqdm import p_umap

def add(a, b):
    return a + b

added = p_umap(add, ['1', '2', '3'], ['a', 'b', 'c'])
# added is an array with '1a', '2b', '3c' in any order

p_uimap

Returns an iterator for an unordered map in parallel.

from p_tqdm import p_uimap

def add(a, b):
    return a + b

iterator = p_uimap(add, ['1', '2', '3'], ['a', 'b', 'c'])

for result in iterator:
    print(result) # prints '1a', '2b', '3c' in any order

t_map

Performs an ordered map sequentially.

from p_tqdm import t_map

def add(a, b):
    return a + b

added = t_map(add, ['1', '2', '3'], ['a', 'b', 'c'])
# added == ['1a', '2b', '3c']

t_imap

Returns an iterator for an ordered map to be performed sequentially.

from p_tqdm import p_imap

def add(a, b):
    return a + b

iterator = t_imap(add, ['1', '2', '3'], ['a', 'b', 'c'])

for result in iterator:
    print(result) # prints '1a', '2b', '3c'

Shared properties

Arguments

All p_tqdm functions accept any number of lists (of the same length) as input, as long as the number of lists matches the number of arguments of the function. Additionally, if any non-list variable is passed as an input to a p_tqdm function, the variable will be passed to all calls of the function. See the example below.

l1 = ['1', '2', '3']
l2 = ['a', 'b', 'c']

def add(a, b, c):
    return a + b + c

added = p_map(add, l1, l2, '!')
# added == ['1a!', '2b!', '3c!']

CPUs

All the parallel p_tqdm functions can be passed the keyword num_cpus to indicate how many CPUs to use. The default is all CPUs. num_cpus can either be an integer to indicate the exact number of CPUs to use or a float to indicate the proportion of CPUs to use.