ubelt: A Python repository from Zheaoli

Purpose

UBelt is a "utility belt" of commonly needed utility and helper functions. It is a migration of the most useful parts of utool (https://github.com/Erotemic/utool) into a minimal and standalone module.

The utool library contains a number of useful utility functions, however a number of these are too specific or not well documented. The goal of this migration is to slowly port over the most re-usable parts of utool into a stable package.

In addition to utility functions utool also contains a custom doctest harness and code introspection and auto-generation features. A rewrite of the test harness has been ported to a new module called: xdoctest. A small subset of the auto-generation and code introspection will be ported / made visible through ubelt.

Installation:

From github:

pip install git+https://github.com/Erotemic/ubelt.git

From pypi:

pip install ubelt

Available Functions:

This list of functions and classes is currently available. See the corresponding doc-strings for more details.

import ubelt as ub

ub.dict_hist
ub.dict_subset
ub.dict_take
ub.find_duplicates
ub.group_items
ub.map_keys
ub.map_vals
ub.readfrom
ub.writeto
ub.ensuredir
ub.ensure_app_resource_dir
ub.chunks
ub.compress
ub.take
ub.flatten
ub.memoize
ub.NiceRepr
ub.NoParam
ub.CaptureStdout
ub.Timer
ub.Timerit (powerful multiline alternative to timeit)
ub.ProgIter (simplified alternative to tqdm)
ub.Cacher
ub.cmd
ub.editfile
ub.startfile
ub.delete
ub.repr2
ub.hzcat
ub.argval
ub.argflag
ub.modname_to_modpath (works via static analysis)
ub.modpath_to_modname (works via static analysis)
ub.import_module_from_path
ub.import_module_from_name
ub.download
ub.AutoDict

A minimal version of the doctest harness has been completed. This can be accessed using ub.doctest_package.

Examples

Here are some examples of some features inside ubelt

Caching

Cache intermediate results in a script with minimal boilerplate.

>>> import ubelt as ub
>>> cfgstr = 'repr-of-params-that-uniquely-determine-the-process'
>>> cacher = ub.Cacher('test_process', cfgstr)
>>> data = cacher.tryload()
>>> if data is None:
>>>     myvar1 = 'result of expensive process'
>>>     myvar2 = 'another result'
>>>     data = myvar1, myvar2
>>>     cacher.save(data)
>>> myvar1, myvar2 = data

Timing

Quickly time a single line.

>>> import ubelt as ub
>>> timer = ub.Timer('Timer demo!', verbose=1)
>>> with timer:
>>>     prime = ub.find_nth_prime(40)
tic('Timer demo!')
...toc('Timer demo!')=0.0008s

Robust Timing

Easily do robust timings on existing blocks of code by simply indenting them. There is no need to refactor into a string representation or convert to a single line. With ub.Timerit there is no need to resort to the timeit module!

The quick and dirty way just requires one indent.

>>> import ubelt as ub
>>> for _ in ub.Timerit(num=200, verbose=2):
>>>     ub.find_nth_prime(100)
Timing for 200 loops
Timing complete, 200 loops
    time per loop : 0.003288508653640747 seconds

Use the loop variable as a context manager for more accurate timings or to incorporate an setup phase that is not timed. You can also access properties of the ub.Timerit class to programmatically use results.

>>> import ubelt as ub
>>> t1 = ub.Timerit(num=200, verbose=2)
>>> for timer in t1:
>>>     setup_vars = 100
>>>     with timer:
>>>         ub.find_nth_prime(setup_vars)
>>> print('t1.total_time = %r' % (t1.total_time,))
Timing for 200 loops
Timing complete, 200 loops
    time per loop : 0.003165217638015747 seconds
t1.total_time = 0.6330435276031494

Grouping

Group items in a sequence into a dictionary by a second id list

>>> import ubelt as ub
>>> item_list    = ['ham',     'jam',   'spam',     'eggs',    'cheese', 'bannana']
>>> groupid_list = ['protein', 'fruit', 'protein',  'protein', 'dairy',  'fruit']
>>> result = ub.group_items(item_list, groupid_list)
>>> print(result)
{'dairy': ['cheese'], 'fruit': ['jam', 'bannana'], 'protein': ['ham', 'spam', 'eggs']}

Dictionary Histogram

Find the frequency of items in a sequence

>>> import ubelt as ub
>>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist = ub.dict_hist(item_list)
>>> print(hist)
{1232: 2, 1: 1, 2: 4, 900: 3, 39: 1}

Dictionary Manipulation

Take a subset of a dictionary.

>>> import ubelt as ub
>>> dict_ = {'K': 3, 'dcvs_clip_max': 0.2, 'p': 0.1}
>>> subdict_ = ub.dict_subset(dict_, ['K', 'dcvs_clip_max'])
>>> print(subdict_)
{'K': 3, 'dcvs_clip_max': 0.2}

Take only the values, optionally specify a default value.

>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> print(list(ub.dict_take(dict_, [1, 2, 3, 4, 5], default=None)))
['a', 'b', 'c', None, None]

Apply a function to each value in the dictionary (see also ub.map_keys).

>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> newdict = ub.map_vals(len, dict_)
>>> print(newdict)
{'a': 3, 'b': 0}

Loop Progress

See also tqdm for an alternative implementation.

>>> import ubelt as ub
>>> def is_prime(n):
...     return n >= 2 and not any(n % i == 0 for i in range(2, n))
>>> for n in ub.ProgIter(range(1000), verbose=2):
>>>     # do some work
>>>     is_prime(n)
    0/1000... rate=0.00 Hz, eta=?, total=0:00:00, wall=14:05 EST 
    1/1000... rate=82241.25 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST 
  257/1000... rate=177204.69 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST 
  642/1000... rate=94099.22 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST 
 1000/1000... rate=71886.74 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST

Horizontal String Concatenation

>>> import ubelt as ub
>>> B = ub.repr2([[1, 2], [3, 4]], nl=1, cbr=True, trailsep=False)
>>> C = ub.repr2([[5, 6], [7, 8]], nl=1, cbr=True, trailsep=False)
>>> print(ub.hzcat(['A = ', B, ' * ', C]))
A = [[1, 2], * [[5, 6],
     [3, 4]]    [7, 8]]

Command Line interaction

The ub.cmd function provides a simple interface to the command line. It is an alternative to os.system and subprocess (although it uses subprocess under the hood). Its key feature is that it prints stdout and stderr to the terminal in real-time, while simultaneously capturing the output.

This allows you to easily run a command line executable as part of a python process, see what it is doing, and then do something based on its output, just as you would if you were interacting with the command line itself.

Also, ub.cmd removes the need to think about if you need to pass a list of args, or a string. Both will work. This utility has been tested on both windows and linux.

>>> info = cmd(('echo', 'simple cmdline interface'), verbose=1)
simple cmdline interface
>>> assert info['ret'] == 0
>>> assert info['out'].strip() == 'simple cmdline interface'
>>> assert info['err'].strip() == ''

Zheaoli/ubelt