UBelt is a "utility belt" of commonly needed utility and helper functions.
It is a migration of the most useful parts of utool
(https://github.com/Erotemic/utool) into a minimal and standalone module.
The utool
library contains a number of useful utility functions, however a
number of these are too specific or not well documented. The goal of this
migration is to slowly port over the most re-usable parts of utool
into a
stable package.
In addition to utility functions utool
also contains a custom doctest
harness and code introspection and auto-generation features.
A rewrite of the test harness has been ported to a new module called:
xdoctest
. A small subset of the
auto-generation and code introspection will be ported / made visible through
ubelt
.
pip install git+https://github.com/Erotemic/ubelt.git
pip install ubelt
This list of functions and classes is currently available. See the corresponding doc-strings for more details.
import ubelt as ub
ub.dict_hist
ub.dict_subset
ub.dict_take
ub.find_duplicates
ub.group_items
ub.map_keys
ub.map_vals
ub.readfrom
ub.writeto
ub.ensuredir
ub.ensure_app_resource_dir
ub.chunks
ub.compress
ub.take
ub.flatten
ub.memoize
ub.NiceRepr
ub.NoParam
ub.CaptureStdout
ub.Timer
ub.Timerit (powerful multiline alternative to timeit)
ub.ProgIter (simplified alternative to tqdm)
ub.Cacher
ub.cmd
ub.editfile
ub.startfile
ub.delete
ub.repr2
ub.hzcat
ub.argval
ub.argflag
ub.modname_to_modpath (works via static analysis)
ub.modpath_to_modname (works via static analysis)
ub.import_module_from_path
ub.import_module_from_name
ub.download
ub.AutoDict
A minimal version of the doctest harness has been completed.
This can be accessed using ub.doctest_package
.
Here are some examples of some features inside ubelt
Cache intermediate results in a script with minimal boilerplate.
>>> import ubelt as ub
>>> cfgstr = 'repr-of-params-that-uniquely-determine-the-process'
>>> cacher = ub.Cacher('test_process', cfgstr)
>>> data = cacher.tryload()
>>> if data is None:
>>> myvar1 = 'result of expensive process'
>>> myvar2 = 'another result'
>>> data = myvar1, myvar2
>>> cacher.save(data)
>>> myvar1, myvar2 = data
Quickly time a single line.
>>> import ubelt as ub
>>> timer = ub.Timer('Timer demo!', verbose=1)
>>> with timer:
>>> prime = ub.find_nth_prime(40)
tic('Timer demo!')
...toc('Timer demo!')=0.0008s
Easily do robust timings on existing blocks of code by simply indenting them.
There is no need to refactor into a string representation or convert to a
single line. With ub.Timerit
there is no need to resort to the timeit
module!
The quick and dirty way just requires one indent.
>>> import ubelt as ub
>>> for _ in ub.Timerit(num=200, verbose=2):
>>> ub.find_nth_prime(100)
Timing for 200 loops
Timing complete, 200 loops
time per loop : 0.003288508653640747 seconds
Use the loop variable as a context manager for more accurate timings or to
incorporate an setup phase that is not timed. You can also access properties
of the ub.Timerit
class to programmatically use results.
>>> import ubelt as ub
>>> t1 = ub.Timerit(num=200, verbose=2)
>>> for timer in t1:
>>> setup_vars = 100
>>> with timer:
>>> ub.find_nth_prime(setup_vars)
>>> print('t1.total_time = %r' % (t1.total_time,))
Timing for 200 loops
Timing complete, 200 loops
time per loop : 0.003165217638015747 seconds
t1.total_time = 0.6330435276031494
Group items in a sequence into a dictionary by a second id list
>>> import ubelt as ub
>>> item_list = ['ham', 'jam', 'spam', 'eggs', 'cheese', 'bannana']
>>> groupid_list = ['protein', 'fruit', 'protein', 'protein', 'dairy', 'fruit']
>>> result = ub.group_items(item_list, groupid_list)
>>> print(result)
{'dairy': ['cheese'], 'fruit': ['jam', 'bannana'], 'protein': ['ham', 'spam', 'eggs']}
Find the frequency of items in a sequence
>>> import ubelt as ub
>>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900]
>>> hist = ub.dict_hist(item_list)
>>> print(hist)
{1232: 2, 1: 1, 2: 4, 900: 3, 39: 1}
Take a subset of a dictionary.
>>> import ubelt as ub
>>> dict_ = {'K': 3, 'dcvs_clip_max': 0.2, 'p': 0.1}
>>> subdict_ = ub.dict_subset(dict_, ['K', 'dcvs_clip_max'])
>>> print(subdict_)
{'K': 3, 'dcvs_clip_max': 0.2}
Take only the values, optionally specify a default value.
>>> import ubelt as ub
>>> dict_ = {1: 'a', 2: 'b', 3: 'c'}
>>> print(list(ub.dict_take(dict_, [1, 2, 3, 4, 5], default=None)))
['a', 'b', 'c', None, None]
Apply a function to each value in the dictionary (see also ub.map_keys
).
>>> import ubelt as ub
>>> dict_ = {'a': [1, 2, 3], 'b': []}
>>> newdict = ub.map_vals(len, dict_)
>>> print(newdict)
{'a': 3, 'b': 0}
See also tqdm
for an alternative
implementation.
>>> import ubelt as ub
>>> def is_prime(n):
... return n >= 2 and not any(n % i == 0 for i in range(2, n))
>>> for n in ub.ProgIter(range(1000), verbose=2):
>>> # do some work
>>> is_prime(n)
0/1000... rate=0.00 Hz, eta=?, total=0:00:00, wall=14:05 EST
1/1000... rate=82241.25 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST
257/1000... rate=177204.69 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST
642/1000... rate=94099.22 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST
1000/1000... rate=71886.74 Hz, eta=0:00:00, total=0:00:00, wall=14:05 EST
>>> import ubelt as ub
>>> B = ub.repr2([[1, 2], [3, 4]], nl=1, cbr=True, trailsep=False)
>>> C = ub.repr2([[5, 6], [7, 8]], nl=1, cbr=True, trailsep=False)
>>> print(ub.hzcat(['A = ', B, ' * ', C]))
A = [[1, 2], * [[5, 6],
[3, 4]] [7, 8]]
The ub.cmd
function provides a simple interface to the command line. It is
an alternative to os.system
and subprocess
(although it uses subprocess
under the hood). Its key feature is that it prints stdout
and stderr
to the
terminal in real-time, while simultaneously capturing the output.
This allows you to easily run a command line executable as part of a python process, see what it is doing, and then do something based on its output, just as you would if you were interacting with the command line itself.
Also, ub.cmd
removes the need to think about if you need to pass a list of
args, or a string. Both will work. This utility has been tested on both windows
and linux.
>>> info = cmd(('echo', 'simple cmdline interface'), verbose=1)
simple cmdline interface
>>> assert info['ret'] == 0
>>> assert info['out'].strip() == 'simple cmdline interface'
>>> assert info['err'].strip() == ''