/python_cheat_sheet

a repo that contains useful general Python information to help me when doing development work

Python Cheat Sheet

This repository is for general Python information that could be useful for any development work.

Notes from 'Transforming Code into Beautiful, Idiomatic Python'

This section contains notes from the video by Raymod Hettinger about taking advantage of some of Python's language features. The video can be found here.

General looping

Use xrange with Python 2.7 (Python 3 use range). This saves memory by creating an iterator.

Do not use indices to reference items in a list. Do the following instead:

colors = ['blue', 'red', 'green']
for color in colors:
    print(color)

To go backwards through a list use the function reversed.

If you need to index at the same time then use enumerate.

If you want to loop over two lists at the same time then use izip as follows:

colors = ['blue', 'red', 'green']
names = ['alice', 'bob', 'eve']
for name, color in izip(names, colors):
    print(name, color)

Note that zip does the same thing but will generate the entire structure in memory beforehand. The benefit of izip is that it may end up reusing the same allocations so there could be performance savings too.

Use sorted to reorder a list. Try not to use custom comparison functions for sorting as this function may get called n Log n times. Use sorted's key functions instead. This will be called just once per item. Note that comparison functions are no longer in Python 3.

Breaking out of loops

With this general type of code:

blocks = []
while True:
    block = f.read(32)
    if block == '':
        break
    blocks.append(block)

Can use iter with its sentinel parameter instead to make the code clearer:

blocks = []
for block in iter(partial(f.read, 32), ''):
    blocks.append(block)

With the use of the sentinel value in the second parameter of iter the first parameter now expects a callable object. This can be done with the use of partial to create a function that takes no parameters but returns the result of the read.

Avoiding the use of a boolean to find something in a list by using else with for. For example with:

def find(seq, target):
    found = False
    for i, value in enumerate(seq):
        if value == target:
            found = True
            break
    if not found:
        return -1
    return i

The boolean can be removed as follows:

def find(seq, target):
    for i, value in enumerate(seq):
        if value == target:
            break
    else:
        return -1
    return i

The else is called at the end of the for loop if there was no break. The only issue with this style is that the else keyword in this instance is not widely known and may cause some confusion. According to the video this use of else in this way was proposed by Donald Knuth.

General dictionary uses

Looping over dictionary keys is simply:

d = { '1': 'alice', '2': 'bob', '3': 'eve' }
for k in d:
    print(k)

Do not use this if you wish to modify the dictionary whilst iterating. You should use keys which makes a copy, allowing you to make modifications:

for k in d.keys():
    if k == '2':
        del d[k]

Looping over keys and values at the same time then use iteritems. Unlike items this will not create a huge list in memory.

The function izip can be used to construct dictionaries from lists:

ids = ['1', '2', '3']
names = ['alice', 'bob', 'eve']
d = dict(izip(ids, names))

Using izip over zip should consume less memory as it will be reused when creating the tuple for each pair of list items.

Use defaultdict for doing something like maintaining a count of items in a list. For example:

names = ['alice', 'bob', 'eve', 'alice', 'bob', 'alice']
d = defaultdict(int)
for name in names:
    d[name] += 1

The int default behaviour will be to return a 0 with no parameters so the value by default for a dictionary item is zero. We can then increment this by 1 every time. Without defaultdict the dictionary function get could be used with its default parameter.

To create a dictionary where each key groups items from a list (in the below example by string length) you can use defaultdict. For example:

names = ['alice', 'bob', 'eve', 'steve']
d = defaultdict(list)
for name in names:
    key = len(name)
    d[key].append(name)

In the above example the defaultdict will be an empty list is returned if the key does not match an existing list in the dictionary. An older way of doing this would be to use the dictionary function setdefault when querying the key.

Sometimes you want to link dictionaries together such that if the key is specified in a particular dictionary then it may take priority over an entry in another dictionary. A common use of this might be program arguments which override some values in environment variables which in turn may override default values. Use of ChainMap is the most efficient way to do this.

Improving clarity

Use keyword arguments for more obscure function parameters to make the code much more readable. It does add a slight performance hit so should be avoided for some cases (e.g. function calls with loops).

When returning tuples from functions. consider using named tuples to make the meaning of the values clearer (i.e. use namedtuple). Named tuples are a sub-class of tuples so they still behave like a tuple.

For clarity and speed, always unpack sequences as follows:

employee = 'alice', 'work', 64
name, place, id = employee

Packing should be used to avoid temporary variables. For example swapping:

x = 10
y = 20
tmp = x
x = y
y = tmp

Instead just do:

x = 10
y = 20
x, y = y, x

This updates the state all at once. It avoid potential mistakes with temporary variables and is much clearer. This work even better if you're making calculations with the old values of state and want to update a set of new variables with the new state in one go.

Efficiency

Never add strings together. Always use join.

When updating sequences with the following:

del seq[0]
seq.pop(0)
seq.insert(0, 'value')

It usually means that the wrong data structure is being used and you should consider replacing with a deque and use the following instead:

del seq[0]
seq.popleft()
seq.appendleft('value')

Decorators and context managers

Decorators can be used to separate business logic from administration code. For example, if you want to cache a look-up value in functions this could use the @cache decorator. This is in Python 3 but can easily be written for earlier versions.

Rather than copying a thread's context by calling getcontext().copy() for some reason use with instead. For example if you need to change the decimal precision:

with localcontext(Context(prec=50)):
    print(Decimal(355) / Decimal(113))

So the with is managing the creation of a new context and cleaning it up. This is far easier to read and maintain. Another use is with opening files as with will ensure that the file is closed automatically.

Locks is another good example. Looking at the bad way of doing it:

lock = threading.Lock()
lock.acquire()
try:
    print("Critical section 1")
    print("Critical section 2")
finally:
    lock.release()

This can be improved as follows:

lock = threading.Lock()
with lock:
    print("Critical section 1")
    print("Critical section 2")

A more advanced use of a context manager is an example of trying to remove a file that might not be there:

with ignored(OSError):
    os.remove("file.tmp")

Note this is in Python 3.4 but it is possible to write your own version of ignored that has a try block which will yield unless it is an exception to ignore. In which case it will drop to the except and pass.