Combination of map and accumulate
rindis opened this issue · 0 comments
I found myself needing something that seemed like a combination of map
and accumulate
, and thought it might be a useful addition to toolz
.
The problem is that you have some sequence of objects; and you want to manipulate each of those objects, but a given manipulation depends on the result of the previous manipulation.
Example
Given the following list of dicts:
data = [{'id': 'a', 'value': 1},
{'id': 'b', 'value': 2},
{'id': 'c', 'value': 3},
{'id': 'd', 'value': 4}]
I would like to take the number x = 4
and start subtracting the 'value'
property of each element in data
, until x
becomes zero.
This is the expected behaviour of a function that does what I want:
>>> data = [{'id': 'a', 'value': 1},
... {'id': 'b', 'value': 2},
... {'id': 'c', 'value': 3},
... {'id': 'd', 'value': 4}]
>>> func(data, 4)
[{'id': 'a', 'value': 0}, {'id': 'b', 'value': 0}, {'id': 'c', 'value': 2}, {'id': 'd', 'value': 4}]
Notice that for the dictionaries with 'id': 'a'
and 'id': 'b'
, the 'value'
is 0
. For 'id': 'c'
the value is 2
, because at this point the value of x
is 1
. For 'id': 'd'
the value is the same as the original, because x
is 0
at this point.
A direct implementation of func
could be:
def func(data, x_init):
x = x_init
new_data = []
for elem in data:
new_data.append(assoc_in(elem, ['value'], max(elem['value'] - x, 0)))
x = max(x - elem['value'], 0)
return new_data
Which can be refactored to:
def f(acc, elem):
return assoc_in(elem, ['value'], max(elem['value'] - acc, 0))
def g(acc, elem):
return max(acc - elem['value'], 0)
def func(data, x_init):
x = x_init
new_data = []
for elem in data:
new_data.append(f(x, elem))
x = g(x, elem)
return new_data
And could have a potential generalisation as:
def mapacc(mapper, accumulator, sequence, accumulator_init):
sequence = iter(sequence)
acc = accumulator_init
for elem in sequence:
result = mapper(acc, elem)
acc = accumulator(acc, elem)
yield result
Which could be used to solve the above example by:
>>> list(mapacc(mapper=lambda acc, elem: assoc_in(elem, ['value'], max(elem['value'] - acc, 0)), # Manipulate element using acc
... accumulator=lambda acc, elem: max(acc - elem['value'], 0), # Update acc using element
... sequence=data,
... accumulator_init=4))
[{'id': 'a', 'value': 0}, {'id': 'b', 'value': 0}, {'id': 'c', 'value': 2}, {'id': 'd', 'value': 4}]
EDIT
Updating the implementation to:
def mapacc(mapper, accumulator, sequence, accumulator_init):
return zip(*_mapacc(mapper, accumulator, sequence, accumulator_init))
def mapacc(mapper, accumulator, sequence, accumulator_init):
sequence = iter(sequence)
acc = accumulator_init
for elem in sequence:
result = mapper(acc, elem)
acc = accumulator(acc, elem)
yield result, acc
Also returns how the accumulator has updated during the function:
x, y = mapacc(mapper=lambda acc, elem: assoc_in(elem, ['value'], max(elem['value'] - acc, 0)), # Manipulate element using acc
accumulator=lambda acc, elem: max(acc - elem['value'], 0), # Update acc using element
sequence=data,
accumulator_init=4)
x
({'id': 'a', 'value': 0}, {'id': 'b', 'value': 0}, {'id': 'c', 'value': 2}, {'id': 'd', 'value': 4})
y
(3, 1, 0, 0)