Dictionary arguments cause combinatorial memory usage in `add_task`
richpsharp opened this issue · 0 comments
richpsharp commented
To reproduce:
task_graph = taskgraph.TaskGraph('.', -1)
arg_dict = {}
x={None: None}
for _ in range(4000):
arg_dict[_] = x
def my_op(my_dict):
print(my_dict)
task_graph.add_task(func=my_op, args=(), kwargs={'my_dict': arg_dict})
task_graph.join()
And observe serious memory use before my_op
is called. This is caused by a bug in Task._filter_non_files
that incorrectly strips dictionary argments and instead of cleaning a dictionary values, it makes a tuple for each element of the dictionary that includes a copy of the original dictionary.