didactic-meme/pyfuncol

[FEATURE] Optimise chained transformations

Opened this issue · 5 comments

Describe the solution you'd like
We should optimise chained transformations in a similar way as what Apache Spark does: they should be lazy and re-ordered so that transformations reducing the number of elements are always executed first.

That's a good idea. Do you have any clues about how we would be able to get the information that calls are being chained?

We could instantiate an object that keeps track in a dict of what methods have been called whenever we call a transformation, re-orders them and executes them. For example if we do list.map(...).filter(...) the map instantiates the object, that records the map operation. Then, it stores filter as well, re-orders the operation to run filter first, then runs them.

Mmhh yes sounds good! Just, how do you know when the last call occurs to execute them all?

I guess here's why Spark has transformations and actions haha

We could do the same, but I don't like it very much

haha most probably ^^

Without doing something of the sort, I don't really see yet how we could do it... I'll think about it but my intuition is that we can't without either manipulate AST or having explicit call to execute transformations