spotify/pythonflow

Can you use Pythonflow in OOP?

Closed this issue · 6 comments

Hi there, thanks for open-sourcing this library. I am wondering if PythonFlow can support object based, stateful DAG relationship?

I have seen a similar library before, and had similar question posted here:
man-group/mdf#23

I really think it will be cool for Python to have these kinds of Dataflow toolkit. Would be great to hear the case for PythonFlow.

@sorying, can you provide an example of what you're looking for? Pythonflow might support your use case. For example,

class MyObject:
    def __init__(self, a):
        self.a = a

    def add_a(self, b):
        return self.a + b

with pf.Graph() as g:
    obj = pf.placeholder()
    b = pf.placeholder()
    result = obj.add_a(b)

g(result, {obj: MyObject(3), b: 4})  # 7

@tillahoffmann thanks for the quick response.

Perhaps what I am looking for is some DAG/dataflow concept integrated into Database object representation. In your above example, most likely in my database I will have many rows that share the same ORM representation of MyObject, of which I have some "calculated property" that's expensive to compute.

If I borrow and example from SQLAlchemy's Hybrid Attribute pages
(http://docs.sqlalchemy.org/en/latest/orm/extensions/hybrid.html)

from sqlalchemy import Column, Integer

Base = declarative_base()
class Interval():
    id = Column(Integer, primary_key=True)
    start = Column(Integer, nullable=False)
    end = Column(Integer, nullable=False)

    def __init__(self, start, end):
        self.start = start
        self.end = end

    @property
    def length(self):
        return self.end - self.start

    @property
    def is_very_long(self):
        return self.length > 1000

    @property
    def is_very_short(self):
        return self.length < 1000

So let's say the property 'length' is a very expensive calculation (but it still adheres to the fact that it's a pure function and no side effect).

is_very_long() and is_very_short() are both quick calculation based on length(). Utilizing Pythonflow, if I had already calculated is_very_long() once, then when calculating is_very_short(), i shouldn't need to re-compute the expensive operation length(). Are their property decorator for that?

thanks for the pointer

@sorying, it sounds like a standard cache may be more appropriate for your use case.

class Interval:
    def __init__(self):
        self._length = None

    @property
    def length(self):
        if self._length is None:
            self._length = some_expensive_calculation()
        return self._length

I'll close this for now because the caching with an ORM probably lies outside the scope of Pythonflow. But feel free to reopen.

@sorying, would you be able to provide a MCVE, e.g. as a gist, so I can get a better understanding of (a) what your requirements are and (b) how you want to use pythonflow?

@tillahoffmann - do you have any example in using pythonflow in python object/class/instance?