bmuller/twistar

find() variant returning a generator?

Opened this issue · 4 comments

Hi,

In my code I sometimes have to go thought all objects of a certain type. By using the DBObject.all() function it seems all objects are instantiated in a large array. Would it be possible to have all() or find() return a generator function, so that code like
for object in Object.all():
object.doSomething()
would involve only one instance of Object at a time, which could be garbage collected after each iteration in the loop?

Jan-Pascal

Hey @janpascal - This is made a bit more complicated given the asynchronous nature of the queries. What sort of interface were you thinking of?

Hi Brian,

I've only been using twisted for a couple of weeks, so forgive me if I think too lightly of these things...

Maybe something in the line of
@inlineCallbacks

def f():
...
gen = MyObject.all(generator=True)
while o = yield gen.next():
o.doSomething()

Would that be possible? So gen.next() returns a Deferred, which yields either the next object, or None

Thanks

Jan-Pascal

That could work, but it wouldn't be able to function like a true generator. For instance, you wouldn't be able to do:

for o in MyObject.all(generator=True):
     print o.name

It could maybe be done as a class method that takes a function that is called with a list of objects at a time (like find_in_batches in ActiveRecord). That would look something like:

def print_names(olist):
     for o in olist:
          print o.name
     # may return a deferred

def done():
     print "Finished finding and printing everything"

MyObject.find_in_batches(print_names).addCallback(done)

But it would have to be done carefully to not blow out the call stack if there are a ton of calls.

I like the find_in_batches pattern. It would suit my needs, and it looks
like a clean interface, especially if you could give it an optional
batch_size argument.