morepath/morepath

Peewee and PonyORM support

Opened this issue · 30 comments

I just ran into this lightweight ORM:

http://docs.peewee-orm.com/en/latest/

May be worth it supporting it along the lines of more.transaction, or maybe even with more.transaction if we can get Peewee to talk to the transaction module.

http://docs.peewee-orm.com/en/latest/peewee/transactions.html

Just looking into Peewee. Looks really simple to use and reason about but still powerful.
Was nearly convinced to use it when I found Pony ORM.

Looks really awesome!

Pony is an advanced object-relational mapper. The most interesting feature of Pony is its ability to write queries to the database using Python generator expressions. Pony analyzes the abstract syntax tree of the generator expression and translates it to into SQL query.

@faassen What do you think?

I saw a talk about PonyORM years ago. It looked really cool if I recall correctly, but I didn't look into it further as the license wasn't compatible with commercial projects (you'd need to buy one). But they changed to an Apache license at one point and I need to look into it again now.

I'm not yet decided. Would like to integrate one of them in my Morepath project.

So if you have looked into please tell me what you think.

I think ponyorm is really cool. We need to think about zope.transaction integration, so it can work with more.transaction. Alternatively you could integrate it directly.

href commented

+1 for zope.transaction integration. Usually you want to tie in other things to the transaction (like e-mail delivery or file creation). Zope.transaction is perfect for this.

For now I start to integrate it directly and it seems to work well.

@href I don't really understand zope.transaction yet so maybe you can give me a hint.

href commented

It's really just a generic transaction implementation. Concrete implementations like SQLAlchemy can hook themselves into zope.transaction. This gives you the ability to always use the same interface and to tie multiple modules together that provide transactions independently.

As the end-user your only interface is the transaction module:

import transaction
try:
    # do things
except:
   transaction.abort()
   raise
else:
   transaction.commit()

More.transaction integrates with zope.transaction (as do other web-frameworks), which means you can have an unlimited number of things provide a transaction which will only be committed if the request was successful.

For example, we use this to tie in our mail-sending and our file-writing into the transaction (the file-writing is used to send sms). As a result we can be sure that our confirmation e-mail only goes out if action that prompted it was actually successful. Writing your own datamanagers for simple things like this is quite easy:

https://github.com/OneGov/onegov.core/blob/master/onegov/core/datamanager.py

For more about zope.transaction see the docs:

http://transaction.readthedocs.io/en/latest/

If you wanted to integrate it with Peewee or PonyORM you probably want to look at the way SQLAlchemy does it:
https://github.com/zopefoundation/zope.sqlalchemy/tree/master/src/zope/sqlalchemy

@href thanks that sounds interesting. I will look into it.

Hmm the thing is that pony_orm makes it really easy to create the transaction. You just wrap your transaction function with a @db_session decorator like

@db_session
def check_user(username):
    return User.exists(username=username)

When the session ends it performs the following actions:

  • Commits transaction if data was changed and no exceptions occurred otherwise it rolls back transaction.
  • Returns the database connection to the connection pool.
  • Clears the Identity Map cache.

So you can be really explicit where a commit happens.

I was working before with pure SQL so this looks natural to me.

href commented

Sure, you can use any other way of handling transactions, I would recommend you do tie the transactions to the request in some way though. But that's your personal decision.

However, if we integrate any of those ORMs into morepath (say through more.pewee or more.ponyorm) we should do it with more.transaction, otherwise we'll end up with two incompatible ways of handling transactions in morepath!

That's what I'd do if I wasn't a big SQLAlchemy fanboy :)

href commented

Come to think of it we don't even need more.pewee or more.ponyorm. As soon as there's a zope.transaction integration for any of those orms we are basically done (though we could add tests for popular integrations, like we do with sqlalchemy).

@href just looked into zope.transaction and, zope.sqlalchemy.
It seems to add much boilerplate and make it less intuitive.

The charm of pony is, that it's lightweight, easy to reason about and intuitive.
Similar to Morepath.

I think for someone not coming from the zope world it's really much to learn if he wants to use more.transaction. I already tried several times to go into it but it's really difficult for me.

Pony on the other hand seems to be very intuitive.

@href @faassen In the pony code base there's an example for integration with bottle.
It uses a simple plugin to tie the transactions to the request.
So maybe something similar would be enough for more.pony.

@href Maybe it's not so bad to have a more lightweight solution for people like me.

@href @faassen Is there some documentation how to use more.transaction?

Any API?

If someone is interested in the internals of pony ORM, here is an explanation from the author.

Really mind blowing! 😉

href commented

I'm not sure we have any other documentation than this: http://blog.startifact.com/posts/racing-the-morepath-sqlalchemy-integration.html

@faassen I have published morepath_ponyorm.
If you want I can move it to Morepath.

@href Maybe you can have a look at morepath_ponyorm.

I see several points why I would prefer to use Pony ORM directly and not wrap it with more.transaction:

  • Simplicity: Pony ORM provides so much simplicity that it would be a pity to loose it.
  • Pony has his own transaction manager which is also configurable.
    The signature is
    db_session(allowed_exceptions=[], immediate=False, retry=0, retry_exceptions=[TransactionError], serializable=False, strict=False)
  • Performance: Pony ORM claims to be 1.5-3 times faster than Django and SQLAlchemy even without query result caching. In addition it has a better cache mechanism. Don't know what happens when we use it with more.transaction.
  • more.transaction is hard to maintain. When I refactored morepath_sqlalchemy I tried to incorporate the changes to pyramid_tm until more.transaction was forked, but at the end I gave up only removing some legacy code because I not really understand it and I don't know what is relevant for morepath.
  • Lack of documentation for more.transaction.

Still remains your argument about integration of file upload and sending emails. At the moment I have no use case.
But it seems that this has not to be tight together with the database transactions.

We also have no documentation for this.

href commented

One thing I've noticed is that you're doing transactions manually - unless I'm missing something. This might lead to behaviour that the users of your applications might not expect.

For example, this is how you create a record:

@App.html(model=DocumentCollection, name='add_submit', request_method='POST')
def document_collection_add_submit(self, request):
    with request.db_session:
        title = request.POST.get('title')
        content = request.POST.get('content')
        document = self.add(title=title, content=content)
        request.db_commit()
        return "<p>Awesome %s</p>" % document.id

Since your commit happens manually, your database changes even if your request fails. For example, it could fail further up the stack in a tween, or you could have an error in your template rendering (assuming you added that).

If that happens, your users will see an error and they would rightfully assume that their request was unsuccessful. However, internally your transaction went through and the record was created.

Now unless you're adding some kind of transaction handling in a tween (like more.transaction does), your web application will behave in an unorthodox manner: It'll show an error, even though the goal of the request (to add a record) was achieved.

It's not necessary to commit manually.
Normally this happens automatically.
But here I need to get the document.id of the newly inserted document.
If I don't commit the id doesn't exist.

@henri-hulski, when you say committing happens automatically, do you mean it happens when you leave the with statement? As there's no integration with Morepath's request system, right? @href pointed out that if you don't do it in a tween there's no way to unroll in case of HTTP errors, and the with statement's automatic commit behavior doesn't change this.

The transaction system makes available subtransaction (savepoint) functionality to handle the use case of committing something so you have an id, if I recall correctly.

I would prefer it if we had transaction (and thus more.transaction) integration for other ORMs, including Pony. We should investigate whether we can use Pony's db_session to implement transaction. If that is really not possible, I would still prefer it if we could at least create a tween that behaves like more.transaction but uses the Pony session mechanism, as it seems to make sense to me to relate the transaction to the request, not just individual methods.

I think we can forget about zope.sqlalchemy, that indeed is something I designed years ago to make SQLAlchemy's scoped session mechanism fit with the Zope Component Architecture, and it's not very relevant to Morepath. Except we may look at the source code to remind ourselves how it cleverly integrates with the transaction module.

@href Yeah after looking at it more carefully I think you're right that we need a tween.

My initial plan was actually to make it first work like this and then move db_session to more.pony and use it inside a tween. But when I saw that it works without I forgot why it's necessary.
So thanks for reminding.

@faassen hmm actually not really sure how to archive this. db_session can either be used as decorator or context manager.
In the docs a tween factory looks like this:

def make_tween(app, handler):
    def my_tween(request):
        print "Enter"
        response = handler(request)
        print "Exit"
        return response
    return my_tween

So how can I attach here db_session?

href commented

Oh I didn't know you also wrote zope.sqlalchemy. Thanks for doing that :)

I still think integrating PonyORM/Pewee into transaction is the way to go. As I stated, I can already tie various things to the transaction underlying my requests, so to me it seems adding PonyORM is just another.

more.transaction would become the transaction handler to rule them all, which will be helpful for future integrations (e.g. integrating Mailgun or Twilio).

That being said, technically we could also just provide our own API with more.transaction to accomplish that, but I think there needs to be a real good reason to not reuse transaction.

As Pony is build around its own transaction manager will this not be in conflict with an external transaction manager?
Do they not try to make the same things?
Is it not enough if we tie these different things all to the request instead of the database transaction?
There seems to be also some incompatibility. Pony eg. doesn't support savepoints.

@href It would be nice if you could create a simple test app which shows this problem so we can try different solutions.

Another point is if someone wants to use Pony he probably knows the Pony API and why he should use another transaction API.

href commented

I went ahead and tried writing a transaction integration for Pony. Not as a real project but just to see what it would entail and to gain a better understanding of the issue.

I certainly see now where you're coming from @henri-hulski. It's not that Pony is all that incompatible with such an integration, but that it's basically a black box. There are no ways to add any hooks to the transaction machinery. This in stark contrast to SQLAlchemy which is very much open to all kinds of extensions.

It is also certainly an integration that needs to be approached carefully. To do it properly there would have to be changes to PonyORM as well. At least some basic hooks/events.

Unfortunately I can't be of too much help, because I have no use for another ORM than SQLAlchemy and as it is I have no time to write the things I actually need ;)

Here's the code I produced while exploring:
https://gist.github.com/href/f7627eee7fa10f6f78f0a43ca0ba7619

One last thing, to get the id from a new record you do not need to commit the session, you just need to flush it. I think all ORMs should work that way.

@href Thanks for looking into this.
I think for now I try to wrap db_session around the request in a tween.
But when I decide to use transaction your gist is a good start point.

You're right with using flush instead of commit. I fixed that yesterday but didn't upload yet.
That's the same way we do it in morepath_sqlalchemy.

Ok it was actually really simple. I had only to decorate the tween with @db_session.
I also added some settings.

@App.setting_section(section='pony')
def get_pony_settings():
    return {
        'allowed_exceptions': [],
        'immediate': False,
        'retry': 0,
        'retry_exceptions': [TransactionError],
        'serializable': False,
        'strict': False
    }


@App.tween_factory(over=morepath.EXCVIEW)
def pony_tween_factory(app, handler):

    @db_session(
        allowed_exceptions=app.settings.pony.allowed_exceptions,
        immediate=app.settings.pony.immediate,
        retry=app.settings.pony.retry,
        retry_exceptions=app.settings.pony.retry_exceptions,
        serializable=app.settings.pony.serializable,
        strict=app.settings.pony.strict
    )
    def pony_tween(request):
        return handler(request)

    return pony_tween

@faassen If you're agree i can extract this to more.pony.

@href Pony provides actually some entity hooks:

  • after_delete(): Called after the entity instance is deleted in the database.
  • after_insert(): Called after the row is inserted into the database.
  • after_update(): Called after the instance updated in the database.
  • before_delete(): Called before deletion the entity instance in the database.
  • before_insert(): Called only for newly created objects before it is inserted into the database.
  • before_update(): Called for entity instances before updating the instance in the database.

I don't think they are compatible with Transaction, but they allow some fine-grained actions when changing the database, eg. sending an email after updating payment status.

I agree more.transaction is the way to go to have a generic transaction system where you hook other components and tie them together. It's a better approach to integrate other modules with or without transactions through their own hooks/events to more.transaction rather than making your project relying on a specific transaction machinery, like SQLA or Pony.

I've released more.pony a while ago which integrates the above tween in Morepath.

We can still create a more.transaction integration for Pony if someone is interested.
For now more.pony works fine for me.

Not sure if we should close this issue for now.