MongoEngine/mongoengine

switch_collection method is not thread safe

Opened this issue ยท 15 comments

Hi guys,

We just run into an issue when archiving data from one collection into another using switch_collection in multi threaded environment (sample code we used below):

def archive(self):
        self.switch_collection(Item._meta['archive_collection'])
        self.save()
        self.switch_collection(Item._meta['collection'])
        self.delete()

The scenario is a following:
1st thread runs archive, executes self.switch_collection(Item._meta['archive_collection'])
2nd thread wants to query different document and doesn't find it, because 1st thread has switched collection.

It would be a good idea to make switch_collection (and preferably switch_db) method thread safe.
What do you think?

Hi

I did a little investigation. Document.switch_collection internally use context manager switch_collection. That manager rewrite collection name on document class. Is this behavior necessary?

@matino , why not wrap with lock your archive and query methods?

@mmarksnippety , I think it is because otherwise how would you be able to do other actions on the second collection?

@DavidBord - I would have to lock all queries in a non archived collection, remember about locking new queries - not really an elegant solution for me.

I think that read locking the collection while in the context of switch_collection would also lock the collection for reading from the collection switch_collection switched to

I need to be 100% sure ;)

But its not only about you, right? :)
I don't think that starving all readers between calls to switch_collection is a proper behaviour so I am marking this as won't fix

Regardless of mongoengine, I would have separated the archive and query stuff into 2 separate processes because I wouldn't want a problem with one of them to affect the other

@DavidBord - fine by me, thanks for your 2prompt replies anyway! Maybe there should be a warning at least in the docs about this behavior? What do you think?

+1 for including a warning in the docs. We've been seeing a lot of strange bugs that all turned out to be caused by the non-thread-safety of the switch_db context manager. It would've been really helpful if there were a big fat warning in the docs.

Why not completely review this or maybe even remove this feature? I know that this sound bald but beyond the thread issue the concept of a "switch_db" is already dangerous, doesn't looks like an streamlined code.

Ideally one should use simply use connA and connB to reference different collections, because in fact they are different connections

Ran into the same issue today, also in a multithreaded environment. Hours spend catching an inconsistent bug because of such behaviour. Ended up in mongoengine's sources (leaky abstraction, yeah). Docs, unfortunately, still have no info about switch_db's (lack of) thread-safety. A little example of consequences:

Thread 1:

q = Question.objects(...)  # assuming that we use some db
# processing q's data, time goes...
q.save()  # intending to use the same db as for the query, nothing seems to affect q's connection

Thread 2:

outgoing_question = Question(...)
outgoing_question.switch_db('some_alias')
# at this moment q from Thread 1 successfully goes to 'some_alias', wow!
outgoing_question.save() 

Thread 2 is almost the same as the usage shown in switch_db's docstring. Expected behaviour's logic is like that: switching with an instance, thus affecting only this instance, right? Unexpectedly, no. First lines of current switch_db's implementation:

    def switch_db(self, db_alias, keep_created=True):
        """< docstring > """
        with switch_db(self.__class__, db_alias) as cls:  # switches the db on the whole class
            collection = cls._get_collection()
            db = cls._get_db()

Summarizing: it's quite an unexpected move to switch the whole Document's database affecting all queries and instances when using switch_db method of an instance, isn't it? Manual locking seems completely inelegant in my case as well, because real code is rather more complex than the examples I've provided, but that seems to be the only option for now.

+1 for feature implementation reviewing, or at least a sad warning in docs.

My approach to resolve this problem was to stop using switch_collection altogether and instead dynamically generate one class per collection using inheritance. See here for more details.

I think something similar can be done to switch_collection itself, by having it dynamically generate a new class per collection requested (if it hasn't already generated it).

6 years passed and the bug which is "just a little bit" critical unless you do a lot of concurrency testing is still open :(

rather than switch, what about using using(), i haven't looked at the source code yet, but presumably this only affects the calling thread? https://docs.mongoengine.org/apireference.html#mongoengine.queryset.QuerySet.using (and I assume there is a similar thing for inserts, updates, etc.)...?

I took a look at the code and it all uses get_db() under-the-hood in a non-thread-safe way. I've proposed a change to support switching db's in a safe way, which could be extended to collections. See here: #2686