girder/girder_worker

Girder-friendly locks

Opened this issue · 7 comments

I'm currently working on a GW task in which I download data from a Girder item, update that data, and then upload back to the item. Obviously, if more than one instance of this task is run at the same time, the update from one could overwrite the other, so I would like to use a lock to assert ownership over the item while the task does its work.

Celery has documentation for how to do this with the Django cache here. I'm wondering how best to do the equivalent of this in Girder. This may be a matter of developing a Girder plugin with endpoints to facilitate lock acquisition, or there may be some lighter solution which could be added to the documentation. I'm fairly agnostic on exactly what form the solution might take, but I'm interested in what other people think would be the best way to handle it.

I haven't fully thought this through, and I'm not very experienced with this sort of atomicity via MongoDB, so apologies if my proposed solution is incorrect.

I believe this can be solved in a truly atomic way using the MongoDB findAndModify method. If the calling thread receives an object that already has the "locked" field set, it would know that another thread possesses the lock, and could fail or block or whatever behavior you want in the case of lock contention. Releasing the lock would then just be a matter of deleting or modifying that field.

Since the use of this atomic primitive is not baked into Girder's core model layer, you would need a little bit of plugin code to accomplish it.

Let me know if you need help writing such plugin code, e.g. a REST endpoint to acquire the lock on a document, I think it's something I could prototype quickly.

Actually since you're using Girder worker, the sensible place to acquire the lock would be in the same request where you create and send off the worker task that will operate on the item. You could use a file reference value to signal the unlock upon upload of the output file. In any case, my offer to help write that code stands.

Just my two cents, this would be another reason to introduce redis into the mix. Redis has pretty good support for HA distributed locking, and could replace rabbitmq as the broker

If we wanted to do this in a way that is completely on the worker side with no help from the server, redis would be fine. If we are OK with having Girder manage the locks, it's best just to do it with the existing DBMS.

Thanks very much, @zachmullen! I'll be bringing this up as part of a meeting later today and see who on the project will be tasked with working on this. We'll definitely ping you after that to at least get some guidance.

Edit: Sorry, didn't page reload so I didn't see those last two comments. I am very interested in the costs/benefits of doing this in Mongo vs introducing Redis, but it's not an area that I have much expertise in.

After discussion in the meeting, I think we'll be introducing Redis as the straightest path forward for making this happen. Once I've done so, I'll submit either a documentation PR or an example file showing how we did this for our use case, which will hopefully provide some guidance to other folks going forward. Thanks for the help with this!

Cool!