Add Mongodb backend

Question

Add Mongodb backend

michielbdejong opened this issue 6 years ago · 5 comments

Redis is a cache that can also persist. Mongo is a persister that can also cache. :)
Mongo is more popular than Redis, although that is probably almost entirely for use as a model store, Rails and similar MVC frameworks.

But I do like the idea of having a persistence-oriented persistence layer, because data loss should be the biggest concern in a pod-server, much more important than latency/throughput.

So it makes sense to implement a Mongodb backend as well!

Just looking at the docs, we could maybe use https://docs.mongodb.com/manual/tutorial/model-embedded-one-to-many-relationships-between-documents/ for the Container-Member relationship...

Answer 1 · 2019-05-17T08:22:46.000Z

Ah no, "to model large hierarchical data sets", "use normalized data models", see https://docs.mongodb.com/manual/core/data-model-design/#data-modeling-embedding. So I'll probably have to use https://docs.mongodb.com/manual/core/transactions/#transactions-api

Answer 2 · 2019-05-17T08:39:18.000Z

OK, so answer: https://stackoverflow.com/questions/16523621/atomicity-and-cas-operations-in-mongodb and https://docs.mongodb.com/manual/core/write-operations-atomicity/#update-if-current

I'll use mongodb documents that just say { "value": "..." } and then set the current value as an update condition, but this is going to be very bad in terms of performance, when compared to Redis. Because in Redis I only have to do WATCH, and in Mongo I actually have to retrieve the current version. For large blobs that's going to be a nightmare.

Note that this is not necessary for containers, they're basically read-only except for the 'DELETE' operation, and for that we can set the update condition to { "members": [] }, thanks to solid/solid-spec#172.

Answer 3 · 2019-05-17T08:41:53.000Z

Ah, for updating large blobs we can add a 'version' field, which is retrieved on getBlob and then used as a condition (and updated!) on setData.

Answer 4 · 2019-05-26T22:28:41.000Z

We've used Redis as a primary data store with no issues, it isn't just a cache database, its a persistent database with caching abilities as will (by way of expiration & indicating LRU config).

We did for a while use MongoDB for a lot of things, but when you're only searching for data using key (or URI values) then there is no need for MongoDB.

That said, MongoDB is a good option, and would even allow you to use its _id field as the primary URI.

I think it would be nice to be able to abstract away whether it is MongoDB or Redis by way of a "key store", which can be backed by MongoDB or Redis. Users could specify either REDIS_URL, MONGO_URL, if neither are provided, fall back to whatever is default (e.g. fs store).

So really what I'm saying is define the pods requirements for a data store, rather than letting the choice of data store define the requirements.

Answer 5 · 2019-07-08T09:40:44.000Z

We're pretty happy with redis atm, may reopen later if pod providers explicitly ask us for this feature.