Usage with PouchDB?
Closed this issue · 4 comments
Thanks for the article at https://loune.net/2017/04/using-aws-s3-as-a-database-with-pouchdb/ !
I was optimistic that I could provide an S3-backed db to the levelup PouchDB adapter until I arrived at the part of the article which says...
While unfortunately, we can’t use S3LevelDOWN with PouchDB
...and I couldn't find a reason why.
I accept all the limitations you mentioned (needing single-threading, not being performant for reverse key traversal etc), but I think in my case none of these would be a particular issue.
The reason I want an S3LevelDown implementation is for persistence (through replication) of a LevelUp PouchDB hosted in a NodeJS Lambda.
All data queries would be handled by the LevelUp PouchDB, and only CHANGES to data should propagate through replication to the S3LevelDown PouchDB. Everything would take place in a single node thread with a reservedConcurrency of 1 lambda.
I am happy for there to be a singleton responsible for this syncing to S3, and I'm hoping the replication protocol will end up only reading a small number and writing a small number of actually changed S3 files when a replication happens.
The S3LevelDown implementation would effectively have responsibility for storing the index after an execution context had shutdown. When a new execution context needed to be brought back up it would be used to repopulate the LevelUp PouchDB again.
I haven't yet figured out how to pass the S3LevelDOWN as a db into the pouchdb-adapter-levelup yet. Is there an example of this usage, or is there some fundamental I've missed which means it was never even really possible?
Hi @cefn, if you can guarantee through other means that there will only be one concurrent writer, then it might be fine. I've never used S3LevelDown with PouchDB in production work loads, so I can't really comment on stability. But I'm keen to hear your experiences if you do go down this path. I've added an example with PouchDB: https://github.com/loune/s3leveldown/tree/master/examples/pouchdb
That is brilliant!
I'm working slowly towards the pouchdb-on-lambda end of the stack. Initially proving out the logic of the app on couchdb and wrapping it in graphql.
I have graphql in a lambda and can host pouchdb on a lambda backed by DynamoDB though so I think it should be feasible to bring all these parts together.
One impressive part of S3LevelDown backing PouchDB in a lambda is that there is no need to dimension read or write capacity. Once the PouchDB lambda is up, and has read in the backing S3, it can handle a very large number of reads and writes against the data charged only for compute time, with only the 'differences' having to be synchronised to S3.
I implemented it succesfully with a pouchdb http router I wrote.
However in my test so far, the access to S3 is terribly slow.
Do you think it's normal or do you have any configuration hint I may use to speed the S3 access time ?
Thanks for your help.
The performance considerations explain some of the performance issues and potential fixes. The main issue is probably the number of API calls required to get values of each key, but without seeing the code I can't really comment. Also the PouchDB is more of a proof of concept and shouldn't be used in production environments, because of the concurrency issues.