rwynn/monstache

Creating multiple indices for one collection on resume

Opened this issue · 1 comments

Hi @rwynn

On running monstache server, it is creating index cdp.master-users in elasticsearch as expected. When I stop the monstache server and start back after sometime, it is creating a new index 65167c8a25634f0d84692fc5_cdp.master-users in elasticsearch.

When this resume happens, all the lost operations from the last timestamp are getting synced to 65167c8a25634f0d84692fc5_cdp.master-users while all the new operations are getting synced to cdp.master-users index.

Is this an expected behavior because I am unable to find anything on this in the documentation. Thanks.

I am using latest monstache version with AWS Elasticsearch 7.10 and Mongo db Atlas 6.0.10.

I have following monstache config:

mongo-url = 

elasticsearch-urls = [""]

change-stream-namespaces = ["cdp.master-users"]

elasticsearch-user = 
elasticsearch-password = 
resume= true
verbose = true
exit-after-direct-reads = false


[[script]]
script = """
module.exports = function(doc) {
    if (doc.id && doc.id.$oid) {
      doc._id = doc.id.$oid;
    }
  
    if (doc.name && doc.name.updatedAt && doc.name.updatedAt.$date) {
      doc.name.updatedAt = new Date(doc.name.updatedAt.$date.$numberLong);
    }

    if (doc.updatedAt && doc.updatedAt.$date) {
        doc.updatedAt = new Date(doc.updatedAt.$date.$numberLong);
      }

    if (doc.createdAt && doc.createdAt.$date) {
        doc.createdAt = new Date(doc.createdAt.$date.$numberLong);
      }
    return doc;
  };
  """

Hi @sachin1ag,

This appears to be an old bug with MongoDB Atlas Free Tier (not an issue in monstache). You can find the history here:

#196

I previously submitted a ticket and it was eventually fixed in the Atlas service, but then unfixed at some point.

The problem does not present (or did not) on paid tiers. Probably because the Atlas free tier uses some sort of multi-tenant feature of a shared database under the covers. The Atlas account/tenant ID prefix should not be present in the changes events returned to the client from a resume change stream call.

You may want to submit another ticket if you need the free tier.