
Records are missing in sync

sachinnagesh opened this issue · 5 comments

Hi @rwynn ,

We are facing a very strange issue with monstache. We observed some of the records are not at all synched to elastic-index. It's happening for 5-10 records per 100 records and it's very random. This is observed in case of create and update. Also we don't see any logs at all related to records in monstache logs.
Just to give you idea about out setup, we have mongodb deployment with replica set. We have multiple db's (each for a specific company - multi tenant) in deployement. From each company db we want to sync a mongodb view created on product collection.

    $lookup: {
      from: "product-features",
      localField: "productid",
      foreignField: "productid",
      as: "features"
    $lookup: {
      from: "product-technical-details",
      localField: "productid",
      foreignField: "productid",
      as: "technicals"
    $lookup: {
      from: "product-inventory",
      localField: "productid",
      foreignField: "productid",
      as: "inventory"

dbName : company1
collections : products, product-features, product-technical-details, product-inventory
dbName : company12
collections : products, product-features, product-technical-details, product-inventory

Here is monstache.toml file looks like

mongo-url = "{{ .MongoURL }}"

elasticsearch-urls =[ "{{ .Elasticsearch.URL }}" ]
{{if .Elasticsearch.Auth.Enabled }}
elasticsearch-user = "{{ .Elasticsearch.Auth.UserName }}"
elasticsearch-password = "{{ .Elasticsearch.Auth.Password }}"
{{ end }}
{{if .Elasticsearch.SSL.Enabled }}
elasticsearch-pem-file = "{{ .Elasticsearch.SSL.Path }}"
{{ end }}

direct-read-namespaces=["company1.products-view","company2.products-view" ]

change-stream-namespaces=[ '' ]
gzip = true
stats = true
index-stats = true
dropped-collections = false
dropped-databases = false
replay = false
resume = true
resume-write-unsafe = false
resume-name = "default"
resume-strategy = 0
verbose = true
exit-after-direct-reads = false
direct-read-stateful = true
elasticsearch-retry = true
prune-invalid-json = true
relate-buffer = 500000
delete-index-pattern = "*_product-detail-index"

buffer-duration = "100ms"

## Relate Mapping for company1
namespace = "company1.products-view"
index = "company1_product-detail-index"

namespace = "company1.products"
with-namespace = "company1.products-view"
keep-src = false

namespace = "company1.product-features|"
with-namespace = "company1.products"
src-field = "productid"
match-field = "productid"
keep-src = false

namespace = "company1.product-technical-details"
with-namespace =  "company1.products"
src-field = "productid"
match-field = "productid"
keep-src = false

namespace = "company1.product-inventory"
with-namespace = "company1.products"
src-field = "productid"
match-field = "productid"
keep-src = false

## Relate Mapping for company2
namespace = "company2.products-view"
index = "company2_product-detail-index"

namespace = "company2.products"
with-namespace = "company2.products-view"
keep-src = false

namespace = "company2.product-features"
with-namespace = "company2.products"
src-field = "productid"
match-field = "productid"
keep-src = false

namespace = "company2.product-technical-details"
with-namespace =  "company2.products"
src-field = "productid"
match-field = "productid"
keep-src = false

namespace = "company2.product-inventory"
with-namespace = "company2.products"
src-field = "productid"
match-field = "productid"
keep-src = false

We also tried by setting below parameters and removing namespace-regex but still issue persist

resume-strategy = 1

We think somehow monstache missing those create/update events. We are using monstache:6.7.10

Does it started to happen recently? Ours is having the same problem but we never changed the monstache config for 2 months its weird

@yunusemrecatalcam Yes we started facing issue from last 2-3 months.

@yunusemrecatalcam we found the issue from where it's coming. While fetching data from mongo view while processing relate, it doesn't get the record at all during insertion. We have mongo replica set deployment. I feel while writing data to mongo collection, there are services which are not configured with write majority. For now we have added retry mechanism (5 times) with some delay between iteration. But still there is going to be issue during update, it may not get latest updated record.

@yunusemrecatalcam I think another way to solve this is to add readPreference from primary

I have a very similar problem.
im using this versions:

  • Elastic v7.17.9
  • Mongodb 6.0.14
  • Monstache 6.7.17

This is my toml file

elasticsearch-urls =["url"]
elasticsearch-max-conns = 50
change-stream-namespaces = [ "collection1","collection2","collection3"]
replay = false
resume = true
resume-name = "default"
index-as-update = true
direct-read-no-timeout = true
elasticsearch-retry = true
fail-fast = false
stats = false
verbose = true
disable-change-events = false
enable-patches = true
namespace = "collection1"
index = "index1"
namespace = "collection2"
index = "index2"
namespace = "collection3"
index = "index3"
namespace = "collection1"
script = """
module.exports = function(doc) {
  if ( { 
    doc.owner = findId(doc.owner_id, {
      collection: "collection1"

  function removeKey(obj) {
    Object.keys(obj).forEach(function(key) {
      if (key === "_class") delete(obj[key]);
      if (typeof obj[key] === 'object' && obj[key] !== null) {

  function isNumber (value) {
  if (value === null || value === undefined) {
    return false;
  if (typeof value === "string") {
    return !isNaN(value) && !isNaN(parseFloat(value));
  return !isNaN(value);

  if (isNumber(doc.amount)) {
    doc.amount = doc.amount * 100
  if (isNumber(doc.presales_amount)) {
    doc.presales_amount = doc.presales_amount * 100

  return doc;
namespace = "collection1"
with-namespace = "collection2"
src-field = "_id"
match-field = "owner_id"
keep-src = true

It's happening for 5-10 records per 100 records and it's very random exactly like @sachinnagesh reported.
Have you got any suggestions?