couchbase/sync_gateway

Problem with tombstones purging

erikbotta opened this issue · 7 comments

The Sync Gateway issue tracker is reserved for bug reports and enhancement
requests. For general questions, please use the Couchbase forums:
https://forums.couchbase.com/c/mobile/sync-gateway. Thank you!

Sync Gateway version

{"couchdb":"Welcome","vendor":{"name":"Couchbase Sync Gateway","version":"2.1"},"version":"Couchbase Sync Gateway/2.1.2(2;35fe28e)"}

Operating system

Ubuntu 16.04.5 LTS - k8s : v1.12.6

Config file

  {
      "log": [
        "*"
      ],
      "interface": ":4984",
      "adminInterface": ":4985",
      "databases": {
        "wallet": {
          "server": "couchbase-0.couchbase.default.svc.cluster.local:8091",
          "bucket": "somebucket",
          "username": "syncgwuser",
          "password": "password",
          "enable_shared_bucket_access": true,
          "import_docs": "continuous",
          "users": {
            "GUEST": {
              "disabled": true
            }
          },
          "sync": `
          function (doc, oldDoc) {
            // check if document was removed from server or via SDK
            if (isRemoved()) {
              return;
            }
            if (isDeleted()) {
              requireUser(oldDoc.reservedOwnerId);
              return;
            }
            if (!doc.reservedOwnerId) {
              throw({forbidden: 'Missing required properties'});
            }
            if (oldDoc == null) {
              requireUser(doc.reservedOwnerId);
            } else {
              requireUser(oldDoc.reservedOwnerId);
            }

            channel(doc.reservedOwnerId);

            // this is when document is removed via SDK or directly on server
            function isRemoved() {
              return( isDeleted() && oldDoc == null);
            }

            function isDeleted() {
              return (doc._deleted == true);
            }
          }`,
          "num_index_replicas": 0
        }
      },
      "CORS": {
        "Origin": [
          "http://localhost:8080", "http://127.0.0.1:8080"
        ],
        "LoginOrigin": [
          "http://localhost:8080", "http://127.0.0.1:8080"
        ],
        "Headers": [
          "Content-Type", "Accept", "Authorization"
        ],
        "MaxAge": 33
      }
      }

Log output

2019-04-12T10:42:13.980Z [WRN] Unable to retrieve server's metadata purge interval - will use default value. Get 127.0.0.1/settings/autoCompaction: unsupported protocol scheme "" -- db.NewDatabaseContext() at database.go:374
2019-04-12T10:42:13.980Z [INF] Using metadata purge interval of 1.25 days for tombstone compaction.
2019-04-12T10:42:13.987Z [INF] Reset guest user to config

Expected behavior

It should purge tombstones via _compact endpoint on syncgw in interval defined on couchbase server (for testing purposes set to 0.04day)

Actual behavior

It is deleted after default 1.25 days after /_compact endpoint is called

Steps to reproduce

  1. Set 0.04 days of purge tombstones on couchbase
  2. Delete some document on couchbase server
  3. After 1hour (0.04) call syncgw endpoint for _compact
bbrks commented

Hi @erikbotta,

Please make sure your bucket server configuration includes a protocol.
E.g: "server": "http://couchbase-0.couchbase.default.svc.cluster.local:8091",

You can find a reference of the supported values in the documentation:
https://docs.couchbase.com/sync-gateway/2.1/config-properties.html#databases-foo_db-server

We've already got a couple of fixes in to improve the default behaviour when retrieving metadata purge interval in the upcoming release of Sync Gateway in cases like this, but for now, including http:// in front of your server config value should be enough!

Thx @bbrks I changed it to :
"server": "couchbase://couchbase.default.svc.cluster.local" which makes more sense (SRV DNS from k8s via service)
but error is near the same:
2019-04-15T11:13:42.858Z [INF] DCP: Starting mutation feed on bucket wallet due to either channel cache mode or doc tracking (auto-import/bucketshadow) 2019-04-15T11:13:42.858Z [INF] DCP: Using DCP feed for bucket: "wallet" (based on feed_type specified in config file) 2019-04-15T11:13:43.430Z [WRN] Unable to retrieve server's metadata purge interval - will use default value. Get couchbase://couchbase.default.svc.cluster.local/settings/autoCompaction: unsupported protocol scheme "couchbase" -- db.NewDatabaseContext() at database.go:373 2019-04-15T11:13:43.430Z [INF] Using metadata purge interval of 1.25 days for tombstone compaction.

bbrks commented

Hi @erikbotta,

It looks like the fixes I mentioned previously for the next SG version also included supporting the couchbase:// protocol, so unfortunately you will need to use http:// as a workaround in the meantime! Sorry for the inconvenience.

2019-04-15T11:41:58.131Z [INF] Using metadata purge interval of 0.04 days for tombstone compaction. 2019-04-15T11:41:58.154Z [INF] Reset guest user to config 2019-04-15T11:41:58.154Z [INF] Starting admin server on :4985 2019-04-15T11:41:58.157Z [INF] Starting server on :4984 ...
Seems ok now. Thx for help @bbrks
Is there some info about version which will include couchbase protocol? Communicating only with couchbase master node from syncgw is possible single point of failure for us.

bbrks commented

@erikbotta Sync Gateway 2.5.0 will have these enhancements. I can't comment on release timeframes other than "soon"!

I assume having a 0.04 day tombstone compaction is necessary for your data model? If it is not, you can use the couchbase protocol today, but you'd have to rely on the default metadata purge interval of 1.25 days in the meantime.

@bbrks 0.04 was just testing. We are searching for best implementation of syncgw and couchbase on k8s under CE license. So we are fixing some issues like:

  • tombstones purging

  • problems with architecture when more replicas of syncgateways (via statefull set) are conecting to couchbase cluster (statefull set too). And there is no documentation how syncgw is caching in more replicas statefull set

  • etc.

Our needs of purge interval are : as much as possible :D Because we need to work with offline users at maximum range :)
So we are testing maximum by minimum interval :)

Closing based on discussion above.