Config replication heartbeat interval and timeout on android
Closed this issue · 4 comments
Hi,
We are using couchbase lite on mobile android devices, and replicate data through sync gateway. The replication behavior is not ideal in bad network, even if we have handled fatal network errors that will cause replication to stop permanently. If something in the network breaks down (turn off wifi, and turn it back on 2 minutes later, for example), the status would change to be "BUSY", and finally become "STOPPED" about 15 minutes later. We can't afford to wait so long to react.
We think this is because of something about heartbeat. With heartbeat, no matter the break point occurs in which layer, on which side, it should be detected in a relatively short time.
I searched and scanned the code base roughly and get that replication heartbeat interval default value is set to be 5 minutes in c++ code ( and I didn't saw config about heartbeat timeout ). In java code, logic about keyword "heartbeat" can not be found, especially in ReplicationConfiguration
and classes about configuration.
Hope this feature being considered and supported. Thanks.
And we are using the latest release, 2.8.1
My apologies for taking so long to get to this.
Can you get me a log in which this happens? ... and, just to confirm, this isn't the same problem that you saw in #13 , correct?
My apologies for taking so long to get to this.
Can you get me a log in which this happens? ... and, just to confirm, this isn't the same problem that you saw in #13 , correct?
😄
#13 and this are different problems actually.
- #13 happens during websocket handshake, and will probably cause replication to hang.
- This happens after websocket connection is established, and will delay replicator status updating for a long time , making it hard to detect the real replicator status from outside and decide if the replicator should be restarted. (For example, tell us "Busy" even after the network is actually down minutes later.) We did some experiment on it, and confirmed that there's no websocket ping-pongs. We need heartbeat to detect bad networks.
We made some adhoc patch to AbstractCBLWebSocket.java for these two problems, tried in staging env for several days, and it works fine till now.
Maybe later I'll commit a pull request. Let's see if it breaks your original design or has other problems.
I believe this is fixed in couchbase-lite-java-common @ d2413f29730b1d9a4544244f