read operation to server failed on database local
dblado opened this issue · 41 comments
I've been seeing the following exception in elasticsearch today:
[2013-09-12 22:36:49,823][ERROR][river.mongodb ] [M] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
I did add two indexes to my mongo collection last night. In the mongo I see the following query executing (the client is the elasticsearch river):
{
"opid" : 3150,
"active" : true,
"secs_running" : 40,
"op" : "query",
"ns" : "local.oplog.rs",
"query" : {
"$query" : {
"$and" : [
{
"$or" : [
{
"ns" : "vendop.vendor"
},
{
"ns" : "vendop.$cmd"
}
]
},
{
"ts" : {
"$gt" : Timestamp(1378589134, 45)
}
}
]
},
"$orderby" : {
"$natural" : 1
}
},
"client" : "10.159.61.227:38940",
"desc" : "conn103",
"threadId" : "0x7f31d9c25700",
"connectionId" : 103,
"locks" : {
"^" : "r",
"^local" : "R"
},
"waitingForLock" : false,
"numYields" : 171,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(79863946),
"w" : NumberLong(0)
},
"timeAcquiringMicros" : {
"r" : NumberLong(40534881),
"w" : NumberLong(0)
}
}
}
]
}
one of these queries takes a long time to execute:
Thu Sep 12 22:37:21.277 [conn87] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909236201096610 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3873 locks(micros) r:1835294560 nreturned:92 reslen:15638 932067ms
Thu Sep 12 22:37:21.277 [conn90] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235441195695 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 3099 locks(micros) r:1480829880 nreturned:92 reslen:15638 751916ms
Thu Sep 12 22:37:21.277 [conn91] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909237130526342 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 2894 locks(micros) r:1362870376 nreturned:92 reslen:15638 691878ms
Thu Sep 12 22:37:21.278 [conn85] query local.oplog.rs query: { $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } } cursorid:22909235755883542 ntoreturn:0 ntoskip:0 nscanned:18878069 keyUpdates:0 numYields: 4315 locks(micros) r:2072460163 nreturned:92 reslen:15638 1052126ms
I'm on es 0.90.3, river-mongodb 1.7.0, mongodb 2.4.5.
any ideas what is causing the queries from river-mongodb to mongo to take so long to run?
updated w/ more log info:
The org.elasticsearch.action.NoShardAvailableActionException is also new. Any help would be greatly appreciated.
[2013-09-13 04:06:41,811][INFO ][node ] [Spiral] version[0.90.3], pid[11057], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 04:06:41,812][INFO ][node ] [Spiral] initializing ...
[2013-09-13 04:06:41,873][INFO ][plugins ] [Spiral] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] initialized
[2013-09-13 04:06:44,635][INFO ][node ] [Spiral] starting ...
[2013-09-13 04:06:44,738][INFO ][transport ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 04:06:47,774][INFO ][cluster.service ] [Spiral] new_master [Spiral][1-o0xVXqTw-hIS9zNXLnZA][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 04:06:47,810][INFO ][discovery ] [Spiral] elasticsearch/1-o0xVXqTw-hIS9zNXLnZA
[2013-09-13 04:06:47,838][INFO ][http ] [Spiral] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 04:06:47,839][INFO ][node ] [Spiral] started
[2013-09-13 04:06:47,861][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:48,773][INFO ][gateway ] [Spiral] recovered [2] indices into cluster_state
[2013-09-13 04:06:49,122][DEBUG][action.get ] [Spiral] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 04:06:49,344][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Prefix: [[Spiral] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 04:06:49,355][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 04:06:49,356][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 04:06:49,359][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 04:06:49,730][TRACE][river.mongodb ] initialTimestamp script returned: 1379045214687
[2013-09-13 04:06:49,738][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 04:06:49,739][INFO ][river.mongodb ] [Spiral] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 1719 , "uptime" : 25144.0 , "uptimeMillis" : 25144689 , "uptimeEstimate" : 24948.0 , "localTime" : { "$date" : "2013-09-13T04:06:49.934Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 419 , "total_ms" : 366 , "average_ms" : 0.8735083532219571 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T04:06:45.362Z"}} , "connections" : { "current" : 32 , "available" : 19968 , "totalCreated" : 476} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 388 , "totalNoTimeout" : 23} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3026 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64792968 , "page_faults" : 258863} , "globalLock" : { "totalTime" : 25144690000 , "lockTime" : 22065038 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 23 , "readers" : 23 , "writers" : 0}} , "indexCounters" : { "accesses" : 1050 , "hits" : 1050 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 1720996 , "W" : 22065038} , "timeAcquiringMicros" : { "R" : 2022298 , "W" : 859466476}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 820853527560 , "w" : 66890} , "timeAcquiringMicros" : { "r" : 417564575283 , "w" : 894218}} , "vendop" : { "timeLockedMicros" : { "r" : 3367283 , "w" : 1696108} , "timeAcquiringMicros" : { "r" : 1756863 , "w" : 140851}} , "config" : { "timeLockedMicros" : { "r" : 23137 , "w" : 0} , "timeAcquiringMicros" : { "r" : 1159 , "w" : 0}}} , "network" : { "bytesIn" : 807799 , "bytesOut" : 15995589 , "numRequests" : 12997} , "opcounters" : { "insert" : 55 , "query" : 2600 , "update" : 11 , "delete" : 1 , "getmore" : 0 , "command" : 11593} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 337 , "pageFaultExceptionsThrown" : 324 , "config" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "vendop" : { "accessesNotInMemory" : 336 , "pageFaultExceptionsThrown" : 323}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7313 , "virtual" : 33443 , "supported" : true , "mapped" : 16409 , "mappedWithJournal" : 32818} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 55 , "returned" : 42419 , "updated" : 11} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 3 , "idhack" : 247 , "scanAndOrder" : 17} , "queryExecutor" : { "scanned" : 7532378605} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 68 , "totalMillis" : 11} , "insertBytes" : 13725} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 419}} , "ok" : 1.0}
[2013-09-13 04:06:49,964][TRACE][river.mongodb ] [Spiral] [mongodb][vendors_river] process: mongod
[2013-09-13 04:06:50,005][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:06:50,009][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
[2013-09-13 04:07:50,074][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:07:50,077][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:07:50,078][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
[2013-09-13 04:08:50,138][ERROR][river.mongodb ] [Spiral] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 04:08:50,141][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 04:08:50,142][DEBUG][river.mongodb ] [Spiral] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
How big is "oplog.rs" collection?
The river is using a tailable cursor to query this capped collection so that should be fast.
Regarding the second exception "org.elasticsearch.action.NoShardAvailableActionException" it is a new known issue with ES 0.90.3. It does not seem to affect the river but I am currently looking at it.
the oplog.rs collection is 9553 (I believe):
set1:PRIMARY> db.printReplicationInfo()
configured oplog size: 9553.9705078125MB
log length start to end: 2219103secs (616.42hrs)
oplog first event time: Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)
oplog last event time: Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)
now: Fri Sep 13 2013 15:58:46 GMT+0000 (UTC)
set1:PRIMARY> db.getReplicationInfo()
{
"logSizeMB" : 9553.9705078125,
"usedMB" : 8992.25,
"timeDiff" : 2219103,
"timeDiffHours" : 616.42,
"tFirst" : "Sun Aug 18 2013 21:05:59 GMT+0000 (UTC)",
"tLast" : "Fri Sep 13 2013 13:31:02 GMT+0000 (UTC)",
"now" : "Fri Sep 13 2013 15:58:58 GMT+0000 (UTC)"
}
@dblado I was actually asking the number of rows in "oplog.rs" (sorry for the confusion).
The way the river monitors this collection is the recommended way by Mongo [1] to access mapped collection using a tailable cursor.
I have the feeling something is not correctly setup in your MongoDB instance.
[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/
no worries -- db.oplog.rs.count() shows 18878152 records
On Fri, Sep 13, 2013 at 9:16 AM, Richard Louapre
notifications@github.comwrote:
@dblado https://github.com/dblado I was actually asking the number of
rows in "oplog.rs" (sorry for the confusion).The way the river monitors this collection is the recommended way by Mongo
[1] to access mapped collection using a tailable cursor.I have the feeling something is not correctly setup in your MongoDB
instance.[1] - http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24405936
.
Hi,
There are actually hardcoded values for "connect timeout" and "socket timeout" in the current implementation... That seems to be the issue has it takes more than 15 seconds in your scenario.
// TODO: MongoClientOptions should be configurable
private Mongo getMongoClient() {
if (mongo == null) {
Builder builder = MongoClientOptions.builder()
.autoConnectRetry(true).connectTimeout(15000)
.socketKeepAlive(true).socketTimeout(60000);
if (definition.isMongoUseSSL()) {
builder.socketFactory(getSSLSocketFactory());
}
MongoClientOptions mco = builder.build();
mongo = new MongoClient(definition.getMongoServers(), mco);
}
return mongo;
}
I will make a change to be able to specify these parameters in the river settings.
Yea but I don't the indexing through the river to be taking so darn long --
I want as near real time as possible. I have the db and es on 2 m1.large
instances in EC2 -- could my problem be undersized hardware or nodes?
On Fri, Sep 13, 2013 at 9:37 AM, Richard Louapre
notifications@github.comwrote:
Hi,
There are actually hardcoded values for "connect timeout" and "socket
timeout" in the current implementation... That seems to be the issue has it
takes more than 15 seconds in your scenario.// TODO: MongoClientOptions should be configurable private Mongo getMongoClient() { if (mongo == null) { Builder builder = MongoClientOptions.builder() .autoConnectRetry(true).connectTimeout(15000) .socketKeepAlive(true).socketTimeout(60000); if (definition.isMongoUseSSL()) { builder.socketFactory(getSSLSocketFactory()); } MongoClientOptions mco = builder.build(); mongo = new MongoClient(definition.getMongoServers(), mco); } return mongo; }
I will make a change to be able to specify these parameters in the river
settings.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24407363
.
the queries that the river sends to mongo are taking 20+ minutes to execute. the load on both m1.large instances are near 0 since i'm the only one using it at the moment. there might be a few other hits to the db and es from the app but not much
Can you do an .explain() on the query to see if it can be optimized or indexes can be created?
I will have a new SNAPSHOT ready for testing soon (if you are interested).
not sure how to do the explain...I thought I could do:
set1:PRIMARY> db.oplog.rs.find({ $query: { $and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts: { $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }).explain()
Fri Sep 13 17:24:15.723 JavaScript execution failed: SyntaxError: Unexpected number
I just copied one of the offending queries from the mongo logfile
ya, send me a jar file w/ the timeout hardcode increased (or customizable) and I'll give it a try -- maybe once the river receives the response to this specific query my problem will go away. Although I did set an initial timestamp on the river to force to ignore old oplog entries
Here is the snapshot [1]. Please give it a try.
No need to change the river settings.
Just stop ES replace the jar files and restart.
[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip
still seeing the read timed out exception:
[2013-09-13 17:31:45,243][INFO ][node ] [Jimmy Woo] version[0.90.3], pid[13379], build[5c38d60/2013-08-06T13:18:31Z]
[2013-09-13 17:31:45,244][INFO ][node ] [Jimmy Woo] initializing ...
[2013-09-13 17:31:45,306][INFO ][plugins ] [Jimmy Woo] loaded [mongodb-river, mapper-attachments, lang-javascript], sites [river-mongodb, head]
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] initialized
[2013-09-13 17:31:48,056][INFO ][node ] [Jimmy Woo] starting ...
[2013-09-13 17:31:48,157][INFO ][transport ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.159.61.227:9300]}
[2013-09-13 17:31:51,193][INFO ][cluster.service ] [Jimmy Woo] new_master [Jimmy Woo][rLYcLpO8QtyDJ1JaGrRvIg][inet[/10.159.61.227:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-13 17:31:51,229][INFO ][discovery ] [Jimmy Woo] elasticsearch/rLYcLpO8QtyDJ1JaGrRvIg
[2013-09-13 17:31:51,246][INFO ][http ] [Jimmy Woo] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.159.61.227:9200]}
[2013-09-13 17:31:51,246][INFO ][node ] [Jimmy Woo] started
[2013-09-13 17:31:51,281][DEBUG][action.get ] [Jimmy Woo] [_river][0]: failed to execute [[_river][vendors_river][_meta]: routing [null]]
org.elasticsearch.action.NoShardAvailableActionException: [_river][0] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:123)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:72)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:47)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:61)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:179)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:112)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at org.elasticsearch.river.routing.RiversRouter$1.execute(RiversRouter.java:109)
at org.elasticsearch.river.cluster.RiverClusterService$1.run(RiverClusterService.java:103)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
[2013-09-13 17:31:52,232][INFO ][gateway ] [Jimmy Woo] recovered [2] indices into cluster_state
[2013-09-13 17:31:52,396][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Prefix: [[Jimmy Woo] [mongodb][vendors_river] ] - name: [river.mongodb]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] mongoServersSettings: [{port=27017, host=db.prod.vendop.com}]
[2013-09-13 17:31:52,406][INFO ][river.mongodb ] Server: db.prod.vendop.com - 27017
[2013-09-13 17:31:52,568][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 17:31:52,827][TRACE][river.mongodb ] initialTimestamp script returned: 1379093517797
[2013-09-13 17:31:52,835][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using mongodb server(s): host [db.prod.vendop.com], port [27017]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] MongoDB River Plugin version: [1.7.0]
[2013-09-13 17:31:52,836][INFO ][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] starting mongodb stream. options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [500], gridfs [false], filter [], db [vendop], collection [vendor], script [null], indexing to [vendors]/[vendop]
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] serverStatus: { "serverUsed" : "ip-10-181-140-155/10.181.140.155:27017" , "host" : "ip-10-181-140-155" , "version" : "2.4.5" , "process" : "mongod" , "pid" : 4808 , "uptime" : 41183.0 , "uptimeMillis" : 41182854 , "uptimeEstimate" : 40862.0 , "localTime" : { "$date" : "2013-09-13T17:31:53.170Z"} , "asserts" : { "regular" : 0 , "warning" : 0 , "msg" : 0 , "user" : 0 , "rollovers" : 0} , "backgroundFlushing" : { "flushes" : 686 , "total_ms" : 136 , "average_ms" : 0.19825072886297376 , "last_ms" : 0 , "last_finished" : { "$date" : "2013-09-13T17:31:30.401Z"}} , "connections" : { "current" : 31 , "available" : 19969 , "totalCreated" : 699} , "cursors" : { "totalOpen" : 34 , "clientCursors_size" : 34 , "timedOut" : 652 , "totalNoTimeout" : 20} , "dur" : { "commits" : 88 , "journaledMB" : 0.0 , "writeToDataFilesMB" : 0.0 , "compression" : 0.0 , "commitsInWriteLock" : 0 , "earlyCommits" : 0 , "timeMs" : { "dt" : 3033 , "prepLogBuffer" : 0 , "writeToJournal" : 0 , "writeToDataFiles" : 0 , "remapPrivateView" : 0}} , "extra_info" : { "note" : "fields vary by platform" , "heap_usage_bytes" : 64683304 , "page_faults" : 372578} , "globalLock" : { "totalTime" : 41182884000 , "lockTime" : 22903163 , "currentQueue" : { "total" : 0 , "readers" : 0 , "writers" : 0} , "activeClients" : { "total" : 20 , "readers" : 20 , "writers" : 0}} , "indexCounters" : { "accesses" : 318 , "hits" : 318 , "misses" : 0 , "resets" : 0 , "missRatio" : 0.0} , "locks" : { "." : { "timeLockedMicros" : { "R" : 2835632 , "W" : 22903163} , "timeAcquiringMicros" : { "R" : 3015339 , "W" : 1469808177}} , "admin" : { "timeLockedMicros" : { } , "timeAcquiringMicros" : { }} , "local" : { "timeLockedMicros" : { "r" : 1429902579128 , "w" : 58855} , "timeAcquiringMicros" : { "r" : 727932420060 , "w" : 227207}} , "vendop" : { "timeLockedMicros" : { "r" : 2435005 , "w" : 1234970} , "timeAcquiringMicros" : { "r" : 828842 , "w" : 55977}} , "test" : { "timeLockedMicros" : { "r" : 18719 , "w" : 0} , "timeAcquiringMicros" : { "r" : 90 , "w" : 0}}} , "network" : { "bytesIn" : 1105334 , "bytesOut" : 10207728 , "numRequests" : 18142} , "opcounters" : { "insert" : 9 , "query" : 2973 , "update" : 5 , "delete" : 1 , "getmore" : 0 , "command" : 16927} , "opcountersRepl" : { "insert" : 0 , "query" : 0 , "update" : 0 , "delete" : 0 , "getmore" : 0 , "command" : 0} , "recordStats" : { "accessesNotInMemory" : 102 , "pageFaultExceptionsThrown" : 93 , "local" : { "accessesNotInMemory" : 1 , "pageFaultExceptionsThrown" : 1} , "test" : { "accessesNotInMemory" : 0 , "pageFaultExceptionsThrown" : 0} , "vendop" : { "accessesNotInMemory" : 101 , "pageFaultExceptionsThrown" : 92}} , "repl" : { "setName" : "set1" , "ismaster" : true , "secondary" : false , "hosts" : [ "ip-10-181-140-155:27017"] , "primary" : "ip-10-181-140-155:27017" , "me" : "ip-10-181-140-155:27017"} , "writeBacksQueued" : false , "mem" : { "bits" : 64 , "resident" : 7283 , "virtual" : 33602 , "supported" : true , "mapped" : 16489 , "mappedWithJournal" : 32978} , "metrics" : { "document" : { "deleted" : 1 , "inserted" : 9 , "returned" : 72010 , "updated" : 5} , "getLastError" : { "wtime" : { "num" : 0 , "totalMillis" : 0} , "wtimeouts" : 0} , "operation" : { "fastmod" : 1 , "idhack" : 82 , "scanAndOrder" : 6} , "queryExecutor" : { "scanned" : 12572869515} , "record" : { "moves" : 3} , "repl" : { "apply" : { "batches" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0} , "buffer" : { "count" : 0 , "maxSizeBytes" : 268435456 , "sizeBytes" : 0} , "network" : { "bytes" : 0 , "getmores" : { "num" : 0 , "totalMillis" : 0} , "ops" : 0 , "readersCreated" : 3} , "oplog" : { "insert" : { "num" : 14 , "totalMillis" : 5} , "insertBytes" : 4388} , "preload" : { "docs" : { "num" : 0 , "totalMillis" : 0} , "indexes" : { "num" : 0 , "totalMillis" : 0}}} , "ttl" : { "deletedDocuments" : 0 , "passes" : 686}} , "ok" : 1.0}
[2013-09-13 17:31:53,286][TRACE][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] process: mongod
[2013-09-13 17:31:53,346][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:31:53,350][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
[2013-09-13 17:32:53,383][ERROR][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Mongo gave an exception
com.mongodb.MongoException$Network: Read operation to server ip-10-181-140-155/10.181.140.155:27017 failed on database local
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at org.elasticsearch.river.mongodb.MongoDBRiver$Slurper.run(MongoDBRiver.java:1211)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at org.bson.io.Bits.readFully(Bits.java:46)
at org.bson.io.Bits.readFully(Bits.java:33)
at org.bson.io.Bits.readFully(Bits.java:28)
at com.mongodb.Response.(Response.java:40)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 8 more
[2013-09-13 17:32:53,386][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] vendop.vendor last timestamp: { "$ts" : 1378589134 , "$inc" : 45}
[2013-09-13 17:32:53,387][DEBUG][river.mongodb ] [Jimmy Woo] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
this explain seems to be doing something:
set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()
it's been running a couple minutes already but I guess that's normal because the explain() still runs the query normally just outputs additional data??
I don't believe you are have correctly the SNAPHOT version. The log file should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT]
instead of MongoDB River Plugin version: [1.7.0]
Did you replace elasticsearch-river-mongodb-1.7.0.jar by elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in $ES_HOME/plugins/river-mongodb?
my bad -- I didn't delete 1.7.0 just copied in the SNAPSHOT -- running again
On Fri, Sep 13, 2013 at 11:09 AM, Richard Louapre
notifications@github.comwrote:
I don't believe you are have correctly the SNAPHOT version. The log file
should show:
MongoDB River Plugin version: [1.7.1-SNAPSHOT] instead of MongoDB River
Plugin version: [1.7.0]Did you replace elasticsearch-river-mongodb-1.7.0.jar by
elasticsearch-river-mongodb-1.7.1-SNAPSHOT.jar in
$ES_HOME/plugins/river-mongodb?—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24413525
.
in the startup of ES I'm seeing:
[2013-09-13 18:14:26,106][TRACE][river.mongodb ] mongoOptionsSettings: {initial_timestamp={script_type=js, script=var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());}, secondary_read_preference=true}
[2013-09-13 18:14:26,514][TRACE][river.mongodb ] initialTimestamp script returned: 1379096071448
but the query to mongo doesn't seem to respect the initial timestamp:
[2013-09-13 18:14:26,997][DEBUG][river.mongodb ] [Topspin] [mongodb][vendors_river] Using filter: { "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}
here is the explain:
set1:PRIMARY> db.oplog.rs.find({ "$and" : [ { "$or" : [ { "ns" : "vendop.vendor"} , { "ns" : "vendop.$cmd"}]} , { "ts" : { "$gt" : { "$ts" : 1378589134 , "$inc" : 45}}}]}).explain()
{
"cursor" : "ForwardCappedCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 18878164,
"nscanned" : 18878164,
"nscannedObjectsAllPlans" : 18878164,
"nscannedAllPlans" : 18878164,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 3989,
"nChunkSkips" : 0,
"millis" : 930870,
"indexBounds" : {
},
"server" : "ip-10-181-140-155:27017"
}
wow, just had LOTS of activity in the ES log:
seems like everything updated now in the index
http://pastebin.com/SyVfbrwY
things appear to be coming in live now -- so the last_ts in the river was GMT: Sat, 07 Sep 2013 21:25:34 GMT. According to my log files my issue has actually been happening for a while I just never noticed until yesterday. I'm guessing river queries were taking longer and longer each time because it was trying to play catchup??
The initial sync can definitely take a long time but hopefully it should be fast once all data are in sync.
Is it what you are seeing?
Regarding the initial timestamp can you please provide your river settings so I can take a look?
here is my river setting:
{
"index": {
"name": "vendors"
},
"_enabled": true,
"_name": "vendors_river",
"mongodb": {
"db": "vendop",
"servers": [
{
"port": 27017,
"host": "db.prod.vendop.com"
}
],
"collection": "vendor",
"options": {
"initial_timestamp": {
"script_type": "js",
"script": "var date = new Date(); date.setSeconds(date.getSeconds() + 5); new java.lang.Long(date.getTime());"
},
"secondary_read_preference": true
}
},
"type": "mongodb"
}
as soon as it caught the river was near live but it's slowing down -- response time from mongo is 400000+ ms
Did you try to register the river multiple times? Did you set "initial_timestamp" in one of your last version of the river settings?
"initial_timestamp" is used only if _last_ts does not exist.
So you want to use this parameter make sure the previous river settings have been deleted first:
curl -XDELETE localhost:9200/_river/vendors_river
-- response time from mongo is 400000+ ms
Can you tell which query is slowing down?
ahh ok, will remember to delete the river and not use multiple versions to
modify the timestamp -- didn't see that in the docs.
seems all the queries form the river are slow still:
Fri Sep 13 18:20:51.854 [conn744] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027466574947012 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2320 locks(micros) r:1053631876 nreturned:101
reslen:17333 536445ms
Fri Sep 13 18:20:51.854 [conn744] end connection 10.159.61.227:39294 (17
connections now open)
Fri Sep 13 18:20:51.854 [conn741] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188027467292149267 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 2965 locks(micros) r:1408357829 nreturned:101
reslen:17333 716624ms
Fri Sep 13 18:20:51.854 [conn741] end connection 10.159.61.227:39291 (16
connections now open)
Fri Sep 13 18:20:51.858 [conn737] query local.oplog.rs query: { query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: { $ts: 1378589134.0, $inc: 45.0 } } } ] }, $explain: true }
ntoreturn:0 ntoskip:0 nscanned:18878164 keyUpdates:0 numYields: 3989
locks(micros) r:1831009723 nreturned:1 reslen:401 930874ms
Fri Sep 13 18:20:54.785 [conn754] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039966367486858 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1492 locks(micros) r:762280729 nreturned:101
reslen:17333 387781ms
Fri Sep 13 18:20:54.785 [conn746] query local.oplog.rs query: { $query: {
$and: [ { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, { ts:
{ $gt: Timestamp 1378589134000|45 } } ] }, $orderby: { $natural: 1 } }
cursorid:188039965764332883 ntoreturn:0 ntoskip:0 nscanned:18878150
keyUpdates:0 numYields: 1576 locks(micros) r:824228690 nreturned:101
reslen:17333 419325ms
looks like those are all for the same timestamp
On Fri, Sep 13, 2013 at 12:15 PM, Richard Louapre
notifications@github.comwrote:
-- response time from mongo is 400000+ ms
Can you tell which query is slowing down?
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24417964
.
It looks like there might be an undocumented QUERYOPTION_OPLOGREPLAY option [1] to make query on oplog.rs more efficient.
I will test it and let you know.
Thanks,
Richard.
Can you please try this version [1]?
QUERYOPTION_OPLOGREPLAY has been implemented.
[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip
everything seems to have stabilized w/ the most recent change to use QUERYOPTION_OPLOGREPLAY. WIll continue monitoring and let you know if I have other issues.
Thanks so much for the quick turnaround.
@dblado could you please provide the response time of the query using QUERYOPTION_OPLOGREPLAY?
What is odd is that the oplog is now empty...even when I make changes to the db...so nothing is being indexed??
oh nevermind, I was looking at localhost :)
here are log entries:
Sun Sep 15 21:13:42.290 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:13:47.302 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:93 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:52.314 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:124 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:13:57.325 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:103 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:14:02.337 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:137 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:07.349 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:135 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:12.361 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:161 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:14:17.373 [conn3] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:4325292602388 ntoreturn:0 keyUpdates:0 locks(micros) r:106 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:33.228 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:114 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:38.240 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:111 nreturned:0 reslen:20 5010ms
Sun Sep 15 21:16:43.252 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:122 nreturned:0 reslen:20 5011ms
Sun Sep 15 21:16:48.263 [conn12] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:489838434203944 ntoreturn:0 keyUpdates:0 locks(micros) r:104 nreturned:0 reslen:20 5010ms
each query is still taking about 5 seconds -- considering that there are no updated records in the db in the last few hours that seems a bit high to me
Could you please provide the response time for these 3 queries (sorry I don't have a MongoDB instance with a huge oplog.rs):
db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } })
Had to modify them to:
db.oplog.rs.find({ $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } })
db.oplog.rs.find({ $query: { ts: { $gt: Timestamp(1379278843,1) } }, $orderby: { $natural: 1 } })
waiting for the results
results:
Mon Sep 16 16:50:40.649 [conn15] query local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 919 locks(micros) r:613458050 nreturned:0 reslen:20 308179ms
Mon Sep 16 16:52:50.190 [conn17] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1062 locks(micros) r:701058589 nreturned:0 reslen:20 351803ms
Mon Sep 16 16:52:50.190 [conn16] query local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379278843000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:18878319 keyUpdates:0 numYields: 1164 locks(micros) r:770496332 nreturned:0 reslen:20 386413ms
the river is producing these queries right now @ 5seconds per query:
Mon Sep 16 17:03:09.425 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:108 nreturned:0 reslen:20 5010ms
Mon Sep 16 17:03:12.546 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.546 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:12.546 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:12.547 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:14.437 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:157 nreturned:0 reslen:20 5011ms
Mon Sep 16 17:03:17.547 [conn7] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.547 [conn7] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:17.547 [conn3] run command admin.$cmd { ismaster: 1 }
Mon Sep 16 17:03:17.548 [conn3] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Mon Sep 16 17:03:19.448 [conn8] getmore local.oplog.rs query: { $query: { $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ], ts: { $gt: Timestamp 1379278843000|1 } }, $orderby: { $natural: 1 } } cursorid:39500982229712 ntoreturn:0 keyUpdates:0 locks(micros) r:92 nreturned:0 reslen:20 5010ms
5 seconds seems still a lot to me.
I have changed the order if the index filter in oplog.rs: ts comes first.
Can you please give a try to this version [1]?
[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip
yea 5 seconds is still a lot of time.
new jar didn't change it
Wed Sep 18 21:11:05.118 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:188 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:09.905 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.905 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:09.986 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:09.986 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:10.130 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:14.908 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.908 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:14.988 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:14.988 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:15.142 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:182 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:19.910 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.910 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:19.989 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:19.989 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:20.154 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:187 nreturned:0 reslen:20 5011ms
Wed Sep 18 21:11:22.439 [DataFileSync] flushing mmaps took 0ms for 17 files
Wed Sep 18 21:11:24.913 [conn63] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.913 [conn63] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:24.990 [conn67] run command admin.$cmd { ismaster: 1 }
Wed Sep 18 21:11:24.990 [conn67] command admin.$cmd command: { ismaster: 1 } ntoreturn:1 keyUpdates:0 reslen:257 0ms
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms
@dblado I have just create a new post in MongoDB user group [1].
[1] - https://groups.google.com/forum/#!topic/mongodb-user/E7BSv624nBg
@dblado just a quick note according to Asya Kamsky the log should be interpreted as:
Wed Sep 18 21:11:25.166 [conn68] getmore local.oplog.rs query: { $query: { ts: { $gt: Timestamp 1379529386000|1 }, $or: [ { ns: "vendop.vendor" }, { ns: "vendop.$cmd" } ] }, $orderby: { $natural: 1 } } cursorid:1102542858630512437 ntoreturn:0 keyUpdates:0 locks(micros) r:195 nreturned:0 reslen:20 5011ms
r:195 -> it took 195 microseconds, the rest of the time was spent waiting.