Performance issue of tailable cursor + awaitdata
Closed this issue · 11 comments
I have one more issue:
1 query with tailable cursor taks up to 10% cpu load on mongod process on my machine.
Every additionall query adds same amount of load. At the end using 5 of them cossts 40-50% of cpu.
I don't see any answers on the web ....
Hi, I have been working through similar issues as well as following your progress this week. I think I have read everything there is to read on the subject, which isn't a whole lot. Today I went back to the roots and tried to understand and replicate a capped collection with a tailable cursor, and here's what I got:
https://github.com/squalrus/sandbox/blob/master/mongo/server.js
I believed the nature of the tailable cursor to rely heavily on the cursor that is doing the lookups. In my recursive function I am just passing along the cursor to be used to look for the next items ( over passing the collection and a timestamp like I have seen in a few examples ). In my initial testing this appears to be working, with low CPU usage. The above link is also using multiple threads, and the load is still insignificant.
You guys should be able to get my project running in no time if you'd like to do some testing as well -- I would appreciate it and any feedback/thoughts/etc.
Thanks.
Haven't tried it yet, but I see that you are not using awaitdata:true, in this case you will get your nextObject callback called every time, with or without a doc. This is actually just a normal polling from cursor. The reason we use awaitdata:true is to avoid polling on the client. Mongo server will return a document as it finds one.
But something goes wrong there ...
Yeah, I had that in there at some point today, but I guess it was removed.
I just added it back and ran it, and the CPU usage jumped to 60% then died with:
errorError: tailable cursor timed out
errorRangeError: Maximum call stack size exceeded
From what I remember reading, it seemed like awaitdata: true was necessary for a tailable cursor, but I don't understand why it appears to be working without it, and with minimal load on the server. I am going to investigate some more...
Make your mongod verbose, you will see the difference between polling and waiting for data.
Ahh, I see. So it isn't actually acting as a tailable cursor, it is just polling? If so, it seems strange to me that it isn't bogging down the CPU more than it is.
When I use awaitdata: true, if I keep sending data to the db, it seems to be fine, but as soon as I stop there's a delay and it dies. Is this due to cursor variables like numberOfRetries and/or tailableRetryInterval?
This code is wrong:
https://github.com/scttnlsn/mubsub/blob/master/lib/channel.js#L75
It does not use the _id field in the $gt query.
Created an issue on mongodb tracker
Now it is clean that mongod is broken and we need to remove await data because its unsusable.
I am going to step back to the normal polling from cursor solution.