Need to cache current state of each slob in memcache
Opened this issue · 10 comments
GoogleCodeExporter commented
What steps will reproduce the problem?
1. Deploy full-text-search walkaround
2. Import waves from google wave that have several participants and lots of
text. (In my case 5 participants and one wave bot that doesn't function because
I haven't implemented robot-apis. It has 44 entries in the wave and most are
several paragraphs of text with multiple colors, font size and typeface)
3. Navigate through the wave with up and down arrows. Edit an entry. Add an
entry. Etc.
What is the expected output? What do you see instead?
I expected to be able to use walkaround as a free google app, but it is
exceeding the maximum free "Datastore Read Ops" so my walkround instance is
inaccessible for a lot of the day.
Only about 15 minutes of actual typing by one user over the span of an hour
caused it to reach 0.05 Million Datastore Read Ops yesterday. Only about 5
minutes of typing by one user caused 0.02 Million Datastore Read Ops today.
What browser and browser version are you using? On what operating system?
Firefox 11 on Windows 7
What URL does your browser show when the problem occurs? Did you compile
walkaround on your machine, or are you using a public instance?
https://kwalkaround.appspot.com/
I compiled the instance on my Mac OSX 10.6.8 and deployed it to google's app
engine, but the site is being accessed by other users from other machines
through the website.
Please provide any additional information below.
This could be associated with Issue 7, but since Issue 7 has been labeled
fixed, and those code changes were merged with the full-text-search branch, I
thought this might be a different issue with the same symptoms. Also, maybe
there are settings I could enable or change that would help, but I do not know
what those would be.
Here is my Quota Detail information for Storage:
Storage
Datastore Write Operations 0.00 Million Ops Okay
Datastore Read Operations 0.02 Million Ops Okay
Datastore Small Operations 0.00 Million Ops Okay
Datastore API Calls 4,287 Okay
Datastore Queries 1,306 Okay
Blobstore API Calls 0 Okay
Datastore Stored Data 0.01 GBytes Okay
Blobstore Stored Data 0.01 GBytes Okay
Data Sent to Datastore API 0.00 GBytes Okay
Data Received from Datastore API 0.02 GBytes Okay
Datastore Entity Fetch Ops 17,075 Okay
Datastore Entity Put Ops 470 Okay
Datastore Entity Delete Ops 73 Okay
Datastore Index Write Ops 1,347 Okay
Datastore Query Ops 1,311 Okay
Datastore Key Fetch Ops 295 Okay
Datastore Id Allocation Ops 1 Okay
Number of Indexes 3%3% 6 of 200 Okay
Prospective Search Subscriptions 0%0% 0 of 10,000 Okay
Logs Stored Data 0.05 GBytes Okay
Original issue reported on code.google.com by otter...@gmail.com
on 12 Apr 2012 at 8:08
GoogleCodeExporter commented
You can use wavereactor.appspot.com instance to import waves - it has more
quota.
I don't think that Google ever promised to provide free, non limited Walkaround
instance to anyone.
Original comment by vega113
on 12 Apr 2012 at 9:26
GoogleCodeExporter commented
Thanks, I'll check that out. I wasn't really saying google promised it would be
free, though. I just thought that a couple of friends using a walkaround
instance a few times a day shouldn't put it over google's quota maximum. Unless
google's quota is insanely low and this is expected behavior?
Original comment by otter...@gmail.com
on 12 Apr 2012 at 9:31
GoogleCodeExporter commented
Per the rest of the issue text, the bigger concern is whether Walkaround is
using excessive read operations along the same lines as Issue 7 , I believe.
Very small changes causing thousands of reads would likely impact project
adoption, even (or especially) for users paying for quota.
Original comment by betterth...@gmail.com
on 12 Apr 2012 at 9:32
GoogleCodeExporter commented
Can you go to /admin/appstats and check in the "RPC Stats" section which RPCs
are being called too much (e.g. datastore_v3.Get), and which paths are calling
them the most (e.g. POST /channel)? Maybe just expand the top few RPCs and
paste the output here.
Original comment by oh...@google.com
on 12 Apr 2012 at 11:20
GoogleCodeExporter commented
Here is the list. None of those counts seems like much compared to the number
of Datastore read Ops the Admin Console is listing. When I googled this issue
to see if I could locate what was causing all the reads I did find some forum
posts indicating that the RCP counts might not tell you every read that is
happening, which is unfortunate, but maybe it will still show where the excess
is coming from. Here is one of those links:
http://stackoverflow.com/questions/7807529/how-many-datastore-reads-consume-each
-fetch-count-and-query-operations
memcache.Get 1366
POST /channel 1326
/wave 21
/photos 7
/ 4
/contacts 4
/inbox 4
memcache.Set 277
POST /channel 263
/wave 11
/ 1
/photos 1
POST /_ah/queue/__deferred__ 1
datastore_v3.RunQuery 151
POST /channel 138
/wave 13
datastore_v3.BeginTransaction 82
POST /channel 69
/wave 10
/ 1
/contacts 1
/photos 1
datastore_v3.Rollback 81
POST /channel 69
/wave 10
/ 1
/photos 1
datastore_v3.Get 74
POST /channel 69
/wave 3
/ 1
/photos 1
datastore_v3.Next 17
/wave 17
Original comment by otter...@gmail.com
on 13 Apr 2012 at 12:15
GoogleCodeExporter commented
The appstats data is only stored in memcache, so any part or all of it can
indeed go missing at any point. But there's a good chance that it will point
us in the right direction anyway. Can you try your five minutes of typing
again that cause 20,000 datastore ops, and tell us what appstats shows
immediately after that?
Original comment by oh...@google.com
on 13 Apr 2012 at 12:53
GoogleCodeExporter commented
Okay, I replicated my issue. I opened a new tab in Chrome on Windows 7 and
logged into the admin console to see the datastore quota prior to starting
typing in my wave:
Storage
Datastore Write Operations 0.01 Million Ops Okay
Datastore Read Operations 0.11 Million Ops Okay
Datastore Small Operations 0.00 Million Ops Okay
Datastore API Calls 13,870 Okay
Datastore Queries 4,564 Okay
Blobstore API Calls 0 Okay
Datastore Stored Data 0.01 GBytes Okay
Blobstore Stored Data 0.01 GBytes Okay
Data Sent to Datastore API 0.00 GBytes Okay
Data Received from Datastore API 0.15 GBytes Okay
Datastore Entity Fetch Ops 105,798 Okay
Datastore Entity Put Ops 1,384 Okay
Datastore Entity Delete Ops 73 Okay
Datastore Index Write Ops 3,771 Okay
Datastore Query Ops 4,578 Okay
Datastore Key Fetch Ops 895 Okay
Datastore Id Allocation Ops 1 Okay
Number of Indexes 3% 6 of 200 Okay
Prospective Search Subscriptions 0% 0 of 10,000 Okay
Logs Stored Data 0.05 GBytes Okay
Here is the datastore quota AFTER doing 5 minutes of typing in my wave:
Storage
Datastore Write Operations 0.01 Million Ops Okay
Datastore Read Operations 0.13 Million Ops Okay
Datastore Small Operations 0.00 Million Ops Okay
Datastore API Calls 17,106 Okay
Datastore Queries 5,490 Okay
Blobstore API Calls 0 Okay
Datastore Stored Data 0.01 GBytes Okay
Blobstore Stored Data 0.01 GBytes Okay
Data Sent to Datastore API 0.00 GBytes Okay
Data Received from Datastore API 0.17 GBytes Okay
Datastore Entity Fetch Ops 122,766 Okay
Datastore Entity Put Ops 1,768 Okay
Datastore Entity Delete Ops 73 Okay
Datastore Index Write Ops 4,855 Okay
Datastore Query Ops 5,461 Okay
Datastore Key Fetch Ops 1,276 Okay
Datastore Id Allocation Ops 1 Okay
Number of Indexes 3% 6 of 200 Okay
Prospective Search Subscriptions 0% 0 of 10,000 Okay
Logs Stored Data 0.05 GBytes Okay
You can see the "Datastore Read Ops" went up by 0.02 Million and the "Datastore
Entity Fetch Ops" went up by about 17k.
And here is the RPC stat info you wanted for when this happens:
RPC Count
memcache.Get 1158
POST /submitdelta 894
POST /channel 157
POST /taskqueue/postcommit 90
/wave 11
/contacts 2
/photos 2
/ 1
/inbox 1
datastore_v3.RunQuery 594
POST /submitdelta 290
POST /taskqueue/postcommit 265
POST /channel 32
/wave 7
datastore_v3.BeginTransaction 426
POST /submitdelta 232
POST /taskqueue/postcommit 156
POST /channel 31
/wave 5
/contacts 2
memcache.Set 342
POST /submitdelta 255
POST /channel 47
POST /taskqueue/postcommit 33
/wave 5
/contacts 1
POST /_ah/queue/__deferred__ 1
channel.SendChannelMessage 299
POST /submitdelta 299
datastore_v3.Commit 260
POST /submitdelta 226
POST /taskqueue/postcommit 33
/contacts 1
datastore_v3.Put 260
POST /submitdelta 226
POST /taskqueue/postcommit 33
/contacts 1
datastore_v3.Next 232
POST /submitdelta 212
POST /taskqueue/postcommit 20
datastore_v3.Rollback 166
POST /taskqueue/postcommit 123
POST /channel 31
POST /submitdelta 6
/wave 5
/contacts 1
search.IndexDocument 128
POST /taskqueue/postcommit 128
datastore_v3.Get 84
POST /taskqueue/postcommit 47
POST /submitdelta 34
/contacts 1
/wave 1
POST /channel 1
logservice.Flush 38
taskqueue.BulkAdd 25
urlfetch.Fetch 8
channel.CreateChannel 1
memcache.Delete 1
search.Search 1
Original comment by otter...@gmail.com
on 13 Apr 2012 at 5:46
GoogleCodeExporter commented
[deleted comment]
GoogleCodeExporter commented
Oh, also, what I typed in the wave ended up being 881 characters, not including
any time I backspaced, if that helps.
Original comment by otter...@gmail.com
on 13 Apr 2012 at 5:50
GoogleCodeExporter commented
Looking at the numbers:
RPCs related to entity fetch ops:
datastore_v3.RunQuery 594
POST /submitdelta 290
POST /taskqueue/postcommit 265
POST /channel 32
/wave 7
datastore_v3.Next 232
POST /submitdelta 212
POST /taskqueue/postcommit 20
datastore_v3.Get 84
POST /taskqueue/postcommit 47
POST /submitdelta 34
/contacts 1
/wave 1
POST /channel 1
other datastore RPCs:
datastore_v3.BeginTransaction 426
POST /submitdelta 232
POST /taskqueue/postcommit 156
POST /channel 31
/wave 5
/contacts 2
datastore_v3.Commit 260
POST /submitdelta 226
POST /taskqueue/postcommit 33
/contacts 1
datastore_v3.Put 260
POST /submitdelta 226
POST /taskqueue/postcommit 33
/contacts 1
datastore_v3.Rollback 166
POST /taskqueue/postcommit 123
POST /channel 31
POST /submitdelta 6
/wave 5
/contacts 1
non-datastore RPCs:
channel.SendChannelMessage 299
POST /submitdelta 299
memcache.Get 1158
POST /submitdelta 894
POST /channel 157
POST /taskqueue/postcommit 90
/wave 11
/contacts 2
/photos 2
/ 1
/inbox 1
memcache.Set 342
POST /submitdelta 255
POST /channel 47
POST /taskqueue/postcommit 33
/wave 5
/contacts 1
POST /_ah/queue/__deferred__ 1
search.IndexDocument 128
POST /taskqueue/postcommit 128
logservice.Flush 38
taskqueue.BulkAdd 25
urlfetch.Fetch 8
channel.CreateChannel 1
memcache.Delete 1
search.Search 1
I realize now that I should have asked you to go to /admin and clear memcache
first to reset these counters to zero before the experiment.
But, given the above numbers, it looks like most of the read ops are in the
reconstruction of the wavelet state during submitdelta and post-commit. You
could edit webinf/flags.json and increase post_commit_action_interval_millis to
e.g. 300000 to reduce the number of datastore ops consumed by post-commit.
(This will also make waves take five minutes to show up in new participants'
inboxes -- we should probably short-circuit AddParticipant similarly to how we
treat RemoveParticipant.)
The long-term solution to this is probably to cache the current state of each
wavelet in memcache to avoid the datastore accesses (this can also be used to
make waves load faster). It's not hard, but I'm not sure when I'll get around
to it. Export-related bugs are a higher priority at the moment.
Another solution would be to make affinity work with dynamic backends so that
MutationLog.StateCache is more effective, increase
post_commit_action_interval_millis, and short-circuit AddParticipant. Caching
slob states seems more useful overall though.
Original comment by oh...@google.com
on 15 Apr 2012 at 4:35
- Changed title: Need to cache current state of each slob in memcache
- Changed state: Accepted