bengler/grove

Optimize occurrence-queries

Opened this issue · 2 comments

There is something awfully wrong with getting posts based on an occurrence query. It generates loads of extra queries per row.

Example:

http://api.bagera.no/api/grove/v1/posts/post.event:apdm.stream.ba.calendar.*?occurrence%5Blabel%5D=start_time&occurrence%5Bfrom%5D=2012-12-05&occurrence%5Bto%5D=2012-12-09&offset=0&limit=300

For each entry returned by the initial query, Grove runs the following queries:

SELECT DISTINCT "locations".* FROM "locations" INNER JOIN "locations_posts" ON "locations"."id" = "locations_posts"."location_id" WHERE "locations_posts"."post_id" = 250698;
SELECT "occurrence_entries".* FROM "occurrence_entries"  WHERE "occurrence_entries"."post_id" = 250698;

So fetching 300 entries generates at least 601 SQL queries. These should be collapsed into single queries using IN matching, and/or use Memcache to cache the lookups.

In addition, SELECT "occurrence_entries" happens when petroglyph asks for post.merged_document.

I've put in place a default_scope (https://github.com/bengler/grove/blob/db-roundtrip-optimization/lib/models/post.rb#L39) which eagerly fetches occurence_entries. This saves us one extra DB trip when we later call post.occurrences (via petroglyph rendering), but for some reason we still get the separate SELECT "occurrence_entries" call when we initially fetch a post from the DB. Will look more into this.