Request cache for non id access
mschipperheyn opened this issue · 7 comments
There are a number of queries that require a findOne
style approach where you specify additional parameters and these are not handled by dataloader in the sense that it doesn't recognise 2 queries using the same findOne keys and only executes this once.
Another scenario where you loose efficiency is that for a complex page, you are executing the same queries multiple times.
E.g. a social feed that publishes posts that includes group and group membership info.
When you are displaying the same group multiple times in the feed, dataloader will make sure that you load the group only once.
But group type resolver might have various keys to verify if the current user
- is a member,
- if he can comment,
- if he can post,
- if he's an admin, etc.
This means that you get
- multiple
GroupMember.findOne({ groupId: id, userId: userId })
style queries. - inside the service manager you may need to load the group again in various locations
So, some optimization strategies might be:
- Passing the group parent to the membership service manager query for the various canComment, isAdmin style keys
- Creating a request cache to store findOne style queries by key, e.g.
group${groupId}
,groupMember${groupId}${userId}
I set out to create this request cache. But it didn't go exactly as expected, because the resolver executes all these queries in parallel. So this was the output I saw on my cache utility:
[1579467458791] INFO : request-cache cache hit false, group18b35b78-c4a0-48a6-974a-514a5a9ff8a2 +520ms
[1579467458791] INFO : request-cache cache hit false, groupMember18b35b78-c4a0-48a6-974a-514a5a9ff8a2347d118f-e48e-4f94-be5c-46c81edeb800 +0ms
[1579467458792] INFO : request-cache cache hit false, group18b35b78-c4a0-48a6-974a-514a5a9ff8a2 +1ms
[1579467458795] INFO : request-cache cache hit false, groupc47247f5-17fc-49ae-a8b5-b0f8572d4738 +3ms
[1579467458795] INFO : request-cache cache hit false, groupMemberc47247f5-17fc-49ae-a8b5-b0f8572d4738347d118f-e48e-4f94-be5c-46c81edeb800 +0ms
[1579467458795] INFO : request-cache cache hit false, groupc47247f5-17fc-49ae-a8b5-b0f8572d4738 +0ms
[1579467458799] INFO : request-cache cache hit false, groupMemberef802182-6838-4541-ae79-b563ca9d23c8347d118f-e48e-4f94-be5c-46c81edeb800 +4ms
[1579467458801] INFO : request-cache cache hit false, groupef802182-6838-4541-ae79-b563ca9d23c8 +2ms
[1579467458801] INFO : request-cache cache hit false, groupMemberef802182-6838-4541-ae79-b563ca9d23c8347d118f-e48e-4f94-be5c-46c81edeb800 +0ms
[1579467458801] INFO : request-cache cache hit false, groupef802182-6838-4541-ae79-b563ca9d23c8 +0ms
[..]
[1579467459043] INFO : request-cache cache set group18b35b78-c4a0-48a6-974a-514a5a9ff8a2 +209ms
[1579467459044] INFO : request-cache cache set groupc47247f5-17fc-49ae-a8b5-b0f8572d4738 +1ms
[1579467459045] INFO : request-cache cache set groupef802182-6838-4541-ae79-b563ca9d23c8 +1ms
[1579467459045] INFO : request-cache cache set group18b35b78-c4a0-48a6-974a-514a5a9ff8a2 +0ms
[1579467459046] INFO : request-cache cache set group1f29a29b-4691-43af-8ee0-064f5dc7902e +1ms
[1579467459046] INFO : request-cache cache set group1f29a29b-4691-43af-8ee0-064f5dc7902e +0ms
Followed by the same query executed multiple times. (I have a query cache in place, but that's besides the point).
Basically, it tries to query the cache multiple times for the items that are not there yet, then executes the query multiple times and places them in the request cache multiple times. Too late.
This is obviously a fail. I was thinking that something like apollo batch http on the client might make sense, where it aggregates queries and delays executing them for a x ms and then executes the required query only once.
Any suggestions on how to achieve something like this?
Queries in parallel should be handled automatic by creating your own Dataloader
, that's how we handle more complex queries internally.
So, can you go into that a bit more in terms of how to implement something like that?
For your example, something like this:
import {property} from 'lodash';
loaders.groupMember = new Dataloader((sets) => {
const members = await GroupMember.findAll({
where: {
groupId: sets.map(property('group.id')),
userId: sets.map(property('user.id')),
}
});
return sets.map(set => {
return members.find(member => member.groupId === set.group.id && member.userId === set.user.id);
});
}, {
cache: true,
batch: true,
cacheKeyFn: ({group, user}) => `${group.id}:${user.id}`
});
loaders.groupMember.load({group, user});
Ah, so you can create custom loaders based on certain keys being used. Wow, that's cool. Is there any place where I can read more about that?
Ah, I see that that stuff is covered by the underlying Dataloader library. Never occurred to me to look there.
Yeah we do that for more complex stuff, i haven't seen any clear patterns that might be generalized into a library, yet.
Well thanks! All sorts of lightbulbs turning on in my head.