Usage with Wagtail Personalisation
drcongo opened this issue ยท 6 comments
Hello. I'm using Wagtail-Cache on a site that also uses Wagtail-Personalisation which allows different users to see different versions of pages based on segmentation. Obviously I need to cache the segmented versions of the page and serve the correct one to each user, but what I'm getting is whichever version of a page was first matched saved in the cache and then every user seeing that version.
Given that Wagtail-Cache uses vary on cookie, I figured that I should be able to store the segmentation rules that a user matches in a cookie and each page version would then get cached separately, but unfortunately it's still serving the first requested version. I don't know Django's cache mechanisms particularly well, do you know if there are some rules around which cookies the vary checks against?
Plan B is to try overriding Wagtail-Cache's middleware to add checks against segmentation in there but I'd like to keep a clean Wagtail-Cache install if possible.
Thanks, and sorry it's more of a question than an issue.
edit: I've also tried adding a custom x-segmentation
header and using patch_vary_headers
to add that to the vary list. I'm starting to think this is all just down to the order that things are happening. If I find a solution I'll report back.
This is an important use-case you raised, and it would make sense for wagtail-cache to support this out of the box.
Regarding the vary headers, it only ignores the cache in this case if the request did not have a cookie, but the response set a cookie. This part could probably use some improvements.
The relevant code that decides to set the cache for a response is here. If you can think of a way to modify it to work better with wagtail personalization that would be great. I know WordPress uses some special cookie names which generally if present means the page should not be cached. We could probably do something similar here. You could of course test this out by making a hook in your project is_response_cacheable
(which will fully override wagtail-cache's decision).
wagtail-cache/wagtailcache/cache.py
Lines 95 to 110 in fc27ce0
The actual cache key which identifies the entry in the cache is a bit more opaque because it is based on django internals django.utils.cache.learn_cache_key
and django.utils.cache.get_cache_key
. I have been thinking of replacing this with our own mechanism for generating cache keys, which will probably be necessary at some point to support more edge cases.
Thanks @vsalvino - If I fork and have a poke at adding some kind of configurability to vary headers do you think that might be a good way to add support for these kinds of use cases without tying it specifically to Wagtail-Personalisation?
This was the middleware I added to try to get it to work, but it didn't seem to succeed no matter where I placed it in the MIDDLEWARE settings.
class SegmentCookieMiddleware(MiddlewareMixin):
"""Sets segmentation matches in a cookie so that Wagtail Cache
returns different results for different cookie settings.
"""
def process_response(self, request, response):
patch_vary_headers(response, ['x-segmentation'])
try:
user_segs = []
segments = request.session.get('segments', [])
if len(segments):
for segment in segments:
user_segs.append(segment.get('encoded_name', ''))
seg_str = '_'.join(user_segs)
hashed = md5(seg_str.encode('utf-8')).hexdigest()
response.set_cookie('segmentation', hashed)
response['x-segmentation'] = hashed
except Exception:
pass # Fail silently
return response
Yes - I would recommend forking and then:
-
add some debug/print statements around
django.utils.cache.learn_cache_key
anddjango.utils.cache.get_cache_key
to see how it is keying the cache. It is highly possible that those are just too naive to key the cache including specific headers etc. Hashing the cache key based on response headers would probably be smart if the built-in django utils are not already doing that. -
Play with identifying vary headers better in those lines I highlighted above.
I have a little code snippet here which has also been helpful for me in the past to debug the cache. Essentially you create another cache entry to save a list of all cache keys so you can see them #2 (comment)
Update: No matter what I tried, I couldn't get Django's cache key functions to take my custom headers into account. However, I have found an elegant / hacky (delete as applicable) solution to the original problem, which is to use the cache prefix setting, which was previously always None
in Wagtail-Cache. So, I've added the ability to a) set a key_prefix as a new setting WAGTAIL_CACHE_PREFIX
, and b) to have that be optionally a callable. What this means is in the project I'm working on with Wagtail-Personalisation I've been able to do this...
def gen_cache_prefix(request):
seg_str = None
try:
user_segs = []
segments = request.session.get('segments', [])
if len(segments):
for segment in segments:
user_segs.append(segment.get('encoded_name', ''))
seg_str = '_'.join(user_segs)
except Exception:
pass # Fail silently
return seg_str
WAGTAIL_CACHE_PREFIX = gen_cache_prefix
which generates a cache key prefix taking the user segmentation into account. This seems to work perfectly.
I don't know if doing this conflicts with future plans for the prefix functionality though, so happy to run this in our fork if so. If not, and you feel like this could be useful to merge, let me know and I'll add documentation and put in a pull request.
The behavior your describe with the django cache keys is kind of what I expected. Similar to how we had to write our own middleware, I think we will need to write our own keying mechanism to support these types of use-cases.
Your use of a prefix is a clever workaround. I wouldn't be opposed merging it in as a possibly useful feature. But I think the real "fix" is to improve the keys and possibly make that user-configurable to determine what headers/cookies/etc. will be considered by wagtail-cache. This would be especially relevant for sites that do internationalization based on browser language rather than unique URLs. It would also improve reliability of handling CSRF tokens.
I would like to get some feedback from other wagtail-cache users - will try posting in the slack.
Following up on this. If you're using a custom header, then you will want to patch your response
object with Vary: Cookie, <header-name>
. For example, if your header is X-AB-Test
, then it should be Vary: Cookie, X-AB-Test
.
With that setup, the caching middleware should in fact take your header into account and cache separate versions of those pages. You can set this on the response itself from the page/view/etc. No need to do this in middleware. The middleware should preserve it and cache appropriately.