googleapis/python-ndb

Custom Data Format in GlobalCache

phil-lopreiato opened this issue · 1 comments

I'd like to have a little bit more configurability in the GlobalCache interface to control the exact format of the data we store. This is a pretty advanced request, so I think the right abstraction would be to move the existing logic into the base class (so most situations don't have to implement it themselves), but it would be possible to override.

what

Specifically, I want to add the following things to GlobalCache and refactor the existing code to use it

class GlobalCache:

  # the existing stuff
  # ...

  @staticmethod
  def cache_key(key: datastore.Key) -> bytes:
    # current implementation of _cache.global_cache_key
    # https://github.com/googleapis/python-ndb/blob/f113dbd6332ebc1dd1ebfed3b90e19d9101b785b/google/cloud/ndb/_cache.py#L732-L741
    return _PREFIX + key.to_protobuf().SerializeToString()

  @staticmethod
  def from_cache_value(key: datastore.Key, cache_data: bytes) -> entity_pb2.Entity:
    # from this code: https://github.com/googleapis/python-ndb/blob/b77dd5f2f9fea95ac6307b6c10fd66cd073f1b5b/google/cloud/ndb/_datastore_api.py#L146-L147
    # parsing data serialized by the legacy ndb library requires passing the key, because it does not include it in the blob
    entity_pb = entity_pb2.Entity()
    entity_pb.MergeFromString(result)
    return entity_pb

  @staticmethod
  def to_cache_value(entity_pb: entity_pb2.Entity) -> bytes:
    # from this code: https://github.com/googleapis/python-ndb/blob/b77dd5f2f9fea95ac6307b6c10fd66cd073f1b5b/google/cloud/ndb/_datastore_api.py#L378
    return entity_pb.SerializeToString()

okay but why

The migration I'm doing has a complex data model back it, and I would like to be able to run this library alongside the legacy one, pointing to the same data underneath. There has also apparently been some changes recently around the legacy builtin libraries - which are now ostensibly supported in py3.

I want to continue using this library for the ndb code, but I would like to plug the legacy builtin memcache as a global cache provider for full compatibility, both forwards and backwards. The issue is that the the key format (a different prefix and the urlsafe ndb key) and data format (the legacy library using a serialized EntityProto instead of entity_pb2.Entity) are different. With the change I described above, I could write a conversion function between the two Entity types and have full compatibility using an implementation of the GlobalCache interface backed by google.appengine.api.memcache

how

If the maintainers think this is an okay change, I'm happy to write the diff to do the refactor.

I think this ended up more trouble than it's worth, and it's likely just preferable to switch back to the legacy ndb library if full compatibility is required.