aws/aws-encryption-sdk-python

CacheKey Error while cache is being evicted

darkpssngr opened this issue · 2 comments

Problem:

A short description of what the problem is and why we need to fix it. Add reproduction steps if necessary.
I'm using the following code to initialize my sdk in a fastapi server

   kms_key_provider = aws_encryption_sdk.StrictAwsKmsMasterKeyProvider(
            key_ids=[get_integration_credentials_kms_key_arn()]
        )


    cache = aws_encryption_sdk.LocalCryptoMaterialsCache(CACHE_ENTRIES)

    # Create a caching CMM
    kms_manager = aws_encryption_sdk.CachingCryptoMaterialsManager(
        master_key_provider=kms_key_provider,
        cache=cache,
        max_age=float(CACHE_TTL_IN_SECONDS),
    )

Our use is mostly decryption of the data. On one of the calls I saw that the decryption failed with CacheKeyError with message Key not found in cache. It seems to have thrown this exception while invalidating the cache entry. The following is the stack trace

Traceback (most recent call last):
  File "/backend/app/models.py", line 236, in decrypt_with_kms
    plaintext, _ = get_encryption_client().decrypt(
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/__init__.py", line 218, in decrypt
    plaintext = decryptor.read()
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/streaming_client.py", line 342, in read
    self._prep_message()
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/streaming_client.py", line 941, in _prep_message
    self._header, self.header_auth = self._read_header()
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/streaming_client.py", line 1045, in _read_header
    decryption_materials = self.config.materials_manager.decrypt_materials(request=decrypt_materials_request)
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/materials_managers/caching.py", line 236, in decrypt_materials
    self.cache.remove(cache_entry)
  File "/root/.local/share/virtualenvs/backend-gPBFdWVG/lib/python3.8/site-packages/aws_encryption_sdk/caches/local.py", line 151, in remove
    raise CacheKeyError("Key not found in cache")
aws_encryption_sdk.exceptions.CacheKeyError: Key not found in cache

Shouldn't this be handled to fail silently. If not how is this supposed to be handled? I know I'm using a global variable although StrictAwsKmsMasterKeyProvider is not supposed to be threadsafe. But this is happening at the CachingManager layer which is thread safe.

If my implementation is wrong could you help me with the right way to do this?

Solution:

N/A

Out of scope:

N/A

@darkpssngr we may need to dig into this more,
but I am not certain the Caching Cryptographic Materials Manager is thread safe.
I think the Cryptographic Materials Cache is thread safe,
as it uses RLock
;
but a quick glance at the Caching Materials Manager suggests that it MAY NOT be thread safe...

Which MAY mean that you have a race on updating an expired entry,
and then end up getting an exception on one or more of the racing threads.

Can you try making the Local Cache and a Partition ID Global variables,
but everything else thread local?

You will need to use the global Partition ID/Name on the Caching CMMs construction methods.

This library was built back when Boto Clients were not thread safe;
we have yet to refactor it to take advantage of Boto's improvements.

But you should be able to use the library in a multi-threaded environment with the work-around I have prescribed.

@texastony for now I have put this in a thread local to make sure the race condition is not triggered.