Azure/azure-sdk-for-python

Client Connection Sharing confuses ETag value

paisleypark opened this issue · 4 comments

  • Package Name azure-cosmos:
  • Package Version 4.6.0:
  • Operating System n/a:
  • Python Version 3.9:

Describe the bug
Using a single Database, single client, two Containers (one source one destination) on same Database. Using the single client, read source async with Change Feed and writing to dest gives the last ETag value of the destination Container operation even when the ETag of the source (last Change Feed op) is requested. It appears that the client is ignoring the source ETag from CF. This makes it impossible to use continuation token from source, so entire CF has to be read from beginning.

To Reproduce
Steps to reproduce the behavior:
cosmosDbSourceResponse = sourceClientContainer.query_items_change_feed( is_start_from_beginning = True )
itemsTempIn = []
async for itemOne in cosmosDbSourceResponse:
itemsTempIn.append( itemOne )
for itemOneIn in itemsTempIn:
itemOneOut = { .... }
await destClientContainer.create_item( itemOneOut )
etagLast = sourceClientContainer.client_connection.last_response_headers['Etag']

etagLast contains a GUID result from destClientContainer

Expected behavior

Moving the etagLast statement above "for itemOneIn in itemsTempIn:" captures the proper value from source. This behavior does not seem to happen when there are two clients/accounts in use, only when sharing the client.

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @AbhinavTrips @bambriz @pjohari-ms @simorenoh.

Hi @paisleypark
Thank you for reaching out. In regards to getting the last response headers it is not reccomended to directly call it, especially in async client as the client is a singleton it returns the last headers in general from all operations done. Instead it is reccomended to use a response hook to get the appropiate response headers. Here is an example how it would look like for your scenario:

class ResponseHook:
    def __init__(self):
        self.last_headers = None
        self.last_etag = None

    def __call__(self, headers: Dict[str, Any], results):
        self.last_headers = headers
        self.last_etag = self.last_headers.get('etag')
        
async def examples_async():
    async with CosmosClient(url, key) as client:
        # Initialize your response hook
        hook = ResponseHook()

        # Create database if not exists
        database_name = 'testDatabasebob'
        database = await client.create_database_if_not_exists(database_name)

        # Create source container if not exists
        source_container_name = 'sourceContainer'
        source_container = await database.create_container_if_not_exists(id=source_container_name, partition_key=PartitionKey(path="/id"))

        # Upsert 15 items into source container
        for i in range(15):
            item = {'id': str(i), 'value': i}
            await source_container.upsert_item(item)

        # Create destination container if not exists
        destination_container_name = 'destinationContainer'
        destination_container = await database.create_container_if_not_exists(id=destination_container_name, partition_key=PartitionKey(path="/id"))

        # Query items change feed from beginning and upsert them into destination container
        change_feed = source_container.query_items_change_feed(is_start_from_beginning=True, response_hook=hook)
        itemsTempIn = []
        async for item in change_feed:
            itemsTempIn.append(item)

        for item in itemsTempIn:
            await destination_container.create_item(body=item)

        # Print the last response header from source container
        print("Last etag From Source:", hook.last_etag, " Wrong Headers from Source: ", source_container.client_connection.last_response_headers['Etag'])

This would output the following:
Last etag: "3480" Wrong Headers from Source: "00000000-0000-0000-95df-a2d0c63001da"

Hi @paisleypark. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

Hi @paisleypark, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!