sensedeep/dynamodb-onetable

Document Path Invalid - Objects not being created on update

AXSJ opened this issue · 7 comments

AXSJ commented

Describe the bug

If you have a field that is of type object, that is not required (so the field is not created when you create the record), then you perform an update that attempts to populate subfields in that object, you will get the following error: "OneTable execute failed "update" for "example". The document path provided in the update expression is invalid for update"

To Reproduce

Fairly easy to reproduce: https://gist.github.com/AXSJ/ceb2f3c54c59acea46250245e6a6b8d0

Follow the steps in the above Gist.

Cut/Paste

Output Logs:

error OneTable exception in "update" on "Example" { 
    "err": { 
        "name": "ValidationException", 
        "$fault": "client", 
        "$metadata": { 
            "httpStatusCode": 400, 
            "requestId": "3ILKG8PJ3UUJG77KFS4A4N79V7VV4KQNSO5AEMVJF66Q9ASUAAJG", 
            "attempts": 1, 
            "totalRetryDelay": 0 
        }, 
        "__type": "com.amazon.coral.validate#ValidationException", 
        "message": "The document path provided in the update expression is invalid for update" 
    }, 
    "trace": { 
        "model": "Example", 
        "cmd": { 
            "ConditionExpression": "(attribute_exists(#_0)) and (attribute_exists(#_1))", 
            "ExpressionAttributeNames": { 
                "#_0": "pk", 
                "#_1": "sk", 
                "#_2": "exampleField", 
                "#_3": "exampleSubField", 
                "#_4": "id", 
                "#_5": "updated" 
            }, 
            "ExpressionAttributeValues": { 
                ":_0": { 
                    "S": "test" 
                }, 
                ":_1": { 
                    "S": "01GTT384PEX0SFV5ADW769AS63" 
                }, 
                ":_2": { 
                    "S": "2023-03-06T00:05:15.009Z" 
                } 
            }, 
            "TableName": "", 
            "ReturnValues": "ALL_NEW", 
            "UpdateExpression": "set #_2.#_3 = :_0, #_4 = :_1, #_5 = :_2", 
            "Key": { 
                "pk": { 
                    "S": "Example#01GTT384PEX0SFV5ADW769AS63" 
                }, 
                "sk": { 
                    "S": "Example#" 
                } 
            } 
        }, 
        "op": "update", 
        "properties": { 
            "exampleField": { 
                "exampleSubField": "test" 
            }, 
            "pk": "Example#01GTT384PEX0SFV5ADW769AS63", 
            "sk": "Example#", 
            "id": "01GTT384PEX0SFV5ADW769AS63", 
            "_type": "Example", 
            "updated": "2023-03-06T00:05:15.009Z" 
        } 
    } 
} 

Expected behavior
For the update to carry out. So create the exampleField field along with any subfields passed in the update payload.

Environment (please complete the following information):

  • OS - MacOS Ventura 13.2.1
  • Node Version - v16.19.0
  • OneTable Version - dynamodb-onetable@^2.6.1
  • TypeScript Version
  • Any other relevant environment information

Possibly caused by the partial setting. Have you tried it with the schema partial field set to false? This would probably not fix it for your requirements (ability to update just specific nested fields) but would could help the devs locate an issue faster.

My guess would be that the partial update feature doesn't yet support updating fields in a nested object when the object doesn't yet exist.

Yes. That is a limitation of partial updates. The parent object must exist, otherwise surgical property updates cannot be done.

We typically have the parent use a "default" value of {}. This is minimal cost, but then partial updates always work.

NOTE: you can use {partial: false} or true on each API to override the partial setting on a per-API basis.

AXSJ commented

Thank you for your help. Greatly appreciated. However, my project requires the Partial setting set to true.

For now, I have added a "default" value of {}. However, this is more of a hack and not a solution. OneTable needs to be updated.

Thanks for the feedback. Setting default to {} is the default solution and may be the only possible performant solution.

To work with missing parent objects would require additional conditional checks on all updates which will have a performance impact. The failing request would then need to be caught and the parent would need to be created and then the original request retried.

Thank you for your help. Greatly appreciated. However, my project requires the Partial setting set to true.

For now, I have added a "default" value of {}. However, this is more of a hack and not a solution. OneTable needs to be updated.

The way Dynamodb works means that this cannot be done in a single update call as it's not possible to add attributes to an object that does not exist. So you have to have a do the first update with a conditional check that the attribute exists.

If the update fails due to this conditional check then you make the update request again but this time adding an object as the attribute rather than trying to set a value on it.

I imagine the framework could be updated to handle this but as @dev-embedthis this would probably have a performance impact. If you feel the solution suggested is too much of a hack for you then you can implement the above flow through onetable:

  1. Update with partial true and a conditional check
  2. If it fails due to the conditional check then do the update again but with partial set to false and no conditional check
AXSJ commented

Ok thank you everyone! Greatly appreciate the help and the knowledge. I will use the default {} solution. Thank you for explaining @AlexStansfield. And thank you @dev-embedthis for your contribution.

Just a few notes to summarize. You can get the message: "The document path provided in the update expression is invalid for update" two ways:

  • An object with nested properties does not exist
  • In an array with nested objects and properties, the nested object does not exist.

You can address both by using {partial: false} on the specific update API. In that case you must provide the complete top level property with all items.

The issue is that detecting a missing top level property is difficult as the message from DynamoDB is not specific as to the document path that is causing the problem. To pre-check the existing of a top level property with nested items would be prohibitive.

This problem mostly manifests when you ADD a new nested schema without a default value of {}. Best dealt with in migrations when you add the property to ensure all items have the required new property set to {}.

If you are using arrays of nested items, it is more difficult as you can't set a default of [{}] as that would imply an array length of 1. Setting the default to [] is desirable but insufficient as when you do the update, the {} element won't exist. So best to use {partial: false} and update the entire property.

Unless anyone can think of a better solution.