aaronshaf/dynamodb-admin

Dynamo DB seems to be corrupting shared-local-instance.db file

Opened this issue · 9 comments

I'm not sure that the problem is 100% here, but from the searching I did on the web + testing I tried locally, it seems that at least some of it is here. So, the problem:

When using dynamodb-admin, PutItem and UpdateItem operations are broken on dynamodb-local in that they create duplicates of the item in question (in the case of PutItem if you're trying to add a value that already exists, in the case of UpdateItem no matter what). This problem happened to me and other people I'm working with and we run different systems on different machines (I'll try to detail a little about this later) when either we try to update an item directly through dynamodb-admin (i.e.: when we add an item, and then open it up, change the json, and save), and when the update was done through code (when we're testing our API locally). Whenever an update is tried, a duplicate of the object is created (with the updated info), and then dynamodb-local breaks (because it now has two items sharing partition and sort keys). Deleting the item (at least through dynamodb-admin) is impossible, as is purging the table (because deleting is impossible), so the only option is to either delete the table or delete the entire shared-local-instance.db and repopulate it. Although we haven't actually tried putting the same item twice, this seems to be a problem that usually is related to the update problem I just described.

At first I assumed this was some problem with the dynamodb-local instance, but I found a thread where people described the same problem and it was said that this problem was usually found when people used a third-party viewer to manage the database (such as dynamodb-admin itself), which would corrupt the .db file causing the hash and range keys to be lost. To verify that this was, indeed, the case, I removed dynamodb-admin from my workflow, doing everything through the command line. After that sequences of API uses that would previously fail (because it would first update an item, then try to access the same item, but due to this very problem the item had become innaccessible and would cause a crash of dynamodb-local) were now working, with updates working correctly. This shows that the answer provided in the thread is correct at least to some extent.

Some more technical detailing:

  • I'm running on Arch Linux, whereas two of my colleagues have had the same problem on macOS
  • I'm running dynamodb-admin version 4.0.1 installed through npm
  • The dynamodb-local docker instance was setup using this guide from AWS
rchl commented

I've tried on the database I have handy and couldn't reproduce. I think this could depend on the structure of the database. If you have exact steps, including db structure then please share.

I hope this helps:

This was a rather simple database, with only 5 tables, most of them having only a hash and range keys and no other indexes, except for one, which had a secondary global index with hash and range keys.

All the tables were created (locally) through DynamoDB admin, so the "workflow" was to run docker-compose up with the dynamodb-local instance, then jump into dynamodb-admin and create the tables. One of these tables was populated through a JS script, but that was only added later, and after the problem was already known to happen. The other ones were left empty and would only be populated through the testing of the API.

Writing to the database was done through js, using aws-sdk. Writes worked a-ok, and a first update worked ok as well, but as soon as the API tried reading an item that had been updated, dynamodb-local would crash. This also happened if we tried to update an item manually, as I described.

Since we stopped using dynamodb-admin we haven't had such a problem, despite not changing anything other than that (the code is the same, and the docker instance is the same as well).

Do you think there might be somewhere I can find logs for either dynamodb-admin or dynamodb-local? This way I might provide more/better info.

Other than that, I hope this helps.

I have this issue as well. Any UpdateItem operation duplicates the item and then just the db stops working. I can literally replicate it with a single table.

  MyOwnTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: MyOwnTable
      TableClass: STANDARD
      AttributeDefinitions:
        - AttributeName: another_id
          AttributeType: S
        - AttributeName: this_id
          AttributeType: S
      KeySchema:
        - AttributeName: another_id
          KeyType: HASH
        - AttributeName: this_id
          KeyType: RANGE
      BillingMode: PAY_PER_REQUEST

If you use dynamodb-admin to go to this table, create the following item.

{
  "another_id": "123",
  "this_id": "abc",
  "myVal": 4
}

Click the Save button. Reopen the same item and then change "myVal" to 5.

The UI doesn't react, the log fills up with

ar 31, 2022 3:22:02 AM com.almworks.sqlite4java.Internal log
WARNING: [sqlite] SQLiteDBAccess$14@1991783d: job exception
com.amazonaws.services.dynamodbv2.local.shared.exceptions.LocalDBAccessException: Given key conditions were not unique. Returned: **details on the fields you just changed**
	at com.amazonaws.services.dynamodbv2.local.shared.access.LocalDBUtils.ldAccessFail(LocalDBUtils.java:799)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccessJob.getRecordInternal(SQLiteDBAccessJob.java:224)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccess$14.doWork(SQLiteDBAccess.java:1555)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.SQLiteDBAccess$14.doWork(SQLiteDBAccess.java:1551)
	at com.amazonaws.services.dynamodbv2.local.shared.access.sqlite.AmazonDynamoDBOfflineSQLiteJob.job(AmazonDynamoDBOfflineSQLiteJob.java:117)
	at com.almworks.sqlite4java.SQLiteJob.execute(SQLiteJob.java:372)
	at com.almworks.sqlite4java.SQLiteQueue.executeJob(SQLiteQueue.java:534)
	at com.almworks.sqlite4java.SQLiteQueue.queueFunction(SQLiteQueue.java:667)
	at com.almworks.sqlite4java.SQLiteQueue.runQueue(SQLiteQueue.java:623)
	at com.almworks.sqlite4java.SQLiteQueue.access$000(SQLiteQueue.java:77)
	at com.almworks.sqlite4java.SQLiteQueue$1.run(SQLiteQueue.java:205)
	at java.lang.Thread.run(Thread.java:748)

and finally, we can see that there are two items on the table. Clicking on them triggers more of these logs.

I just found out something interesting. I downloaded NoSQL Workbench from Amazon and tried it again. I get the same thing. I think we can safely assume that this doesn't depend on the UI for the Dynamodb local instance, but rather the way the instance is setup. I'm currently using https://www.serverless.com/plugins/serverless-dynamodb-local, which COULD be the culprit. This plugin uses https://www.npmjs.com/package/dynamodb-localhost as an underlying dependency, and this one uses the official DynamoDBLocal.jar file... Interesting 🤔

I added noStart:true for serverless-dynamodb-local, and decided to stand up my own Dockerized Local DynamoDB. It works. It does seem there's an issue on how serverless-dynamodb-local or dynamodb-localhost work.

@AugustoQueiroz By any chance, were you using any of those two libraries?

I think I managed to isolate the culprit to the option: -optimizeDbBeforeStartup...

Removing that option worked for me.

Maybe it is related to saving different data comprised out of different Dynamo's ConversionSchema (V1, V2, V2_COMPATIBLE). This can potentially courrpt the data.

Yes, UpdateItem is created a new record and cause duplicated key for me. It is also occur in my both MacOS and wsl Ubuntu.