DynamoDB: Support batching on full-load operations
amotl opened this issue · 3 comments
Problem
The DynamoDB Table Loader does not do bulk loading yet. It needs to be implemented to transfer larger amounts of data more efficiently.
Details
if key is None:
response = table.scan(Limit=bulk_size)
else:
response = table.scan(ExclusiveStartKey=key, Limit=bulk_size)
Other than the snippet above, which may effectively just emulate creating batches of data manually, there also appears to be a native operation variant on the DynamoDB API, called BatchGetItem
. It might be the right choice to use from the beginning.
I did not look into the details yet, so please advise and correct me where I am wrong. Thank you very much. 🍀
Contrary to my previous assessment, the BatchGetItem and BatchExecuteStatement operations are not about retrieving multiple items in bulk, but rather about submitting multiple queries within a single request.
Using the Scan operation, together with Pagination, like displayed in the code snippet in the OP, is absolutely the right choice.