cloudspannerecosystem/wrench

Divide and apply long DML which overs limit (20,000)

110y opened this issue · 3 comments

110y commented

WHAT

  • Divide and apply long DML which overs limit (20,000)
    • Cloud Spanner does not allow DML which overs limit.

WHY

  • Currently if we would like to apply DML overs the limit, we must divide DML into some partials and it is tedious.
110y commented

We would like to do so not only for Partitioned-DML but also Insert.

@110y looking at leveraging this in our infrastructure. Curious as to what work you think is present to land this feature as we also think this is a limiting factor.

ie, I'd be happy to help take on some maintenance work, just dunno if you have an idea/plan for how to pull this off.

110y commented

@AlexanderMann

Thank you for commenting.

For now, GCP does not provide any documentations about how to count Spanner mutations. But there are some experiments to count mutations like this: https://github.com/sinmetal/mutation_count_playground.

My current idea is:

  • Getting a table structure from information_schema table (including indexes).
  • Parsing a DML queries to count a number of columns (and indexes) which will be applied.
  • Calculating a number of mutations which will be applied by following experiments mentioned above.
  • Dividing DML queries to make the queries be possible to be applied to Spanner.

ie, I'd be happy to help take on some maintenance work

Awesome! If you would help this issue, I'll review it.

just dunno if you have an idea/plan for how to pull this off.

Although I've written current idea in above, let me know if you have any another ideas, thanks.