Valodim/terraform-provider-desec

Support deploying bigger plans: use bulk requests to work around rate limitations

ashpool37 opened this issue · 2 comments

Initial problem

I was trying to create a domain with 11 records in it from scratch. Upon applying the plan, all records were created normally, except for one. The command output ended with the following message:

Error: failed to call API: Post "https://desec.io/api/v1/domains/[REDACTED]/rrsets/": POST https://desec.io/api/v1/domains/[REDACTED]/rrsets/ giving up after 6 attempt(s)

After issuing terraform apply again, the new plan — which contained the creation of only one record — was applied without errors.

Attempted solution (limit_write)

As I felt I was reaching a rate limit of the deSEC API, I attempted to tweak the limit_write argument in my provider schema as suggested by the documentation. I found that this argument was no longer supported:

Error: Unsupported argument

on main.tf line 23, in provider "desec":
23: limit_write = 4

An argument named "limit_write" is not expected here.

Pinning down the bug

This is a documentation bug at least, as the docs still mention a schema argument that isn't there anymore.

However, I also believe there ought to be way to influence the rate at which the provider sends requests given the relatively strict limits imposed by deSEC. I can think of a few options:

  • bring back limit_read and limit_write;
  • implement a retry_max argument and use it to set RetryMax in the options of the desec client;
  • find a way to leverage bulk requests supported by the deSEC API, e.g. by creating/deleting instances of the same resource in bulk.

Having one or more of these enhancements implemented would improve the experience of applying bigger plans using the provider.

After experimenting with retry_max and some medium-sized zone definitions (30+ records) it's obvious that for any medium-scale deployments utilizing bulk requests is the only option. The 30 requests/hour constraint is unworkable otherwise.

I agree this can only really be solved by batching the requests. However, Terraform doesn't offer any mechanisms for batching, so this would require a hack in the provider to do.

We're doing something similar with the rrset cache for reading resources. But with the given CRUD interface I can't even think of a straightforward hack that could be used to implement this for writing resources. We could record CRUD calls and batch execute them on a debounced timer, I guess? That sounds pretty horrible though.

I'll leave this issue open for the future, but I'm not sure there's much we can do here.