Remove configuration from state

Question

Remove configuration from state

Closed this issue 12 days ago · 1 comments

This is speculative issue, something that's been on my mind for well over a year.

Basically I would like adaptor code to make far fewer assumptions about the state object. It shouldn't assume state.configuration and shouldn't assume state.client. These assumptions cause problems when users forget to return state (or return state wrongly).

Ideally, configuration wouldn't be written to the state object at all, and it would feel safer and more responsible if credentials are never actually shared with job code.

So, this issue is an exploration of how we could remove configuration from the state object.

Solutions

1. Do it in the adaptor

The quickest, easiest, most backward-compatible solution is to make adaptor code make fewer assumptions about state.

For a long time adaptors wrote a client object to state, which triggered exactly this same problem. This was fixed by using closures. The execute function will read from state before execution starts, save the client to a closure variable, than then start executing. Operations then have a client in scope.

We could do the same thing with configuration: inside the execute hook, pull any needed values off the config object, save them to local closures, and remove the configuration key from state.

One difficulty with this is, believe it or not, I don't like to be too prescriptive in adaptor code. Have strong standards yes, but hard rules no. If we do have a strong rule, I'd much prefer it was enforced by the runtime (leading on to option two...)

2. Do it in the runtime

The runtime could split state and configuration into two different objects and pass them to each operation as two arguments.

Every operation should have this signature:

fn(state, config) => State

The two benefits to this approach, as I see it:

The adaptor will never assume state.configuration, removing one more runtime dependency on the shape of state
Job code will never see the config object

Does this approach make debugging config harder? How would you ever inspect your configuration object? You can't console.log it from job code. The runtime could log the shape of the config object, including primitive data types and perhaps string lengths. Or the first and last 3 characters or something. We could also have a common printConfig() operation which does the same thing

This is not backwards compatible because configuration will not be on the state object, so old adaptors will break. We could always have a runtime flag to support this, and the runtime can run legacy mode for those adaptors.

3. Do it in a lifecycle hook

A variant of 2: if an adaptor exports a setup function (or init or something), we can call that with state and config as two arguments.

The adaptor can then do whatever it wants to initialize itself with config, writing to state or setting closure variables.

We could also have a teardown function to invert any setup if needed (although if the adaptor doesn't use state, it likely doesn't have anything to do)

The advantage of this approach is that configuration isn't needlessly passed to every operation, which feels like an antipattern. I much prefer a burn-after-reading approach to config where we pass it to the ADAPTOR, not the job, on startup, and then never expose it again.

Again, debugging might be made a bit harder. But we could consider logging some obfuscated summary of the config object at the top of the log. This would be adaptor-specific - just console.log any non-sensitive information, like the user name, in the callback.

Examples

msgraph reads state.configuration for every operation to pull the access token off it. This is super brittle.

Answer 1 · 2024-10-30T09:28:22.000Z

This is subsumed by #779 (where we go with option 2)