Azure/azure-functions-host

Figure out how to support In/Out bindings properly for non C# languages

mathewc opened this issue · 14 comments

There are challenges currently to supporting in/out binding in scripts for types other than C#. The reason is that for all the other script types, values have to get marshalled (e.g. input bindings write to a temp file, output bindings read from temp file). That pipeline of course loses any object identity, which is required by the SDK read/write bindings.

I'd posit this is a language by language thing. It feels like, we have phases of language support:

  1. Can be called via spawning a new process, communicate via STDIN/OUT & Files
  2. Function is called via a cold "host" which supports basic types (string, json, blob), which are 1 way
    • Host enforced data type consistency, including in, out, in/out
  3. Function is called via a hot "host"
  4. Function is called via a hot "host" that supports advanced types (streams, Azure SDK, etc.)

C# is at 4 (and probably actually something a little more advanced). Node is at 2. Everything else is at 1. Some frameworks, like Batch, may not ever advance to later phases.

For phase 1, we may just go with in/out bindings, we have 1 in and 1 out. It's up to the end user to enforce type consistency (or risk failure).

Thoughts?

Do we need this open? We now have some support here, with work left to do, but this issue doesn't capture that work anymore, I think.

Let's leave it open until I have a chance to review where we're at and what remains to do. I'll then update the issue with the details.

One issue we have still is that we don't have proper portal support for direction: "inout" which is required in some cases (e.g. when binding to CloudBlockBlob, etc.) Customers are running into this issue.

I probably need to understand better what the difference between in and inout is. It seems like we could put it under the "in" category in the portal.

We're mapping our { in, out, inout } values to the corresponding values that the core SDK uses in its bindings. It uses the FileAccess enum, { Write, Read, ReadWrite }. There are some existing validations in place for blob and other bindings to error if the wrong FileAccess is specified. Naturally we map "out" to "Write", however the SDK will error on that - it requires ReadWrite when binding to CloudBlockBlob.

There may be changes we can make in the SDK for this, or perhaps restrict changes to Functions. Not sure yet.

Interesting. I remember we had thought about inout way back in the initial designs for UX, etc. We should probably sync how to expose this properly. Maybe a "Blob Read Input", "Blob Read/Write Input" and a "Blob Write Output?

Do we need to keep that concept/restriction in place? I was talking to @brettsam about that and we were trying to understand the value of ReadWrite binding types. Document DB exposes similar types that would work as in or out. Prior to the change we made, we were driving that off of the argument type, so we can make that decision when needed. Just wondering if this is a concept we need users to worry about.

^ 👍 I'd rather limit the number of concepts we have, especially since the bindings concepts can be a bit abstract to grok in the first place. If we can prevent the user from having to think about this, I think that'd be ideal.

Bindings do need a direction, which must be specified by the user. There are also use cases for inout bindings, as in the DocDB/Blob scenarios. So the fact that the user is specifying a direction means that in the case of an inout binding (a binding that in their code they are both reading from and writing to), neither "in" nor "out" really capture it - it's an "inout".

I agree that we want to make this as simple for users as possible. Need to think on it more.

Need a discussion about in, out, and inout. @christopheranderson, @fabiocav and @mathewc

I discovered this issue last night when I was implementing support for "inout" blob and blobTrigger bindings in Rust using the 2.0 language extension protocol. This is similar to what is being observed for other language workers like Python in #2104.

My language worker is returning output data in the gRPC InvocationResponse with the same name as the "inout" binding, so I assumed this would do the trick, but instead I quickly ran into this issue in the WebJobs SDK.

Unfortunately, "inout" blobTrigger bindings ran into a different NotImplementedException coming from TriggerAdapterBindingProvider. The documentation for C# implies that this works for some C#-specific binding types, so it would be nice if we could support "inout" blobTrigger bindings in addition to "inout" blob bindings for other languages as well. Note: I didn't research how "inout" blobTrigger bindings actually work for the supported C# binding types, though.

Doing so would enable pretty clean code to process blobs in some way after being uploaded, without having to have a separate output binding:

use azure_functions::bindings::BlobTrigger;
use azure_functions::func;

#[func]
#[binding(name = "trigger", path = "container/{filename}")]
pub fn modify_blob(trigger: &mut BlobTrigger) {
    info!("Current contents of blob '{}': {:?}.", trigger.path, trigger.blob.as_str());
    trigger.blob = "Blob was modified!".into();
}

Side note: the direction for the binding is inferred to be "inout" in the above example due to the binding being passed by mutable reference (i.e. &mut) and only for bindings that support that direction.

Closing this issue as this work is being tracked as part of the rich binding support being added to the OOP language workers feature.

Hi @fabiocav. Is there an issue that tracks that I could subscribe to? I like to keep track of what's going in so I can update the Rust worker as-needed.