Uniquely identify actors of the same type residing on the same host?
persquare opened this issue · 8 comments
Hi,
Is there any way to distinguish two actors of the same type other than what is shown in the echoserver.rs
example, where (as I understand it) the same echo actor is signed twice to give each copy a unique public key?
I guess what I'm looking for is some way to give an actor a unique name or id, much like capability providers can be named.
Just to give some context to the question: I'm experimenting with a generic host where applications can be deployed from manifests (similar to examples/sample_manifest.yaml
), but where the host and capability providers are static/long-lived, whereas (multiple) actor applications are dynamically deployed and removed during the lifetime of the host.
For a long-lived host where the capabilities stay and the actors move in and out, then the thing you want to use for actor uniqueness is definitely its public key. This is how gantry operates - you upload and download wasm modules based on their public keys.
If what you're looking for is to run two copies of the same actor with the same cryptographically signed identity, then we don't currently support that. I'm assuming you're asking so that you could load actor Mxxxxx
multiple times into the same host, each time with a different set of configuration--to potentially use the same logic (actor) running on two different ports, or expose an actor via HTTP and the same actor via NATS, etc?
If you could provide a little more context on the use case, I can let you know if there's a workaround given existing constraints.
My original concern about allowing multiple identical actors (the same unique identity, hash, module bytes) to run within the same host was that waSCC wouldn't be able to safely protect that actor from stepping on its own configuration - e.g. network port conflicts, blocked access to shared resources, etc.
However, that problem exists with or without the ability to run duplicate actors within the same host. As I said, I'd love to hear more about your use case that warrants duplicate actors then I might be able to provide more useful information.
So, previously I've used a combination of actors and dataflow to write applications that can be deployed across distributed runtimes (hosts) and you can find some info here on the assumptions made on the applications.
The waSCC project piqued my curiosity, and I wanted to see if I could create applications the same way. To give a concrete "hello world"-example:
init : flow.Init(data="Hello World!")
print : io.Print()
voidport > init.in
init.out > print.token
where the two first lines create actors and the two final lines determine the massaging routes between the actor ports. Regardless of on which host the init and the print actor resides, the message from init
would find its way to print
.
As I understand how waSCC works, there would be no way to make sure that the message arrived at the particular print
actor instantiated by this script (manifest), as opposed to another print
actor from some other application.
Now, I'm aware that waSCC actors don't have an init, and that they need an external request to activate; but I guess using an http_server cap for init would do the trick here.
I hope I've managed to explain what I'm trying to achieve, even though the example is somewhat contrived.
One more thing. I guess the above would also require that I figured out some way to create per-application topics for messages based on the flow graph.
Looking at the assumptions from the Ericsson Calvin page, I see these two bullets:
- Actor classes are uniquely identifiable
- Actor instances are uniquely identifiable
If you take a philosophical approach, then the waSCC actor public key right now is a unique identifier for an actor class, or in other terms, the prototype. An actor on its own it lacks the ability to uniquely identify an instance (although internally there is a unique, monotonically-increasing hostID variable assigned to each wasm module). Representing two instances of the same prototype basically comes down to a choice of state management:
- Stateless - The actor prototype code is invoked with all of the context necessary to perform its task and it does not maintain data between calls. In some architectures, the data passed to the prototype is a unique instance ID used for "state lookup" during calls, which is what I'm doing in the architecture for a distributed online game engine
- Stateful - The combination of the actor's prototype code and per-instance data is held in memory, and so every instance is a snapshot of a running wasm interpreter and the VM state.
In the architecture I describe for the game, I deliberately avoid the stateful version because I don't want the overhead of having to maintain two active wasm interpreters for what amounts to exactly the same code that would only otherwise differ by state.
Thanks for the pointer to the MUD blog post, delightful read! I've added "blog-ahead compilation" to my active vocabulary.
I'm new to wasm (and Rust for that matter) and naïvely assumed that all instances shared the same interpreter. The use of prototypes makes perfect sense now. I need to recalibrate my compass, and digest this for a while but I'm pretty sure that that will result in new questions.
Thinking somehow seem to results in more questions than answers...
Just one final check: There is currently no mechanism in waSCC that assists in passing an instance ID during calls – so I'd have to come up with my own solution for that, right?
waSCC, and waPC underneath it, are both pretty strict about remaining ignorant to the shape and manner of the message payloads. It's this agnosticism that allows waSCC to support all kinds of scenarios and be as portable as it is.
Once I get some of the work done on the MUD, I can share the code with you, but you're right in that I am passing instance IDs around and then inside the wasm code, fetching state from a key value store that's keyed on that instance ID.
My grandfather used to say that "if you have questions, you're doing it right" :) I've always loved that saying.
As a side note on running multiple copies of the interpreter: if your wasm engine is a JITter, then it's actually compiling the wasm code into native code instead of interpreting it on demand. Part of the guarantee of wasm is the isolated sandbox, so the JITters produce isolated/sandboxed code. If two instances of the same module could share code paths, even innocently in the name of optimization, then any successful module impersonator could get scheduled alongside the real module and steal any and all sensitive data that flows through the code paths.
If you want to chat more about WebAssembly and/or waSCC, feel free to come by our community meeting every Friday at 1pm Eastern.