leonoel/missionary

Cloroutine exception

julienfantin opened this issue ยท 9 comments

Below is a repro for an exception materializing in the cloroutine implementation.

I tried to make the repro minimal, so the example is a bit contrived but I believe should be valid?

b-27-SNAPSHOT throws a NPE, however b-26 just hangs.

 (m/?
  ;; Unhandled java.lang.NullPointerException
  ;; Cannot invoke "clojure.lang.IFn.invoke(Object)" because the return value of
  ;;"clojure.lang.RT.aget(Object[], int)" is null
  (m/sp
   (m/? ;; works up till here
    (m/reduce
     (fn [_ x]
       (m/? (m/sp x)))
     [] (m/seed [1])))))

m/? is not valid inside clojure.core/fn. Leo says โ€“ "Parking operators are illegal outside of rewriting macros and also illegal in lambda expressions defined in rewriting macros. The reason is technical, related to platform limitations. It will eventually be documented, it is also planned to improve error messages."

Hi Dustin!

Parking operators are illegal outside of rewriting macros and also illegal in lambda expressions defined in rewriting macros

Does your comment imply that with this labeling:

 (m/? ;; 1.
  (m/sp
   (m/? ;; 2.
    (m/reduce
     (fn [_ x]
       (m/? ;; 3. 
        (m/sp x)))
     [] (m/seed [1])))))
  • 1 is actually illegal, even though it's used at the repl to pull values out
  • only 2 is technically legal because it's inside a rewriting macro and doesn't cross a fn barrier
  • it's the co-occurrence of 1 and 3 that's materializing the issue

I'm not sure I'm understanding this correctly, because pulling the expression at 2 out to the top-level works, even though the 1 and 3 situation occurs:

(m/?
  (m/reduce
   (fn [_ x]
     (m/?
      (m/sp x)))
   [] (m/seed [1]))) ;; => 1

What's the intuition here?

Surprisingly, (1) and (3) is not the same as (2), m/? is defined differently inside and outside of the rewrite macro. Inside the macro rewrite, m/? is a macro cutpoint, but outside, it is a JVM-only thread blocking operation for REPL convenience. So here, only (2) is the "real" async m/? and the others are blocking. Super different machinery under the same name.

(3) in particular seems super bad IIUC as it is blocking the thread in the middle of an async reduce, not actually sure what exactly is happening there.

I personally think (1) should be given a different name to prevent exactly this error which ~all beginners make; Leo disagrees and I've forgotten his reasoning.

Surprisingly, (1) and (3) is not the same as (2), m/? is defined differently inside and outside of the rewrite macro. Inside the macro rewrite, m/? is a macro cutpoint, but outside, it is a JVM-only thread blocking operation for REPL convenience. So here, only (2) is the "real" async m/? and the others are blocking. Super different machinery under the same name.

Right, I think I get that. Similar situation as core.async <! vs <!!?

(3) in particular seems super bad IIUC as it is blocking the thread in the middle of an async reduce, not actually sure what exactly is happening there.

I personally think (1) should be given a different name to prevent exactly this error which ~all beginners make; Leo disagrees and I've forgotten his reasoning.

I think I agree. The previous example, translated to core.async yields a more useful error:

(a/<!!
  (a/reduce
    (fn [_ x]
      (a/<!
        (a/go x)))
    [] (a/to-chan [1])))
;; Exception in thread "async-dispatch-12" java.lang.AssertionError: Assert failed: <! used not in (go ...) block
nil

I personally think (1) should be given a different name to prevent exactly this error which ~all beginners make; Leo disagrees and I've forgotten his reasoning.

Having m/? available outside of the rewriting macro is very useful for repl development, for example I often use scope-capture to capture the scope inside an m/sp block, and can then evaluate the m/? forms directly while developing.

But I agree that is can be confusing - my preference would be for a warning to be displayed when m/? is used outside and have an additional function m/blk to indicate that you intend to block the thread.

@mjmeintjes I would love to see a demo of your use of scope-capture with missionary, can you paste code and/or screenshot of it here? If you do that I can translate it into a notebook like https://nextjournal.com/dustingetz/missionary-relieve-backpressure

Leo disagrees and I've forgotten his reasoning.

  • the mechanical difference is an implementation detail, the operator semantics are the same inside and outside sp/ap : run an effect, wait for result. ? just happens to not be supported everywhere, for technical reasons, which should be better explained in both documentation and error messages.
  • a unique name means you can leverage it in macros that will work in both contexts (cf holding).
  • if virtual threads are available, it is safe to replace (sp ,,,) with (via virtual-thread-executor ,,,), parking becomes blocking but the meaning is the same.
  1. if better errors can fix the beginner experience here โ€“ is there a clear answer? Not obvious to me how to detect the misuse, but if it can be done, the following points are somewhat invalidated.

  2. Setting aside the design elegance, it seems the concrete argument is that blocking m/? saves an if statement in holding. A tiny cost savings relative to the learning curve cost.

  3. it seems like Missionary is trying to unify a lot of different things here. I see two distinct markets/use cases:

    • application programmers who want "just a common portable async solution"
    • framework programmers who want "advanced concurrency library that unifies N concurrency backends under one DSL frontend"

    If the missionary API is just frontend syntax to various concurrency machinery, perhaps there are other ways to opt in to various "modes" for different usecases. E.g. you can set a global flag, or have multiple "api namespace frontends". That way, common devs don't have to understand any advanced ideas (worse actually - your answer requires common devs to understand theoretical future things that don't even exist yet)

  4. P.S. โ€” the use case of blocking the REPL on a task is not even a real use case IMO: I propose that (a) RCF is a more idiomatic way to tap async results to the REPL in a testable way and that (b) this fully supersedes m/? repl use case. If so, the hello world should show beginners how to write idiomatic async tests right from the first moment. Testability matters!

4. P.S. โ€” the use case of blocking the REPL on a task is not even a real use case IMO: I propose that (a) RCF is a more idiomatic way to tap async results to the REPL in a testable way and that (b) this fully supersedes m/? repl use case. If so, the hello world should show beginners how to write idiomatic async tests right from the first moment. Testability matters!

It would be interesting to compare development approaches, as for me being able to use the REPL while developing and debugging async tasks is critical, and one of the big advantages of missionary above core.async.

For example, say you have a function like:

(defn example-test$ []
  (m/sp
   (let [a (m/? (get-first-result$))  ;; (1)
         b (m/? (get-dep-result$ a))  ;; (2)
         c (m/? (get-dep-result$ b))] ;; (3)
     [a b c])))

It is useful to be able to just directly evaluate (1), or do something like (def a 123) and then be able to evaluate (2). I do this all the time when developing - incrementally set up state and then develop the function further.

As I mentioned, I also use this when debugging an error - use scope-capture to set up the scope when the error occurred, and then evaluate the tasks directly to determine the problem.

For example, say (get-dep-result$ a) threw an exception, I could wrap it with (spy-ex), a custom macro that just creates a scope-capture execution point when the form it wraps throws an exception. Then I can use scope-capture to set up the scope as it was when the exception was thrown, and evaluate the forms directly inside the function to examine the problem.

(defn example-test$ []
  (m/sp
   (let [a (m/? (get-first-result$)) 
         b (./spy-ex (m/? (get-dep-result$ a)))  ;; create a "breakpoint" here when an exception is thrown
         c (m/? (get-dep-result$ b))]
     [a b c])))

Because m/? works inside and outside of the rewrite context, I can develop my code in a sync fashion, while running it as async code. This is a big advantage for me, as it makes development much simpler.