sustrik/libmill

Context switching points

paulofaria opened this issue · 12 comments

Hello, Martin.

As you might remember I maintain https://github.com/Zewo/Venice. A Swift library that wraps libmill.

Unfortunately we're having a bit of a leakage problem on macOS. Swift uses Foundation under the covers on Darwin which in turn uses ARC. The problem is that because of libmill context switches the calls to push/pop on the autorelease pool get unbalanced.

To deal with this we basically need a hook for right before any resume of a co-routine and a hook for right after it’s suspended and a guarantee that those hooks are always 1:1.

We tried inspecting the code ourselves, but I'm afraid we couldn't find all the places. Could you help us pinpoint the actual places the context switches occur, or the best place to put those hooks, given we need 1:1?

Just to note, I checked CLibVenice and it uses the setjmp context switching method still. @paulofaria - libmill HEAD uses an optimised context switcher on x64.

@raedwulf yeah! we we're talking about updating it just now. haha. awesome work by the way. 😊

Thanks :) not sure if this helps but setjmp occurs when the coroutine is suspended and longjmp performs the actual switching to another coroutine. When a setjmp occurs again it is this latter coroutine being suspended now. Maybe hooking these functions will catch all cases easily?

I'm not familiar with Swift but I think it might be difficult to wrap my optimised implementation and I think the interop would kill any of the benefits gained. But no worries, the implementation with setjmp and longjmp is not going anywhere.

I think @antonmes had a try with your context switching. Anton, do you remember if there were performance drops?

Or actually, not much gain.

@raedwulf appreciate the tips, using setjmp/longjump nearly works, but theres still somewhere else where execution drops off that we're missing.

@paulofaria yes, i had the same numbers with this benchmark, but i guess the bottleneck in this case is memory allocation for each coroutine.

Sorry for the delay.

Some background info: mill_suspend() is called to suspend current coroutine (IIRC it's also called -- without the corresponding resume -- when coroutine finishes. mill_resume() is called to re-schedule a suspended coroutine. It doesn't mean it will run immediately though.

That being said, I would say the best way to hook into libmill's context switching are mill_setjmp_ and mill_longjmp_ macros. They are the only way to switch contexts in libmill, so putting the hooks there should catch all context switches.

Thanks! And sorry for the late reply as well!

@sustrik and I have been doing some work on libdill's context switching/stack allocation code. I'm not sure if you've tried using it in Venice yet, but it might be worth a try to see if the problems still exist. However some functionality is split between libdill and dsock, so you might not be able to do a full port as yet if it does work.