pure-c/purec

Implement reference counting

felixSchl opened this issue · 10 comments

Experimental: Implement reference counting as an alternative or as a complement to GC.

  • Add a refcount int to purs_any_t
  • Add a retain function to increment refcount (we can worry about atomicity later)
  • Add a release function decrement the refcount
    • If refcount drops to zero clean up the contained purs_value_t, and free the pointer
  • Update purs_vec_t to properly increment/decrement the reference count
  • Update purs_record_t to properly increment/decrement the reference count
  • Keep retain-ed references in scope structs
  • Keep retain-ed references in _FFI scopes (and release after BODY)
  • Emit release calls for allocated resources before returning from functions

One thing to consider is that you still have to handle cyclical references somehow.

I am trying to think of a PureScript program that would result in cyclic references, do you have an example on top of your head?

Cyclical references require mutation, so it can be done with Lazy or a Ref. For example, anything that uses Lazy.fix is cyclical https://github.com/purescript/purescript-control/blob/v4.1.0/src/Control/Lazy.purs#L22-L25.

So a Ref would point to itself either directly or indirectly? I think we could get away with a warning not to do that and information on how to probe for leaks (or assume that knowledge, given the context). I cannot see the cycle using Lazy.fix, however. Do you mind explaining it to me?

The cycle depends on the implementation of defer, which is usually at some point tied with a Data.Lazy. Probably the simplest example with lazy lists is:

xs = defer \_ -> cons 42 xs

This will be a single cons node that points back to itself.

You can tie knots with Ref mutation.

data MutableList a = Nil | Cons a (Ref (MutableList a))

loop = do
  tail <- Ref.new Nil
  let list = Cons 42 tail
  Ref.write list tail
  pure tail

Thank you for elaborating on this. Given they both require mutation (and therefore FFI), I doubt we can statically pick up on those. I wonder if going hybrid would be possible, such that FFI allocated values could be gc-allocated or require an explicit release function be called on them.

I haven't read it yet, but collecting cycles seems to be a solved problem as well: https://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon01Concurrent.pdf. From a bit of research many languages take this approach - python and php to name a prominent few.

This feature has been completed! 🎉

I am very happy with the state of the project now. All packages featured in the bundled package-set build fine and their test suits run fine without leaking. Additionally, all upstream tests are passing incl. leak checks. In addition to the reference counting GC it's also possible to alternatively enable the tracing GC instead if required. The resulting binaries are small and perform at least on par with the JS equivalent. There's undoubtedly more optimizations that could be done, at the corefn level, the support library level and code-gen level, but so far the code is reasonably optimized and performing well enough for this to be a useful backend.

Congrats!