TimLariviere/Fabulous-new

[Performance] Implement StackArray of several sizes and use it for storing Attributes and Children

Closed this issue · 0 comments

twop commented

Context

Currently adding an attribute to a widget involves a heap allocation of a new array.

Copy pasting (at the moment of creation of this issue):

static member inline AddScalarAttribute(this: ^T :> IWidgetBuilder, attr: ScalarAttribute) =
    let attribs = this.ScalarAttributes
    let attribs2 = Array.zeroCreate(attribs.Length + 1) // <--- allocating a new one
    Array.blit attribs 0 attribs2 0 attribs.Length
    attribs2.[attribs.Length] <- attr

    let result =
        (^T: (new : ScalarAttribute [] * WidgetAttribute [] * WidgetCollectionAttribute [] -> ^T) (attribs2,
                                                                                                   this.WidgetAttributes,
                                                                                                   this.WidgetCollectionAttributes))

    result

Problems with that approach

  1. We allocate a new array on the heap for each attribute. Note that this gives amazing ergonomics due to immutability that allows safe reuse of widgets, so it is there for a good reason!
  2. Most of the time (like 99.9%) these prev allocated arrays (let attribs = this.ScalarAttributes) are short lived and apply pressure on GC
  3. Cache misses during attribute diffing. E.g. In order to traverse attributes we need to follow the pointer where these attributes are located.

Proposed solution

Use a technique commonly used in high performance computing when you know that arrays are going to be mostly small.

pseudocode:

[<Struct>]
type StackArrayN<'a> = {
  Size: uint
  Item0: 'a // initialized with 'default'
  Item1: 'a
  ....
  ItemN: 'a

  PostN: 'a[] // needs to be initialized with null. e.g. 'default'
}

As you might guessed it is a bit tricky to work with and it needs to be explicitly written for each N. BUT it doesn't allocate anything before we reach N. Which makes traversal and creation very CPU cache friendly (in a happy path)

Luckily we have only a handful of use cases when it is applicable

Usecases

  1. Storing Attributes in a Widget, double win for creation and diffing perf
  2. Children of Widgets, usually we have at most 1 type of children attributes with some exceptions, thus StackArray1 might be a good candidate for those, although I'm not sure if we can actually do that given the recursive nature of types.
  3. Any other small collections

Notes

  1. Exact N needs to be determined from some existing code bases. Ideally we would run a project with a build that can capture usage profile. I would aim for p90 if it is not too big of a number. From what I saw it is probably somewhere between 5 and 10. @TimLariviere you are probably more knowledgeable in that regard.

Downsides

  1. Storing Widgets will consume more memory. Assuming we use N=5(TBD) for Attributes. For 1000 Widgets (which is a pretty large number) 5 * 32 bytes (upper limit of Attribute size) * 1000 = 150Kb. Totally fine by me
  2. It is super cumbersome to work with, how to iterate them with no dynamic dispatch nor allocations?
  3. Sorting is probably going to be a pain

Links/Resources