nicolasstucki/multisets

immutable Bag is too slow in a benchmark (linked)

Opened this issue · 2 comments

Hi @nicolasstucki! You might remember me from previous issues this week, and because we met at some conference (IIRC, Scala 2013).
I'm giving so many comments because I'm experimenting with some implementations of immutable bags, including something I quickly hacked together, and yours, and right now I have an example which appears much slower with your implementation. I don't claim it's necessarily because of your library — I'm using it for atypical needs and in atypical ways, but maybe you're interested in taking a look.

Furthermore, almost all examples are much (100x) slower with bags (either implementation) than with lists, though I hope this can be avoided. That's why I took a look at your library.

I'm still examining the first benchmark I wrote:

  def nestedLoopBags1(coll1: Bag[Int], coll2: Bag[Int]): Bag[Int] =
    for {
      i <- coll1
      j <- coll2
    } yield (i * N + j)

In fact, I'm studying the performance of some equivalent variants, including variants which re-execute incrementally only some operations when the inputs change.
I haven't properly profiled the execution time, so I don't know yet what's the cause of the slowdown, but I expect that #7 is involved.

To reproduce

I'm afraid the code isn't properly commented yet; but anyway:

  1. Checkout https://github.com/inc-lc/ilc-scala
  2. run sbt
  3. at the sbt prompt, run testOnly ilc.examples.handwritten.NestedLoop1Bench.
  4. Notice that nestedLoopBags3Incr is no faster than other bag operations, and that all bag benchmarks (Bag) are 100x slower than non-bag-benchmarks (those without Bag in the name).
  5. Switch to branch topic/homegrown-bag-tune (with a hand-crafted implementation of some Bag operations) and compare performance.

In most cases I would expect this lists to perform better for this particular benchmark because the bag builder is inherently more complex. In general full traversals or constructions of bags will not be as performant as other collections because of the bucket objects and some other boxing of values.

Another reason for them to not have the most performant implementations is the generic equivalence support. In most cases if you would fix the equivalence (in your case the compact representation with equivalence defined as equality) it is possible to have more efficient implementations of the operations. You could try to create a custom implementation of the HashBag where you remove the bagConfiguration as a parameter and override the operations that you require to be performant.

Although thanks to your remark I found a possible improvement on the last implementation of immutable.HashBag where I can remove one source of boxing on every operation.

Thanks for your answer. But have you looked at the quadratic performance in #7?

In most cases I would expect this lists to perform better for this particular benchmark because the bag builder is inherently more complex. In general full traversals or constructions of bags will not be as performant as other collections because of the bucket objects and some other boxing of values.

That matches my experience, but a 100x slowdown still seems somehow worrisome. But let's agree you're right at least until I can offer something more performant.