immutable.Bag.++ takes time linear in the size of both collections.
Opened this issue · 0 comments
While reading your code, I noticed one thing which might explain bad performance I'm seeing in some scenarios.
immutable.Bag.++
creates a completely new bag, which is very expensive. Instead, I expect it should reuse the ++
method of HashTries, which will share unchanged nodes of the trie. The implementation is in scala.collection.BagLike.++
, and not overriden in scala.collection.immutable.BagLike.++
. Indeed, the implementation of immutable.Bag.++
appeared to give correct results even in presence of issue #3.
In particular, this means that the cost of ++
is linear in the size of both arguments: adding 1 element to a bag with N elements takes Θ(N) time. Thus, building a bag of N elements by appending one element at a time takes time Θ(1 + 2 + 3 + ... + N) = Θ(N^2).