immutable.Bag doesn't extend `GenericTraversableTemplate[Bag[T]]`
Opened this issue · 7 comments
This means that the return type of e.g. flatten
is just Iterable[T]
. This should be hopefully easy to fix.
I haven't checked the other Bag variants yet, though I conjecture they need updating as well.
I'm not sure if the semantic of flatten would be correct for any custom equivalence. I'll have to analyse it depth. Could you tell me which is your specific use case.
I'm using a standard equality (compact
), and relying the equivalence b.flatMap(f).flatten == b.map(f)
. (More in particular, I'm saving the results of b.flatMap(f)
, to update it incrementally if b
undergoes a change). I haven't fully understood semantics of other cases, I'm still exploring your library.
Your comment about semantics seems interesting, but since flatten
needn't access implementation details, this seems likely to affect other client code. And indeed, if I consider bags of different sizes equivalent in my bb: Bag[Bag[T]]
, the size of bb.flatten
will be unpredictable with some implementations of flatten.
I currently uglily reimplemented flatten
like in the standard library, which is probably slower than needed.
implicit def config[T] = Bag.configuration.compact[T]
implicit class FlattenOps[T](b: Bag[Bag[T]]) {
def flattenB: Bag[T] = {
val builder = Bag.newBuilder[T]
for (baglet <- b)
builder ++= baglet
builder.result
}
}
You are probably right, extending it will work correctly.
I'm not sure I'm suggesting that. However, for standard equalities it'll work fine, and that's the usecase I'd like. For the rest, I'm not sure what correctly means.
Also, when I hacked together quickly a bag implementation, I actually had a potentially faster version, which multiplies multiplicities instead of readding elements:
case class Bag[A](contents: immutable.Map[A, Int] = immutable.HashMap())
//XXX: Not reusing collection infrastructure :-(
{
//...
def flatten[B](implicit p: A <:< Bag[B]): Bag[B] = {
new Bag(for {
(outerEl, count) <- contents
innerBag = p(outerEl)
(el, count2) <- innerBag.contents
} yield (el, count * count2))
}
}
Could such an implementation be supported in your library?
That may work for the compact representation but will fail on the others. But the implementation of flatten might be just a union
over the bags. Which would internally do something similar to the code you wrote when using the compact representation.
Thanks, I was indeed just supporting a compact representation.
I've tried out your suggestion. However, as currently implemented, that appears to take quadratic time, because union
takes time linear in the size of both collections (see new issue #7), and the accumulator keeps growing. (In fact, I've not waited for my benchmark to terminate).
The complexity is thus equivalent to the one of the code below, which has more visibly quadratic cost:
var acc = Array.empty
for (bag <- bags) {
val newAcc = Array.empty
newAcc += acc //cost: O( | acc | ), which keeps increasing
newAcc += bag //cost: O( | bag | )
acc = newAcc
}
I forgot to write the implementation of flattenB
I tried out:
def flattenB: Bag[T] =
b.fold[Bag[T]](Bag.empty)(_ union _)
Apparently, that fold
resolves to foldLeft
, though foldRight
wouldn't make a difference.