The ensureCapacity implementation
frajt opened this issue · 3 comments
Hi,
We have a change in the Trove ensureCapacity method which takes care that the grow of the underlying tables (_set, _values) is following the normal grow where the capacity is always doubled. Your (and Trove) implementation will always increase the capacity just to fit the new additional (or Trove's extra desired) size which will cause a lot of rehashing (including array re-allocations) when ensureCapacity is called many times for small increments. If the ensureCapacity is going to rehash, it should follow the normal capacity grow (Trove - double it, Koloboke - grow by configured scale).
Reimplemented ensureCapacity for Trove 2.0.x
@Override
public void ensureCapacity(int desiredExtraSize) {
if(desiredExtraSize > (super._maxSize - size())) {
final int desiredCapacity = ((int) ((desiredExtraSize + size()) / super._loadFactor)) + 1;
final int doubleCapacity = this.capacity() << 1;
this.rehash(PrimeFinder.nextPrime(Math.max(desiredCapacity, doubleCapacity)));
final int capacity = capacity();
this.computeMaxSize(capacity);
this.computeNextAutoCompactionAmount(capacity);
} else {
;
}
}
BTW: Do you have a page listing all changes made to the Trove interface? The ensureCapacity is to be called now with the expected total size which is quite a change to the desired size on Trove.
Michal
Currently it seems to me that such logic would make more harm than good. In the case you are performing many addAll/putAll calls, where each merged collection is small, indeed there will be more rehashes than with the approach you suggested. But it seems to be very rare case. On the other hand, if you perform only one addAll/putAll, with your approach there will be unneccesary memory overuse.
Note that with current approach you can workaround this behaviour, by performing ensureCapacity manually with sum of total additions:
totalAdd = collections.map(c -> c.size()).sum();
bigKolobokeCollection.ensureCapacity(bigKolobokeCollection.size() + totalAdd);
collections.forEach(bigKolobokeCollection::addAll);
With you suggestion, there would be NO way to enlarge capacity by a small margin.
Please correct me if I understood you question incorrectly, or you think puttingAll/addingAll many small collections is more common case.
BTW: Do you have a page listing all changes made to the Trove interface? The ensureCapacity is to be called now with the expected total size which is quite a change to the desired size on Trove.
I done this because I have found that I more often type something like c.ensureCapacity(capacityNeeded - c.size())
than c.ensureCapacity(addition)
. Note that ensureCapacity() is already called inside putAll
/addAll
, you shouldn't worry about that.
No, there is no such page yet.
I done this because I have found that I more often type something like c.ensureCapacity(capacityNeeded - c.size()) than c.ensureCapacity(addition). Note that ensureCapacity() is already called inside putAll/addAll, you shouldn't worry about that.
However, no. I thinked explaination why I did that change, but more likely I did that because ensureCapacity
name is expected to accept the total capacity rather than addition. I followed principle of least astonishment in API.