Iterators 'blocking' each other
Opened this issue · 2 comments
I suspect that this may be caused by #35 so don't worry for now.
This is a potential issue that I saw when perf testing the current PRs, but also saw in main.
If I run (on the main branch)
import { ArrayIterator, range } from '../dist/asynciterator.js'
for (let i = 0; i < 5; i++) {
const arr = new range(0, 2000000); // I've tested with both `range` and `ArrayIterator`
const iterator = arr
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item);
const now = performance.now();
iterator.on('data', () => {}).on('end', () => {
const end = performance.now();
console.log('elapsed', end - now);
});
}
then the result is
elapsed 14108.122984007001
elapsed 14112.1332180053
elapsed 14112.153782993555
elapsed 14112.245641991496
elapsed 14112.30153799057
(as opposed to ~3000ms for one)
I suspect the issue is that the end
event is called asynchronously so has to wait for everything else to complete before we get back to it.
I'm not entirely sure about my assumptions here but this should generate a significant amount of CPU contention between iterators as we're still single-threaded (-ish, I know there's some magic happening in V8) and there's no I/O anywhere.
On my machine, the time per item remains constant and the total time scales linearly with the overall number of items, regardless of the number of (quasi-)parallel chains. I get between 8.8 and 9.3 seconds for 10_000_000 items regardless of whether that's due to 1 chain going through 10_000_000 items or 5 chains going through 2_000_000 items each.
Curiously, distributing 10_000_000 items across 100 chains going through 100_000 items each results in roughly 1.25x faster performance, finishing in ~ 6 seconds. I don't think it's something to do with the end event.
Just ran some tests that seem to support what @jacoscaz is saying here
Experiment
import { ArrayIterator, range, AsyncIterator } from 'asynciterator'
for (let i = 0; i < 5; i++) {
const arr = range(0, 2000000); // I've tested with both `range` and `ArrayIterator`
const iterator = arr
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item)
.map((item) => item);
console.time(`End event emitted for iterator ${i}`)
console.time(`First data event emitted for iterator ${i}`)
console.time(`Middle data event emitted for iterator ${i}`)
console.time(`Last data event emitted for iterator ${i}`)
iterator.on('data', (x) => {
if (x === 0) {
console.timeEnd(`First data event emitted for iterator ${i}`)
}
if (x === 1000000) {
console.timeEnd(`Middle data event emitted for iterator ${i}`)
}
if (x === 2000000 - 1) {
console.timeEnd(`Last data event emitted for iterator ${i}`)
}
}).on('end', () => {
console.timeEnd(`End event emitted for iterator ${i}`)
});
}
Results
First data event emitted for iterator 0: 8.585ms
First data event emitted for iterator 1: 9.589ms
First data event emitted for iterator 2: 9.729ms
First data event emitted for iterator 3: 9.942ms
First data event emitted for iterator 4: 10.167ms
Middle data event emitted for iterator 0: 7.531s
Middle data event emitted for iterator 1: 7.531s
Middle data event emitted for iterator 2: 7.530s
Middle data event emitted for iterator 3: 7.531s
Middle data event emitted for iterator 4: 7.531s
Last data event emitted for iterator 0: 14.690s
Last data event emitted for iterator 1: 14.690s
Last data event emitted for iterator 2: 14.690s
Last data event emitted for iterator 3: 14.690s
Last data event emitted for iterator 4: 14.690s
End event emitted for iterator 0: 14.691s
End event emitted for iterator 1: 14.691s
End event emitted for iterator 2: 14.691s
End event emitted for iterator 3: 14.691s
End event emitted for iterator 4: 14.691s