Size hint for internal iterator used by `many*` and `count`
m4rw3r opened this issue · 4 comments
Implementing Iterator::size_hint
will enable more efficient allocation for the containers. According to profiling a lot of time is spent allocating and reallocating in some situations, a better size_hint
would improve performance.
count
should yield(n, Some(n))
since it will always result inn
elements on success.many1
andsep_by1
should yield(1, None)
since they have no upper boundmany
,many_till
andsep_by
can use the default value of(0, None)
It might also be feasible to implement a combinator which is a hybrid of count and many, where specifying both a lower and upper bound is possible. This would make it a lot more efficient to allocate some parts (and by using monadic composition it can even reserve space for a certain known number of elements which was specified earlier in the message).
Spec for bounded many
fn many(Input<I>, R, Parser) -> ParseResult<I, T, E>
where R: BoundedRange,
Parser: FnMut(Input<I>) -> ParseResult<I, U, E>,
T: FromIterator<U>
trait BoundedRange { ... }
impl BoundedRange for Range { ... }
impl BoundedRange for RangeFull { ... }
impl BoundedRange for RangeFrom { ... }
impl BoundedRange for RangeTo { ... }
- Iteration should stop once the max value is reached (if it is specified by the range), no more than
n
items should be emitted unless the range is lacking an upper bound - A
size_hint
based on the range should be provided - If an error or incomplete is encountered outside of the range (ie. if fewer items than the lower bound have been emitted), the error should be propagated
- If an error is encountered inside of the range the parser should be considered complete and return the resulting
FromIterator
value - If an incomplete is encountered inside of the range of the parser it should be considered complete if the input is
END_OF_INPUT
and input.len is 0.
TODO
-
bounded::many
spec -
BoundedRange
trait -
bounded::many
impls- Replace uses of
iter::Iter
-
count
-
many
-
many1
-
sep_by
-
sep_by1
-
- Replace uses of
-
bounded::many_till
- Replace uses of
iter::IterTill
(many_till
)
- Replace uses of
-
bounded::skip_many
Aww, did you decide this wasn't feasible?
It is merged in a local copy of master which I have not yet pushed, along with #18. Currently I am polishing some things and updating all the doctests to use parse_only
instead of Input::new
and unwrap
. So it is not gone at all, just that the feature itself is done :)
The performance is promising, equal or slightly better performance in most cases except for in very small benchmarks.
It is merged in a local copy of master which I have not yet pushed,
Ah, gotcha. I'm just so used to seeing issues automatically closed by Github when the corresponding commit is pushed to master. When I didn't see the "X closed by Y" message, I just assumed that meant that it couldn't be done.
The performance is promising
Great!
@shepmaster You can try it out now if you want, 0.2.0 is out :)