fflorent/nom_locate

Updating a span's fragment

progval opened this issue · 10 comments

Hi, I've just started a nom5 to nom7 update on some code base and stumbled upon this line in our code base:

    span.fragment = span.fragment.trim_end();

I'm not sure how to do something similar since now the fragment is private. Any idea?

Originally posted by @mpizenberg in #62 (comment)

Is this correct and safe? (our Span has no extra info)

pub type Span<'a> = LocatedSpan<&'a str>;

/// Trim the end of the fragment.
/// This uses Span::new_from_raw_offset() which is marked `unsafe` but I believe this operation is
/// safe so I didn't mark this function as unsafe too.
fn trim_fragment_end<'a>(span: &Span<'a>) -> Span<'a> {
    unsafe {
        Span::new_from_raw_offset(
            span.location_offset(),
            span.location_line(),
            span.fragment().trim_end(),
            span.extra,
        )
    }
}

@mpizenberg you have two solutions:

  1. update the parser so the characters you want to trim are not in the span in the first place, or
  2. use this method to obtain two new LocatedSpans, and discard the second one: https://docs.rs/nom_locate/4.0.0/nom_locate/struct.LocatedSpan.html#method.split_at_position_complete

I think that's correct, but the two methods I listed should work too and avoid unsafe (and should be more future-proof)

I see, I don't think (1) is possible since this function is called on the first parser that detect a paragraph block from the raw text. But I suppose (2) could be done, by computing the length of the trimmed slice. Thanks I'll try that

So I extrapolated with the name of split_at_position_complete that I would just have to give a position, but it seems I need to give a predicate function instead, which is evaluated on every item in the stream. Then I saw that there exists the take() function which only keep a given size of the slice. This one seems to be the one I want, to implement trimming like so:

fn trim_fragment_end<'a>(span: &Span<'a>) -> Span<'a> {
    let len_after_trimmed = span.fragment().trim_end().len();
    span.take(len_after_trimmed)
}

But unfortunately, this requires Self: Slice<RangeFrom<usize>> + Slice<RangeTo<usize>> which I cannot do since I'm directly using LocatedSpan<&'a str> which I don't own. Any idea on how to do this trimming efficiently without unsafe?

Hmm... what about span.slice(..len_after_trimmed)?

Thanks @progval that seems to work! (compile at least, I'll have to run tests).

I had not seen that function while looking at the doc. PS: I recently watched a talk about how to improve discoverability of Haskell type classes and on of the advises was to also have named functions on the concrete types. I think this advice works also well in the case of Rust traits system. We can generally improve discoverability of functions by adding dedicated functions in the type. And then just call these functions when implementing a trait.

Thanks again for the quick responses!

Cool!

To be honest, I did not find it in the doc either, I looked at the implementation of take...

btw @progval don't hesitate to close this issue. I cannot myself since you opened it.

Oops, indeed!