kevinmehall/rust-peg

want to specify lifetime bound between `'input` and custom input lifetime `'a`

kingrongH opened this issue ยท 3 comments

Background

I have wrapped str with StrInput and StrInputSlice myself, and have impl ParseSlice for StrInput, and then I can use $() symbol to get StrInputSlice in which way I can get whole input str access. This is very nice, this is also the reason I chose rust-peg over pest.

pub struct StrInput <'a> {
    pub input: &'a str,
}

pub struct StrInputSlice<'a> {
    pub input: &'a str,
    pub start: usize,
    pub end: usize
}

impl<'a> StrInput<'a> {

    pub fn new(input: &'a str) -> Self {
        Self {input}
    }

    pub fn slice(&self, start: usize, end: usize) -> StrInputSlice {
        StrInputSlice {
            input: self.input,
            start, end
        }
    }
    
}

impl<'a> StrInputSlice<'a> {

    pub fn get_slice(&self) -> &'a str {
        &self.input[self.start..self.end]
    }

}

impl<'a> peg::ParseSlice<'a> for StrInput<'a> {

    type Slice = StrInputSlice<'a>;

    fn parse_slice(&'a self, p1: usize, p2: usize) -> Self::Slice {
        self.slice(p1, p2)
    }
}

Problem is here

but for zero-copy reason, I also want to return sliced str from StrInput<'a> , but I got a life time error for the code below. cuz StringInputSlice is parsed by ParseSlice::parse_slice with a impilict 'input life time so that we can't return StrInputSlice with 'a lifetime

parser!(grammar test_parser<'a>() for StrInput<'a> {

    /// the reason I wrap str with `StrInput` and `StrInputSlice` is here. I want a way that I can access previous chars(look behind?). 
    rule check_previous() = slice: $(['*']+)  {?
        if slice.start == 0 {
            return Ok(());
        }
        // unwrap here is ok
        let char_before = &slice.input[slice.start..slice.end].chars().rev().next().unwrap();
        if *char_before != ' ' {
            return Err("should preceded by space")
        }
        Ok(())
    }

    // here we got an error, 
    // cuz StringInputSlice is parsed by `ParseSlice::parse_slice` with a impilict `'input` life time 
    // so that we can't return StrInputSlice with `'a` lifetime
    rule parse_slice() -> StrInputSlice<'a> = slice:$([_]*) {
        slice
    }

});

Thoughts

  1. maybe rust-peg can add a way we can use 'input lifetime.
parser!(grammar test_parser() for StrInput<'input> {
// ...
});
  1. add a way we can add a lifetime bound 'input: 'a
parser!(grammar test_parser2<'a>() for StrInput<'a>
where 'input: 'a
{
    // ...
});

or maybe both ๐Ÿ˜€

#270 looks the same.
with specifying StrInputSlice 's lifetime to 'input, the following code actually work.

parser!(grammar test_parser<'a>() for StrInput<'a> {
    rule parse_slice() -> StrInputSlice<'input> = slice:$([_]*) {
        slice
    }
});

In my case, the 'input and 'a is the same. Looks like the implicit life time 'input design is nice. if you want to use lifetime, you can just use 'input, if you don't ignore it as usual. sounds like zero cost abstract. it's elegant, thanks for the excellent work. A little more doc work about 'input lifetime use case would be much nicer~.

And it seems that the need for adding lifetime bound 'input: 'a is actually not necessary, cuz most time users will have 'input and 'a equally

I think I still need a way to make the following code work as expected, so I reopen this issue.

    rule parse_slice() -> StrInputSlice<'a> = slice:$([_]*) {
        slice
    }

While you can probably make this work if you get the lifetimes just right (see #295), the implicitly-added &'input adds extra confusion here. #299 will allow you to declare grammar test_parser<'a>() for &'a StrInput<'a> {, or make StrInput impl Copy and not need the outer reference at all. Changing how the input type is declared is a breaking change to every rust-peg grammar, so this will have to wait for other changes to be released as 0.9, though.