danielhenrymantilla/polonius-the-crab.rs

Lifetime may not live long enough, `'s` must outlive `'static`

spikespaz opened this issue · 3 comments

This is more of a support question, but I've been struggling with this trait for days.

I have two structs: SourceBytes<S: Iterator<Item = u8>>, and SourceChars<S: Iterator<Item = u8>>. SourceBytes is also an Iterator<Item = u8>, and SourceChars is an Iterator<Item = char>.

Now, I want to be able to buffer bytes in SourceBytes, which is why it exists. It is a wrapper over Iterator<Item = u8>, which has the ability to buffer N bytes:

impl<S> Buffered for SourceBytes<S>
where
    S: Iterator<Item = u8>,
{
    type ItemSlice<'a> = &'a [u8] where Self: 'a;

    fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
        if self.buffer.len() < count {
            self.buffer
                .extend(self.iter.by_ref().take(count - self.buffer.len()));
        }
        self.buffer.get(0..count)
    }
}

Buffered is my own trait, I hope it is self-explanatory looking at the implementation, but if not, I will include a playground link.

Now, as for SourceChars, it's job is to take a SourceBytes (or realistically, &mut S where S: Iterator<Item = u8> + Buffered<ItemSlice = &[u8]>). SourceChars is Iterator<Item = char>, and that looks like this:

impl<S> Iterator for SourceChars<S>
where
    S: Iterator<Item = u8>,
{
    type Item = char;

    fn next(&mut self) -> Option<Self::Item> {
        let mut buf = [0; 4];
        // A single character can be at most 4 bytes.
        for (i, byte) in self.0.by_ref().take(4).enumerate() {
            buf[i] = byte;
            if let Ok(slice) = std::str::from_utf8(&buf[..=i]) {
                return slice.chars().next();
            }
        }
        None
    }
}

This reads zero through four bytes from S (named for "source") and checks if they are a valid UTF-8. If they are, it returns the character each iteration.

Now, the problem is implementing Buffered<ItemSlice = &str> for SourceChars<S>, given &mut SourceBytes for S.

impl<'s, S> Buffered for SourceChars<&'s mut SourceBytes<S>>
where
    S: Iterator<Item = u8> + 's,
{
    type ItemSlice<'a> = &'a str where Self: 'a;

    fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
        let mut src = self.0.by_ref();
        // for bytes in 0.. {
        //     let buf = src.buffer(bytes)?;
        //     if let Ok(slice) = std::str::from_utf8(buf) {
        //         if slice.chars().count() >= count {
        //             return Some(slice);
        //         }
        //     }
        // }
        let mut byte_count = 0;
        polonius_loop!(|src| -> Option<Self::ItemSlice<'polonius>> {
            byte_count += 1;
            let Some(buf) = src.buffer(byte_count) else {
                polonius_return!(None)
            };
            if let Ok(slice) = std::str::from_utf8(buf) {
                if slice.chars().count() >= count {
                    polonius_return!(Some(slice));
                }
            }
        });
        None
    }
}

Rust shows an error originating from the polonius_loop! macro that says 's must outlive 'static, which I think is the reason I am using polonius-the-crab in the first place.

error: lifetime may not live long enough
   --> src/parser/iter.rs:120:9
    |
103 |   impl<'s, S> Buffered for SourceChars<&'s mut SourceBytes<S>>
    |        -- lifetime `'s` defined here
...
120 | /         polonius_loop!(|src| -> Option<Self::ItemSlice<'polonius>> {
121 | |             byte_count += 1;
122 | |             let Some(buf) = src.buffer(byte_count) else {
123 | |                 polonius_return!(None)
...   |
129 | |             }
130 | |         });
    | |__________^ requires that `'s` must outlive `'static`
    |
    = note: this error originates in the macro `polonius_loop` (in Nightly builds, run with -Z macro-backtrace for more info)

Side note: I am not sure if impl<'s, S> is appropriate. It could look like:

impl<S> Buffered for SourceChars<&mut SourceBytes<S>>
where
    for<'s> S: Iterator<Item = u8> + 's,

But then, I think that is a step further away from what I want to accomplish. I don't understand higher-ranked trait bounds well enough to say this with certainty.

Either way, the error is the same, but instead of 's, it is '1.


What am I doing wrong here? Is this Buffered trait even possible to exist in current stable Rust, given that I want <SourceChars<S> as Buffered>::ItemSlice = &str?

Link to the Playground.

FlipB commented

I hit this same problem. It seems to be related to referencing Self::ItemSlice rather than the concrete type in the polonius closure.

So instead of:

        let mut byte_count = 0;
        polonius_loop!(|src| -> Option<Self::ItemSlice<'polonius>> {
            byte_count += 1;
            let Some(buf) = src.buffer(byte_count) else {
                polonius_return!(None)
            };
            if let Ok(slice) = std::str::from_utf8(buf) {
                if slice.chars().count() >= count {
                    polonius_return!(Some(slice));
                }
            }
        });
        None

Try this:

        let mut byte_count = 0;
        polonius_loop!(|src| -> Option<&'polonius str> {
            byte_count += 1;
            let Some(buf) = src.buffer(byte_count) else {
                polonius_return!(None)
            };
            if let Ok(slice) = std::str::from_utf8(buf) {
                if slice.chars().count() >= count {
                    polonius_return!(Some(slice));
                }
            }
        });
        None

@FlipB I could have sworn I tried that, but if it must be the latter, that should be considered a bug.

Either way, I eliminated usage of polonius-the-crab altogether (and hope to never need it again):

impl<S> Buffered for SourceChars<&mut S>
where
    for<'a> S: Iterator<Item = u8> + Buffered<ItemSlice<'a> = &'a [u8]> + 'a,
{
    type ItemSlice<'items> = &'items str where Self: 'items;

    // Allowed specifically here because the borrow checker is incorrect.
    #[allow(unsafe_code)]
    fn buffer(&mut self, count: usize) -> Option<Self::ItemSlice<'_>> {
        for byte_count in 0.. {
            let buf = self.0.buffer(byte_count)?;
            // SAFETY:
            //
            // This unsafe pointer coercion is here because of a limitation
            // in the borrow checker. In the future, when Polonius is merged as
            // the de-facto borrow checker, this unsafe code can be removed.
            //
            // The lifetime of the byte slice is shortened to the lifetime of
            // the return value, which lives as long as `self` does.
            //
            // This is referred to as the "polonius problem",
            // or more accurately, the "lack-of-polonius problem".
            //
            // <https://github.com/rust-lang/rust/issues/54663>
            let buf: *const [u8] = buf;
            let buf: &[u8] = unsafe { &*buf };

            if let Ok(slice) = std::str::from_utf8(buf) {
                if slice.chars().count() >= count {
                    return Some(slice);
                }
            }
        }
        unreachable!()
    }
}

I'm not sure if this issue ought to be closed.

FlipB commented

I don't know that this is the best way to solve this, but this is the solution I arrived at for a similar issue.