Can you transmute lifetimes like this?

I have a following struct:

pub struct Deserializer<'de> {
    tape: Vec<&'de str>,
}

It's meant to store zero copy strings. In theory at least.

To parse I have the following function and traits:

impl<'de> Deserializer<'de> {
    fn parse_to_vec<'s, S, B>(input: S, mut buffer: B) -> Self
    where
        S: Source<'s>,
        B: Buffer<'de>
    {
        Deserializer {
            tape: vec![buffer.append(input.to_source())],
        }
    }
}

pub trait Source<'s> {
    fn to_source(&self) -> &'s str;
}

pub trait Buffer<'b> {
    fn append<'src>(&mut self, src: &'src str) -> &'b str;
}

I have Source and Buffer as traits because I want to abstract over sources and Buffers. For example if I have a borrowed string, I don't need a buffer, so I can easily say:


impl<'b> Buffer<'b> for () {
    fn append<'src>(&mut self, src: &'src str) -> &'b str {
        // THING I'm worried about
        unsafe {transmute(src)}
    }
}

impl<'s> Source<'s> for &'s str {
    fn to_source(&self) -> &'s str {
        self
    }
}

pub fn main() {
    let Deserializer { tape } = Deserializer::parse_to_vec("Hello, world!", ());
    println!("{}",tape[0]); // prints: Hello, world!
}

However, is it OK to implement like this? is the transmute gimmick sound, if I implement it like this?

No.

impl<'b> Buffer<'b> for () {
    fn append<'src>(&mut self, src: &'src str) -> &'b str {
        // THING I'm worried about
        unsafe {transmute(src)}
    }
}

This allows one to change any lifetime to any other lifetime. What were you attempting to do?

1 Like

Not sure what you mean by that. The traits are for crate only.

The idea is simple. Have a zero-copy parser, but allow for buffered readers, without duplicating my code.

  1. If you're given a string slice Deserializer::parse_to_vec("Hello, world!", ()) the tape lives as long as the input string "Hello, world!".
  2. If you're given a Deserializer::parse_to_vec(reader, &mut Vec<u8>) the tape lives as long as the Vec<u8>

Yes, I could make the lifetime always be bound to &mut Vec<u8> but then the thing isn't a zero-copy parser, which is my goal.

It's a use-after-free, a segfault, undefined behavior. Here's the same thing closer to your original main. (If you don't care about that... I don't understand what you're asking.)

The only way a method body matching this signature can return src in some form and be sound:

pub trait Buffer<'b> {
    fn append<'src>(&mut self, src: &'src str) -> &'b str;
}

Is if 'b cannot be extended:

pub trait Buffer<'b> {
    //        vvvvvvvv
    fn append<'src: 'b>(&mut self, src: &'src str) -> &'b str;
}

After which you don't need transmute, but do need to connect the lifetimes in parse_to_vec.

 impl<'b> Buffer<'b> for () {
     fn append<'src: 'b>(&mut self, src: &'src str) -> &'b str {
-        unsafe {transmute(src)}
+        src
     }
 }

 impl<'de> Deserializer<'de> {
     fn parse_to_vec<'s, S, B>(input: S, mut buffer: B) -> Self
     where
+        's: 'de,
         S: Source<'s>,
         B: Buffer<'de>
     {

(I haven't thought through your goals so far.)

2 Likes

Perhaps:

pub trait Buffer<'b> {
    fn append<'src: 'b>(self, src: &'src str) -> &'b str;
}

impl<'b> Buffer<'b> for () {
    fn append<'src: 'b>(self, src: &'src str) -> &'b str {
        src
    }
}

impl<'b> Buffer<'b> for &'b mut String {
    fn append<'src: 'b>(self, src: &'src str) -> &'b str {
        let offset = self.len();
        self.push_str(src);
        &self[offset..]
    }
}
1 Like

You can't give out references to parts of a String and then use push_str on it! It will crash and corrupt data, since String may reallocate and invalidate all previous strs borrowed from it.

It needs more care, see:

BTW, you may need to replace &mut self with &self, and use RefCell and such to mutate. &mut references aren't just mutable, they are exclusive, and when a lifetime bound applies to both, both will have the same very restrictive exclusivity requirements.

It may be easier to have two different methods, one for borrowing from the input, one for using the Vec as an arena. Borrowing from the input is an easy case, and can be done entirely in safe Rust. Arenas are tricky and unsafe.

1 Like

Yeah. It seems my only solution is to return offsets, and then create the nodes in the final step.