Custom iterator lifetime allows use-after-free?

I was helping someone implement a safe wrapper around a C API which allocates and returns an array of strings. We ended up with the following implementation:

#[derive(Debug)]
pub struct StringArrayIter<'a> {
    ptr: *mut *mut c_char,
    len: usize,
    at: usize,
    _phantom: PhantomData<&'a CStr>,
}

impl<'a> StringArrayIter<'a> {
    pub fn new(ptr: *mut *mut c_char, mut len: usize) -> Self {
        // Try to find any early NULLs.
        unsafe {
            for i in 0..len {
                if *(ptr.add(i)) == ptr::null_mut() {
                    len = i;
                    break;
                }
            }
        }
        // Construct iterator.
        Self {
            ptr,
            len,
            at: 0,
            _phantom: PhantomData,
        }
    }
}

impl<'a> Iterator for StringArrayIter<'a> {
    type Item = &'a CStr;

    fn next(&mut self) -> Option<Self::Item> {
        let at = self.at;
        if at >= self.len {
            None
        } else {
            self.at += 1;
            Some(unsafe { CStr::from_ptr(*(self.ptr.add(at))) })
        }
    }
}

impl<'a> Drop for StringArrayIter<'a> {
    fn drop(&mut self) {
        unsafe {
            for i in 0..self.len {
                libc::free(*self.ptr.add(i) as *mut _);
            }
            libc::free(self.ptr as *mut _);
        }
    }
}

However, this iterator allows you to trigger a use-after-free:

let x: Vec<_> = some_func_which_returns_StringArrayIter()
       .filter_map(|s| s.to_str().ok())
       .collect();
// the &strs in x are references to freed memory because StringArrayIter
// has been dropped at this point. If you try to print them you end up
// with garbage:
println!("broken: {:?}", x);

We already fixed this by making the iterator just generate Strings -- but I wanted to ask why does Rust allow the filter_map(|s| s.to_str().ok()) to compile? My understanding of lifetime annotations is that the returned &CStr would have its lifetime tied to the StringArrayIter (after all, they're both 'a) and thus you should get a compilation error when you try to make &str live too long (or the Drop should be executed later). But this doesn't happen...

Here's a playground link. I had to fake the C library portion, but the behaviour is the same as the real code.

You're using unsafe. It exists, because working with FFI is always unsafe, lifetimes are not flexible enough to cover every situation and a ton of other low-level functionality, that requires you to know what you're doing, because Rust can't do that for you.

When you convert a raw pointer to a reference, you are responsible for making sure the type and lifetime produced are correct, not the compiler. unsafe is like a contract between you and the compiler. The compiler lets you do stuff, that could lead to undefined behavior, but you promise the compiler, that this won't happen. The compiler should never be unsound¹.

¹ There are some bugs marked as unsound in the Rust repository, but they are few and will be fixed. Your bug is not the fault of the Rust compiler (team).

1 Like

I understand that I'm using unsafe and that the root issue was that the lifetime of the reference was wrong. I wasn't trying to imply it's a Rust compiler bug or anything like that.

I'm just asking for an explanation of which aspect of my understanding of lifetimes (in particular having an associated lifetime with a struct) is wrong. In particular, why was the lifetime I was returning wrong? Isn't the <'a> associated lifetime enough to indicate to the compiler that "the strings returned by this iterator will only live as long as the iterator"?

The lifetime on struct means "this struct contains references, which live at least as long as the struct itself", not "at most". Compare this with simple struct Ref<'a>(&'a str).

1 Like

Oh. That makes a lot of sense. Thanks. :smile_cat:

If you want to make sure, the references don't escape the struct, you can only make them accessible via a method, that hands them out via closure, instead of returning them. This ensures the struct is as long alive as the reference.

The references returned by the iterator are allowed to live for 'a. What's 'a? Well, it's chosen by the caller of generate_iter, which is defined as follows:

fn generate_iter<'a>() -> StringArrayIter<'a> {

'a is unconstrained. It can be anything. It could be 'static, because it's not related to the lifetime of any input borrows or anything. 'a doesn't really mean anything in this code.

All (non-'static) lifetime parameters are derived from the stack. You need something on the stack to own the strings, and then you can reference them. Here's one way to do it. Note that the lifetime parameter of StringArrayIter is derived from the borrow of StringArray, which tells the compiler not to allow StringArray to be dropped while StringArrayIter exists.

1 Like