NonNull giving seg fault inside struct or tuple

I am implementing some primitive heap data structures for my own language and for that I have been using NonNull and manually drop (as drop would be called by the garbage collector). One strange thing I noticed about it is as below:

#[derive(Clone, Debug)]
struct Nod {
    name: String,
}

#[derive(Clone)]
struct Ptr {
    ptr: NonNull<ManuallyDrop<Nod>>,
}
impl Ptr {
    fn new(name: &str) -> Self {
        let mut x = ManuallyDrop::new(Nod {
            name: name.to_string()
        });
        let x_ptr = &mut x as *mut ManuallyDrop<Nod>;
        let ptr = unsafe {
            match NonNull::new(x_ptr) {
                Some(p) => p,
                None => unreachable!("x_ptr has successful allocation")
            }
        };
        Ptr {
            ptr,
        }
    }
}

Inside main when do something like

let x = Ptr::new("Hello");
let mut z = ManuallyDrop::new(Nod {
        name: "World".to_string()
    });
let z_ptr = &mut z as *mut ManuallyDrop<Nod>;
let n = unsafe {
    NonNull::new_unchecked(z_ptr)
};
unsafe {
    println!("n: {:?}", *(n.as_ptr()));
    println!("x: {:?}", *(x.ptr.as_ptr()));
}

Output in my terminal is:

n: ManuallyDrop { value: Nod { name: "World" } }
zsh: segmentation fault  cargo run

Playground link: Rust Playground

NOTE: The same behaviour is happening when I consider Ptr(NonNull<ManuallyDrop<Nod>>)

ManuallyDrop is not a heap-allocated object. It doesn't have a stable address. It itself can be destroyed and invalidated automatically, it just won't call drop when that happens.

When you do &mut x inside Ptr::new(), you're using temporary address of an on-stack variable that will be invalid as soon as the method exits. The String from name lives on the stack that will be overwritten.

You have to use Box::into_raw to get a valid pointer that outlives the function call.

9 Likes

Thanks @kornel . It worked like a charm!
For reference, made the below change as suggested by you:

fn new(name: &str) -> Self {
        let x = Box::new(ManuallyDrop::new(Nod {
            name: name.to_string()
        }));
        let x_ptr = Box::into_raw(x);
        let ptr = unsafe {
            match NonNull::new(x_ptr) {
                Some(p) => p,
                None => unreachable!("x_ptr has successful allocation")
            }
        };
        Ptr {
            ptr,
        }
    }

You don't need ManuallyDrop if you use into_raw, since the raw pointer won't be freed automatically.

You can use NonNull::new_unchecked relying on the fact that Box can never be a null pointer (in Rust's standard library memory allocations abort the program instead of letting you handle failures).

3 Likes

@kornel if I remove ManuallyDrop then how would I execute custom drop on Nod ? In case of ManuallyDrop we have ManuallyDrop::drop(&mut ManuallyDrop<T>) which calls Drop traits drop but what's the counter method if I only have NonNull<Nod> ?

Use Box::from_raw to convert the inner pointer back into a Box. When the box goes out of scope the destructor will run.

This also ensures the memory allocated by the Box gets deallocated.

3 Likes

@semicoleon no, I want to manually call the drop function on Nod. This basically represents the cleaning up which would be ran by garbage collector.

What's the difference between calling it manually and Box's drop implementation calling it?

You could do

drop(Box::from_raw(x_ptr));

But that's not meaningfully different from

Box::from_raw(x_ptr);

drop gets called the same either way.

Edit: To be clear I mean this is what you would do when you wanted to perform the clean up, not that you should just use Box to manage the lifetime for the whole program.

3 Likes

Are you looking for ptr::drop_in_place? That's what ManuallyDrop::drop calls. But if you got the pointer from a Box, just calling drop_in_place will leak memory. See the documentation on Box::into_raw for more details. In particular, as @semicoleon says, from_raw is how you best both drop the value and have the memory deallocated for pointers obtained from into_raw.

4 Likes

Below in my implementation for a dynamic list. I want to call manual_drop at my will.

use crate::backend::data::Data;
use std::alloc::{self, Layout};
use std::fmt::Display;
use std::marker::PhantomData;
use std::ptr;
use std::{mem::ManuallyDrop, ptr::NonNull};

// This unsafe code is heavily taken from the `Rustonomicon` book.
// See Implementing Vec section in `https://github.com/rust-lang/nomicon` and `https://doc.rust-lang.org/nomicon/` for more information.

#[derive(Clone)]
pub struct CoreListObject {
    ptr: NonNull<Data>,
    len: usize,
    cap: usize,
    _marker: PhantomData<Data>,
}

impl CoreListObject {
    fn pop(&mut self) -> Option<Data> {
        if self.len == 0 {
            None
        } else {
            self.len -= 1;
            unsafe { Some(ptr::read(self.ptr.as_ptr().add(self.len))) }
        }
    }

    fn len(&self) -> usize {
        self.len
    }

    fn cap(&self) -> usize {
        self.cap
    }

    fn grow(&mut self) {
        let (new_cap, new_layout) = if self.cap == 0 {
            (1, Layout::array::<Data>(1).unwrap())
        } else {
            let new_cap = 2 * self.cap;
            let new_layout = Layout::array::<Data>(new_cap).unwrap();
            (new_cap, new_layout)
        };
        assert!(
            new_layout.size() <= isize::MAX as usize,
            "allocation too large"
        );
        let new_ptr = if self.cap == 0 {
            unsafe { alloc::alloc(new_layout) }
        } else {
            let old_layout = Layout::array::<Data>(self.cap).unwrap();
            let old_ptr = self.ptr.as_ptr() as *mut u8;
            unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
        };
        self.ptr = match NonNull::new(new_ptr as *mut Data) {
            Some(p) => p,
            None => alloc::handle_alloc_error(new_layout),
        };
        self.cap = new_cap;
    }

    fn push(&mut self, elem: Data) {
        if self.len == self.cap {
            self.grow();
        }
        unsafe {
            ptr::write(self.ptr.as_ptr().add(self.len()), elem);
        }
        self.len += 1;
    }
}

impl Display for CoreListObject {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        let mut s = "[".to_string();
        if self.len != 0 {
            unsafe {
                s.push_str(&format!("{}", *self.ptr.as_ptr().add(0)));
                for i in 1..self.len {
                    s.push_str(&format!(", {}", *self.ptr.as_ptr().add(i)));
                }
            }
        }
        s.push_str("]");
        write!(f, "{}", s)
    }
}

impl Drop for CoreListObject {
    fn drop(&mut self) {
        if self.cap != 0 {
            while let Some(_) = self.pop() {}
            let layout = Layout::array::<Data>(self.cap).unwrap();
            unsafe {
                alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
            }
        }
    }
}

#[derive(Clone)]
pub struct ListObject(NonNull<ManuallyDrop<CoreListObject>>);

impl ListObject {
    pub fn new() -> Self {
        let x = Box::new(ManuallyDrop::new(CoreListObject {
            ptr: NonNull::dangling(),
            len: 0,
            cap: 0,
            _marker: PhantomData,
        }));
        // Below we are using `Box` instead of directly doing `let x_ptr = &mut x as *mut ManuallyDrop<CoreListObject>`
        // because ManuallyDrop<T> is not heap-allocated (it's a local stack variable) and so reference we obtain directly
        // to it would be valid only to this function. So beyond this function ListObject would carry a reference to unallocated memory!
        // `Box` makes sure that `ManuallyDrop<CoreListObject>` is heap-allocated and any reference to it survive even
        // beyond this function.
        let x_ptr = Box::into_raw(x);
        let ptr = unsafe { NonNull::new_unchecked(x_ptr) };
        ListObject(ptr)
    }

    pub fn push(&self, elem: Data) {
        unsafe { (&mut *self.0.as_ptr()).push(elem) }
    }

    pub fn pop(&self) -> Option<Data> {
        unsafe { (&mut *self.0.as_ptr()).pop() }
    }

    pub fn len(&self) -> usize {
        unsafe { (&*self.0.as_ptr()).len() }
    }

    pub fn cap(&self) -> usize {
        unsafe { (&*self.0.as_ptr()).cap() }
    }

    pub fn manual_drop(&self) {
        unsafe {
            ptr::drop_in_place(self.0.as_ptr());
            ManuallyDrop::drop(&mut (*self.0.as_ptr()));
        }
    }
}

impl Display for ListObject {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        unsafe { write!(f, "{}", (*self.0.as_ptr()).to_string()) }
    }
}

Will this implementation also leak memory ? I am cleaning up the pointer I got from Box as well as deallocating the underlying heap memory. @quinedot @semicoleon

You're definitely still leaking the Box, and you're also dropping the inner object twice.

You don't need ManuallyDrop anymore, raw pointers never have any drop cleanup run on them

This is how you would reconstruct the Box to avoid leaking memory.

pub fn manual_drop(&self) {
    let boxed = unsafe {
        // Reconstruct the `Box` from the raw pointer
        Box::from_raw(self.0.as_ptr())
    };

    // The `Box` would be deallocated without the explicit drop here when the function returned.
    // The explicit drop call is just to make it clear when that happens.
    // `Box` will call `drop` on the type it contains.
    drop(boxed);
}
4 Likes

I think I now see the point, I can remove ManuallyDrop as NonNull won't drop the inner part automatically. For manual dropping the value I can do something like

fn manual_drop(&self) {
     unsafe {
            Box::from_raw(self.ptr.as_ptr());
        }
}
1 Like

Yeah, that's exactly what I want ! Thanks @semicoleon for pointing this out and @quinedot for pointing to the docs on drop_in_place.

1 Like

Sort of a side note, but when you're doing unsafe stuff it's good to get familiar with miri, which is a tool that catches (some) undefined behavior. But more relevant here is it complains about memory leaks. It can't run all Rust code, but its very useful for data structure stuff like this!

1 Like

thanks @semicoleon . I will look into that.

And that's one reason to never mix unsafe and Business Logic™.

Make your data structures with unsafe, ideally put into separate crate, test with Miri, unit tests, fuzzing and what-you-can-imagine… then use that crate from safe code in the Business Logic crate.

4 Likes