I am unable to perform this. I believe the reason is that rustc thinks the Option<Pin> might be Some(exists_pin), which is why it refuses the overwrite, even though it is indeed None.
My other question is whether Pin::new_unchecked(&mut *mh) is unsound. I think it is safe, but Google AI insists it is unsound because the local reference will be dropped at the end of the function, and code like static N: i32 = 9; return &N; is a special case hard-coded into the compiler, making it sound, but only for that particular case.
pub fn mbuf_alloc_shared<T: Default>(mut nb: usize) -> Pin<&'static mut MyBuffer<T>> {
let mut mbuf = MyBuffer::<T>::new();
// mbuf.mem_head is None
nb += size_of::<MemHead<T>>();
let layout = Layout::from_size_align(nb, align_of::<MemHead<T>>()).unwrap();
let mem_head: Pin<&'static mut MemHead<T>> = unsafe {
let mh = alloc(layout) as *mut MemHead<T>;
ptr::write(ptr::addr_of_mut!((*mh).handle), -(nb as isize));
ptr::write(
ptr::addr_of_mut!((*mh).u),
U {
mem: ManuallyDrop::new((0, mem_head_addref_1::<T>)),
},
);
ptr::write(ptr::addr_of_mut!((*mh).any), T::default());
Pin::new_unchecked(&mut *mh)
};
mbuf.mem_head = Some(mem_head);
todo!()
}
Edit: the first problem turned out mbuf is another Pin, my fault.
Regarding the use of new_unchecked, you cannot make any deduction about the soundness of your function, because creating a pinned value using new_unchecked places very strict requirements on the entire API surface which you provide for your pinned-pointee. For example, providing a bad pin projection for MyBuffer could make this unsound, if that projection lets you move out of a value. In other words, you cannot reason locally about using new_unchecked, only globally.
In this case, your static reference allocation must never be dropped. If this is not the case, ie you API somehow lets you drop it, your function is unsound.
I will free the &'static mut ref, at some point when the referece count reaches zero.
It seems there is some danger here, but I can't figure out how to resolve it.
anyway, I need a reference, a reference must holds a lifetime bound; or unbounded lifetime, what ever.
The lifetime can't been deducedm because it infact just a c style buffer:
I can make sure the pch will not be accessed by pch using reference, but it totally up to my logic. When I introduce this into rust, and make the pch to a &'a ref; which 'a should be? if &'static ref forces the pch can never be freed, it is unacceptable.
I don't know how to make this kind of code more rust-style; well, this is another problem.
My origianl one is can I &'static mut *pch and keep the soundness; or the only way is box::leak(box::new(struct {} ))?
I don’t think it should be modeled as a reference / borrow, if I’m understanding this correctly. It should, semantically, own the buffer and be responsible for freeing it. Is the situation analogous to Rust’s Rc<T> or Arc<T>? Those are reference-counted pointers that collectively own a T and drop it once there’s no remaining references. See also Pin<Rc<T>> and Pin<Arc<T>>.
Though if you want an even easier solution, there’s probably libraries out there that do what you want. Personally, I use yoke for self-referential structs in Rust.
Also, note that &'static T references, if exposed to safe code, mean that the backing T can never be dropped. Types which fulfill a 'static lifetime bound can be kept around arbitrarily long, or dropped at any time (so long as they are not borrowed). The problem is that a &'static T reference could be kept for an arbitrarily long time, so the backing T is always borrowed and cannot be dropped. If the &'static T is restricted to solely unsafe code, the unsafe code may be able to soundly “lie” about the lifetime without causing any use-after-frees or other memory unsafety. But that takes a lot of care, and the reference should be carefully guarded to avoid spreading that lie to something expecting a true 'static lifetime.
If you mean that you’re giving a literal C buffer to Rust with FFI, that’s an entirely different can of worms. Getting multiple languages to interact through FFI can be complicated.
And there are complicated logic, there are at least 6 kinds of allocation and free logic, and the memory is not flat, That is why I must reimplemented the logic again in rust;
I don't really understand those struct definitions without further context. For whatever it's worth, translating it to Rust 1-to-1 could be quite difficult, but that doesn't mean you can't make something in Rust with exactly the same API and performance characteristics. Usually it's about reorganizing code and making ownership more clear. On boundaries where normal ownership and borrowing rules don't work well, you can create and use unsafe functions for your internal interfaces.
For the core Pin<_> matter... I think something like Pin<AliasableBox<M>> would be much better than Pin<&'static mut M> or even Pin<&'a mut M>. The owned Pin<AliasableBox<M>> container would let the M be dropped when the whole Pin<_> is dropped. (I mention aliasable::boxed::AliasableBox because the standard library's Box type has stricter aliasing semantics, and its soundness in self-referential structs seems uncertain to me.) However, assuming you need some interesting refcount stuff, presumably you should make your own reference-counting container, say RefcountedMemHead<T>, to use in place of AliasableBox<M>, so that dropping RefcountedMemHead<T> only drops the inner MemHead<T> when it's supposed to. (Storing the refcount in the MemHead might be a good idea; that way, you don't need a separate heap allocation for the refcount of RefcountedMemHead<T>.) RefcountedMemHead<T> could have a NonNull<MemHead<T>> field, and have some method akin to aliasable::boxed::AliasableBox::from_unique_pin for converting a Pin<Box<MemHead<T>> into a Pin<RefcountedMemHead<T>>.
I'm not sure if alloc() is alloc::alloc (or std::alloc::alloc)... if so, I think you should try something like the below:
// Type annotations included solely for clarity, not out of necessity
let mem_head: Box<MaybeUninit<MemHead<T>>> = Box::new_uninit();
// Initialize **every** field of `MemHead`
// SAFETY: ..
unsafe { ptr::write(ptr::addr_of_mut!((*mh).handle), -(nb as isize)); }
// SAFETY: ..
unsafe { ptr::write(
ptr::addr_of_mut!((*mh).u),
U {
mem: ManuallyDrop::new((0, mem_head_addref_1::<T>)),
},
); }
// SAFETY: every field of the `MemHead<T>` has been written to and initialized.
let mem_head: Box<MemHead<T>> = unsafe { mem_head.assume_init() };
let mem_head: Pin<Box<MemHead<T>> = Box::into_pin(mem_head);
let mem_head: Pin<RefcountedMemHead<T>> = RefcountedMemHead::from_pinned_box(mem_head);
When you're fiddling with memory, you're likely to need to stick to raw pointers in internal logic; it may be better to expose guard structs in the public API which store raw pointers and Deref to borrowed references. E.g., RefCell does not hand out plain references, it hands out a struct like Ref or RefMut that internally has a plain pointer (a NonNull pointer), but the guards implement Deref. Rc is sort of similar with its use of raw pointers.
Another thing - I don't really ever see union be used in Rust; it's effectively a mem::transmute (which you should probably use instead?). I can't tell what struct list_head is... is it substantially different from a tuple of pointers? Avoiding any transmutes altogether would be nice.
Next, it seems like you're using a usize as an equivalent of uintptr_t. If it's used for pointer offsets or something, that's fine, but be careful to note that it's NOT a good idea to cast a usize to a pointer; if you find the need to do that, use a pointer. (Maybe a type-erased pointer like *mut (), which people sometimes use similarly to C's void*. Though in most cases a raw pointer to a generic parameter is preferable, like *mut T or something. *mut () is used when a generic parameter isn't an option.)
Even though a pointer is just an address at runtime (in realistic scenarios), compilation also considers the "provenance" of the pointer for the purpose of applying optimizations and whatnot. You can always store a usize in a pointer type (so long as you just use it as a number, and don't dereference it), but an address shouldn't be used as a pointer. Pointer provenance is actually relevant in C and C++ too, but given that people in C/C++ play much looser with pointers, C/C++ compilers have no choice but to be much more forgiving with provenance. Rust took the opportunity to have stricter provenance rules.
In general, if you are assigning the lifetime static to a memory location you will eventually free, your code is definitely unsound. Your function is unsafe, which means that your code can be used correctly, if the caller ensures they uphold some invariants, and if they do not, your program may exhibit undefined behavior. In practice, your function almost impossible to use correctly, since you have made untrue claims in the type system by misusing an unsafe function, and in particular, ANY safe users of your code trivially exhibit unsoundness if they use your reference in any way after you free it.
Box::leak doesn't help you here. You cannot free memory leaking by Box::leak if the lifetime you pick to leak the memory with is a static one.
Indeed, static is unacceptable since it means you will not free the memory.
You really only have a couple feasible choices to resolve the lifetime issue:
mark your function unsafe and have it return a pointer, NOT a reference. Anywhere you believe that you "need" a reference, use a pointer instead.
mark your function unsafe and have it return a reference of some parametric lifetime, NOT of a static lifetime:
then document the requirements of the caller to select a correct lifetime (i.e., one which is strictly smaller than the smallest scope containing the matching mbuff_free or whatever method is responsible for freeing your memory).
Since it seems your type is self-referential, in a way which is not easily undone, I would suggestion not returning a Pin at all, instead return a struct which wraps a *mut T and does whatever internal operations you need on this type. Your struct will be responsible for making sure that the object pointed to by the *mut T is never moved from, i.e. that the pointer is in fact Pinned. It seems you have your own implementation of refcounting, since you are porting from C, I would suggest keeping your Rust implementation close to your C one for simplicity. Once you have an initial implementation which has a safe interface, consider moving from your own refcount implementation to Rc/Arc.
Here is an example of how you could expose a safe interface to your my_buffer type while keeping most of the C implementation verbatim:
use std::ptr::NonNull;
use std::alloc::{alloc,Layout};
use std::marker::PhantomData;
use std::pin::Pin;
// I don't know what this does since you have not included the code
// You may or may not need PhantomData
#[repr(C)]
struct MemHead<T>
{
_pd : PhantomData<*mut T>
}
// NOTE: Since this type contains pointers to itself, there is no safe way to manipulate its fields directly
// All of the fields must be private.
pub struct MyBufferInner<T>
{
mem_head : Option<NonNull<MemHead<T>>>
}
unsafe fn mbuf_alloc_shared<T: Default>(mut nb: usize) -> *mut MyBufferInner<T>
{
// You can allocate the MyBufferInner pointer any way you want, but Box::leak is a simple way
let mbuf_ptr = Box::leak(Box::new(MyBufferInner { mem_head : None }));
let mbuf = mbuf_ptr as &mut MyBufferInner<T>;
nb += size_of::<MemHead<T>>();
let layout = Layout::from_size_align(nb, align_of::<MemHead<T>>()).unwrap();
let mem_head: *mut MemHead<T> = unsafe {
let mh = alloc(layout) as *mut MemHead<T>;
// Do whatever pointer manipulation you need here
mh
};
mbuf.mem_head = unsafe { Some(NonNull::new_unchecked(mem_head)) };
mbuf
}
// Refcounting support, I don't know how you implement this, but eventually move to
// Rc/Arc instead. As a starting point, implement the same way as in C.
impl <T> MyBufferInner<T>
{
unsafe fn add_ref(this : *mut Self)
{
todo!()
}
// returns true iff this was the last ref
unsafe fn remove_ref(this : *mut Self) -> bool
{
todo!()
}
}
pub struct MyBuffer<T>
{
value : NonNull<MyBufferInner<T>>
}
impl<T : Default> MyBuffer<T>
{
pub fn new(nb: usize) -> Self
{
// SAFETY: mbuf_alloc_shared cannot fail so NonNull::new_unchecked is okay
// SAFETY: the return value of mbuf_alloc_shared is a pointer to a self referential object,
// but we never expose any method from MyBuffer which allows you to move out of that
// object. Therefore, the pointee (of type MyBufferInner<T>) is effectively pinned.
unsafe {
MyBuffer{ value: NonNull::new_unchecked(mbuf_alloc_shared::<T>(nb)) }
}
}
}
impl<T> MyBuffer<T>
{
// You don't need to provide this function, but you can do so safely if you
// really need Pin<..> somewhere.
pub fn get_inner_pinned(&mut self) -> Pin<&mut MyBufferInner<T>>
{
// SAFETY: we do not expose anything which would allow moving out of
// the inner MyBufferInner object. The memory is effectively pinned
// SAFETY: we are creating a mut ref here to the MyBufferInner object.
// This reference is the only one, since we take a &mut self as input
unsafe { Pin::new_unchecked(&mut *self.value.as_ptr()) }
}
}
// Clone just increases the refcount
impl<T> Clone for MyBuffer<T>
{
fn clone(&self) -> Self
{
unsafe { MyBufferInner::add_ref(self.value.as_ptr()) };
MyBuffer{ value : self.value }
}
}
impl<T> Drop for MyBuffer<T>
{
fn drop(&mut self)
{
let should_drop = unsafe { MyBufferInner::remove_ref(self.value.as_ptr()) };
if should_drop
{
// Drop the T data, whatever else you normally do in C
// ...
// free the allocation
// SAFETY: We allocated this with Box::new, and our refcounting implementation
// ensures that there are no more references to this object.
unsafe { Box::from_raw(self.value.as_ptr()) };
}
}
}
Another reasonable choice is to simply abandon implementing any of this in rust, and use the FFI to interact with your existing C implementation of my_buffer. However, since your question is about implementing mbuf_alloc_shared in Rust, it seems you don't want to do that because you're porting the code to Rust. But that may be the simplest choice, if you "trust" the C implementation.
for situations like that you probably want something like
struct BufferHandler<'a>{
data:&'a mut Data,
}
impl<'a> BufferHandler<'a>{
pub fn new()
{
//alloc and conver pointer to ref
}
impl Drop for BufferHandler{
fn drop(&self){
//'free' the buffer
//do whatever you need to end this ref be it removing the allocation or decreasing the ref count
}
}
this way data can be accessed only as long as the handle(and consequently the buffer) exists and the buffer can only be freed once no one is holding onto them
from there onward bufferhandle can be used normally like
let myhandle=BufferHandle::new();
///stuff
drop(myhandle)//explicit or not it's irrelevant it will work correctly
lifetime will be inferred 99%of the time so you don't have to write out one yourself
you can even do Rc<BufferHandle<'static>> without issue because every ref to data will need a ref to the Rc so the Handle will never be dropped while someone has the data
I just don’t feel like the lifetime is needed at all on the struct, for the same reason that we don’t have Rc<'a, T>: the struct seems to own its data as far as I can tell (even if it shares ownership with refcounted clones). I think raw pointers are better than a meaningless lifetime.
Also, in your example, the 'a lifetime on data would have to be long enough for data to be used in Drop::drop. Rust therefore assumes that the reference is dereferenceable for the whole function body. However, you’d free the backing data within the function. AFAIK this is undefined behavior. You need to use MaybeUninit (or, in some future version of Rust, a MaybeDangling type) to remove the dereferenceability requirement.
well you can't have a & without a lifetime and forcing the struct and the ref to have the same lifetime is the easiest and safest way to make sure the usage of the ref is aligned to the lifetime of the handle.
on the implementation of drop yes it would be UB to use the ref after we free it in drop but it's dorp so you don't use it afterwards.
and if you somehow decide to try to read the data of the ref inside drop directly after explicitly invalidating it then well that's on you there is a limit to how much we can protect devs
I mean that (AFAIK) Rust literally gives the LLVM codegen backend an attribute indicating that data is dereferenceable for the whole function body (regardless of whether you actually try to dereference data near the end of the function body). I think this is codegen-level UB, not just library-level or even rustc frontend-level UB. References are much stricter than raw pointers, so I think "stay away from references when manipulating memory in interesting ways" is a reasonable rule of thumb. When I was looking at the implementation of the yoke crate (which also lies about lifetimes, and needed to contend with dereferenceability requirements), I learned about this issue (and the fix with MaybeUninit / someday MaybeDangling).
And since the handle needs to, well, exist while the ref is live (assuming that RAII is used, to decrement the refcount on drop and free data if necessary, instead of allowing valid not-actively-destructing handles to have a refcount of zero), you can hand out a &'_ M from a function taking &self with some raw pointer to M. The lifetime gets introduced there, so that the ref cannot outlive the handle.