The idea is that you can represent a list of objects by pointing next and prev at the list field of other object_ts. To recover a pointer to the object, you then have to subtract the offset of list from the start of object_t. In C you would use offsetof to do this. Is there an equivalent way to do this in Rust?
Since the C abi is stable and well defined, Rust could give me this information. Whether there is a way to get it or not is another matter.
EDIT Worth noting some features of the above linked list
Doubly linked.
Circular (no beginning and end), although you could use a null pointer if you did want a beginning/end.
There isn't really any way to get the offsets besides constructing a raw pointer to each, casting to usize, and subtracting them from each other. Would it not be simpler to have list_t store pointers directly to the object_t, rather than to its third field?
That being said, if one makes sure a pointer with provenance over the whole Object is used to create the raw pointer to the IntrusiveList, then all is fine:
The decision to use intrusively linked lists is not mine: it is made by the library author. I'm trying to write bindings to the library. I believe intrustive linked lists are used because the code runs hot in a realtime thread, and so removing the extra indirection between the list and its items is worthwhile.
Yes it is important to note that the ABI of rust structs is unstable. You cannot rely on offsets there.
Awesome, thankyou!
Thankyou for your very detailed answer! In addition to learning that these structures are called an "intrusive" linked list, I did not know about provenance of pointers, and need to go and understand that fully. But thanks again for taking the time to write such a detailed answer.
@Yandros For my specific problem, I am handed a pointer to a list_t from a C library and am expected to construct the object_t pointer myself. Are you saying this is impossible to do without UB?
It depends on what you do with the pointer. If you turn a raw pointer to the full thing into a raw pointer to the field, you can turn it back, but if this conversion goes through a reference and back, you may run into trouble because a reference has all sorts of extra guarantees that may prevent you from going back to a pointer to the full thing.
If you get the raw pointer from C, it should be fine to cast to a u8 raw pointer, subtract the right number of bytes, and cast to pointer to the full thing.
If it is an immutable reference, the guarantee you need to uphold is that, between any two uses of the same immutable reference, the value has not been modified.
The ways you would need to use unsafe-cell in are a bit subtle. The way you should think of it is that, in the eyes of the requirement I mentioned, an UnsafeCell<T> is not considered modified even if the T inside it is.
Be aware that this implies that the guarantee is only turned off if the UnsafeCell is between the reference and the value. Creating an &T to a value inside an UnsafeCell does not turn off the guarantee, because the cell does not lie between the reference and the value. It is only turned off if you have an &UnsafeCell<T>.
The safety requirements of threading are unrelated to the guarantees imposed by references in the sense that the rule that says that you may not perform a data race applies even if you are using raw pointers.
use ::core::{cell::Cell as Mut, ptr};
use ::std::os::raw::c_int as int;
#[repr(C)]
struct Object {
data: Mut<int>;
more_data: Mut<int>;
list: IntrusiveList;
}
// where
#[repr(C)]
struct IntrusiveList {
next: Mut< Option<ptr::NonNull<IntrusiveList>> >,
prev: Mut< Option<ptr::NonNull<IntrusiveList>> >,
}
then you can upgrade a *mut Object to a &Object just fine (provided the lifetime is OK (object is not deallocated in between) and no data races occur). Using Cell to wrap each field is the best way to represent C semantics, and avoid aliasing/immutability-related UB.
No, in my example I was talking about creating (and potentially giving it to the FFI) a pointer to a list with provenance over the .next and .prev fields only. But if you are receiving such a pointer, then it was C's job to imbue it with the right provenance (which, in practice, can be assumed to be always the case). So no issues in that regard.
There are semantically 3 types of pointer in rust, *const/*mut, & and &mut. These pointers might be fat, but its not possible to construct fat pointers directly yourself so I'm ignoring this detail. The rules are as follows
For *const/*mut
When dereferencing, the pointer must point to a valid (initialized, aligned, live) object
The data must be a valid instance of the pointer type (note that this doesn't mean the types have to be identical: a u8 can be read as an i8, because all valid u8s are valid i8s).
There must be no chance that, in parallel, (data race):
We write to the pointer while another thread reads or writes through the pointer.
We read the pointer while another thread writes through the pointer.
Extra conditions apply to some pointer methods, like offset.
For &
The data must be a valid instance of the pointer type (see above).
The data must be valid for the lifetime of the reference.
Strictly speaking, it must be valid at time of access, but if you hand a reference to code you don't control, you must guarantee any safe access is valid (i.e. the data must be valid for the lifetime of the reference).
The data must not change between reads, unless there is an UnsafeCell between the changed data and the reference (defined as when an the reference points to a type that contains an UnsafeCell, which contains data that changes). In the UnsafeCell case, there must be no data races.
No &mut references exist for the object.
For &mut
The data must be a valid instance of the pointer type (see above).
The data must be valid for the lifetime of the reference (it cannot change because it is not observed anywhere else).
No other references (either & or &mut) exist for the object.
The data is not observed except through the pointer, this includes pointer::reading through a raw pointer.
Raw pointers to the object can exist, as long as they are not dereferenced (is this correct??).