Re-interpreting a Repr-C'd struct's bytes as a certain type

nologik · August 8, 2019, 10:44pm

suppose within a #[repr(C)] struct, I have a pointer returned to me by the global allocator with an align of, say, 8 bytes. I want the user to input a type param T, and so long as the align of T is equal to those 8 bytes, I will return a mutable reference to the structure represented by those 8 bytes.

This doesn't work:

    unsafe fn cast_unchecked_mut<Type>(&self) -> &mut Type {
        std::ptr::read_volatile::<&mut Type>(self.ptr as *const &mut Type)
    }

nor

    unsafe fn cast_unchecked_mut<Type>(&self) -> &mut Type {
        std::ptr::read::<&mut Type>(self.ptr as *const &mut Type)
    }

I would also like to mention that self.ptr is of type *mut u8

RustyYato · August 9, 2019, 1:23am

First, you should be using &mut self to avoid UB. Next, you could just cast the pointer and dereference it

unsafe fn cast_unchecked_mut<Type>(&mut self) -> &mut Type {
    assert!(std::mem::align_of::<Type>() <= 8);
    &mut *(self.ptr as *mut Type)
}

nologik · August 9, 2019, 1:24am

I have mechanisms behind the scene controlling the access order, so it's actually good on the UB there (unless there's something weird?). I will check this new way out now! thanks

RustyYato · August 9, 2019, 1:25am

Going from a &_ to a &mut _ is UB unless you go from a &UnsafeCell<T> to a &mut T, even then that can be UB if done incorrectly. No matter how you do it.

nologik · August 9, 2019, 1:26am

Really? How so? I have atomically-controlled usage of the get_unchecked_mut, so i'd figure giving it a mutable through an immutable is okay?

nologik · August 9, 2019, 1:26am

using AtomicUsize, as an example

RustyYato · August 9, 2019, 1:28am

AtomicUsize uses UnsafeCell internally.

(lots of macro stuff to build all atomics, but they are defined like so)

#[$stable]
#[repr(C, align($align))]
pub struct $atomic_type {
    v: UnsafeCell<$int_type>,
}

Same with Cell, RefCell, Mutex, RwLock and many more

nologik · August 9, 2019, 1:31am

Okay, great. Problem solved. For extended dicussion, the below compiles. If I control when rip_mut is called, would this still produce UB? would there be bad alignment or mis-allocation when the data structure changes?

    /// converts immutable self to mutable self. This is retardedly unsafe and I'm not special enough to do it, according to Rust
    unsafe fn rip_mut(&self) -> *mut Self {
        (&*self as *const Self) as *mut Self
    }

RustyYato · August 9, 2019, 1:32am

That is not UB, it would be UB to write to that pointer though. (no &mut _ was made, only a *mut _ which is not speical)

&mut _ is special because it means unique, and going from a shared reference to a unique reference requires black magic (a lang item, UnsafeCell) to do.

nologik · August 9, 2019, 1:34am

(*rip_mut()).my_usize_field = 10

That would be UB if it was access-controlled?

(Access-controlled via AtomicUsize)

RustyYato · August 9, 2019, 1:35am

Yes, that would write to a ptr derived from a shared reference, without going through a UnsafeCell, so it UB.

Btw, what do you mean access controlled by a AtomicUsize?

nologik · August 9, 2019, 1:36am

Ah, this has to do with the project we were talking about in our last thread that you helped me in!

nologik · August 9, 2019, 1:37am

Asynchronous data editing

nologik · August 9, 2019, 1:38am

i make a WriteVisitor that Poll's ready once its "ticket" number equals the AtomicUsize. It guarantees that only one WriteVisitor has access to writing at once

RustyYato · August 9, 2019, 1:38am

Could you please link to it? I don't remember which one.

nologik · August 9, 2019, 1:38am

RustyYato · August 9, 2019, 1:43am

Yeah, this sort of thing would be easiest to build using UnsafeCell directly. Then control access to that UnsafeCell with whatever means that you want. Be very careful to make sure that all &mut _ uniquely referer to what they point to.

nologik · August 9, 2019, 1:47am

UnsafeCell looks nice to me. My question from using that: would the common optimizations that the compiler lends towards knowing that safe code treats mutability as unique go away? Would many optimizations go away?

RustyYato · August 9, 2019, 1:56am

It would remove any optimization that treats &_ as meaning immutable for the value wrapped in UnsafeCell<_>.

for example,

pub fn foo(x: &UnsafeCell<u32>) {
    unsafe {
        let a = *x.get();
        std::thread::yield_now();
        let b = *x.get();
        assert_eq!(a, b);
    }
}

generates

long assembly with panic handling

&T as core::fmt::Debug>::fmt:
	pushq	%r14
	pushq	%rbx
	pushq	%rax
	movq	%rsi, %rbx
	movq	(%rdi), %r14
	movq	%rsi, %rdi
	callq	*core::fmt::Formatter::debug_lower_hex@GOTPCREL(%rip)
	testb	%al, %al
	je	.LBB0_1
	movq	%r14, %rdi
	movq	%rbx, %rsi
	addq	$8, %rsp
	popq	%rbx
	popq	%r14
	jmpq	*core::fmt::num::<impl core::fmt::LowerHex for u32>::fmt@GOTPCREL(%rip)

.LBB0_1:
	movq	%rbx, %rdi
	callq	*core::fmt::Formatter::debug_upper_hex@GOTPCREL(%rip)
	movq	%r14, %rdi
	movq	%rbx, %rsi
	addq	$8, %rsp
	testb	%al, %al
	je	.LBB0_2
	popq	%rbx
	popq	%r14
	jmpq	*core::fmt::num::<impl core::fmt::UpperHex for u32>::fmt@GOTPCREL(%rip)

.LBB0_2:
	popq	%rbx
	popq	%r14
	jmpq	*core::fmt::num::imp::<impl core::fmt::Display for u32>::fmt@GOTPCREL(%rip)

playground::foo:
	pushq	%rbp
	pushq	%rbx
	subq	$104, %rsp
	movq	%rdi, %rbx
	movl	(%rdi), %ebp
	movl	%ebp, (%rsp)
	callq	*std::thread::yield_now@GOTPCREL(%rip)
	movl	(%rbx), %eax
	movl	%eax, 4(%rsp)
	cmpl	%eax, %ebp
	jne	.LBB1_1
	addq	$104, %rsp
	popq	%rbx
	popq	%rbp
	retq

.LBB1_1:
	movq	%rsp, %rax
	movq	%rax, 8(%rsp)
	leaq	4(%rsp), %rax
	movq	%rax, 16(%rsp)
	leaq	8(%rsp), %rax
	movq	%rax, 24(%rsp)
	leaq	<&T as core::fmt::Debug>::fmt(%rip), %rax
	movq	%rax, 32(%rsp)
	leaq	16(%rsp), %rcx
	movq	%rcx, 40(%rsp)
	movq	%rax, 48(%rsp)
	leaq	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.3(%rip), %rax
	movq	%rax, 56(%rsp)
	movq	$3, 64(%rsp)
	movq	$0, 72(%rsp)
	leaq	24(%rsp), %rax
	movq	%rax, 88(%rsp)
	movq	$2, 96(%rsp)
	leaq	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.5(%rip), %rsi
	leaq	56(%rsp), %rdi
	callq	*std::panicking::begin_panic_fmt@GOTPCREL(%rip)
	ud2

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.0:
	.ascii	"assertion failed: `(left == right)`\n  left: `"

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.1:
	.ascii	"`,\n right: `"

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.2:
	.byte	96

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.3:
	.quad	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.0
	.asciz	"-\000\000\000\000\000\000"
	.quad	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.1
	.asciz	"\f\000\000\000\000\000\000"
	.quad	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.2
	.asciz	"\001\000\000\000\000\000\000"

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.4:
	.ascii	"src/lib.rs"

.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.5:
	.quad	.Lanon.ec0bd99b1b4051f7d320bc576b3d0a71.4
	.asciz	"\n\000\000\000\000\000\000\000\b\000\000\000\t\000\000"

while

pub fn bar(x: &u32) {
    let a = *x;
    std::thread::yield_now();
    let b = *x;
    assert_eq!(a, b);
}

generates almost nothing (completely optimized away the assert)

playground::bar:
	jmpq	*std::thread::yield_now@GOTPCREL(%rip)

On playground

Note, that the only thing between the two gets is a thread yield!

This behavior is correct and wanted. You don't want these optimizations when you have shared mutability, as they will lead to subtle and infuriating bugs.

nologik · August 9, 2019, 2:00am

Very interesting... so, let me see if I understand this. both a and b, which are stored in different layouts in memory, happen to have the value which is the pointer towards x.

Topic		Replies	Views
Casting a ref to writable mem to &mut: UB or not? help	11	1506	September 15, 2019
Is This The Right Way to Transmut &mut [u8] to &mut MyType?	7	621	January 7, 2021
How to cast immutable to mutable reference with UnsafeCell help	4	621	March 5, 2024
Reference to opaque type help	4	950	March 27, 2020
Help me understand take_mut safety help	2	696	January 12, 2023

Re-interpreting a Repr-C'd struct's bytes as a certain type

Related topics