Why is the write_volatile() call deleted by the compiler?

The documentation says that “Volatile operations are intended to act on I/O memory, and are guaranteed not to be elided or reordered by the compiler across other volatile operations.”, and so I expect this call to always exist, but it seems not to be the case.
I have this code:

unsafe {
    core::arch::asm!("nop");
    core::arch::asm!("nop");
    core::arch::asm!("nop");
    let ptr = 0x12ff000 as *mut u32;
    ptr.write_volatile(kernel_len as u32);
    core::arch::asm!("cli");
    core::arch::asm!("hlt");
}

But the final assembly code turns out to be this

0x0000000000009383:  nop
0x0000000000009384:  nop
0x0000000000009385:  nop
0x0000000000009386:  mov    %edi,-0x1000
0x000000000000938b:  cli
0x000000000000938c:  hlt

In cases where the call is not deleted, the correct code is generated

0x00000000000092d4:  nop
0x00000000000092d5:  nop
0x00000000000092d6:  nop
0x00000000000092d7:  addr32 mov %eax,0x12ff000
0x00000000000092de:  cli
0x00000000000092df:  hlt

It looks like the documentation is misleading saying that the volatile operation is guaranteed not to be deleted.

How are the two different assemblies generated? Can you provide more information on how you got the two different outputs?

Something like a runnable example on Compiler Explorer would help, such as this one.

1 Like

are you sure you are not inspecting the output of different code?

because in your snippet, there is a memory store instruction, it is not deleted, it just write to a different address (-0x1000 instead of 0x12ff000), where kernel_len is in the %edi register,

It is only me (at the end of a week of work) or there are other people that feel that answers like this one are AI-generated and add absolutely nothing to the discussion?

4 Likes

I just commented out a few lines above the block to make this code work. In any case, no matter what actions I change elsewhere in the code or compiler settings, this call MUST exist because the documentation says “intended to act on I/O memory, and are guaranteed not to be elided”.

Not just that, the answer is blatantly wrong. "If the compiler determines that the write operation has no noticeable effect, the write_volatile() call can be optimized." --no it can't, it directly contradicts the documentation and intention.

@mrjbom Can you provide an actual complete reproducible example that we can inspect? The snippet you provided isn't complete enough to make any conclusions. I can't reproduce your error:

playground::foo:
	push	rax
	nop
	nop
	nop
	mov	dword ptr [19918848], edi
	cli
	hlt
	pop	rax
	ret
1 Like

That's why I've placed some nop operations to make sure that I'm checking the right code.
I have no idea why “mov %edi,-0x1000” appears there

Adding nop operations doesn't guarantee anything. You should try to create a minimal reproducible example that is as small as possible and can be reproduced by others (easiest is to share a link to Rust Playground or Compiler Explorer).

1 Like

again, it is NOT elided, the instruction is there, but for unknown reasons, the address is wrong.

but, as others said, you didn't provide a reproducible case, so it cannot be decided where the discrepancy came from, for example, it might well be your disassembler.

here's similar code on the playground, and it gives the correct output, nowhere to see the mysterious -0x1000

input:

pub fn foo(kernel_len: u64) {
    unsafe {
        core::arch::asm!("nop");
        core::arch::asm!("nop");
        core::arch::asm!("nop");
        let ptr = 0x12ff000 as *mut u32;
        ptr.write_volatile(kernel_len as u32);
        core::arch::asm!("cli");
        core::arch::asm!("hlt");
    }
}

output:

playground::foo:
	pushq	%rax
	#APP
	nop
	#NO_APP
	#APP
	nop
	#NO_APP
	#APP
	nop
	#NO_APP
	movl	%edi, 19918848
	#APP
	cli
	#NO_APP
	#APP
	hlt
	#NO_APP
	popq	%rax
	retq

Guarantees, since assembler inserts cannot be optimized.
In addition, nowhere else in the code can a sequence of nop; nop; nop; nop; ... cli; hlt;
I unambiguously check the right code, moreover I checked the address itself in memory and the right value was not there.
Unfortunately, I am not able to replicate this situation in godbolt or play rust.

In my case, the assembly code and the memory location itself do not have the currect value.

Recreate with docker/podman.

Maybe kernel_len in code before is from some UB. We don't have enough from you to be able it help.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.