Unsafe disclaimer: I'm not writing any code for a program or modifying code in production. I'm just hacking around to learn how memory access works under the hood. I promise not to use this information for bad.
I'm asking for a review of the code blocks below and a correction or validation of my stated assumptions.
Borrow checker won't allow construction of self-referential object
struct Struct { view: &mut u8, stashed: u8 };
let stashed = 1u8;
let s = Struct { view: &mut stashed, stashed };
Lets thwart the borrow checker with a 'static lifetime trap
fn bad_self_referential_struct_moving_var_after_reference() {
#[derive(Debug)]
struct Struct { view: &'static mut u8, stashed: u8 };
let mut stashed = 1u8;
let val_ref: &'static mut u8 = unsafe { std::mem::transmute(&mut stashed) };
let mut s = Struct { view: val_ref, stashed };
dbg!(&s); // Struct { view: 1, stashed: 1 }
s.stashed += 1;
dbg!(&s); // Struct { view: 1, stashed: 2 } !!view has wrong reference!!
}
The following bugs are what the checker was stopping
-
s.view
references the wrong variable -
stashed
gets copied not moved intoStruct { stashed: stashed
sos.view
at least doesn't reference garbage that doesn't exist - if we return
s
, thens.view
will reference garbage becausestashed
var will get destroyed at the end of the fn
Let's try two-step initialization with the 'static lifetime trap
fn bad_self_referential_struct_with_second_step_init_for_ref() {
#[derive(Debug)]
struct Struct { view: Option<&'static mut u8>, stashed: u8 };
let stashed = 1u8;
let mut s = Struct { view: None, stashed };
s.view = unsafe { std::mem::transmute(&mut s.stashed) };
dbg!(&s); // Struct { view: Some(1), stashed: 1 }
s.stashed += 1;
dbg!(&s); // Struct { view: Some(2), stashed: 2 }
}
This appears to work with correct values printing. But we haven't moved s
yet so lets test that.
// .....
// .....
dbg!(&s); // Struct { view: Some(1), stashed: 1 }
s.stashed += 1;
dbg!(&s); // Struct { view: Some(2), stashed: 2 }
// now lets move s
let mut s = s;
dbg!(&s); // Struct { view: Some(2), stashed: 2 }
s.stashed += 1;
dbg!(&s); // Struct { view: Some(2), stashed: 3 } // view has wrong reference
Quick notes about safe referencing with Box
. To borrow the box and not borrow the inner value
&box
-
&mut box
-box.borrow()
-box.borrow_mut()
To get references to the inner box data
box.as_ref()
box.as_mut()
-
[edit+]
box.borrow()
-
[edit+]
box.borrow_mut()
I hope those are correct assumptions?
Lets try putting stashed
on the heap so it doesn't get to move after it's referenced
fn maybe_bad_self_referential_struct_simple_with_transmute() {
#[derive(Debug)]
struct Struct { view: &'static mut u8, stashed: Box<u8> };
let mut stashed = Box::new(1u8);
let stashed_ref: &'static mut u8 = unsafe { std::mem::transmute(stashed.as_mut()) };
let mut s = Struct { view: stashed_ref, stashed };
dbg!(&s); // Struct { view: 1, stashed: 1 }
*s.stashed += 1;
dbg!(&s); // Struct { view: 2, stashed: 2 }
let mut s = s;
dbg!(&s); // Struct { view: 2, stashed: 2 }
*s.stashed += 1;
dbg!(&s); // Struct { view: 3, stashed: 3 }
}
This looks okay.. stashed_ref
refers to the inner pointer of Box
thanks to stashed.as_mut()
. It's also moved-not-copied into s
because it is a &mut T which doesn't implement Copy
. stashed
is also moved into Box
. Drop order of the fields is correct. No garbage is being left behind. Seems like it's all fine now.
If I run Cargo +nightly miri run
though
error: Undefined Behavior: trying to retag from <3054> for Unique permission at alloc1643[0x0], but that tag does not exist in the borrow stack for this location
help: <3054> was created by a Unique retag at offsets [0x0..0x1]
let mut s = Struct { view: stashed_ref, stashed };
| ^^^^^^^^^^^
help: <3054> was later invalidated at offsets [0x0..0x1] by a Unique retag
let mut s = Struct { view: stashed_ref, stashed };
The the construction of a single threaded, non Sync Struct
in maybe_bad_self_referential_struct_simple_with_transmute
should be sound? Not worried about concurrent access yet. I make no assumptions about memory bugs like data races in a multi-threaded execution.
But miri complains because it shouldn't be sound for reasons unrelated to how program execution should occur? The issue is compilation, specifically optimization, right? If the compiler thinks you moved your referenced object, then the references to that object are allowed to be cleaned up.
Bonus: using box leak to do the same as above but with the same aliasing issue reported by miri
fn maybe_bad_self_referential_struct_simple_with_leak() {
#[derive(Debug)]
struct Struct { view: &'static mut u8, stashed: Box<u8> };
let stashed = Box::new(1u8);
let stashed_ref = Box::leak::<'static>(stashed);
let stashed_ptr = stashed_ref as *mut _;
let stashed = unsafe { Box::from_raw(stashed_ptr) };
let mut s = Struct { view: stashed_ref, stashed };
dbg!(&s); // Struct { view: 1, stashed: 1 }
*s.stashed += 1;
dbg!(&s); // Struct { view: 2, stashed: 2 }
let mut s = s;
dbg!(&s); // Struct { view: 2, stashed: 2 }
*s.stashed += 1;
dbg!(&s); // Struct { view: 3, stashed: 3 }
}
I looked around and discovered that there is a crate called aliasable that "addresses" the aliasing issue.
So I tried it
struct Struct { view: &'static mut u8, stashed: AliasableBox<u8> };
let mut stashed = AliasableBox::from_unique(Box::new(1u8));
let stashed_ref = unsafe { std::mem::transmute(stashed.as_mut()) };
let mut s = Struct { view: stashed_ref, stashed };
dbg!(&s);
s.stashed.add_assign(1);
// dbg!(&s);
Miri wont report UB with this but if I uncomment the last line to read after write then it will report UB again. At the least, constructing the self-referential object no longer reports UB and that was the objective.
How the aliasable crate works and why miri reports UB when I mutate then read in this case is a mystery to me. Without AliasableBox
, even commenting the last line in the other examples will still report UB, so AliasableBox
is doing something. But I don't know what and I'm still learning so some insight would be appreciated.