Is `transmute(usize)` as `Option<&T>` sound?

transmute doesn't require alignment (of the value itself, not about "inner types") to be the same, at least not mentioned in the document https://doc.rust-lang.org/std/intrinsics/fn.transmute.html.

Yes, the value of y can be two different values depending on the layout (also endianness) with my assumption. But it is only an unspecified behavior (the behavior is only in a defined set, in this case, of size two), not an undefined behavior UB (the behavior can be anything). (I'm 80% sure about this).

This is UB at least because nomicon says

Transmuting between non-repr(C) types is UB

TwoInts is not repr(C). Full stop.

In order for something to be defined behaviour, someone must have defined what the behaviour is. Except for certain special cases such as option-like enums, nobody has defined what the layout of a #[repr(Rust)] type is. This makes it undefined behaviour.

Of course, the current implementation of the compiler will certainly pick one of the two field orderings and use that, but this has not been defined, because we wish the retain the ability to change these guarantees in the future.

That said, the compiler would be perfectly justified in noticing that the types are not guaranteed to be equivalent and proceed to assume the main function is never called as that would be undefined behaviour and then optimize it away.

It would also be valid to unconditionally trigger the assert.

1 Like

Sorry, my wording was not precise enough. I'm not saying TwoInt has size 8. Yes, it is perfectly okay for the compiler to optimize my code to an unconditional panic.
It is perfectly possible to TwoInt to has size 16, for example.
But I'm saying that assuming the assertion passed (thus assuming it had size 8), the layout has to be one of two layouts.

Edit: also assuming u32 has align 4 (it is possible to have a smaller alignment).

My reasoning:
Let offset_<i> to be (&x.<i> as *const _ as usize) - (&x as *const _ as usize) for x : TwoInt. offset_<i> is constant regardless of x [1].
&x.0 and &x.1 has to be distinct, 4-byte aligned and for each i,

  • 0 <= offset_<i> && offset_<i> < size_of::<TwoInt>() [1]
  • offset_<i> % 4 == 0
  • offset_0 != offset_1 [1]

But if size_of::<TwoInt>() == 8, only two solutions, namely offset_0 == 0 && offset_1 == 4 and offset_0 == 4 && offset_1 == 0 satisfy the constraints. Thus, the behavior is one of the two possible behaviors.

(A caveat is that we are not taking reference in the code so it is possible that reasoning doesn't apply, I'm not sure about this).

Edit: also requires a of transmute itself is not a UB when such "layout is the same" (meaning that std::mem::transmute actually works as expected and doesn't do more or less works). Although it should be true, I'm skeptical it is currently explicitly specified... (because nomicon contradicts this).

[1]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/layout/structs-and-tuples.md#struct-types although wording here is not in the most precise way, it seems guarantees that

  1. a struct type determines a fixed offset for each field (for each compilation).
  2. Each field doesn't overlap.

Some discussion about this statement e.g. https://github.com/rust-lang/rust/issues/29702.
My stance is that the statement shouldn't be true. For example, if we take the statement literally, transmute::<T, T>(value) is a UB for repr(Rust) struct T, but this is absurd.
It should be noted that nomicon predates unsafe-code-guideline and probably doesn't reflect the most recent consensus.

Yes, this is how it would work now but you don't really know what could happen in a year.

This reference describes things as they are at the moment but it offers no guarantees that it couldn't change out of blue. I know, most of this stuff can be logically reasoned and there's no reason to think this would change yet.

But in a year, some dude notices a bug in codegeneration for unicorn-x71 architecture and says: "You know, this can be fixed quite trivially but we need to change some code in layout distributing, you know, padding bytes and all. This doesn't break anything because layout for repr(Rust) types has always been unspecified, people relying on it are lunatics". And he would be absolutely right.

It's fine to rely on things that are de-facto stable but not documented yet. But when it comes to things the docs explicitly call UB you should stop right here, open an issue in rustc/book/reference/nomicon repo and request folks to clarify things, as did I in proc-macro attributes discussion. Once you start to apply reasoning to things that are explicitly not defined and subject to change in the future - you're screwed.

For example, take #[transparent] attribute: people had been doing

struct Foo(i32);

let i: i32 = transmute(Foo(2));

and it worked just fine... until some dude noticed that when it comes to arm functions' calling conventions they are not the same at all, Foo get passed by ref but i32 is passed by value. Did folks specify layout for single-tuple structs? Nope, they left it undefined and introduced #[transparent] specifically for this purpose.

This is why I came here with my Option<&T> question in the first place - while I'm 100% sure that both usize and Option<&T> are represented the same at bit-level I know nothing about how it would interact with rustc optimiser and the like. Also, I found links that explicitly guarantee that Option<T> is repr(C) as long as T is repr(C) and subject for "non-null" optimization. I'm relieved.

Yes, the doc is clearly incorrect in certain cases but it doesn't mean it's incorrect in all cases.

Also, I don't recall existence of "unspecified behavior" in rust reference, mind sharing a link?

You are correct here. The unsafe-code-guidelines preface states

The goal of the UCG WG is to produce the Unsafe Code Guidelines Reference: a "guide" for writing unsafe code that "recommends" what kinds of things unsafe code can and cannot do, and that documents which guarantees unsafe code may rely on. It is largely a work-in-progress right now.
Unless we state otherwise, the information in the guide is mostly a "recommendation" and still subject to change. The main goal for the WG in 2019 is to RFC parts of the guide into the language specification, and move those parts into the Rust Language Reference proper.

(emphasis mine).

However, we have to base the discussion on some reference. And the most recent, thoroughly-discussed documentation is the unsafe-code-guidelines so I'm referring this here.

If there is discrepanency between rustc / miri to the unsafe-code-guidelines, it should be considered as a bug of either two and should have been reported, hopefully.

The nomicon issue is stale since 2017 for comparison.

This is not a true story. In fact, repr(C) for an enum has a different meaning.

The unsafe-code-guidelines says Option-like enums have special constraints on the layout (value representation). The reference requires Option-like enums to have no explicit repr attribute. Which means it is repr(Rust).

Summary:

  • repr(C): always have a defined layout by the field ordering algorithm.
  • repr(Rust): no constraints of the layout in its own but there can be additional constraints by other means. For example, if it is an Option-like enums.

repr(Rust) is not always have no constraints but it is not always have constraints.

It is a term from C specification but https://github.com/rust-lang/unsafe-code-guidelines/pull/214 has a discussion. My usage of the term should be the same as @RalfJung 's comment there. In fact, the comment specifically refer to the struct field offsets as "unspecified".

If it can fail the assertion then it is definitely UB. The compiler is free to reorder the transmute to before the assert. It is free to make the transmutes machine code an invalid instruction (or something else) if the aligns differ.

Main point; Rust must be valid on the abstract machine. It is not enough to have matching machine code.

No.

fn invoke_ub() { unsafe { (0 as *mut i32).write(1) } }
fn main() {
    panic!();
    invoke_ub();
}

It is a UB if invoke_ub is called but it is a perfectly sound program if invoke_ub is never called. Compiler cannot reorder invoke_ub before the panic!.

Or a more practical example:

fn index<T>(slice: &[T], i: usize) -> &T {
    assert!(i < slice.len());
    unsafe { slice.get_unchecked(i) }
}

Obviously, it should be sound (this is simplified std implementation!).

1 Like

Thanks for the clarification, you are right here.

I would call it "has no constraints unless specified by other means".

I would also call "unspecified behavior" term totally useless because

  • Unspecified behavior always comes with a set of behaviors that the implementation can pick from.

    It doesn't mean you can rely on this set of behaviors staying the same. Like I said, even if it works now it might stop working in the future unless it's well defined.

  • We always need to give a bound on what "unspecified behavior" can do, or we might as well declare it UB.

    I absolutely agree here. Look: it may be a different theoretical concept but if we can't really use the differences between them, well, they are the same from practical point of view.

So, when you considering TwoInts layout and deriving there are "only two ways how the compiler could possibly do that" you might be right for now but in next compiler patch it may break. Don't do such assumptions.

Yes, this is sound because get_unchecked(i) may or may not be sound. Compiler can reason that this is not necessarily UB so it would preserve this assert!().

But in your invoke_ub example compiler could (not necessarily!) notice that there's no non-UB path of execution and replace invoke_ub call with std::hint::unreachable.
Now let's apply this logic to TwoInts snippet (read comments bottom to top)

struct TwoInt(u32, u32);
fn main() {
    // optimized as unused
    let x = TwoInt(0, 1);
    // the next line is unreachable so we deduce that assert always fails
    // let's replace it with panic!
    assert!(size_of::<TwoInt>() == 8 && align_of::<TwoInt>() == 4);
    // compiler can reason this is UB so it replaces it with unreachable
    let y: u64 = unsafe {std::mem::transmute(x)};
    // this is also unreachable so it's also optimized away
    println!("{}", y);
}

Congratulations, main is now reduced to single panic!. This doesn't happen at the moment but this can very well change in next rustc patch, who knows?

Congratulations, main is now reduced to single panic! .

I have been saying that this can happen, right now. No disagreements here. My comments have been with assuming compiler has chosen the particular layout in a particular compilation.

Consider this alternative form:

fn main() {
    if size_of::<TwoInt>() == 8 && align_of::<TwoInt>() == 4 {
        unsafe { transmute(TwoInt(0, 1)) };
        std::process::exit(2);
    } else {
        std::process::exit(1);
    }
}

I'm saying the program above either:

  1. Exits with code 1.
  2. Exits with code 2.

Do you agree this is true? (With respect to the current unsafe-code-guideline).

I wonder if the disagreement point is whether a type layout is always the same for a particular compilation.

fn main() {
   assert!(size_of::<TwoInt>() == size_of::<TwoInt>());
}

The assertion should always pass. A compiler can change a type layout even between two compilations with the same compiler, same flags, etc. of the same code. However, the type layout is fixed for a particular compilation. It is not the same as a layout is nondeterministic.

I would also call "unspecified behavior" term totally useless because

It doesn't mean you can rely on this set of behaviors staying the same. Like I said, even if it works now it might stop working in the future unless it's well defined .

The set is defined in unsafe-code-guidelines. The set of struct field offsets is a subset of integer tuples such that particular constraints are satisfied. The particular constraints are described in unsafe-code-guidelines, such as "no overlaps", "aligned" etc..

No, I don't.

fn main() {
    if size_of::<TwoInt>() == 8 && align_of::<TwoInt>() == 4 {
        unsafe { transmute(TwoInt(0, 1)) }; // UB is turned into unreachable, because why not?
        // std::process::exit(2); // optimized away, because unreachable
    } else {
        std::process::exit(1);
    }
}

And now it either

  • exits with 0 if condition is true and UB is detected
  • exits with 1 if condition is false
  • exits with 2 if condition is true and UB is not detected

Yes, according to the current unsafe-code-guidelines the UB will not be detected but nothing in the guidelines tells us that this behavior will be preserved in the future.

I believe this is one of these "de-facto standard" cases but I'm not a member of rust core team though.

And this is where (I believe) the source of misunderstanding lays in.

You're referring to unsafe-code-guidelines as to a formal and well defined spec while it's a best-effort document. Yes, some parts of it can be counted as a proper spec, while others not.
Layouts of repr(Rust) are the latter.

You are stating that you can rely on current description as if it was standard, but it is explicitly stated that it's subject to change. I'm trying to persuade you from doing that.

In my opinion, this the the same thing as reliance on #[doc(hidden)] pub items in third party crates. Especially, if they are annotated with /// Not public API, subject to change.

If the transmute were UB, then the whole branch would be optimized away, and the program would end up assuming that the condition in the branch is always false and exit with 1. That is, some bugs can still occur from that assumption being wrong (since it does not need to be checked, since UB is assumed unreachable).


That being said, I haven't looked at your example in detail, but know that

mem::transmute::<usize, Option<&Thing>>(arg)

is sound provided that Thing : Sized and arg % mem::align_of::<Thing>() == 0

2 Likes

You're referring to unsafe-code-guidelines as to a formal and well defined spec while it's a best-effort document. Yes, some parts of it can be counted as a proper spec, while others not.

Again, we cannot do a meaningful discussion if there is no definition. Of course, you can say "anything can change tomorrow" but that is trivial and not meaningful. We have to agree on what definition (the model) we use here. I have chosen unsafe-code-guideline because not it is a "proper spec" (yes, it is explicitly not), but the guideline seemed most useful for me.
Do you have an alternative model?

Yeah, absolutely. But I'm not saying "we shouldn't talk about it", I'm saying "we should use it as a mental model only and do not write any code relying on unstable docs".

Yes, it's totally useful stuff when you try to understand how things work today, but don't make you codebase depend on it because it can change out of blue.

No, and I don't need one. This model is fine but, please, do not base you code on it in hope it would stay the same. Well, unless you're rustc contributor, these guys are free to do whatever the hell they please.

No, and I don't need one. This model is fine but, please , do not base you code on it in hope it would stay the same.

I tend to agree with this "philosophy". Still, there are disagreements.

Yeah, absolutely. But I'm not saying "we shouldn't talk about it", I'm saying "we should use it as a mental model only and do not write any code relying on unstable docs".

But in my question

I explicitly stated we want to talk based on unsafe-code-guideline.

Compilation of future compiler or even current compiler is irrelevant here because we are talking about the particular model, namely unsafe-code-guideline. Or, more precisely, unsafe-code-guidelines#49f3c1235a0643fabdbe221bebe90579ddd06bdd.
That is, I'm saying that, even if current rustc disagrees with unsafe-code-guideline somewhat, it is irrelevant and what written in the unsafe-code-guideline is taken (well, not really in practice, but whatever).

No, and I don't need one. This model is fine but, please , do not base you code on it in hope it would stay the same

Well, unsafe-code-guideline is an official guideline. Meaning that unsafe code writers are meant to follow this when writing unsafe codes. Do you imply this is bad? No unsafe codes should be written until the guideline becomes a "specification"?

Edit: Also, unsafe-code-guidelines is maintained by a working group, which includes many core compiler contributes.

No way in hell this is irrelevant.

People don't use unsafe-code-guidelines to compile their code, they use rustc. If rustc and guidelines disagree - rustc is the one who wins here, given the disagreement is not because of a bug in compiler, of course. If docs are not up to date - this is the problem of docs, not compiler.

But yes if you want to put it as "theoretically speaking, based purely on docs" - sure, we can say that transmute is OK in this particular model. Just keep in mind that if a theoretical model is not on par with universe - this is not a problem of universe.

Yes, I am stating this is bad if followed mindlessly.
I already said it's OK to rely on de-facto standards, not everything gets documented in time.
It is also OK to rely on unstable stuff if there's no other, safer way around. Well, not OK but we have no other choice in this case.

But it's definitely not OK to rely on subject-to-change stuff when you have another choice. In TwoInts case we could simply annotate it with repr(C) or use bit shifts to convert it in safe code, it's hardly a difference in most of real-world cases, profiler is your best friend here.

Hell yeah!

Little remark: "No unsafe code should be based upon a unstable guideline"

I genuinely wish people would stop founding treir unsafe code on not-ready-yet documentation, see this for example. To be honest, all of this UB (AFAIK, tell me if I'm wrong) can be rewritten to not-UB without any performance regressions, there's always another way around (again, AFAIK, I have yet to see such a case).

Unfortunately, this wish is on pair with getting a real unicorn.

In rustc they can write whatever they want because rustc is de-facto Rust, it's always up to date, core folks would be ready should anything change. They are the ones who define Rust as it is after all, it's quite logical that they are the ones who write docs.

It doesn't make the doc magically stable. Some parts should not be relied upon unless you are ready to keep your finger on the pulse and react accordingly. The truth is: most people outside of rustc aren't.

I agree that "when you can do without unsafe, you should do without unsafe". Not only because of Rust's rather unspecified semantics, because it is error-prone to justify an unsafe code even if there is a specification (existence of C++ standard doesn't mean you can write valid codes).
TwoInt example is more of a thought-experiment and not meant to be a real code. Obviously, transmute is not needed there. Probably you can output the same assembly with safe code.

Now I'm understanding which point we have a disagreement.
The point is whether unsafe-code-guideline is more of an (a. rather stable to-be-specification) or a (b. rather unstable, "de-fact standard").
The working group should have an answer to this question. My opinion was just an educated guess.

It doesn't make the doc magically stable. Some parts should not be relied upon unless you are ready to keep your finger on the pulse and react accordingly. The truth is: most people outside of rustc aren't.

That is unfortunate. Hopefully, Rust will have stable specification for unsafe things. Rust's stability guarantee is one of Rust's great things.

Don't use transmute when you don't need to. You should just use

((self.0 & !1) as *const Data).as_ref()

Then it's very clearly correct, and still compiles to a NOP: Compiler Explorer

3 Likes

Transmuting between non-repr(C) types is UB

That's not entirely correct. The full statement is that ransmuting between types that don't have a guaranteed layout is relying on unspecified behavior (and I see @pcpthm has been arguing for that, too) .

Miri cannot detect when you are relying on unspecified behavior, that's why you do not get an error. This is different from what we usually refer to as UB, so while it is often a convenient short-hand to say that such transmutes are UB, it can also be confusing.

Do you have a link for that? We should likely fix the Nomicon then.

This seems to indicate an outdated view of what UB is; also see the UCG glossary. "Undefined Behavior", these days (since around C89) is not about things "not being defined" in the spec; it is about things explicitly being defined as never happening -- so that if they happen anyway, that's "causing undefined behavior". If the spec of the Abstract Machine doesn't define something, that's just an incomplete spec and not a license for anyone to do anything.

No, that is wrong. The compiler can never move possibly-UB-causing operations to before a check that might avoid executing them! if false { cause_ub(); } is a perfectly valid thing to do. Dead code never gets executed on the Abstract Machine, and thus there cannot be UB from dead code either.

The compiler is free to reorder UB around operations that will definitely terminate. This leads to the (in)famous "atemporal UB" in C:

printf("Hello\n"); // definitely terminates
cause_ub();

No matter what printf does, this program will definitely always eventually have UB, and that is en9ough to give the compiler license to move the UB "up" and make it happen earlier. But that is not the same as moving UB out of conditionals that guard against UB!

The UCG is gathering information on the de-facto situation of Rust and forming consensus within itself to make recommendations to the lang team. Nothing the UCG does or says has any normative power. The plan is, once we have some amount of material, to start writing RFCs that go through the usual process to eventually become normative.

So, if you view the UCG documents as pre-RFCs, you are not very wrong.


But, to come back to the original question by @CreepySkeleton (man you guys went on a tangent here :wink: ): that code seems fine to me. transmute::<Option<&Data>, usize> is indeed the same as as_ref() (well, at least my reading of this FCP is that we have lang team consensus on that), and the safety docs for that say

When calling this method, you have to ensure that if the pointer is non-NULL, then it is properly aligned, dereferencable (for the whole size of T ) and points to an initialized instance of T .

So as long as you an make sure that this is true whenever as_ref is called (and it seems like you do), I'd say you are good.

5 Likes