Is this UB or not?

axos88 · February 3, 2023, 10:23pm

Had an argument with someone on SO rust - How do I create a global, mutable singleton? - Stack Overflow wether the following block leads to UB or not with the assumption that the architecture guarantees main() runs before any other rust code, and it not interrupted until the end of the unsafe block.

Can someone support if it is or is not UB, and why? Both opinions seem plausible, especially if dark magic optimization happens behind the scenes, but I can't figure it out.

#![feature(const_size_of_val)]
#![feature(const_ptr_write)]

static mut GLOBAL_LAZY_MUT: StructThatIsNotSyncNorSend = unsafe {
    // Copied from MaybeUninit::zeroed() with minor modifications, see below
    let mut u = MaybeUninit::uninit();

    let bytes = mem::size_of_val(&u);
    write_bytes(u.as_ptr() as *const u8 as *mut u8, 0xA5, bytes); //Trick the compiler check that verifies pointers and references are not null.

    u.assume_init()
};

(...)

fn main() {
    unsafe {
        let mut v = StructThatIsNotSyncNorSend::new();
        mem::swap(&mut GLOBAL_LAZY_MUT, &mut v);
        mem::forget(v);
    }
  
}

axos88 · February 3, 2023, 10:27pm

And StructThatIsNotSyncNorSend::new() does not access the global var either of course.

alice · February 3, 2023, 10:39pm

The main source of problems here is that StructThatIsNotSyncNorSend might be a type where filling it with A5 bytes is not allowed. For example, if it has a boolean somewhere, then your code produces a boolean with the value 0xA5, and booleans must have the value 0x00 or 0x01. If that happens, then your code definitely has UB.

How about wrapping the type of your static in MaybeUninit?

The part about the struct not being Send or Sync is less interesting. That only causes UB if you actually use the struct in an invalid way.

quinedot · February 3, 2023, 11:34pm

Side note: Other Rust code can call main, so if it matters you should have an atomic and panic when entering main a second time.

afetisov · February 4, 2023, 1:01am

The code is not UB because the value is overwritten before it is read

mem::swap obviously performs a read and a write on both of its operands. If reading your static is UB, you have just triggered it.

Also, as Alice says, your "fake initializtion" trick doesn't work. There are no validity guarantees on raw pointers (they can be null), and there are way more guarantees on safe references and smart pointer types. More generally, any Rust type can have arbitrary bit patterns as "niches", meaning that those are never a valid representation of some value of that type. bool is a famous example (only 0 and 1 are acceptable), enums are another common source, but unsafe code can produce absolutely any kind of niches.

So if your 0xA5 pattern is a valid value of the type, you should've been able to construct it via normal means, and if it isn't, then you immediately get UB at that exact line. You could have just used mem::zeroed. Again, it's safe if all zeroes is a valid representation of the type, and immediate UB otherwise, but at least the compiler does some sanity checks for some common types which can never be used with mem::zeroed.

If your type is !Send and !Sync, then accessing it from a different thread is immediate undefined behaviour. This means it is confined to a single thread, so why not make it a thread_local!? You can wrap it in Cell or RefCell if you need to mutate it safely. 100% safe and sound, 0 unsafe code, and the performance hit is negligible to nonexistent.

alice · February 4, 2023, 7:47am

The code is not UB because the value is overwritten before it is read

It's completely irrelevant whether you use it or not. The rules for UB include the following line:

Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation.

Behavior considered undefined

So according to the rules, undefined behavior happens immediately when the invalid value is created. When you use it, or whether you use it at all, makes no difference for the UB rules.

You can read more about this phenomenon in this article: Why even unused data needs to be valid.

This is not actually true. For example, reading a raw pointer from two different threads in parallel is perfectly fine. You only get UB once you actually do something that disallowed.

It's important to distinguish between validity and safety invariants. The requirement that a bool contains only 0x00 or 0x01 bytes is a validity invariant, so breaking it leads to UB immediately. The requirement that you don't access !Send+!Sync values from other threads is a safety invariant, so breaking it just makes it easier to trigger UB later.

steffahn · February 4, 2023, 9:13am

Assuming the context I understand from the linked stackoverflow post, this StructThatIsNotSyncNorSend would be containing a Vec<T> of some sort.

Looking into the definition of Vec<T>, it consists of a usize length, a usize capacity a unit-struct for the allocator and a pointer of type Unique<T> and thus a NonNull<T> pointer, so for all I can tell, it is _with the current compiler and the current standard library currently (as far as I can tell) not necessarily “language-UB” undefined behavior to initializing a Vec with a lot of 0xA5 bytes, since that would make the NonNull pointer certainly not null.

However, it’s still definitely “library-UB” undefined behavior to initialized a Vec with an invalid pointer. I.e. the standard library is free to change in future compiler versions in ways that turns this code into actual (“language-UB”) undefined behavior.

In fact, it seems very likely with goals such as exploring more niches from aligned pointers, that at least an improperly aligned pointer value in a Vec will become language-UB sooner or later.

I mean… in the unlikely scenario of only caring about the compilation results of a particular piece of code under a particular compiler version, such considerations for future changes might be irrelevant.

In any case, I do not see any reason to prefer this at least very-close-to-UB approach over properly using a MaybeUninit value. (Or if possible, as in case of Vecs, perhaps even just pre-initializing with an empty dummy Vec::new.)

Of course even more preferrable would always be a solution without using unsafe at all, unless there’s some significant problems – e.g. poor performance – with all and even the best possible safe solutions.

axos88 · February 7, 2023, 2:25pm

@alice, @quinedot, @afetisov

Thanks for you answers, you raise good points, let me reflect on them.

Sure, we can add not calling main again to the list of restrictions, I never thought that anyone would ever want to do that, but you are right that it is a valid corner case.

The whole point would be to avoid wrapping it in MaybeUninit, because once the app starts up it is guaranteed to be initialized.

Although writing a value of 0xA5 into the memory layout is not valid for say a boolean, if that region of memory is never actaully read and interpreted as a boolean (the value is replaced before it is used that way) I have a hard time understanding why it triggers UB.

mem::swap does read the memory region since it copies the data bytes around, but NEVER interprets the read data as a boolean, a Vec, or whatever is there. It's basically translates to a call to libc::memcpy.

If we are worried that mem::swap causes UB, we could argue the same way about allocating memory on the heap for any struct. When malloc() returns, the memory region contains random data, which we immediately overwrite with the contents of our struct. So for a split-nanosecond there the data we point to may have other than 0 or 1 for a boolean, and a negative value for a vec length. But that data is never interpreted that way, and it is overwritten before anyone has a chance to do so, the same way that it happens here.

The whole point is to replace the 0xA5-s before anyone else has a chance to interpret the data region as the desired type with a valid instance (which cannot be built at compile time, because say it depends on an environment variable or something).
The issue is that I cannot use mem::zeroed() because the compiler won't let me. Obviously the compiler has no way to understand or verify the assumption that the value will. be overwritten before it is accessed, which I as a programmer can do, which is exactly the point of unsafe code.

I'm not saying this is not risky, I'm just saying that by itself this does not trigger UB, as long as we adhere to some (pretty strict) rules.

But I am open to arguments on why it triggers UB. For example in case of the boolean with a value of 2 may trigger UB, because the compiler assumes a value of 0 or 1. When the value is pattern matched the compiler assumes that by having a branch for the value 0 and 1 the match is exhaustive, but in reality our value will match neither, triggering UB, because the compiler might optimize our code in such a way that there is a return only in each branch, and our execution path may fall through to after the end of our function, and execute whatever data is there in memory as instructions.

But as you see, even in this case the UB is triggered only when the value 2 is interpreted as a boolean. Just by writing 2 to the value we are not triggering UB, IF we can guarantee nobody reads and interprets that value before we overwrite it with a 0 or 1, including possibly threads executing in parallel.

You are right, the type should actually be Send and Sync, not !Send and !Sync. I just wanted to highlight that it will only be accessed from a single thread (well just wanted to formalize the requirement that no parallel thread can access it before main has a chance to initialize it), but I kinda did the exact opposite.

2e71828 · February 7, 2023, 2:58pm

This is not necessarily true. Because the compiler knows the type that it's working with, it's free to make optimizations based on the type's layout. For example, if you have a platform where bit sets are faster than copying whole bytes around for some reason, the compiler is free to just swap the low-order bit inside mem::swap::<bool> and leave the rest alone. So, your invalid initial value remains invalid even after the swap.

Admittedly, this is a bit of a contrived example as I don't know of a platform like this, but it serves to illustrate the sort of thing that might happen with more complicated types as well.

In Rust, std::alloc::alloc() returns *mut [u8], and it's then your responsibility to fill it properly before casting it to some other type T.

jendrikw · February 7, 2023, 2:59pm

@axos88 Did you read this?

steffahn · February 7, 2023, 3:25pm

That’s an argument I don’t understand. It’s a static mut already anyways, so every access needs unsafe code either way. Also it’s a static, so destructors don’t matter. There’s literally no downside to using a MaybeUninit, except perhaps that you have to explicitly use the MaybeUninit API. But if there’s so many places where the static mut is unsafely accessed the convenience of being more concise than explicit MaybeUninit API usage allows is a concern, feel free to write a wrapper type implementing Deref and maybe even DerefMut.

If all accesses after initialization are read-only and safe anyways (I suppose this assumes a setting where we would not have any non-Sync types involved, otherwise actual “safety” is hard to achieve), there’s even a lot of potential benefit to switching to a non-mutable static with interior mutability. Use wrapper around UnsafeCell<MaybeUninit<…>> and give it a Deref implementation, and you’re golden. Or even the tiny overhead of using safe existing alternatives like once_cell.

afetisov · February 7, 2023, 4:08pm

Because constructing an invalid bool is UB, regardless whether you read it or not. You construct invalid representations when you call MaybeUninit::assume_init on invalidly initialized data. It doesn't matter what happens in the rest of the program, at that point you have UB.

Nobody promised you that, that's just your assumptions. The compiler is free to implement mem::swap in whatever way it sees fit. Moreover, the compiler is free to derive logical conclusions from whatever facts you tell them, including "this operation is not UB", "this condition is true in the branch" and "this value conforms to the specified type". This means that even if mem::swap specifically works as you expect, you still get UB and can get miscompilations in a different part of program, because the compiler has derived incorrect conclusions from your invalid operations.

As @2e71828 said, allocation returns an untyped pointer (void * in C terminology, *mut u8 in Rust). There are no assumptions on what data is stored there. You must explicitly fully initialize the contents before you can safely convert it to a different type.

But more importantly, you basic mental model of the compiler is incorrect. "split-nanosecond", "overwritten before anyone has a chance to do so" --- do you think you're playing catch with the compiler? Or that you should and could trick the hardware into overlooking your shennanigans? That's not how it works. If your code is incorrect "for a split-nanosecond", it's just incorrect period, and the behaviour of your entire program is Undefined.

Your "incorrect for a split-nanosecond" means "broken for thousands of users" in practice, because those supposedly "rare" events happen way more often than you think, particularly in multi-threaded code. Even if it wasn't, the compiler doesn't care what you do at each moment. The point of optimizing compilers is that they take your entire program and rewrite it into an entirely different program, with the only restriction being that it must have exactly the same observable behaviour. Instruction timing isn't observable.

If the compiler doesn't let you, you most definitely are introducing Undefined Behaviour. You're not outsmarting it, you're introducing a very nasty, dangerous and hard-to-fix bug.

Unsafe code must obey exactly the same rules as safe code. You don't get to "turn off the borrow-checker" or "turn off the type system", you just get the capability to perform some new dangerous operations, and it's your responsibility to ensure they uphold the same rules as safe code.

Coding-Badly · February 7, 2023, 4:08pm

AVR processors have bit manipulation instructions for low memory addresses.

alice · February 7, 2023, 5:19pm

In my opinion, there is no real value in that. Just do this:

static mut GLOBAL_LAZY_MUT: MaybeUninit<StructThatIsNotSyncNorSend> = MaybeUninit::uninit();

fn get_global() -> &'static StructThatIsNotSyncNorSend {
    // SAFETY: We initialize it in `main` before we call `get_global`.
    unsafe {
        let ptr = GLOBAL_LAZY_MUT.as_ptr();
        &*ptr
    }
}

fn main() {
    // initialize GLOBAL_LAZY_MUT to a valid value
}

Calling get_global will give you a nice and easily usable reference without a MaybeUninit in sight. As long as you make sure to initialize the global before using it, this is completely correct.

No, it is not the same at all. Your code creates an actual StructThatIsNotSyncNorSend that is invalid, whereas the malloc example only creates an *mut StructThatIsNotSyncNorSend. The fact that the invalid memory is behind a raw pointer is very important for the UB rules, because raw pointers are not required to point at a valid value.

No, your code that creates an invalid StructThatIsNotSyncNorSend is unambiguously UB. The rules for UB do not care about whether you use the invalid value. End of story. You might want the UB rules to be different, or maybe you don't think they make sense, but that doesn't change the rules. The rules are what they are.

I want to point out that you do not need an example of how something might be optimized incorrectly to say that something is UB. In fact, there are some things that are UB, but which are never miscompiled in practice. (Your situation is likely one of them.) However, that cannot be used to argue that it isn't UB because the relationship between UB and miscompiled is only a one-way relation:

CORRECT: If something is miscompiled, then it has UB. (or there's a compiler bug)
WRONG: If something has UB, then it must be possible for it to get miscompiled.

Due to this one-wayness, discussing optimizations can only ever be used to conclude that something is UB, and not to conclude that something is not UB. To conclude that something is not UB, you must refer to the rules for UB instead.

axos88 · February 8, 2023, 10:10am

TLDR: But as I finished writing up my answers, I think I finally locked in on what is the root of the debate here is:

Should a value be considered produced if it's ephemeral and it's replaced before anyone else has a chance to read it? Can a value unknown to all influence the behaviour of the application?
In a more real-world - albeit contrived - example, can me writing down the base64 encoded private key for Satoshi Nokomoto's wallet and hiding it in a backpack in the woods impact your life, IF you don't have knowledge about it? Even if you do have knowledge about it, it doesn't necessarily impact your life, because you may or may not go searching for it, depending on your concience, but if you don't know about it, it CANNOT impact your life.

@jendrikw

No I have not, and that was an interesting read, thank you. But note that even in the article the thing that makes the code UB is that the code may be optimized and rearranged in a way that the boolean is read and interpreted. In this case the whole argument is that is is overwritten before that can happen. No code accessing the value written after the assignment can be moved before that (data barrier).
I interpret the article as basically saying that although UB is triggered when an invalid value is actually read and interpreted, you might be wrong thinking that some value is not used, because the compiler can rearrange the code in a way that you don't expect, but that doesn't seem to apply to this particular case.

@steffahn

Sure using maybeunint can be used in this case, but that does present an (arguably small) performance impact, since the maybeunint needs to be assume_inited on each call, which doesn't appear to be a const fn. The impact is even worse when writing an accessor to hide this API. Not sure if .assume_init() can in reality be optimized away so that it would be zero-cost in practice or not, but currently casts cannot be const-fns I think.

There’s literally no downside to using a MaybeUninit , except perhaps that you have to explicitly use the MaybeUninit API. But if there’s so many places where the static mut is unsafely accessed the convenience of being more concise than explicit MaybeUninit API usage allows is a concern, feel free to write a wrapper type implementing Deref and maybe even DerefMut .

While the performance impact is arguable small, and it might be a case of premature optimization, which is the source of all evil, I'm also interested in the theoretic side of this. The original snipped was written as part of an embedded, real-time project where CPU is a scarce resource and every CPU cycle counted. Single CPU, single-threaded operation.

If all accesses after initialization are read-only and safe anyways

Yes, all accesses need to be read-only, that's for sure.

@afetisov

Because constructing an invalid bool is UB, regardless whether you read it or not.

That's a statement, not an argument. And actually the question I'm looking for an answer for. Why would constructing an invalid bool be UB, IF and only IF I can guarantee it won't be read before it's overwritten with a valid value?

Nobody promised you that, that's just your assumptions. The compiler is free to implement mem::swap in whatever way it sees fit.

You are absolutely right. Let's swap out mem::swap with libc::memcpy-s to be on the safe side.

This means that even if mem::swap specifically works as you expect, you still get UB and can get miscompilations in a different part of program, because the compiler has derived incorrect conclusions from your invalid operations.

For example?

But more importantly, you basic mental model of the compiler is incorrect. "split-nanosecond", "overwritten before anyone has a chance to do so" --- do you think you're playing catch with the compiler? Or that you should and could trick the hardware into overlooking your shennanigans? That's not how it works. If your code is incorrect "for a split-nanosecond", it's just incorrect period, and the behaviour of your entire program is Undefined.

That was unnecesarily emotional. The hardware executing the resulting code does not have the notion of types or casts. If no code is generated with the wrong assumptions (do this if there is a 0 in this 8-bit space, and do this if there is a 1 in that 8-bit space, with no other cases is a good example), it's not going to do anything undefined. I'm not sure what you mean by playing catch with the compiler, but i'm pretty sure that the hardware executes instruction in sequence (assuming single-CPU execution). When calling calloc for example, you will have random data in the memory allocated by the allocator, but that is zeroed out before anyone has a chance to access it (calloc zeroes it out before returning). So it's a similar, but somewhat different case. It has non-zero data for a split nanosecond, but nobody cares, because it's zeroed out before any other part of the codebase "knows" about its existence. Same case here, the invalid value is overwritten with a valid one before any other part of the codebase "knows" about it's existence. Any code looking at the variable sees a valid value there.

and rewrite it into an entirely different program with the only restriction being that it must have exactly the same observable behaviour.

That including a restriction that reads to a value cannot be moved before writes. So all reads need to happen after it is initialized. Note: We are singlethreaded, but that doesn't even matter, because even in a multithreaded environment, when main() is invoked it is running a single thread only, and it will fork off later on into multiple threads.

Unsafe code must obey exactly the same rules as safe code. You don't get to "turn off the borrow-checker" or "turn off the type system", you just get the capability to perform some new dangerous operations, and it's your responsibility to ensure they uphold the same rules as safe code.

Exactly. That's why the snippet comes with a big fat warning and strict rules that the program needs to adhere to to make the guarantees that any reads to the memory region already see initialized code.

@Coding-Badly

Yep, I was just trying to jog my memory about AVR processors having something like that.

@alice

The fact that the invalid memory is behind a raw pointer is very important for the UB rules, because raw pointers are not required to point at a valid value.

I think that is considered UB because safe code may be optimized in a way that it accesses that invalid region, while *mut can only be accessed in unsafe code, where the compiler can rely on the programmer to guarantee that access only happens if the memory is verified to be valid. (Also I think unsafe code is not rearranged by the compiler?). Or simply that by doing that you will most likely shoot yourself in the leg, because in most cases it cannot be guaranteed that no accesses happen to that variable, even in safe code.

However, that cannot be used to argue that it isn't UB because the relationship between UB and miscompiled is only a one-way relation:

I agree, however I'm starting to have the feeling we are conflating two things. UB-suspicious code as in badly written code itself, and the actual UB, that happens when the hardware is running the UB-suspicious binary, and triggering a code path where it starts exhibiting undefined behaviour ie. doing random stuff, that is not what we would expect by looking at the code.

you must refer to the rules for UB instead.

The rule you must be referring to is this: Producing an invalid value, even in private fields and locals.. Which is clearly documented, and I guess we all agree on why it's bad.

TLDR: But as I finished writing up my answers, I think I finally locked in on what is the root of the debate here is:

Should a value be considered produced if it's ephemeral and it's replaced before anyone else has a chance to read it? Can a value unknown to all influence the behaviour of the application?
In a more real-world - albeit contrived - example, can me writing down the base64 encoded private key for Satoshi Nokomoto's wallet and hiding it in a backpack in the woods impact your life, IF you don't have knowledge about it? Even if you do have knowledge about it, it doesn't necessarily impact your life, because you may or may not go searching for it, depending on your concience, but if you don't know about it, it CANNOT impact your life.

steffahn · February 8, 2023, 10:19am

No there’s no performance impact, unless we are talking about debug mode (which is usually not a reasonable thing to worry about); otherwise the call will be inlined and optimized away. By the way, the question whether a function is or is not const fn only influences whether a function can be executed at compile time (e.g. to initialize a static variable, or a const value), it does not have any influence over how much or little overhead the function has if it’s called at run-time.

No, the performance impact is zero, and the optimization is not premature but non-existent.

alice · February 8, 2023, 10:35am

This appear to be the crux of the issue. Ultimately, the truth is that, yes, according to how the rules are written today, the value is considered to be produced, even if you replace it before anyone else has a chance to read it.

I understand that lots of people find this be counter-intuitive, but that's how the rules are written.

The question of "why are the rules written that way?" is an interesting one. The article Why even unused data needs to be valid that I posted earlier is an attempt to answer the question this question of why.

Now, you make the point that there's a difference between your code and the example from the article:

The article's example uses the boolean in dead code, which could be rearranged so that it is no longer dead code.
Your example newer uses it anywhere, even in dead code.

However, the article talks about this point in the last paragraph. Ultimately, writing down a set of UB rules where there's a difference between "used only in dead code" and "never used, not even in dead code" runs into a lot of problems. The authors of the UB rules have decided that attempting to distinguish between these two things is futile.

And that is why the rules also consider your example to be UB: it is too difficult to write a set of rules where your code is allowed, without also making the code from the article allowed, and we do not want the article's example to be allowed because allowing it prevents us from making certain optimizations that we want to make.

steffahn · February 8, 2023, 10:43am

Note that this sounds like an inaccurate understanding of what UB really is. Undefined behavior does not mean “random stuff, that is not what we would expect by looking at the code”. In fact, many cases of undefined behavior manifest in the program doing “stuff that is what we would expect by looking at the code”, which is one of the large problems why UB can be so tricky.

The problem is that unexpected things happening is an option, and that UB cannot be contained. I.e. on the one hand, it’s the case that the behavior can possibly be unexpected very subtly (but significantly), and also it can always switch to the “random stuff” kind of behavior with future compiler versions or seemingly unrelated changes to the code. And on the other hand, it’s nothing where you can write a unit-test or look at the assembly of a single function, to draw the conclusion that everything seems to be “as expected” and then expect that the UB was properly dealt with. Instead, by definition, there is no way to get rid of it; once UB happens, that’s all the program does at that point. There is no further defined behavior at any later point in execution, everything stays undefined.

And e.g. if you call a function whose assembly looks fine to you, the optimizer might re-analyze the function in the context it’s called (aka inlining) and then the UB might manifest in a bad way. Maybe there’s way out if you compile the function into an object file, then analyze the assembly to make sure the behavior is as expected, and then link to the object file in a way that the compiler can never review the original source code again at the call-site. Though (due to the necessary review step), this approach would be equivalent to writing the assembly yourself; and even then it might still be a more straightforward to avoid the UB in order to get the compiler guarantees for the assembly’s correctness and do manual inspect only for performance analysis, relying on optimizations, rather on specific “reasonable” manifestations of undefined behavior.

steffahn · February 8, 2023, 10:58am

Because of the vibe I get from this wording, I want to emphasize this should not be read as any endorsement for an interpretation along the lines of “those lazy rule-writers just didn’t document it well enough, I know better, this case can’t be UB, since I cannot think of any optimization / compiler transformation / etc… that could break it and I don’t see any reason why it should be disallowed”. If it’s defined to be undefined behavior, it truly is. The “UB rules” that are talked about here are not some kind of model of a more complicated underlying truth; instead they are the language. There’s nothing between the rules for undefined behavior, and what truly is or isn’t undefined behavior.

In view of future compiler, the set of optimizations that could mess with your code will likely include optimizations that have not even been invented yet, so there’s no point in trying to reason about all possible compiler optimizations in the first place. The only way to avoid this problem, i.e. the only way to legitimately work with a new, more complicated, more refined (i.e. more lenient; less code is defined to be UB; more behavior is actually defined) set of rules is my getting the rules defining the programming language officially changed, so that the compiler promises to always adhere to the changed rules for future compatibility.

axos88 · February 8, 2023, 11:09am

The difference between the two cases is that in the article the invalid value has been produced AND moved/copied when the function in question has been called, so the value can be considered arguable be used even by that. If we take the AVR example, the call might have been made by just copying the lowest bit to the correct place according to the ABI in question, which means its value seen by the function won't be the same as the one we produced, which is the manifested "random stuff", the behaviour of the program depends if the invalid value is odd or not.

In my case the invalid value is not used at all - or at least I don't see anything that could be considered usage. With one exception, the u.assume_init() in the initialization block, which could and should be replaced by another piece of code producing the invalid value.

@steffahn

Yeah, random could mean exactly what we expect too. A better experssion would have been potentially different.

once UB happens, that’s all the program does at that point.

That's such a scary thing to think about, but is absolutely correct. However note that even when phrased this way the actual UB "needs to happen", and is not something that is globally there or not based on how the code is written. By that I mean that the execution path needs to step on the place where the UB-suspicious code is, and once that is reached the program is considered to exhibit undefined behaviour from that point on, even if in that particular case nothing bad happens, because it cannot be guaranteed not to start exhibiting UB in the future.

I believe basically we've reached a point where we agree on most stuff.

The only thing that still hangs in the air is what the meaning of produced/used is. And I believe that is not as trivial as one might think at first glance, and it would be worth thinking about and giving an exact definition. I still believe that writing a bunch of bytes to a memory area that is overwritten without access to it cannot influence the behaviour of the program, and shouldn't be considered a produced value at all.

Guaranteeing that it is not accessed is another thing and undoubtely a can of worms - as you say we cannot rely on the current set of optimizations because new ones will be possible/likely/surely added later.

Topic		Replies	Views
What's the "correct" way of doing static mut in 2024 Rust? embedded	24	5247	March 10, 2025
Is changing an immutable variable by pointer a UB? help	12	376	February 1, 2025
Trying to figure out if this is a UB help	1	375	January 12, 2023
Using sync::Once, mutable and MaybeUninit code review	12	971	June 21, 2023
Why multiple mutable reference even without data access is UB? help	10	374	December 26, 2024

Is this UB or not?

Related topics