I have a question inspired by Ofek Shilon's CppCon talk. Can rustc assume that the pointer x
passed to
fn f(x: &Cell<i32>)
does not escape, since the function is quantified over all lifetimes?
I have a question inspired by Ofek Shilon's CppCon talk. Can rustc assume that the pointer x
passed to
fn f(x: &Cell<i32>)
does not escape, since the function is quantified over all lifetimes?
I think GhostCell
and LocalKey
take advantage of something like this. Does the compiler actually use this to elide loads (if it's sound, which it seems like it should be)?
Rust can assume the reference doesn't escape -- last longer than the call -- yes.
Other data can escape the call though, including the address of the Cell
during the call, so I'm not sure what all you can rely on or not (and is probably not formally decided). For example, you could store the address globally, and then have another (unsafe
) function that accesses the Cell
with the safetly preconditions:
&Cell
you last stored is still validIf you call this unsafe
function having satisfied the preconditions, is it valid for it to reconstruct the &Cell<i32>
and rely on the value of the i32
? If so, escaping still occurs.
Example I threw together out of curiosity. (The fact that it seems to work doesn't mean it's guaranteed to though.)
I can't imagine a way for the escape to happen that doesn't involve unsafe
, but I'm also not sure if the compiler would be able to exploit that here or not.
Do such considerations ever result in eliding loads, I don't know. That might even be more of an LLVM question. Perhaps the better question is "do the LLVM equivalents of pure
, const
, noescape
get emitted?"
There's few guaranteed optimizations, and that question (or questions about what escaping considerations would be considered valid) might be better suited for compiler devs on Zulip or perhaps on IRLO.
As far as I'm aware, lifetime annotations in types should be irrelevant for the actual program behavior, and they just exist for the borrow checking and type checking to be feasible and sound. (Don't quote me on that though, I'm not sure where exactly I got the from. In my opinion, it's sensible though, considering lifetimes being "erased" during compilation.)
Regarding the code testing this in Miri: you might as well simply transmute the &Cell
into &'static Cell
, no need to use a pointer; in my mind, using references directly in the Miri test seems even more likely of an indicator that this kind of code is not UB.
Maybe that blog post:
The compiler erases lifetime information prior to monomorphization and code generation, meaning that the generated code simply has no way to depend on lifetimes. That could be changed, but we’d have to work hard to avoid code blowup by generating separate copies of code for each lifetime it was used within, assuming that the behavior didn’t change.
IOW: it's not something that is part of Rust language specification, but it's true for the existing compiler and while the ability to use lifetimes for code generation has pluses and minuses, currently the decision is not to use lifetimes for that.
I spent awhile on dead-ends exploring how maybe a fn foo(&Cell<i32>) -> i32
could imply noescape
; I'm convinced now that's impossible from the API alone (though it could be an optimization based on the body). The exploration is a dead-end because there's no firm connection between the validity of a raw pointer and the lifetime of the reference you created it from.
However, "lifetimes don't effect program behavior" is implicitly a statement about well-defined programs. The question then becomes, are the preconditions I listed adequate to avoid UB? If not, the behavior cannot be guaranteed, naturally.
If we take this documentation to be normative, my program is UB -- references (not raw pointers) have been used to access the underlying value, before I ever use the stored raw pointer. [1] The fact that a Cell
is involved is also irrelevant in my reading (the memory of the Cell
is not itself within an UnsafeCell
, notionally, even though it's the same memory span). [2] Also nothing on that page mentions UnsafeCell
or interior mutabilty at all.
That said I think it could be rewritten to confirm to that documentation, i.e., perhaps be well-defined.
I think transmuting to &'static
is different that *const
in that having a &
to an invalid value is UB even in unsafe
code; if you create a &'static T
, unsafe
code could never soundly temporarly put some invalid bit pattern in that memory.
(I spent almost no time thinking this one through more thoroughly though.)
Ironically in no small part due to steps I took to be explicit about intention, making the reference lifetime being valid for every call to whateva
(even though the call to some_func
was a reborrow anyway). ↩︎
I haven't thought about or looked for citations to see if it could be relevant for memory that is notionally within an UnsafeCell
, like the i32
itself. ↩︎
I assume you mean
- The result of casting a reference to a pointer is valid for as long as the underlying object is live and no reference (just raw pointers) is used to access the same memory.
Yeah, that sounds off. Sounds correct for me when talking about mutable references, but for shared references and read-only (and in this context, I’d also count turning a *const Cell<…>
into &Cell<…>
and then writing to it through the Cell
API as “read-only”) pointer access, this statement is probably[1] wrong / badly formulated.
AFAIR, depending on what you would or wouldn’t count as “having” (and what you consider “invalid”), that’s not set in stone yet, but I also agree that an approach using *const
is more likely not to be UB. I mentioned the &'static
-transmuting approach only, because in my experience Miri more reliably reports UB for code involving references than raw pointers.
not to say obviously ↩︎
Ah, I see.
I agree the ptr
documentation is lamentably poor, especially considering how it's innately unsafe
to actually make use of them. I also found this for example:
This does not take ownership of the original allocation and requires no resource management later, but you must not use the pointer after its lifetime.
But pointers don't have lifetimes, and the references in the example have lifetimes that end immediately. Probably they meant "value liveness scope" of some sort.
This conversation only solidifies my stance that I can't guarantee it works by the way. It is all just so horribly underspecified.
I think it might have been this thread I had in mind for this statement, btw
Thanks.
I guess that would be sufficient for &'static Cell
as it's !Sync
so you can't observe it from thread B while it's invalid in the unsafe
of a thread A, or so. (Didn't think deeply on it.)
This makes sense but I am not happy about it. It really seems that calling a function f
quantified over all lifetimes with a reference x
shouldn't enable modifications to the memory pointed to by x
after f
returns, but that is evidently not the case.
For a type like &i32
there is a strong immutability guarantee, so the compiler could be smart about reading it.
However, Cell
contains UnsafeCell
inside, which in Rust is a type for "anything could happen", so in this case I would not expect the compiler to elide any reads.
I think there's a misunderstanding about how lifetimes work in here somewhere.
Functions are generic over lifetimes, that's why they use <>
(when not elided). fn foo<'a>(&'a Cell<i32>)
doesn't mean each reference passed to foo
has to be valid for any lifetime. Instead, foo
can accept a reference of any lifetime, including the lifetime that ends as soon as foo
returns.
In other words, it's not foo
takes a reference that is valid for any lifetime 'a
. It's foo<'a>
takes a reference that is valid for the lifetime 'a
.
But I don't think it really matters in the way you're thinking. Instead of doing escape analysis to determine what optimizations apply, Rust just won't let you write code that would be optimized unless it passes the borrow checker. The escape analysis happens in your head, while you're having a conversation with the compiler in the form of error messages. By the time you get to codegen, there's nothing more to learn from doing further escape analysis because it already knows what optimizations it can do.
Note that borrowing in Rust never affects when things are dropped; drop order is purely lexical.
The fact that foo
accepts a reference of any lifetime, including one that ends as soon as foo
returns, should constrain how it behaves with longer-lived references. At least, I thought it should, but apparently it doesn't.
This mechanism is how I thought LocalKey::with
worked. You can't leak the reference &'a T
because the closure you pass has to be generic over the lifetime 'a
. But even though the borrow checker won't let you leak the reference, the compiler must act as if you could have because you can convert the reference to a raw pointer.
The situation isn’t quite as bad as that:
dyn
) which means that the outer function will be monomorphized for the actual closure type. This lets the compiler inspect the contents of the closue to determine whether or not any escape happens via raw pointers.unsafe
promise that the programmer will uphold this guarantee. So the compiler is allowed to make codegen decisions under the assumption that any raw-pointer access won’t alter the referent— Any program that violates the guarantee is exercising UB as the result of an unsound unsafe
block somewhere.Indeed, Miri considers it UB, even if a *mut
is used.
I assumed the use of Cell
was intentional to be closer to C++ semantics.
Yes, it's not as interesting with a &i32
or &mut i32
.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.