Why this is not safe?


#1

why the code is not safe ?
I know this is not allowed, and variables freed in opposite order, but is this not safe?

struct Demo<'a> {
    age: &'a mut i32
}

fn main() {
    let mut a  = 10i32;
    let mut x = Demo { age: &mut a };
    let mut age = 10i32;
    x.age = &mut age;
}
src/main.rs:9:18: 9:21 error: `age` does not live long enough
src/main.rs:9     x.age = &mut age;
                               ^~~
src/main.rs:7:38: 10:2 note: reference must be valid for the block suffix following statement 1 at 7:37...
src/main.rs: 7     let mut x = Demo { age: &mut a };
src/main.rs: 8     let mut age = 10i32;
src/main.rs: 9     x.age = &mut age;
src/main.rs:10 }
src/main.rs:8:25: 10:2 note: ...but borrowed value is only valid for the block suffix following statement 2 at 8:24
src/main.rs: 8     let mut age = 10i32;
src/main.rs: 9     x.age = &mut age;
src/main.rs:10 }

#2

Hi there,
I think it is because your “age” is declared after your “x” and so will be droped before (younger variables are droped first when going out of scope). Then your “age” have a lifetime shorter than “x”.

Declare “age” before “x” and this will work.


#3

If you’re in a situation where you can’t just reorder the variables, you might get away with an explicit mem::drop(x) so it dies before age. Or you could rebind it, let x = x; into a smaller scope for similar effect.


#4

Surely the compiler could re-order the variables for you?


#5

No, by the time lifetimes are checked, the execution order is fixed.
It’s important to understand that lifetimes cannot affect code generation*, but rather whether your code compiles or it doesn’t.

*TBH, optimizations could use information from lifetimes in code that passes the borrow-checker, but it’s yet unclear what interactions will be there and it’s possible the information will come from a common source with lifetimes, not from lifetimes themselves.


#6

You could simply use the lifetimes as constraints on instruction ordering. If you cannot satisfy the constraints then you get an error as you currently do. Does Rust not use a proper linear type system? The references to a borrow checker suggest heuristic rules rather than a solid logic?


#7

It’s safe, but would not be safe if both x and age had a destructor, as age's destructor would run first and x's destructor could then access the already-destroyed age.

Interestingly, I believe rustc used to allow this sort of thing. I think this PR from 2015 changed it; see also the related Sound Generic Drop RFC from the same time period.


#8

Thanks everyone, now I know why it’s unsafe.

I must have to get used to the rust rules :slight_smile:


#9

Rust’s type system is affine, not linear.

They are actual rules, not heuristics.


#10

Why not use linear types, which model move semantics exactly?

What is the difference between an actual rule and a heuristic? (Maybe I mean ad-hoc rules)? What I was trying to say is that a linear type system would enforce the semantics through the axioms of the logic itself, rather than a collection of additional rules, which I am guessing have not been proved to be complete and consistent?

Edit: Or maybe that’s all wrong, as an affine type system allows types to be unused, but is otherwise the same as a linear type system. I still don’t understand what the borrow-checker is? Perhaps its just a terminology issue, but why not add additional typing rules to the type system, so that borrow types are first-class?


#11

The region system (“lifetimes” in Rust discourse) and the affine type rules (“ownership” in Rust discourse) are separate systems which work extremely well together. The borrow checker is the pass in the compiler which implements the region analysis; it is a set of typing rules, not heuristic.

The limitations discussed above are not theoretical I don’t think but have to do with the amount of resources Rust has to develop and implement these systems. Right now, lifetimes are strictly nested scopes determined by the point at which the variable is declared and the point at which it becomes inaccessible. This will become more complex with time.


#12

Can’t this be dealt with in terms of the type system logic? Why should lifetime variables become more complex with time? Type variables don’t become more complex, they have clearly defined semantics, why should lifetimes be different?


#13

Damnit, I spoke unclearly. Right now, lifetimes are simplistic - they have a single entry and exit, which are determined based on declaration position rather than use, and they cannot have partial overlap (only complete overlap or no overlap). These rules will become more complex over time as Rust the language is developed and improved, so that lifetimes can be typed more flexibly to accommodate more valid programs.


#14

@keean It’s hard to understand what you’re suggesting.
What the Rust compiler does is it takes syntactical scoping semantics and checks them.
A reference to a variable cannot live longer than that variable. This is how Rust prevents use-after-free, and it’s all encoded in lifetime subtyping.

Are you suggesting to reorder the definitions based on lifetimes? Or to check the lifetimes in a way that is order-agnostic locally, and pick any ordering which would be safe?


#15

Oh, I think I see what you’re saying: put the instructions in a system, add constraints, and solve to get an ordering.

However, you couldn’t reorder anything with side-effects, and the way optimizing compilers actually do reordering is based on alias-analysis and other heuristics, and that’s not what we want for our checks.


#16

I agree, so you would have to track side effects, which is probably the biggest problem for doing it in Rust.

Here you miss the point, the kind of reordering I am talking about is an Abstract Syntax Tree to Abstract Syntax Tree rewrite. If you did this at the beginning of the compiler passes it would be the same as the programmer re-ordering the code. Compilers can be understood as AST to AST rewrites, from one sub-language to another. You can still have hard checks at different points, and even better you can have sub-languages that only permit well-formed programs. With a nice architecture the compiler becomes hundreds of passes, each doing a specific well understood rewrite from one sub-language to another, and this is only a few percent slower than the few-large-pass mess that most compilers tend to be.


#17

@keean: You also need to consider the quality of error messages, and the total complexity of the compiler. Solving for ordering in the manner you describe could easily require significant complexity in the solver, and could allow the programmer to write cases that result in cycles. This would occur after potentially several AST-to-AST rewriting passes, and may present significant difficulties in mapping it back to the mistake the user made.

In comparison, a stricter set of rules (i.e. one that rejects more programs) that mimics the textual program structure has the ergonomic advantage of mimicking the textual program structure: when it says something doesn’t live long enough, that’s visually apparent in the source.


#18

Its actually a simple solver and a few lines of Prolog. It is interesting that a lot of compiler operations are very similar to parts of a Prolog implementation. You tend to find that inside a sufficiently complex compiler is a bad implementation of Prolog. It is probably simpler to start with a good implementation of Prolog :slight_smile: I mean as a library for the compiler implementation to use, not that the compiler should be written in Prolog. Unification is used for type inference, backtracking for trait bounds solving etc. In fact the trait system is already an implementation of the positive fragment of pure Prolog.

Cycles are an obvious error, and the type system already contains the mechanisms to catch cycles as errors (as recursive types are not allowed). This is another example of bits of a Prolog implementation showing up at various points inside a compiler.

Well it could be the first pass, as currently the user does it by hand before the compiler does anything. However there is a more general point here, and that is you need to link the rewritten ASTs back to the original AST for error generation.

I was not suggesting these rules be removed, rather that they be applied after a rewriting stage. However you are right that this is easily visible in the source code. It seems that other lifetime issues like this involving the need for a temporary variable are going to go away in future versions of the compiler, and I see no reason not to make quality-of-life improvements if there are no real drawbacks.


#19

The current set of rules includes that the lifetimes must be strictly nested, at open and close. “Applying the rules after a rewriting stage” is equivalent to replacing the strict-nesting rule with one saying that they must be capable of being strictly nested, at least as far as whether the program is accepted goes (and in the absence of side effects, also in terms of program behavior).

That is a less-strict rule, hence my phrasing.


#20

It is a sufficient rule, and does not allow any incorrect code, so it is sound. Requiring temporary variables for certain operations, for example if ‘f’ borrows mutably then you currently have to write “let x = p.f(); p.g(x)” instead of “p.g(p.f())”. This is also stricter, but as I understand it is already going away, so strictness in itself is obviously not seen as a desirable property.