Free variables after last usage

Currently, variables are freed at the end of a scope/function.
In some situations, this is inefficient memory usage as a variable might not be used anymore but is still in memory.
You can manually drop variables or introduce nested scopes to optimize this.
Would it be possible to do this as part of the compilation?

use std::fs;

fn main() {
    // {
    let var1 = fs::read_to_string("16mb-text-file.txt").unwrap();
    dbg!(&var1.len());
    // }

    let var2 = fs::read_to_string("11mb-text-file.txt").unwrap();
    dbg!(&var2.len());
}

In this example the memory usage stacks up to 27mb although when var2 is initialized, var1 is already no longer used.
Introducing a nested scope as indicated changes this behavior as var1 is dropped after its last usage.

This could change the peak memory usage for long (running) functions and recursive functions.
One obvious change would be the order in which the variables are being dropped (now implicit rather than explicit with scope/manual drop).

Is it possible/feasible to implement early dropping for programs where the drop order does not matter?
Thanks

It's certainly possible technically, but it would be a wildly breaking change, so it won't happen.

4 Likes

It's probably technically possible -- after all, NLL calculates something like this for references -- but it's intentionally not done.

Notably, it's reasonable for NLL to cut things short because there's no runtime-observable behaviour when a reference is dropped. The code just either compiles or doesn't compile.

However, for anything with a meaningful Drop, it can really matter when it drops. This is perhaps most obviously true with "guard"-type things. Dropping a mutex's lock guard early can be a catastrophic bug.

So it's my understanding that the language is intentionally optimizing for predictability here instead. After all, you can always drop(var1); before the end of a scope if you find out that it would be better for it to live for less time.

10 Likes

I think Rust should support eager drop for types that opt in to it.

Or maybe there could be an optimization pass that checks if the address of the data hasn’t escaped, and the drop has no side effects other than freeing memory. It’s impractical to litter every function with multiple drop() calls.

2 Likes

It's not an issue in practice, though.

6 Likes

There is a side effect though to anything that could have a heap allocation. And that is timing. While not a big deal for hard realtime (you really shouldn't be interacting with the allocator anyway) the crypto people won't be happy about potential for side channels I bet (I'm on the hard realtime group of people so I'm guessing at what the later group would think of this).

Also, it usually isn't a problem. As in: I have never seen any code that needed littering with drop calls due to using too much ram due to it not being freed eagerly. The only places I remember seeing explicit drops at all have been for mutex guards or other RAII structures, and the whole point of those is the side effects so this wouldn't apply anyway

2 Likes

I think it would be worth experimenting whether it improves things.

For example, in functions that have many places that can return or panic, would it reduce code size thanks to unconditionally dropping before the branchy code?

Could it reduce peak memory usage? Could it reduce memory fragmentation and therefore reduce total memory usage?

I thought about writing a tool that just inserts drops into the source code as early as possible, but that is tricky to do in presence of loans, and a simple drop(var) can’t be done on partially moved objects.

4 Likes

I would want any such thing to be a property of bindings, or perhaps function bodies or other blocks, and not types.

1 Like
let eagerdrop mystuff = fs::read_to_string("16mb-text-file.txt").unwrap();

Hm.. This feels like something I'd use if it were available -- I'm currently working on something that loads chunks of data from files and this code has a few explicit drop()s just to avoid overlapping large:ish memory allocations.

That said, knowing myself, at some point I'd probably make something that's "eagerdrop" into a drop guard and forget to remove the eagerdrop. If eagerdrop existed and it could trigger an error due to being used on something that is quite obviously not meant to be eagerly dropped (#[must_bind) was created to be used on guards, right?) I'd be a happy camper.

How? If you're not using the guard any more or borrowing from it, you're not accessing the data inside the mutex any more.

I agree with OP that it seems like it would make more sense for drop to happen when you stop using something rather than at the end of a scope, or end of the statement for temporaries. If you want to drop later for some reason, you can always add an explicit drop later which counts as the last use.

Though it would definitely be a breaking change to change drop timing and so seems impossible.

1 Like

Sometimes you need mutual exclusion but you can't represent that with protected access to data. This is where you'd use Mutex<()>. More generally a semaphore works like this. There's nothing you can "do" with a tokio semaphore guard besides drop it IIRC. Of course you should avoid writing code like that if you can but sometimes it's out of your hands.

6 Likes

Another - non-fatal, but still meaningful - example is the tracing_appender::WorkerGuard; it can't be used explicitly for anything, but if it's dropped prematurely - tracing_appender will stop responding to logging events.

2 Likes

If it was in bindings, I think the default would have to be eager drop, with late drop let = x for the exceptional cases. Otherwise you’d end up having to opt in in to eager drop over and over again. Late drop would be a noisy default like #[must_use] (functions with discardable result are less common, as demonstrated by swift).

However, I think this is a property of types.

There are basic types that know they can be dropped ASAP (including EagerDrop for Vec<T> where T: EagerDrop).
And there are types like guards that know they need to have predictable scope.

I see. But it seems like a very rare case and one where you have to be careful anyway: it's already possible to get the scope wrong and have everything compile.

The solution to the OP is get more memory. :grinning:

Note, dropping has performance implications so often better at end after work is done. (If not taking extra care to manage memory.)

As someone not wanting to make the language more complicated any solution should not have burden on user so annotations on let bindings would be unwise. But maybe the optimiser could perform action under the hood.

Drop has side effects by default. I guess the developers could add a marker Trait WithoutDropSideEffects that just leaves the decision up to the compiler to eagerly drop. There would be no way of enforcing any incorrectly marked structures though.

2 Likes

That sounds very reasonable :+1:

I don't think it should be part of the language. That would be too complicated and we already have manual drop.
It should come for free and without developer interaction. Just like lto.
Otherwise it is too much work and many crates/project would not make use of it.
Maybe every type which does not implement a custom drop method?

Could it be a lint for clippy?

If it was in bindings, I think the default would have to be late drop, with eager drop let = x for the exceptional cases. Otherwise you'd end up having to opt in to predictable drop locations over and over again.


It other words, it depends on what you want or find reasonable. Note also that we already have drop flags; eager dropping would mean that every time drop flag elision gets smarter, your drop locations may change.

Anyway, I also agree with @tczajka that it's a breaking change probably impossible to land. If you have something with a destructor and get a raw pointer to it, you can pass it to FFI and know your data hasn't been deallocated before the scope ends, for example. I.e. the current behavior can be relied upon for soundness.

4 Likes

I understand the need for dropping early to free resources, but I don't understand the need for new syntax or a new feature. We already have braces to control scope. If a variable should be dropped earlier than the end of a scope, declaring it and using it in a nested scope makes sense (it really can't be used outside that scope anyway) and makes the code clearer, it's a good habit. Also nesting shouldn't be considered negative just because it indents code and might cause long lines to wrap. Embrace the brace (haha).

13 Likes

Exactly. A smaller number of general, orthogonal features is better than encoding every combination and every particular need of every programmer.

3 Likes

I did not mean that in subjective way “it should default to whatever strategy I/you prefer”, but rather as an observation that almost all types almost all of the time can be dropped eagerly. If taking advantage of that required an extra keyword, it would be a very common boilerplate.

There are drops with very important observable side effects, but they are a relatively rare exception, not the rule. You’ll have tons more Strings, Vecs with primitives and Boxes of boring structs, than scope guards for unsafe code, or mutexes used in a more clever way than just for accessing the data they hold.

This is hypothetical though. I realize that change of let semantics is too dangerous and disruptive for Rust. However, an EagerDrop or some maker trait like that could be added in a backwards compatible way, and have a positive effect on programs, even existing ones, without any syntax changes or boilerplate.

1 Like