String and memory allocation

Hello! I´m studying the differences between &str and String and memory allocation concept.

The code below will issue the "temporary value being referenced and then dropped" error:

use chrono::{DateTime, Local};

#[derive(Debug)]
struct User<'a> {
    timestamp: &'a str,
}

impl<'a> User<'a> {
    fn set_timestamp(&mut self) {
        let local_time: DateTime<Local> = Local::now();
        self.timestamp = &format!(
            "{}-{}",
            local_time.format("%d/%m/%Y"),
            local_time.format("%H:%M")
        );
    }
}

fn main() {
    let mut fred = User { timestamp: "" };
    fred.set_timestamp();
    dbg!(&fred);
}

The error:

Compiling playground v0.0.1 (/playground)
error[E0716]: temporary value dropped while borrowed
  --> src/main.rs:11:27
   |
8  |    impl<'a> User<'a> {
   |         -- lifetime `'a` defined here
...
11 |            self.timestamp = &format!(
   |   _________-_________________^
   |  |_________|
   | ||
12 | ||             "{}-{}",
13 | ||             local_time.format("%d/%m/%Y"),
14 | ||             local_time.format("%H:%M")
15 | ||         );
   | ||         ^- temporary value is freed at the end of this statement
   | ||_________|
   | |__________creates a temporary which is freed while still in use
   |            assignment requires that borrow lasts for `'a`
   |
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

If the struct's member is converted to a String, then it will work. Of course, then there will be memory allocation.

But isn´t the format! macro already making memory allocation anyway?

So is there any problem with just using String as the member type? Or is there any other solution to this?

Thank you a lot!

1 Like

use string as the member type. someone has to own that variable and the User struct is the most logical option. since the string returned by format! will be moved into the struct, there should be no more than one allocation here.

3 Likes

It's a matter of ownership/borrowing, not allocation. Heap allocation is just an implementation detail of the String and the borrow checker doesn't care about it. You can replace the String type with ArrayString<[u8; 1024]>, which never allocates heap memory, and would observes identical behavior.

The type String owns the type str(via heap allocated buffer, but it's not important) and the &String can be coerced into &str because the String implements Deref trait with target str. To be explicit, you can call the .as_str() method of the String instead of to rely on the deref coercion.

Within the fn set_timestamp(&mut self), a new String is created by the format!() macro. But since nobody take its ownership, this String value is dropped at the end of the enclosing statement, the self.timestamp = assignment in this case. This means the self.timestamp would holds dangling reference after the set_timestamp function returns, which is not allowed in Rust. So the compiler reject this code.

If you change the timestamp field's type into String and fix the assignment into self.timestamp = format!(), now the String value is owned by the timestamp field so it will not be dropped immediately. A value will not be dropped while it's owned by something else, including variables or another owned value. A reference cannot change the lifetime of the value. And the compiler reject the code if it may allows to access dangling reference.

4 Likes

You could also just store a DateTime . DateTime is a more complex type because it is generic over TimeZone. Your struct would have to be generic over all TimeZone or specify one TimeZone. This might be why you chose to use a string because it simplifies implementation.

If this is an application/personal project, I would just use DateTime<Utc> then you don't have to worry as much about the time zone when storing it or communicating with external systems and it won't depend on the machine it is run on. I would only convert to DateTime<Local> when displaying it.

2 Likes

You can think of this as creating a String container on the stack (the bytes of the string content live on the heap, but that's a separate consideration). The assignment is then taking a pointer to a struct on the stack. When this function returns, the stack frame containing the struct will be freed, and (if this code compiled) the timestamp field would be pointing to invalid memory.

When we say the value needs an owner, this is the nitty gritty detail; The String itself needs to be persisted somewhere that is not just the stack when you return from a function like this (the String is owned by the stack frame). Setting the field type to String ensures that the User struct owns that String.

2 Likes
1 Like

<'a> on structs makes them temporary and forbids use of them outside their original scope. It's an advanced feature, and 99% of the time it's totally opposite of what you want.

Don't use references in structs. Use String or other owned types.

3 Likes

Thank you, @mmmmib, @Hyeonu, @asafigan, @parasyte, @RustyYato and @kornel!

All the answers have helped me understand a lot better what's going on.

2 Likes

There is also an old talk that Niko gave, which references this aspect of ownership (using Vec instead of String, but the concept applies directly). I usually recommend this as learning material for colleagues, because it succinctly describes the motivations behind the language.

4 Likes

Great talk, @parasyte . Thanks a lot. The concept of ownership is really strong and Rust made me see things from a different perspective. It is also elegant. The borrow / move terminology is much more appealing for beginners like me.

I could clearly see the dangling pointer was happening because the String created by format! macro was not being moved to the struct´s member.

So, the natural next step was to change the struct´s member to a String, which I have mentioned in the original post as the solution. But because we keep hearing heap allocation is always expensive, I felt I was doing something naive and started to wonder if my assumptions about &str and String were right.

After all the great help each poster has provided, things are really clearer now.

This is a great community.

1 Like

Heap allocation might be expensive, but sometimes it is necessary.

Houses are expensive too, but I wouldn't call anyone naive for buying a house.

6 Likes

Also, "expensive" is kinda relative here. python heap-allocates all members of lists separately, always, and yet people still use that language to write software. Rust does not hide costly abstractions from us, which sometimes costs us disproportionate amounts of time thinking about whether we reeeeaaally need them, even in cases where it's unlikely to ever matter.

6 Likes

Let me share one cool trick from my physics background which I find very useful in software performance studies too: always keep some orders of magnitude in mind when discussing measurable things like execution times.

In a software performance setting, this means not being satisfied with saying that something is or is not expensive, but being able to roughly quantify how long a certain operation will typically take. You don't need to know the exact time (which is basically unpredictable as it depends on so many factors), only the power of 10 that you would need to type if you guesstimated that time as "about 10^x seconds".

This will greatly help you in circumstances where you need to choose the least of two evils between two "expensive" things (e.g. mutexes vs atomic operations), or to pick between different kinds of expenses (e.g. runtime performance vs development time).


In your particular case, you are lucky because memory allocator developers love performance microbenchmarks, and they always draw gigantic heaps of of cool graphs of how allocator performance scales in various situations, like these ones for example. Take some of these, and try to get an idea of what typical number you can expect.

In the benchmark linked above, they say that in a sequential workload doing random small sized allocations, they typically see a state of the art memory allocator perform 5-15 million allocations + deallocations pairs per second. So here the ballpark number would be "I can expect allocation+deallocation to take around 0.1µs in sequential programs".

(Note also that memory allocator performance degrades enormously in multi-threaded programs, because memory allocators manipulate a shared heap resource that must be synchronized between threads. This means that you must be much more careful about memory allocation in multi-threaded programs than single-threaded ones, and may benefit from some tricks like preallocating buffers in the sequential phase of the program rather than during the multi-threaded computation.)

So if you perform around 1000 allocations in a single thread, you can expect that to take around 0.1ms. If you perform a million of them, it may take something like 0.1s.

Maybe that's expensive for your use case. Maybe it's irrelevant. With numbers, you can do a quick back of the envelope calculation and get a rough idea of whether your design needs fixing or not, without needing to implement the code and carry out performance benchmark in order to tell.

8 Likes

Hello, @HadrienG. Great explanation. It makes absolute sense. And from a practical point of view, something that works is much more faster and efficient that something that does not.

The thing is when beginners like me hear that "heap allocation is expensive", we actually do not know what "expensive" really means and start (trying) to over optimize things without even knowing what would be the difference, if any.

For example, I can quote this excellent video about life annotations.

At some point (1:06:20), when the author explains differences between &str and String, he says, in the context of the example he is talking about, that requiring allocation is not great for performance and, also, allocation brings up another problem, which is maybe you won´t have an allocator, for example, in the case of embedded devices.

Since I´m not targeting embedded devices, it was the "not great for perfomance" part that got my attention.

Please, do that take note it´s me who does not know the magnitude of these performance matters. I´m not implying in any way the information out there about heap allocation is wrong.

After all the details you and the others have provided, it is very clear the concept of "expensive" is relative and depends on a lot of variables and conditions.

Thanks a lot.

There's another great partner trick, which is to always figure out how any given bit of close scales. Allocation that happens once is very different than allocation that happens once per byte processed.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.