Returning a reference to a String stored in a struct

Hi

I got the following error on some code:

 │     cannot return reference to local data `self.filename` rustc (E0515) [41, 9]
 │      returns a reference to data owned by the current function 

The code

use crate::Command;
use std::fs::File;
use std::io::Write;

pub(crate) struct CommandLog {
    file: File,       // backing file to store the commands
    filename: String, // name of the file
}

fn upsert_logfile(filename: &String) -> File {
    File::options()
        .append(true)
        .create(true)
        .open(filename)
        .unwrap()
}

impl CommandLog {
    pub fn new(filename: String) -> CommandLog {
        CommandLog {
            file: upsert_logfile(&filename),
            filename,
        }
    }

    pub fn append(&mut self, command: Command) {
        match command {
            Command::Set { key, value } => {
                writeln!(&mut self.file, "{}={}", key, value).unwrap();
            }
            Command::Delete { key } => {
                writeln!(&mut self.file, "DEL {}", key).unwrap();
            }
            _ => panic!("Can't log this command"),
        }
    }

    // The error is on this method
    pub fn filename<'a>(self) -> &'a String {
        &self.filename
    }
}

After a google search I landed on this previous question on the forum Returns a reference to data owned by the current function where one of the suggested solutions was to change the method to take mut reference. If I change the filename method to take a mut reference it does work but I don't understand why. Could anyone help me understand it?

If you make your method take a self argument, then it will be a by-value argument. That's essentially just a local variable. It will be destroyed before the function returns. Why do you think you should be able to return a reference to it?

If you take self as an argument, it means that you will take ownership of it. Then, this would mean that self would be dropped by the end of the function, making returning any reference to it impossible.

TLDR: Don't receive self as a parameter, use &self instead.

1 Like

Thanks for the replies @H2CO3 and @moy2010

If you make your method take a self argument, then it will be a by-value argument

Now that you say it, it's obvious. I was thinking that self in method implementations was always passed by reference. I guess that beyond the issue I was having here, it's always a good rule of thumb to pass &self instead as it would be more performant (no copy involved)?

TLDR: Don't receive self as a parameter, use &self instead.

If I do that then I get another error instead

 │     lifetime may not live long enough rustc [40, 9]
 │      method was supposed to return data with lifetime `'a` but it is returning data with lifetime `'1` 
 │     lifetime `'a` defined here rustc [39, 21]
 │     let's call the lifetime of this reference `'1` rustc [39, 25]

This means that the lifetime of self.filename is different to that of the place from where the filename method is called, correct?

What should I do on a case like this, clone the value in self.filename and return that?

This is not the case.

No, that's a huge misconception.

1 Like

You need to tie the lifetime of the return value to the lifetime of &self:

pub fn filename<'a>(&'a self) -> &'a String {
    &self.filename
}

Or with some syntax sugar

pub fn filename(&self) -> &String {
    &self.filename
}
1 Like

No. Don't think of references as a "magical" performance bullet that will make everything faster, no matter the scenario.

In Rust, what matters is if the operation that you will perform requires exclusive access to self or not. So you can have methods that take "self" in many different ways:

self
&self
mut self
&mut self

And so on.

2 Likes

In general, use the method receiver that affords the functionality you need but not ones that you don't.

  • self if you need ownership
  • &mut self if you need exclusive borrowed access
  • &self if you need shared borrowed access
  • other receivers exist (Box<Self>, Arc<Self>, ...) but are more niche and approximately "if you're not sure if you need it, you don't need it"

In this case, ownership doesn't allow you to return a borrow of a field, but you don't need an exclusive borrow, so &self is the logical choice.

(Of course, "need" is still relative; if you immediately clone &self into a Self and consume it, you should have probably taken self.)


Aside: mut self is the same receiver as self, there is no difference in type, just the variable binding. It's notionally the same as

fn foo(self) {
    let mut this = self;
    // ...replace all uses of `self` with `this`
}

In contrast with &self and &mut self, which are different types; you can't get the &mut self out of a &self.

2 Likes

To add to that: it's possible, for example, that passing a reference incurs an additional, unnecessary clone.

When you need an argument by-value, eg. because it's in a constructor-like method, then you should take it by value, because the caller then can give up ownership. By taking a reference, you would always force the callee to perform a clone (or more generally, a borrowed->owned conversion).

Demonstration:

struct Foo {
    value: String
}

impl Foo {
    fn new_fast(value: String) -> Self {
        Foo { value } // no clone
    }

    fn new_slow(reference: &String) -> Self {
        Foo { value: reference.clone() } // MUST clone
    }
}
2 Likes

Notably the first and third option you list, self and mut self, are identical. The difference is only relevant inside of the function implementation, and even there, mut self could always be replaced by using self together with another let mut this = self; step.

4 Likes

Yes, of course. But from the perspective of the caller, it's better to express your intent directly on the signature.

The mut in mut self is not part of the signature. It will consequently also not show up in the generated docs.

5 Likes

The caller shouldn't rely on implementation details. Note that even RustDoc will omit the mut part if there is one.

1 Like

@steffahn , @quinedot . That's interesting! Didn't know that the mut binding on the signature did that.

By the way, for the full list of possible types there, there’s also self: Box<Self>, self: Rc<Self>, self: Arc<Self>, and pinned versions (theoretically allowed to be nested, too, but realistically you’d only ever use one layer) self: Pin<&mut Self>, self: Pin<&Self>, self: Pin<Box<Self>>, self: Pin<Rc<Self>>, self: Pin<Arc<Self>>. Only self: &mut Self, self: &Self and self: Self have any special shorthand syntax though (&mut self, &self and self, respectively.)

Interestingly, this list is different from the list of types that allow you to implement (external) traits on them, the latter is only &T, &mut T, Box<T>, Pin<…>, and all nestings of those (so each list technically contains entries the other doesn’t; but most notably Rc and Arc are not covered in the latter).

3 Likes

mut has nothing to do with the type of the argument. Mutability is not a property of types, but of bindings (and therefore patterns). And the pattern that binds the parameters isn't – as it shouldn't – be part of the signature. The following functions have exactly the same signature:

fn foo1((a, b): (i32, String)) -> u64
fn foo2(arg: (i32, String)) -> u64

because they accept exactly the same shape of arguments. What the callee does with them is none of the business of the caller.

2 Likes

FYI, the value semantics of Rust mean that the called doesn’t care what the function does with a value you pass by-value. Mutations on that value will have no effects on anything you can observe anyways, because you have given up ownership. (Or you’ve passed a copy, in the case of a Copy type, but then still, the callee only modifies their own copy then.)

The only way how operations on a passed value can influence what you-the-caller observes is via interior mutability, but then, for interior mutability, you typically don’t even need to mark values as mutable as those APIs work with &self references, anyways.

Edit: Well… actually…[1]


  1. …normal mutable access also fits the picture… of being able to influence your state. A function like fn apply_callback<T>(x: T, cb: FnOnce(T)) can have mean that the action of apply_callback to pass x to cb might mutate your state, e.g. if you call it with apply_callback(&mut my_variable, |r| *r += 1). But even in that case, inside apply_callback, nothing was needed to be marked mutable.

     

    It’s interesting that from the perspective of interpreting, with a functional programming hat on, something like fn(&mut T) to be (essentially) equivalent to fn(T) -> T, the consequence is that in the case above, a generic fn(T, fn(T)) can be special-cased into fn(&mut S, fn(&mut S)) which then I’d interpret to have a meaning like fn(S, fn(S) -> S) -> S, even though that’s an entirely different-looking signature from the original fn(T, fn(T)).

     

    Maybe an interesting thing to think about, how to fit this into a (in my view very reasonably) view of interpreting Rust programs/functions as pure (if interior mutability, and global side-effects are forbidden). ↩︎

2 Likes

@H2CO3 , @steffahn , yeah, I get that. I was just thinking if there was any scenario where mut self would make more sense than self beyond the early binding, and speculated that it might matter from an API-design perspective.

Thanks all for this very instructive thread!