Return value referencing local variable

Very new to Rust here...

// begin somewhere in an external crate...
pub struct MyData<'a> {
    something: &'a str,
}

impl<'a> MyData<'a> {
    pub fn new(s: &'a str) -> Self {
        MyData { something: s }
    }

    pub fn get_something(&self) -> &'a str {
        self.something
    }
}
// end somewhere in an external crate...

// ...somewhere in my code trying to create an instance...
pub fn get_my_data<'a>(data_piece: &'a str) -> MyData<'a> {
    let full_data = format!("My data piece is here ->{}<-", data_piece);
    MyData::new(&full_data)
}

Seems like a basic common pattern - collect some data, modify it somewhat, pass it to create an instance.

I do understand the problem!

The question is - what is the proper way to handle it?
Am I totally out of luck, because of how MyData is implemented?

If possible, could you tell us what crate you're trying to use and which struct? It might be easier to come up with a potential solution if we knew the use case.
You could instead return a String from get_my_data and construct the MyData in the calling code:

// begin somewhere in an external crate...
#[derive(Debug)]
pub struct MyData<'a> {
    something: &'a str,
}

impl<'a> MyData<'a> {
    pub fn new(s: &'a str) -> Self {
        MyData { something: s }
    }

    pub fn get_something(&self) -> &'a str {
        self.something
    }
}
// end somewhere in an external crate...

// ...somewhere in my code trying to create an instance...
pub fn get_my_string(data_piece: &str) -> String {
    format!("My data piece is here ->{}<-", data_piece)
}

// Use the data somewhere else
fn main() {
    let data_piece = "foo bar baz";
    let my_string = get_my_string(data_piece); 
    let my_data = MyData::new(&my_string);
    // Do various things with my_data:
    dbg!(my_data);
}
1 Like

This code is wrong. Your function body allocates a new string as a local variable (full_data) which will be destroyed when the function returns, but your signature says the contents of MyData lives as long as the data_piece arguments. This would imply that the full_data variable outlives the function it was defined in and rustc spots the contradiction ("dangling reference"), reporting it as a compile error.

You've actually created a new string here (full_data) and aren't modifying the original. Your MyData object will need to take ownership of this newly allocated String.

pub struct MyData {
    something: String,
}
1 Like

I think his issue is that MyData is a struct defined in some other crate, not a struct that he defines in his code. He can't modify the definition so I think the next best thing would be to return and pass around the String, and construct a MyData right before its used.

Well that makes things more interesting...

I had a similar issue with fluent-rs where a parsed translation would borrow from the original text it came from. When trying to modify the translations (I wanted to use google translate to generate translations for other languages) I ran into the same issue the OP has.

My solution was similar to this, create an object which outlives the original MyData and use that to store the modified data. This then "extends" the lifetime so that your modified objects live long enough.

For what it's worth, not having a Cow<str> AST isn't a show stopper if you want to create your own tooling... It just makes things a bit more annoying because you've got to thread the needle to make the borrow checker happy.

I created a StringPool that extends a string reference's lifetime to that of the string pool (it's just a fancy arena). From there you can traverse the AST returning copies of the original nodes or swapping in your own text which is owned by the string pool.

The lifetime annotations get a bit hectic because we're using variance to find a lifetime which satisfies both the 'ast from our original AST nodes and the StringPool. It also doesn't help that the example above has a &HashMap<&str, String> which maps some strings from the original AST nodes to a translated version from, for example, Google Translate.

To recap:

  1. Don't use helper/factory function to create instances. That's fine, though not what I was looking for in a general sense. My example is overly simpler. It can be a lot messier.
  2. Copy the local value to a longer living "container", and use a reference to the copy instead. Feels like something that should be a built-in in the language or the standard library instead of having to roll it by hand, doesn't it?

Thanks.

It's perfectly fine to use helper functions to create things.

Often this lets you simplify setup when you are working with something complex (e.g. a HTTP client using TLS) where the vast majority of users will want the same thing (I don't care how you do encryption, just use my OS certificates and "good" enough algorithms).

Not exactly... My solution was a workaround because the library I was using was quite awkward and not designed to be used the way I was trying to use it. It is not how one would typically write Rust and the extra lifetime restrictions were imposed because that code loads translations whenever Firefox starts, so they had to make ergonomics sacrifices in the name of performance.

Normally, if your object needs to control a resource then you it should own that resource. The feature built into the language that you are talking about is the ability for a struct to own the data it contains.

It is. You just need to make sure that when the data is modified you either modify it in-place (therefore allowing you to reuse the reference), or you create a modified copy and your new types take ownership of this modified version.

You need to take ownership because otherwise the modified version will be destroyed when you leave your function, and your MyData will be left with a reference to destroyed data (which rustc's borrow checker rejects as an error).

1 Like

Speaking as a Rust newbie myself I think that is the wrong conclusion to draw from this discussion. The "builder pattern" is useful in Rust as well: https://en.wikipedia.org/wiki/Builder_pattern (Replace the word "class" with "struct" when reading that.

That does not sound like a good conclusion either.

The question is: What do you want your stuct to do?

  1. If it is going to be a container for data then it should actually contain and "own" the data. Such that the data it contains naturally lives as long as the struct. To that end ones struct would contain String, which itself contains the text, rather than &str which does not and is only a reference to data somewhere else, owned by someone else. Similarly for other types than String.

  2. Or perhaps you want your struct as some kind of record of interesting data elsewhere, that it does not itself contain or own. In that case feel free to use references but be sure the data the references refer to live as long as the struct.

Not to me. I don't want a systems programming language copying stuff around behind my back without me asking for it.

1 Like

I would likely just create a wrapper type and implement From or Deref, depending on how you'll use it.

@ZiCog I'm guessing you missed the parts where I said that I understand the issue (i.e. no need to explain it to me) and that the struct in question comes from another crate (i.e. it is what it is, I can't change it).

@ekuber
This is a really neat technique. Thanks!

Sorry, yes, likely I did.

All is well then.