Is it better for functions to own their parameters?

I was told that when defining struct fields, it's better to own the data (to avoid lifetime issues):

#[derive(Debug)]
struct Keymap {
    key: char,
    message: &'a str
}

vs.

#[derive(Debug)]
struct Keymap {
    key: char,
    message: String
}

Does this advice apply to function arguments?

#[derive(Debug)]
struct Keymap {
    key: char,
    message: String,
}

fn print_message(message: String) {
    // do other things with message
    println!("{:?}", message);
}

fn main() {
    let keymaps = vec![Keymap {
        key: 't',
        message: "Test".to_string(),
    }];
    let key = 't';
    let found_keymap = keymaps.iter().find(|k| k.key == key).unwrap();

    print_message(found_keymap.message.clone());
}

vs.

#[derive(Debug)]
struct Keymap {
    key: char,
    message: String,
}

fn print_message(message: &str) {
    // do other things with message
    println!("{:?}", message);
}

fn main() {
    let keymaps = vec![Keymap {
        key: 't',
        message: "Test".to_string(),
    }];
    let key = 't';
    let found_keymap = keymaps.iter().find(|k| k.key == key).unwrap();

    print_message(&found_keymap.message);
}

I don't think so.

If the function only needs to read some data from a struct you pass into it then it only needs a reference.

If the function is going to update some data in a struct you pass into it then it needs a mutable reference.

Both the above ensure the caller still has ownership of the struct.

If you pass ownership of the struct into the function the caller cannot use it anymore. Not unless you also return it back to the caller, which seems clunky to me but useful sometimes. Or clone a copy to give to the function.

2 Likes

The advice doesn't really stand for types, either.

It's a 0-th order approximation for beginners who aren't yet familiar with lifetime declaration syntax, correct patterns and idioms, and may have an incorrect mental model of references keeping their referent alive (which they do in GC languages but not in Rust).

It's perfectly fine to put references into structs that are meant to be temporary. Those are not your typical "data structures" (which should indeed, in contrast, own their data), though. Usually, they are wrappers or RAII guards of some sort, creating of which is a somewhat advanced-ish use case.

Functions are more frequently such that they only need temporary views of the passed data, so passing references to functions is itself more frequent than putting references into user-defined types. If you don't need ownership in a function for implementing its body, then most definitely do not force the caller to give up ownership.

9 Likes

Better is too strong a word. A general starting point creating structure fields should be considered owned and parameters as shared references and return values owned. (static refs are essentially owned but you give up the ability for a program to create/modify while running.)

To not repeat what other have written will just add; Parameters that are Copy you might be more likely not to take by reference.

1 Like

I might get dinged on the strict technicalities of what I'm about to describe. Notwithstanding, here is my take on the question of whether it's good practice, or a good starting point in the design, to have your functions take ownership of the parameters.

taking ownership ~ "pass by value" with a Rust twist

My mental model is that taking ownership is equivalent to "passing by value". This is where a copy* of the value is made during the "hand-off" from the caller's scope to that of the function (where it is consumed in the body of the function). So once the body of the function falls out of scope, unless the function returns it, the value is dropped.

There are other "side-effects" of moving ownership over to the function that are unique to Rust**. Once the caller passes the value to such a function, the caller can no longer use it; the value was "moved" out of the caller's scope. Another view of this is that you have given the function "all rights" to that data (read/write and consume). However, this transfer of "rights" can be made temporary by having your function return the value to the caller: fn tmp_ownership(input: MyData) -> MyData.

borrow ~ read-access

The model of "rights" is useful because it focuses on the ultimate utility, the motivation, and less on "performance" and optimization which I only consider later in the design. With that said, when your function only needs read access to the data, pass it a non-exclusive borrow reference: &MyData. I think it's a given that we all understand why it's inherently cheaper and easier to manage "read-only" access to the data (perhaps name the required lifetime 'data).

exclusive borrow only when taking ownership is not an option

The last choice is my least preferred, but sometimes "capitulate" based on convention or usability: the "exclusive borrow". The exclusivity (only one) gives you the right to mutate the underlying data. This is my least preferred not because I don't believe in mutation, but more because I can get mutation by transferring ownership, the first option. I like transferring ownership as a "first goto" for mutation because it comes with a built-in accounting system that requires I make my intention explicit: give ownership, mutate, return ownership. It avoids the less explicit "side-effect" of fn foo(input: &mut MyData) -> (). The implied returned value of () is my point: it looks like the function doesn't do anything, only "behind the scenes", it mutates your data. I don't want to overstate this point, but I make it only to provide a way to prioritize what you need, with how to implement it without introducing more complexity than is required; having &mut references in code requires extra attention.

a copy event always happens when calling a function, so what are we actually talking about

* I used the word "copy". I'm using it for what it means "in english", but there is an important contrast to Copy in Rust. Whatever you put in the parameter of the function is copied, so the use of that word is correct. The question is what is copied: a borrow ref, an exclusive borrow ref, or the value itself. They all involve a bit-wise copy and transfer of ownership - it's just "of what" And whether the memory of the original remains "valid" (see below). For instance, "I own the borrow ref" before and after I used it to call the function. From a performance perspective, the only concern might be copying the value itself. The good news is that bit-wise copying is generally cheap and thus is only be a secondary, downstream optimization concern.

optional toggle to a more intuitive "pass by value" behavior

** You can "opt-out" of the default "move" semantic for MyData by having it implement Copy [1]; doing so signals to the compiler to use a "copy" semantic - resulting in two independent, bitwise copies. This latter behavior is more the "pass by value" behavior you might expect/intuit. The Rust primitive types (u32, char, u8), and the & and &mut reference types all implement Copy and thus "opt-out" of the "move" semantic [2]. The MyData is like Vec<T>*** and "everything else" per the default move semantic in Rust. Again though, whether or not we "opt-out" of the move semantic [3], we aren't changing what gets copied when [4].

*** `Copy` and `Drop` are mutually exclusive

Vec<T> and the like do not implement Copy, nor can it be made to do so because it hosts an internal pointer to memory located on the heap; creating a bit-wise copy would effectively create an alias, defeating the purpose of the accounting. More generally, the copies would point to the same, now shared, resource. This applies equally to any resource that needs to be formally de-allocated or closed. This latter spec is specified with the Drop trait. And thus, the mutual exclusivity between Drop and Copy implementations - it's one or the other. Finally, Vec<T> does implement Clone. The implementation provides a new heap allocation and internal pointers with different values than the original... point to the new memory. The new clone will generate its own call to drop to its own separate resources.


  1. unless you need to implement Drop, see below ↩︎

  2. the ergonomics of using & would be horrible without Copy ↩︎

  3. (by implementing the Copy trait, a signaling/flagging trait like Send and Sync) ↩︎

  4. Last I did I deep dive here, the only difference in what happens "under the hood" is how the memory that hosts the original value is tagged ("move semantic" -> invalid memory, "copy semantic" -> valid memory). Again, key is that in both scenarios memory is bit-wise copied ↩︎

3 Likes

Structs are more like function return values than function parameters.

Reusing one of my previous posts on this question:

6 Likes
  • Struct fields: almost always owned.
  • Function arguments: almost always borrowed.

Borrowing gives a temporary permission to access a value, and that fits perfectly what functions need to do — you use the arguments during the function call, and not longer.

The exceptions may be in setters and constructors like new that may be slightly more efficient if you give them owned values (or a generic Into<owned>) for values they keep or return.

Closures are a kind of function, and closures in various callbacks may need to be move || closures, which own their context.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.