'fluent with_xxx' constructions do copy mut Self?

Am I correct in the assumption that in this function mut self is not copied, but only takes ownership and gives it back when returning?
When doing this with large structs I have to be sure of that.

    pub fn with_resolution(mut self, resolution: u8) -> Self
    {
        self.resolution = resolution;
        self
    }

Changing ownership is a move. So self is moved or copied, depending on whether its type is Copy. Either way its bits are copied when the fn is called and when it returns, although it is a shallow copy.

If this is your API and you don't want this copying, you can pass and return &mut self instead:

    pub fn with_resolution(&mut self, resolution: u8) -> &mut Self

I thought of &mut Self as argument. But then the result of a construction like

let s = MyStruct::new().with_xxx().with_yyy().with_zzz();

returns a reference instead of a 'clean' struct instance (which is not directly returnable by a function).

My struct is 128 bytes and does not auto-implement Copy.
So when using mut self we have:
new() -> "create" 128 bytes
with_xxx() -> copy 128 bytes
with_yyy() -> copy 128 bytes
with_zzz() -> copy 128 bytes
?

I mistakenly thought you were using the builder pattern, in which case you would convert the builder to the final output with a build, output or similar method. For example, Command.

no builder pattern, simply setting fields.

This is probably obvious, but with &mut self you can separate the creation, setup, and return:

    let mut level = Level::new();
    level.with_resolution(2).with_lemming_count(42);
    level

Another couple of options are

  • self: Box<Self>
  • a newtype around Box<InnerType>

It’s exceedingly unlikely that the compiler ends up emitting any copies given that the methods are trivially inlinable. If you want to be more sure (especially with use from other crates, slap #[cfg(inline] on them and stop worrying. Plus according to most calling conventions, aggregates larger than a couple of usizes get passed by pointer anyway, and the compiler won’t emit defensive copies unless they’re actually required to not lose information.

8 Likes

I think you meant to say #[inline(always)].

2 Likes

I think in short functions inlining is almost guaranteed without adding #[inline].
(ow, i wish i could read this assembler output...)

Anyway: parameter passing in rust is still a bit confusing to me.
I use for any parameter, which is bigger than u64, a reference (param: &my_struct).
Because of "not-knowing-for-sure-what-happens-so-be-sure-to-avoid-copy-parameter". state.

Plus according to most calling conventions, aggregates larger than a couple of usizes get passed by pointer anyway, and the compiler won’t emit defensive copies unless they’re actually required to not lose information.

ok...

1 Like

If you pass by value, you're notionally copying the value. Optimizations may elide the copies, but they are not guaranteed to do so. So it ultimately comes down to how sure you want to be.

2 Likes

Rarely worth it IMO, but if you really want to be sure, why not. Normal inline is enough to enable link-time inlining AFAIK, which is what matters the most.

1 Like

This has the feel of a premature optimization; the rule should always be to write clear code first, and only engage in performance optimizations when you're confident that they make a difference.

For something like this, that implies that you've written the clear version (moving/copying/shared reference/mutable reference as appropriate), and you've got profiling data or a benchmark showing that changing from the clear and obvious version to a shared reference actually makes a difference in your use case. If you can read the compiler's assembly output, you could also identify copies it's made that it "should" have elided, and show that you get better assembly if you use a shared reference.

Now, there may well be cases where you still write param: &NonZeroU128, because it really is meant to be a shared reference to a single instance, and that's fine - that's how you make that need clear. But using shared references just to avoid copies runs the risk of pessimising your code by having things be cache-unfriendly when letting the compiler move/copy would have resulted in cache-friendly code.

4 Likes

Ok I will dive a bit into that. The asm output in Rust Playground I almost cannot read.
And yes: param: MyType is nicer to see than param: &MyType.

It also came from some alien experiences I had in the beginning with self parameters:

impl Rect 
{
    pub fn offset_no_ref_mut(mut self, delta_x: i32, delta_y: i32)  
    {
        // change self fields.
    }
}

let mut r: Rect = Rect::new(0, 0, 10, 10);
r.offset_no_ref(1, 1);

when to my surprise mut self had eaten itself, and r had not changed!
Which is totally counter-intuitive when having done other languages for years.
So I learned to always use &mut self.

I know, I know, I should really study this stuff...
Around 1/2 year of writing / learning now and still feel like an idiot sometimes.
On the other hand: I have an insanely fast chess-engine and a running game on-screen :slight_smile:

Are there some guidelines regarding passing struct arguments?

For a method that is simply setting a field, you probably want that to be as flexible as possible. Therefore you shouldn't require that self is moved/consumed just to set one its fields. Moving a struct is not possible, for example, when it is itself a field in another struct. Therefore using &mut self for setter methods makes more sense, in general at least.

If the setter methods are only used to create an object, not modify it later, then the builder pattern is probably a better choice. And in that case you can choose whether to use self or &mut self as described in API Guidelines.

1 Like

It's because Rust is unusual in having ownership semantics encoded in the language.

Unless you're writing out raw bytes for the processor to consume, you should be writing your code with a view to giving the maintenance programmer as much clarity as possible; this has the nice side effect that the compiler is more likely to "understand" the clear version and transform it to the optimal machine code, but that is strictly a side effect.

If you need to take ownership of your parameter (so it's now yours, and the caller loses access to their version of it), take self, or foo: Foo. This tells me that the function takes ownership of its parameter, and that I'll need to make arrangements to have a copy to hand if I need the parameter's value later. Note as a special case of this that if you unconditionally call clone() on a parameter, you should always take ownership, and minimise the number of clone() calls.

If you need exclusive access to the caller's version of the parameter, leaving ownership in the hands of the caller and preventing other code from having access, take &mut self or foo: &mut Foo; this tells me that you are borrowing it from the caller, but you're taking an exclusive borrow and hence can mutate it.

If you want shared access to the caller's version of the parameter, leaving ownership in the hands of the caller but allowing other code to have access, take &self or foo: &Foo; this tells me that you're borrowing it from the caller, and you're taking a shared borrow so you're OK with other code having access at the same time.

And those semantics are the things you care about when writing the code - let the compiler fuss about the implementation details (noting that it can, for example, notice that it's doing wasted memory copies, and simply not copy a variable from the caller's stack frame to the callee's stack frame).

Remember, too, that you need to benchmark or examine assembly in release mode, where the optimizer's had a chance to spot wasted stack-to-stack copies, and not debug mode where the optimizer isn't run.

3 Likes

Ok. I got that. Benchmarking and assembler... I am also reading the API guidelines.
Now thinking: why would you ever make a function with a mut self argument (except in this questions case, a chained like easy field setting)?
The ownership of the argument is taken and nothing happens.

It's not uncommon when using self (transferring ownership) that the new owner needs to change it before doing something else with it, such as passing its reference to another fn, storing it in a struct field, etc.

The mut here is just like the mut in let mut. It is not that significant, in either case, which is why people say it's just a lint.

6 Likes

In contrast with reference types (&mut self versus &self),[1] the mut marker on a binding (mut self versus self)[2] doesn't have any impact on the API. Non-mut bindings prevent overwriting or taking a &mut _, but from an API point of view, that tells you nothing since the body can just do this:

fn method(self) {
    let mut this = self;
    // ...
}

For this reason, as I recall, Rustdoc just doesn't show mut binding markers.


Why would you ever take mut self? You already identified one use case: ownership based builders. Here's another example from Iterator: using the &mut self API of self to preform some task.


  1. which are different types ↩︎

  2. which are the same type ↩︎

4 Likes