Traits and references

I'm trying to wrap my head around why you might want to make a trait take &self vs &mut self vs self, and I'm having trouble understanding how these choices interact with the ability to implement a trait for a reference.

Let's take Into for example:

pub trait Into<T> {
    fn into(self) -> T;
}

Why does Into consume self? Some objects may own memory on the heap (such as a Vec<T>), or more generally, non-clone/copy types (&mut T), so consuming self means that we can transfer ownership of this memory to the new object and a clone of the inner Vec<T> (another allocation) is not necessary. This also provides some flexibility -- we can implement Into<T> for &'a CopyableStruct if there's nothing 'unique' contained in the struct.

Question 1: for a struct whose members are all Copy types, do we lose anything by implementing Into for a reference to this struct but not for the struct directly? I tried looking at a few scenarios, and it looks like the answer is no...but I'm sure I've missed something.

struct Test{ //our struct w/ non-expensive Copys
    inner: i32
}

impl<'a> Into<i32> for &'a Test{ //impl on the reference to the struct
    fn into(self) -> i32 { self.inner } //implicit copy of the int
}

fn uses_into<T: Into<i32>>(arg: T) ->i32{
    arg.into()
}

struct StoresInto<T: Into<i32>>{
    into: T
}

fn main(){  
    let mut test = Test{inner: 5};
    {
        //if we have the struct
        //any function that takes T: Into as an argument can be used by passing a reference to our struct
        uses_into(&test);
        //any struct that stores T: Into can be used by passing a reference to our struct
        let stores_into = StoresInto{into: &test};
    }
    
    {
        //if we only have a &mut reference to this struct...
        let mut_ref : &mut Test = &mut test;
        //any function that takes T:Into as an argument can be used by deref/ref pattern
        {uses_into(&*mut_ref);
        //any struct that stores T: Into can be used by deref/ref pattern
        let stores_into = StoresInto{into: &*mut_ref};}
        //I find it weird that mutable references aren't automatically coerced/reborrowed into regular references...

        mut_ref.inner = 6; //mutable reference is still usable as long as the other borrows are dropped
    }
} 

Question 2: When writing a new trait, why would you ever have a function signature that takes &self as an argument? Wouldn't it be better to always write traits to take self and let the implementor decide what 'level' of ownership was required to use the trait for that struct? (E.g. do you need the memory owned by the struct? Do you need a unique reference to the struct? Or is a regular reference good enough?). The above example appears to demonstrate that the implementor does have this level of control....

I guess any discussion/links to discussion around these topics with a focus on best practices is what I'm looking for here...bonus points for using AsRef and friends to smooth out the edges of this stuff

1 Like

I think you should think of it based on semantics of the operation and the trait itself. Into is a conversion function - it's intended to take a T and return a U, but as a conversion. The conversion bit implies that you no longer care about T once you have the U. That's as far as the trait Into is concerned and you should think of it as the T has been consumed.

Now comes the part where you decide to implement Into for some type. Since into takes self (i.e. owns it now), you have more flexibility in some cases (i.e. ability to move fields/values out of T and into U). In the &'a Test example you have above this isn't material because you're moving an i32, but that's a Copy type and so you can "convert" from &'a Test into i32 without needing ownership of Test. But, if Test had, say, a String field that you wanted to move as part of the conversion, that wouldn't work with &'a Test.

But the important bit is to think of the trait's semantics without considering all possible types that may decide to implement it (which is impossible since that set is unbounded and not known to you if this is part of a library that you're writing).

Another way to think of it is via ordinary functions. Say you have:

fn foo<T>(x: T) { ...}

This function says it consumes x. Of course if you end up passing some Copy type to it, the source isn't actually consumed. But again, that's the caller's choice - you express intent in the signature.

Often I want to define a trait with multiple methods, so taking &self allows one method to neither consume nor modify the object, while another method does modify the object.

However, in addition, I would always use &self for a method that is not intended to modify or consume an object. Otherwise it is just confusing to use, particularly in a generic context. eg

trait HasColor {
  fn is_green(&self) -> bool;
  fn is_blue(&self) -> bool;
}
fn is_watery<T: HasColor>(t: T) -> bool {
    t.is_green() || t.is_blue()
}

If we accepted self we couldn't write this function without the restriction that &T has color. This would inhibit the writing of general code that uses the trait.

1 Like

Thanks @vitalyd, I think I get what you're saying...I think there's a good counterexample in Add to this piece here, though:

But the important bit is to think of the trait’s semantics without considering all possible types that may decide to implement it (which is impossible since that set is unbounded and not known to you if this is part of a library that you’re writing).

Add consumes self, presumably to afford implementors the same sort of flexibility you mentioned. I don't think there's anything about the semantics of Add that suggest that we no longer care about the thing we're adding (although AddAssign might qualify here). What I'm wondering is - why not always leave it up to the implementors? Would it be a good rule of thumb to write traits that take self unless there's explicitly a reason not to? What are some examples of things that would preclude taking self as an argument in a trait?

Aside: I don't understand why AddAssign exists at all. Can't AddAssign be derived automatically for every type that implements Add? In other words, why doesn't this blanket impl exist:

impl<T, U> AddAssign<U> for T where T : Add<U, Output=T>{
    fn add_assign(&mut self, rhs: T){
        *self = self.add(rhs);
    }
}

I think you answered this yourself - AddAssign is the variant that cares because it allows in-place updates. It exists because it may be expensive to move/copy temporaries (i.e. using just Add in the manner you described).

Sorry I'm being a stickler here - I'm pretty new to systems programming and I've decided to start learning Rust a little more seriously since I'm convinced you guys have really got something here...so forgive my ignorance on this one: is this hypothetical implementation of AddAssign any slower than the manually written, mutable version? E.g. are these equivalent:

struct Test{
    inner: i32
}

impl Add<Test> for Test{
    type Output = Test;
    fn add(mut self, other: Test) -> Test{
        self.inner = self.inner + other.inner; //note the mutation, reuse of memory
        self        //return value optimization?
    }
}

impl<T, U> AddAssign<U> for T where T : Add<U, Output=T>{
    fn add_assign(&mut self, rhs: T){
        *self = self.add(rhs);
    }
}

vs the canonical version:

impl AddAssign<Test> for Test{
    fn add_assign(&mut self, rhs: Test){
        self.inner = self.inner + rhs.inner
    }
}

In this example, there likely won't be a difference because Test is just an i32. But, this can have material performance impact on large types. RVO is an optimization that may or may not kick in, so you can't 100% rely on it.

That's a good one...somehow I was thinking about the ability to use generic code, but not how the generic code itself would be written...stupid question: how do you write the restriction that &T has color? Like this?

trait HasColor {
  fn is_green(self) -> bool;
  fn is_blue(self) -> bool;
}
fn is_watery<T>(t: T) -> bool 
    where for<'a> &'a T : HasColor
{
    let ref_to_t = &t;
    ref_to_t.is_green() || ref_to_t.is_blue()
} 

Damn that's ugly :slight_smile: I see what you mean here...although I kinda like the semantics. HasColor has additional functionality if your implementation doesn't need ownership of the memory in its methods...hmm

Yeah, HRTB would be the way to specify that. I suspect you also wanted to modify HasColor methods to take self in that example?

Note that your is_watery now takes ownership of t away from the caller. In generic methods, you generally want the caller to decide whether to transfer ownership or give a borrow, if your function doesn't care.

That is correct, good catch! Thanks guys

I realize we've omitted another difference between &self/&mut self and self, which is to support impls for unsized types. You cannot implement a trait function taking self for an unsized type.

1 Like

Cool...so slices and traits?

trait A {
  fn test(&self) -> i32;
}
trait B{
  fn test_2(&self) ->i32;
}

impl B for A{
    fn test_2(&self) -> i32{ //&self is a trait object.  Can't take self here, Self: ?Sized
        self.test()
    }
}

Yeah, pretty much.

why would you ever have a function signature that takes &self as an argument? Wouldn’t it be better to always write traits to take self and let the implementor decide what ‘level’ of ownership was required to use the trait for that struct?

I think constraining all your traits and their methods to only take self (instead of also allowing &self or &mut self) would unnecessarily restrict what you are doing. You often use traits to provide a common interface and in all but the most trivial of cases you won't want to be consuming things the first time they're used.

Likewise if people only implemented traits for references instead of directly on the type (impl<'a> SomeTrait for &'a Foo vs impl SomeTrait for Foo) then it'd be a real pain because of lifetimes and all that.

for a struct whose members are all Copy types, do we lose anything by implementing Into for a reference to this struct but not for the struct directly?

As for Copy types, I think you'll find that in practice they aren't overly common. So you don't exactly gain anything by implementing Into for a reference instead of the struct itself. Actually, it's probably going to have a net negative effect because it affects readability, forces you to unnecessarily borrow when using Foo::from(...) (something that gets picked up by clippy's lints by the way), and is probably unnecessary for "performance" reasons because the compiler will be immediately dereferencing and copying the thing across anyway.

Just as a datapoint to show how frequently you use Copy structs, I've currently got the mdbook repo open in a terminal and of the 21 types with the #[derive(...)] attribute (according to grep '#\[derive' -r src), only 2 also derive Copy.