This question has been asked before by another user but didn't receive the discussion it deserved.
I have a type and it implements Copy
#[derive(PartialEq, Copy, Clone)]
pub struct Scalar<'a> {
data: &'a [u8],
}
And this type can be expensively converted to other types:
impl<'a> Scalar<'a> {
pub fn to_bool(self) -> Result<bool, ScalarError> {
to_bool(self.data)
}
}
Self is taken ownership in the above example to match the naming convention that an expensive conversion between two Copy
types should have a method name prefixed with to_
. This is similar to methods on primitives like i32
and f32
-- all the methods take ownership of self.
Initially I thought this convention for Copy
types was widely used, but looking at the implementors of Copy
in stdlib, I don't see that self ownership is widespread. For instance, the following types are Copy
but take self by reference in their methods:
SocketAddr
Duration
Instant
Taking a copyable self by reference seems to permeate in crates as well (like chrono), and this is the point where I started second guessing myself. Maybe it is best practice to leave it only to primitives always own self. Or maybe it's the other way around and stdlib and the crates I sampled predate best practices.
Perhaps the answer is that it doesn't matter from an API standpoint (as long as one ensures that object will always impl Copy
).
Interestingly I found that it does matter from a performance standpoint. Having our Copy
type always take ownership of self resulted in a real world performance increase of 7-8%, which I needed to triple check to verify, as those results seemed incredulous. It's true (the code is open source and if someone was interested in investigating it I can link the repo).
So my question is, should I go around to all my methods on Copy
types and have self owned so that they match the naming convention guidelines and are (anecdotally) more conducive to performance? I'm assuming such an act may be considered a breaking change (it's unclear sometimes with implicit copies).
To muddy the waters further, what about a method to expose the inner data: ie:
impl<'a> Scalar<'a> {
pub fn as_bytes(self) -> &'a [u8] {
self.data
}
}
- The conversion cost is free so
as_
seems good - But
as_
is for borrowed self which would go against the other methods on the type - Thankfully owned vs borrowed self for this function shows a negligible difference in benchmarks (but would it matter if there was a difference?)
- So either it'll be the odd method out to have
&self
or it will go against the naming convention.
Why does this matter / why am I agonizing over naming? Recently clippy will now warn on wrong_self_convention, and I want to ensure that the APIs I'm creating now stand the test of time to avoid breaking changes. So if I go against the naming convention I'm risking a future warning that I'd need to decide to silence it.