On a GH issue, @trentj mentioned:
[…] the common recommendation to use
Arc::clone(&x)
instead of simplyx.clone()
. I have mixed feelings about that convention
This post is not gonna revolve exclusively about this, nor about calling out Trentj , since, on the contrary, I always appreciate when something "given" is questioned every now and then, to make sure we don't operate off sheer inertia
And it turns out that I've been having "mixed feelings" about the clone
situation myself (although maybe for different reasons):
-
On the one hand, writing
.clone()
is handy (shorter to type thanARc::clone(&
), accessible (in the prelude,ARc
is not), and actually plays a bit better with some corner case situations where type coercion or subtyping is involved (c.f. the aforementioned issue); -
On the other hand,
.clone()
does not let one know about whether the operation is trivial / cheap, such as a counter increment (even an atomic one), or if one is actually duplicating a long string, a long vector, or some other big collection, at least not until looking at the actual types in play and what theirClone
implementations are about. I believe it's a similar sentiment to the one that motivated the addition of the.copied()
adaptor: one feels "better" when knowing they'll just be performingCopy
es of the elements rather thanClone
s (assuming the type sizes are not too big, of course!).I believe some people have asked for a
CheapClone
(marker?) trait. But I don't think that's really the meaningful point here. Indeed, besides the "performance" point of view, there is also the point of view of semantics (although the shared vs unique access distinction in Rust makes this aspect less error-prone, which, btw, is one of the excellent things about Rust!):-
while a "classic"
.clone()
duplicates memory, thus making mutations of the original not be visible on the newly obtained owned handle, -
a "shared ownership"-clone, that is, a retain shared ownership operation, yields an owned handle which refers to the same entity as the clonee's, so that mutations through either owned handle can be observed by the other.
I believe it is this aspect which motivates writing
{Ar,R}c::clone(&…)
rather than….clone()
when possible. -
So, at this point, two things transpire:
-
method syntax can be a bit more convenient than associated-function syntax;
-
there is a semantical difference between duplicating an entity and retaining shared ownership of it.
And in between those two points lingers a rather thorny situation. Indeed, consider:
#[derive(Clone)]
struct NewType(Arc<…>);
or any similar thing.
This pattern is both idiomatic for a number of reasons, and yet plays quite poorly w.r.t. the "semantics of .clone()
" situation:
-
Since this is a new type, I don't think anybody will be expecting to see
NewType::clone(&…)
syntax, that would be weird.
Somy_new_type_instance.clone()
it is.
And yet… we are not duplicating data, here, we are retaining ownership of it!This means that if we see, in the middle of code, something like:
let my_instance2 = my_instance.clone(); … my_instance2.clear(); … stuff(my_instance);
Without knowing the actual semantics of
my_instance
's type'sClone
implementation, we don't know ifmy_instance
will be in a cleared state or not: if it is some kind of wrapper around shared retained ownership, such asNewType
, then it will be cleared (through some interior mutability obviously), and if it is, on the contrary, something like aDashMap
, then it won't be.Some people may blame the interior mutability here —especially when named like that: such mutability, imho, ought to be rather named shared mutability— since, indeed, Rust is about exploiting the beauty of aliasing NAND mutation (not xor, btw: you can have neither), but the reality is that, for instance,
async
codebases that.spawn()
stuff lead to a lot of ownership (either duplicated or retained), and this, in turn, leads to shared mutability being necessary: let's not over-demonize shared mutability, please.So
my_thing.clone();
is an ambiguous thing to write, whenmy_thing
may be featuring retained ownership. -
There is also an API issue: is the retained clone part of the public API? Technically, it isn't, and besides documentation, there is nothing that can express this property. This means that if a downstream user of
NewType
needs, on their own end, to have retained ownership semantics, they don't have a choice but to re-wrapNewType
within their ownArc
, to guarantee those semantics. That seems a bit silly, at least for the situation where the author ofNewType
was intending to keep featuring the retained ownership implementation.
I hope that by this point, I've managed to "spread around" this feeling of uneasy-ness around Clone
's over-terse semantics / contract / API.
Note that I am not criticizing Clone
per se: such a trait, and the usage done in generic contexts such as the .cloned()
, or .repeat()
iterator adaptors, or the .resize()
method on Vec
, is indeed necessary. But the key word here is generics, and, from there, the general abstract contract that Clone
is about.
And this is the "problematic" part: either Clone
is not the best name for the functionality at play, here, or Arc
/ Rc
implementing Clone
are kind of abusing the contract. Whatever it may be since Arc
/ Rc
(and newtypes wrapping it) implement Clone
, this leads to an empirical definition of the actual Clone
contract:
-
A type is
Clone
if it can feature a&Self -> Self
operation.- I'll add that there may be kind of expected subcontracts, especially for
Eq
types, such asthing.clone() == thing
, but even that is quite tacit (although conveyed by theClone
name, and compatible with retained ownership, so let's not dismiss it).
And that's it.
Clone
is nowadays rather aOwnedFromRef
, or, in more Rusty parlance,ToOwned<Owned = Self>
. I like thisToOwned<Owned = Self>
vision way more than theClone
name (although I'm not at all advocating to actually change the names, obviously! I just want the conceptual view to broaden a bit): we no longer name / assume "how" the ref-to-owned operation is achieved. - I'll add that there may be kind of expected subcontracts, especially for
So, assuming that conceptual OwnedFromRef
API, (the generic APIs would conceptually be built around that trait), and Clone
would actually be a "marker" (sub)trait that would "taint" the owned_from_ref
operation with "duplication" semantics.
And while this is conceptual, we'd now be able to feature a sibling of Clone
: another subtrait which would, instead, "taint" the owned_from_ref
operation with "retain" semantics.
Conceptual view / what the Rust stdlib could have been
//! Conceptual view: nevermind the names
/// Non-derivable (mainly to avoid overlapping impls issues with generic derives).
trait OwnedFromRef : Sized {
fn owned_from_ref(&self) -> Self;
}
trait Clone : OwnedFromRef {
fn clone(&self) -> Self;
}
impl<T : Clone> OwnedFromRef for T {
fn owned_from_ref(&self) -> Self {
self.clone()
}
}
/// Derivable.
trait Retain : OwnedFromRef {
fn retain(&self) -> Self;
}
impl<T : Retain> OwnedFromRef for T {
fn owned_from_ref(&self) -> Self {
self.retain()
}
}
impl<T> Vec<T> {
fn resize (self: &'_ mut Vec<T>, new_len: usize, value: T)
where
T : OwnedFromRef,
{
Again, I'm not advocating for this design, nor that the stdlib should have been using it should we have a time machine: I just want people to keep an open mind w.r.t. engineering APIs and the semantics that go with it.
trait Retain : Clone
Now, back to the real world, that Retain
trait is the one which I think deserves to be more than a conceptual / theoretical view of the mind: I believe that a Retain
trait could be a nice addition to improve the quality / readability of our APIs, according to the code > documentation
principle:
/// "Marker" trait to express shared ownership semantics
/// (_e.g._, through reference-counting).
trait Retain : Clone { // also added to the prelude.
fn retain(&self) -> Self;
}
/// Prevent the `.retain()` implementation from being overridden,
/// giving it "marker" trait semantics.
partial // <- current keyword in nightly is `default`, but it's misleading.
impl<T : Retain> Retain for T {
#[inline]
fn retain(&self) -> Self {
self.clone()
}
}
impl<T> Retain for Arc<T> {}
#[derive(
Clone,
Retain, // publicly express that this inner `Arc` is part of the public contract
)]
struct NewType(Arc<…>);
-
Indeed, if such a trait were to be added to the prelude, we'd have all the advantages of
.clone()
syntax, and none of its ambiguity. -
It also scales well for other newtypes around
Retain
ed types, such that they are, themselves,Retain
ed as well, but only if the author opts into that, through thatRetain
derive (equivalent toimpl Retain for NewType {}
+ assertions that the every field isRetain
).
I personally find the idea quite aesthetically pleasant, and will soon be publishing a PoC crate of the trait and the derive. I just wanted to gather some thoughts from the community, to see if they shared those feelings or not, so as to see if this would be (pre?)-RFC-worthy, and everything else: even though it's easy for a third-party crate to feature this abstraction, I find that not having it in the prelude, and not even in the standard library, defeats like 90% of the reasons to use this crate to begin with, which means that, as a third-party crate, it is quite doomed to fail.
Be it as it may, the actual discussions w.r.t. an stdlib addition ought to be deferred to that potential follow-up (pre?)-RFC. **The topic here is rather to gather feelings about .clone()
vs. Arc::clone()
, feelings about this Retain
marker trait idea (I could imagine name bikeshedding, such as a RefCounted
trait with an .increment_ref_count()
), maybe even feelings about the tangentially related CheapClone
trait (I personally find that such a trait would come with many more corner cases than the very simple Retain
abstraction).
So, what are your thoughts about all this?
Updates:
-
The name
.retain()
conflicts too much with what is used in the standard library, so methods named like.retained()
or.reown()
could be more suitable:-
trait RetainOwnership { fn retained(&self) -> Self; } impl<T> RetainOwnership for Arc<T> {}
-
trait SharedOwnership { fn reown(&self) -> Self; } impl<T> SharedOwnership for Arc<T> {}
-