Is it ok to use clone when creating a struct from another one?

So I have a struct with a lot of fields, and I would like to create another struct with a subset of some of the fields from the former.

Basically

struct ManyFields {
  ...
  copy_me: bigdecimal::BigDecimal
  ...
}

and I have the subset struct as this

struct SomeFields {
  copy_me: bigdecimal::BigDecimal
}

trying to create an implementation for ManyFields to create SomeFields like this:

impl ManyFields {
    pub fn to_some_fields(&self) -> SomeFields {
         SomeFields {
            copy_me: self.copy_me
        }
   }
}

fails with the error

move occurs because `self.copy_me` has type `BigDecimal`, which does not implement the `Copy` trait

So I fix this by cloning. Hence this worked

impl ManyFields {
    pub fn to_some_fields(&self) -> SomeFields {
         SomeFields {
            copy_me: self.copy_me.clone()
        }
   }
}

My question is, is this the most idiomatic way to go about it?

I attempted trying to make copy_me: &bigdecimal::BigDecimal but this throws errors relating to lifetimes. And I have a feeling going with the lifetimes approach will complicate things. Plus also I think it is not a good thing to have SomeFields tied to ManyFields via a borrow like that.

So question is, is clone approach ok? (I always have doubts about clone and feel it might not be the most resource effective approach - is this also unfounded?) if not, what is the best way to go about it?

1 Like

ok, so in my use case, which is about converting ManyFields struct to SomeFields to be returned as part of an API call, I realise I do not need to have ManyFields around, so I could actually move it. So instead of pub fn to_some_fields(&self) -> SomeFields { I went with pub fn to_some_fields(self) -> SomeFields {. that is, not borrowing self, but moving self. And this works just fine for me and makes sense too..

But still curious about the general thoughts about clone

1 Like

Yes, using clone in Rust is idiomatic. In case it actually is too expensive for you to always do so, there are of course ways to avoid it nonetheless. I will address two ways to do so without changing the &self argument of the method. Of course in your use-case where that’s possible that is probably simply the best approach.


E.g. in case you do keep the original struct around, a struct containing references to the fields is a possibility in principle; but one has to be cautions defining structs with lifetime arguments not to paint oneself into the wrong corner, so only do so if it really works for the use-case.


If cloning of some particular field is just too expensive, using an Arc is also an option. The original struct would get a copy_me: Arc<bigdecimal::BigDecimal> field, and then cloning of that Arc becomes cheap. If you still need mutable access in some cases anyways, the correct way to do that would then usually not be to introduce RefCell or similar, since that would share changes between the original ManyFields and the created SomeFields struct, so the behavior would be different from the cloning approach; instead you could employ a “clone on write” strategy, using the method Arc::make_mut. This method then executes a .clone operation on the contained data anyways, but only if it’s necessary, i.e. only

  • if the Arc is still shared, and
  • if you need a mutation after all (otherwise you wouldn’t have called Arc::make_mut yet in the first place)

So it’s a way to delay the need for .clone()ing until later when (and only if) it’s absolutely necessary. If the original ManyFields were to be dropped in such a use-case, the Arc would become unique again and cloning wouldn’t be necessary even with mutation.

(Of course, as always, if your use-cases are only single-threaded, Rc would be a feasible and slightly more efficient alternative to Arc.)

5 Likes

are there heuristics for determining the cost of doing a clone? My guess would be the sum total of the types involved in the data structure being cloned. Are there any other thing to take into consideration when thinking about how expensive a clone will be?

In my readings I have come across the Cow type, and from my cursory understanding of what it does, it seems similar to what you described. Will the Cow type also be an alternative?


Scratch that `Cow` is probably be just _copy_ on write...and what you described is _clone_ on write. That should be the difference

The most general rule is: it’s too expensive when optimizing it gives a relevant performance benefit. This of course strongly depends on your use-case, and also you can only really objectively determine whether or not something was “too expensive” by that measure after you’ve done the optimization, and thus had the possibility to improve. The art then naturally is to learn to determine ahead of time whether a more optimized approach will give relevant performance benefits.

(I’m using “relevant” instead of “significant” because, especially in the context of statistically measuring something, a “significant” performance improvement would be any performance improvement that you are able to measure to be for sure a performance benefit at all, no matter how small the benefit actually is.)

Some indicators might be: would the cloning be done often or only a single or handful of times throughout the life of the program? Or – for an indefinitely running program – does it happen only a handful of times a second/minute/… or really frequently. And even for really frequent operations; the smaller the time saved by avoiding the clone is, the more often per second it would need to happen for an optimization to be relevant.

Another indicator: Can the data being cloned become really large? Are you only using BigDecimal because your numbers become a bit larger than machine values… perhaps 2 times, or 10 times as large (in terms of number of bytes)? Then cloning is probably negligible. If the become 100s or 1000s times larger (in terms of number of bytes!!) then one should start reconsidering the cost of the clone.

If you’re writing library code, it can be hard to anticipate how users use the code, so it can make sense to be more aggressively optimizing stuff. Of course, while developing, one should still avoid premature optimization, especially in nontrivial situations, and focus on getting stuff to work at all first. Especially since sometimes, sufficient optimizations could be drop-in solutions. E.g. the “turn it into Arc<BigDecimal>” approach I discussed above would not require any logical changes to your code at all if added later. You would just need to add the necessary make_mut calls everywhere where mutation happens, but that’s a straighforward refactoring, applying a change everywhere the compiler tells you to fix the problem that you cannot (implicitly) get mutable access to the BigDecimal anymore through the Arc.

3 Likes

No, the Cow type has a bit of a different use-case. Its use-case is for cases where you do want something like an ordinary reference with a lifetime but occasionally need to modify the contents anyways. Whereas Arc solves the problem of getting rid of the lifetimes. Also, Arc’s clone-on-write optimizations are a bit more powerful; for a Cow type, you statically never know whether perhaps the reference you got is unique after all, so even in cases where you’re fine with using something with a lifetime, there’s situations where Arc could avoid some copying that Cow cannot.

Typical use-cases for Cow are the ones in the standard library, e.g. as a return type for the method turning &[u8] into a string lossily; the common case receives valid UTF-8, and can return a &str without any new allocations, the invalid UTF-8 case will need to allocate a new String since it needs to replace invalid data with replacement characters. The Cow return type then allows the caller to decide whether they want to create a short-lived borrow of the result, or perhaps unconditionally turn it into an owned String anyways, or possibly keep the Cow for a while, in each case still achieving almost optimal performance.


No, the terminologies mean the same. I used the more Rusty term “clone” because that’s what it is doing in Rust, whereas “copy” in Rust describes operations commonly so cheap that there’s rarely a reason to avoid or delay them at all. The analogue of “Clone” in C++ for example is called a “copy constructor”, and “cloning” is simply not a word used at all AFAIK.

5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.