When profiling my application, I found that a lot of time is spent on clones. I added these as a means to get the code to compile; and now that it works I'd like to audit these clones to see if some can be removed.
Are there any strategies you've discovered for finding and removing unnecessary copies? Or is it going to be a fairly manual affair?
I did find https://github.com/Manishearth/rust-clippy/issues/17 but it doesn't look like it's merged. I see now as I write this that there are some other relevant clippy lints actually so I'll check if any of those highlight problems.
Thanks in advance,
For finding clones, how about a simple
git grep ".clone()"? This will work on all types which are not explicitly marked as Copy, a trait which one should use sparingly anyway.
Once you've found them, the solution is bound to be situation-specific, because it depends on the semantics of your code, but here are some ideas of things to look at, from first to last.
- Borrowing. Sometimes, a reference is all you need.
- Moving. Cheaper than clone for some types, and will also clarify the semantics of your code, which is useful for the next steps.
- Boxing. Makes moves cheap for all types, at the cost of some dynamic memory allocation and pointer indirection.
- Refcounting (Rc, Arc...). Useful when you want an immutable borrow, but your codebase's design does not allow you to do it at compile time. Two typical examples are graph data structures and data shared between multiple threads. Beware cyclic references.
- Copy-on-write (std::borrow::Cow). Can help in situations where simple immutable borrows do not apply because you want to mutate the data from time to time. Standard library implementation is only useful for single-threaded code.
- Read-copy-update, aka RCU. Basically copy-on-write for multiple threads, implemented using clever lock-free black magic. Not available in the standard library, but I think implementations exist on crates.io.
Your looking at the wrong strategy. It should be centred around speeding up your code. First decide if code is fast enough and if not, run a profiling tool to see where it is spending its time. Your profiling tool should be pinpointing where the call to clone is being made.
I think the useful strategy here is to try to fix the problem globally, and not on the case by case bases.
In my experience, there is always a design with very few
Arcs, and a lot of
I usually try to think hard what is the actual data my program is operating on. Then, I define the main structs that own this data and live throughout the program. After that I can define a lot of smaller structs which hold references pointing to the main structure and provide convenient view of the data for some particular parts of the program.
Thanks all for the good recommendations.
I've tried a few options for profiling and timing (perf, flame). That helped to identify the biggest problem areas. Now I can see what can be done to speed it up...