Pattern for "RefOrMove" arguments?

Hi,

Say I have a function that takes T, and usually only needs &T, but on occasion it needs an owned T.

This signature means internally I'll need to clone() / to_owned() T in the situations where I need an owned copy.

fn my_func(t: &T)

But this signature means the caller will often need to clone() T before calling the function if the caller happens to only have &T

fn my_func(t: T)

This signature lets the caller pass whichever they have: &T or T, but internally it's still necessary to call t.borrow().to_owned() to get an owned version, even if the owned object was passed.

fn my_func<R: Borrow<T>>(t: R)

I understand it's largely a moot point because the compiler can eliminate the superfluous clone (I proved that to myself again here: Compiler Explorer)

But is there a nicer pattern that doesn't lean so heavily on the compiler to do the efficient thing, and makes it very clear in the code that there isn't an unnecessary clone happening?

Thank you.

1 Like

It sounds like you're describing Cow, which lets you pass either a reference or an owned value.

2 Likes

It sounds like you're describing Cow, which lets you pass either a reference or an owned value.

Same idea, but not exactly. Cow is a runtime type. The caller would need to create a Cow<>, pass it, and then the implementation would pay a branch to determine if the Cow enum's contents need to be cloned.

I'm wondering if there is a canonically used trait or set of trait bounds that would lead to optimal versions of the function for both T and &T to be created monomorphically, and the correct version being called (statically) depending on whether the caller had a T or &T.

The other pattern is

fn foo<T: DoStuff, U: AsRef<T> + Into<T>>(thing: U)

But that requires types to manually implement

impl AsRef<MyType> for MyType { /* ... */ }
impl From<&MyType> for MyType { /* ... */ }

Borrow can avoid the need for the first one, but ToOwned and Clone don't avoid the need for the second one.


Cow is also less ergonomic because callers have to construct the Cow (unless they again write more implementations so they can use .into()). I played around for awhile to see if there was a way around that, and arrived here.

I had to add the DoStuff bounds to the MyCow implementations, or inference fell apart for bar(&nc). If there's a impl<T: DoStuff> DoStuff for &T, you could probably make it work with a single type parameter instead of two.

It's not at all idiomatic.

1 Like

For my needs, I made a compile-time Cow:

deref_owned::GenericCow

There has been some debate which is the right way to describe this pattern and whether Deref, Borrow, and/or AsRef are best to use. I'm not sure if I made the best chioce, but I gave a bit of reasoning in the docs (mostly demanding Deref is meant to improve ergonomics).

Also, GenericCow isn't part of std, so it's not really a common/idiomatic pattern. I still feel like something like GenericCow (in whichever form) is missing in Rust.

From the docs:

fn into_owned(self) -> <Self::Borrowed as ToOwned>::Owned

Convert into owned type

Opposed to ToOwned::to_owned, this method consumes the receiver, which allows avoiding unnecessary clones in some cases. I.e. using this method on an Owned value (or on a Cow::Owned enum variant) will simply unwrap it. In case of using the Owned struct of this crate (instead of Cow), this operation is zero-cost.

3 Likes

Generally, I'd say if you need to call clone() or to_owned() on the argument eventually it is better to require ownership in the first place. This makes it more obvious for the caller where allocations happen.

Maybe you can specify your use case a bit more?

2 Likes

Yes! That line of discussion was the reason I started this thread. I was talking with a colleague about this and I thought... maybe there is an elegant use of the existing building blocks to express this pattern, and that's the reason Rust doesn't have an explicit type.

But it appears that's not the case.

Generally, I'd say if you need to call clone() or to_owned() on the argument eventually it is better to require ownership in the first place

Trouble with that approach is that 90%+ of calls won't need ownership. So it forces the caller to do a bunch of unnecessary cloning if they only have a ref to start out.

Right I forgot that point where ownership is not required. I guess you're talking about only requiring ownership when a certain condition is met, e.g.

fn insert_if_matches(map: &mut HashSet, insert: String, contains: &str) -> Result<(), String> {
    if insert.contains(contains) {
        map.insert(instert);
        Ok(())
    } else {
        Err(insert)
    }
}

So in case the condition is not met you return the owned value to the caller. However, this makes handling ownership a lot more complicated but sometimes is really what you want. tokio's channels are a good example for this pattern where you get back the value if sending fails.

I think @LukeTPeterson also described a situation where callers sometimes provide an owned value and sometimes a borrowed one:


The insert_if_matches example always requires an owned String in the first place.You cannot call it with &str if you don't happen to have an owned String.

Translating the insert_if_matches example to GenericCow, it would look like this:

pub fn insert_if_matches(
    map: &mut HashSet<String>,
    insert: impl GenericCow<Borrowed = str>,
    contains: &str,
) {
    if insert.contains(contains) {
        map.insert(insert.into_owned());
    }
}

(Playground)

But using GenericCow here isn't really necessary because it's never required to borrow the value (so returning the value on failure is a feasible alternative). Moreover, GenericCow requires you to use the Owned wrapper (see example) when calling the function with an owned value, so ergonomics isn't optimal either.

GenericCow can unfold its advantages if the caller sometimes requires an owned version but also may require a borrowed version only in other cases, and where callers sometimes have an owned version or a borrowed version, and where the latter isn't always defined at run-time. (If it's always defined at run-time, you could simply use std::borrow::Cow).

1 Like

If your concrete type is Clone, then Borrow<Concrete> is enough. Playground.

1 Like

But that performs the potentially unnecessary clone the OP was talking about:


fn bar<T>(value: T)
where
    T: Borrow<Foo>
{
    let borrowed: &Foo = value.borrow();
    let _owned: Foo = borrowed.clone(); println!("Clone happens here!");
}

fn main() {
    let f = Foo { _field: String::from("hi") };
    println!("We need the clone here:");
    bar(&f);
    println!("But we don't need it here:");
    bar(f);
}

(Playground)

Output:

We need the clone here:
Clone happens here!
But we don't need it here:
Clone happens here!

1 Like

I don't think there's a way to avoid that purely at compile-time, given that whether the clone is required is only decided at runtime.

Cow::into_owned() would work, but OP states that Cow is unacceptable because of its runtime cost. That's pretty strange to me, given that a runtime decision must be taken upon each invocation anyway.

You can avoid the clone in cases where the caller passes an owned version:

use std::borrow::{Borrow, Cow};
use std::ops::Deref;

#[repr(transparent)]
pub struct Owned<B>(pub <B as ToOwned>::Owned)
where
    B: ?Sized + ToOwned;

impl<B> Deref for Owned<B>
where
    B: ?Sized + ToOwned,
{
    type Target = B;
    fn deref(&self) -> &B {
        self.0.borrow()
    }
}

impl<B> Borrow<B> for Owned<B>
where
    B: ?Sized + ToOwned,
{
    fn borrow(&self) -> &B {
        self.0.borrow()
    }
}

pub trait GenericCow
where
    Self: Sized,
    Self: Borrow<Self::Borrowed>,
    Self: Deref<Target = Self::Borrowed>,
{
    type Borrowed: ?Sized + ToOwned;
    fn into_owned(self) -> <Self::Borrowed as ToOwned>::Owned;
}

impl<'a, B> GenericCow for &'a B
where
    B: ?Sized + ToOwned,
{
    type Borrowed = B;
    fn into_owned(self) -> <B as ToOwned>::Owned {
        println!("Possible clone here.");
        self.to_owned()
    }
}

impl<'a, B> GenericCow for Cow<'a, B>
where
    B: ?Sized + ToOwned,
{
    type Borrowed = B;
    fn into_owned(self) -> <B as ToOwned>::Owned {
        println!("Possible clone here.");
        Cow::into_owned(self)
    }
}

impl<B> GenericCow for Owned<B>
where
    B: ?Sized + ToOwned,
{
    type Borrowed = B;
    fn into_owned(self) -> <B as ToOwned>::Owned {
        println!("This doesn't clone.");
        self.0
    }
}

#[derive(Clone, Debug)]
struct Foo {
    _field: String,
}

fn bar<T>(value: T)
where
    T: GenericCow<Borrowed = Foo>
{
    let _borrowed: &Foo = value.borrow();
    let _owned: Foo = value.into_owned();
}

fn main() {
    let f = Foo { _field: String::from("hi") };
    println!("We need the clone here:");
    bar(&f);
    println!("But we don't need it here:");
    bar(Owned(f));
}

(Playground)

Output:

We need the clone here:
Possible clone here.
But we don't need it here:
This doesn't clone.

1 Like

To clarify, some code paths through the function would clone, and others wouldn't. So that decision happens at runtime, but it happens based on totally different factors from whether the argument was owned.

If the monomorphized version of the function taking a ref is called, and we end up down the code path requiring an owned object, then there would be an unavoidable clone.

However, if the monomorphized version of the function taking an owned object is called, that same code path can just use the object it was passed via an argument.

But doesn't this mean exactly that your function will clone internally anyway, even if an owned value is passed?

I understand that pattern, but that's not what I am talking about.

My point is that if the callee literally clones (as opposed to calling something like your or std::Cow's .into_owned()), and this is conditional at runtime, then that decision cannot be propagated to compile-time. OP wrote that a clone happens conditionally (decided at runtime). Maybe that isn't true, after all? Frankly, I'm quite confused as to what OP actually wants or is currently doing at this point.

Ah I see what I failed to articulate. I'm not married to Clone. In fact, I was hoping there was a trait like From that could just provide the value if it was already an owned value, or do the Clone if that was necessary.

For example, I believe ToOwned would do what I want except for the fact that the ToOwned impl for a given type has only one associated type, as opposed to an implementation for each target type, the way From / Into has.

ToOwned::to_owned always takes a shared reference. Thus it cannot consume the receiver (but Into or into_owned[1] can).


  1. both Cow::into_owned and GenericCow::into_owned ↩︎

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.