Smart pointer which owns its target

:man_facepalming:

Of course, it was totally useless then.

This whole issue seems super complex. It took me a lot of time to understand it, and I remember fragments of what I learned throughout this discussion again (like how deref is transitive, etc.) which help me to get a better understanding. But I still feel like I don't have a good overview on the issue yet.

I didn't have time to try it yet, but I have looked at it and it inspired me to get rid of the Owned wrapper. My code compiled, but it wasn't useful (because IntoOwned<String> doesn't help me where I need IntoOwned<str>, for example). I suspect your code has the same issue.

Maybe you fix it with all these implementations of IntoOwnedBorrow<Pointee> for Pointer (IOB<[T]> for Vec<T>, IOB<str> for String, etc). This is what I had way earlier in version 0.3.0 of deref_owned, see here, but I don't like that much because newly added smart pointers would not work unless I provide an implementation in a private module. (Perhaps I misunderstand it though.)

Also auto traits or negative impls didn't help (because they don't affect overlapping rules if I understand right).

In the end, I gave up to get rid of the wrapper.

However, I was able to simply deref_owned drastically in version 0.8.0 (see also changelog and documentation):

  • Struct Owned is now a simple wrapper with one type argument (of its inner value).
  • The associated type GenericCow<B>::Owned has been removed in favor of using <B as ToOwned>::Owned.

Moreover, I removed the terminology "smart pointer" as you and @8573 suggested.

Here is an example test case:

#[test]
fn test_generic_fn() {
    fn generic_fn(arg: impl GenericCow<str>) {
        let reference: &str = arg.borrow();
        assert_eq!(reference, "Echo");
        let owned: String = arg.into_owned();
        assert_eq!(owned, "Echo".to_string());
    }
    generic_fn(Owned("Echo".to_string()));
    generic_fn(Cow::Owned("Echo".to_string()));
    generic_fn(Cow::Borrowed("Echo"));
    generic_fn("Echo");
}

(Playground) (source on docs.rs)

mmtkvdb now uses this new version and it seems fine. Nonetheless, the API requires some cleanup, I think. But at least it works for now.

P.S.: deref_owned is now mostly doing the same as what "conradludgate" showed on IRLO.


What I mean is this:

// implementation details that are only pub because of the private_in_public lint
mod details {
    /* … */
    impl<T: Clone> IntoOwnedBorrow<[T]> for Vec<T> {
        fn into_owned(this: Self) -> Vec<T> {
            this
        }
    }
    impl IntoOwnedBorrow<str> for String {
        fn into_owned(this: Self) -> String {
            this
        }
    }
    /* … */

What about Path and PathBuf or third-party type-pairs like that? They won't be supported unless implementations for them are added into the module details, which is private though :face_with_diagonal_mouth:.

I'm not sure if I fully understand your comment though.


You write:

I don't think it's only a formality. It would be if Storable was a sealed trait. But third-party crates may implement it too (and it's intended to be implemented by external crates).

Actually Owned<T> is already an implementation detail. Users of mmtkvdb's API don't need to know about it. They can rely on the fact that the aligned reference-like type implements GenericCow<Self>:

pub unsafe trait Storable: Ord + ToOwned {
    /* … */
    type AlignedRef<'a>: GenericCow<Self>;
    /* … */
}

This bound using GenericCow is sufficient to get a borrow of the storable type or to retrieve an owned form of it (edit: see also fn generic_fn() in test case above). No knowledge about Owned is needed.

Edit: I added some links now that docs.rs processed the package.


I guess most people still frown upon the idea of a wrapper:

pub struct Owned<T>(pub T);

If there are more positive or negative opinions, I'm happy to hear them. I would like to understand better if what I was doing is plain nonsense or is reasonable to do. So far no other solution has been presented that I find reasonable (see my comments above regarding your proposal @CAD97). Maybe the Owned wrapper can be made superfluous in a future version of Rust or with specialization or similar features, but I think that's all very very unstable yet, right?

I guess the central problem is that we have no clear definition of what's "owned" in Rust:

  • String is an owned String
  • i32 is an owned i32
  • &String is an owned reference to a String
  • &str is an owned reference to a string slice
  • Rc<T> is an owned Rc pointing to a T

So while that all may be true, it is not helping us. In that sense: EVERYTHING IS OWNED.

Consequently, if we have a trait say OwnedType, and we implement

impl<T> OwnedType for T { /* … */ }

then this would be in conflict with any other implementation.

I tried to do things like:

impl<B> OwnedType<B> for <B as ToOwned>::Owned { /* … */ }

to restrict the definition from "any owned type" to "an owned type for a particular borrowed type B", but apparently that doesn't satisfy the compiler either because it potentially collides. (See also this post on IRLO.)

We can help us by using a wrapper Owned<T> to explicitly mark a type as "owned". (Here T is the type that's passed as "owned".)

This doesn't look much more different than borrowing or making a Cow:

  • Owned(x)
  • Cow::Owned(x)
  • Cow::Borrowed(&x)
  • &x

Say x is our String, then in each of these four variants, we wrap this String in some way.

  1. as owned (determined at compile-time)
  2. as dynamically chosen to be owned
  3. as dynamically chosen to be borrowed
  4. as borrowed (determined at compile-time)

Thus the transformation xOwned(x) isn't much different than x&x, or xCow::Owned(x).

But perhaps someone with better background on type theory can explain this better. This is merely how I feel about it without having too deep understanding about the mathematical background theory.

Is this a good approach or a bad one? And if it's a bad one, what to do else? A simple use case to discuss this could be the following function:

fn generic_fn(arg: impl GenericCow<str>) {
    let reference: &str = arg.borrow();
    assert_eq!(reference, "Echo");
    let owned: String = arg.into_owned();
    assert_eq!(owned, "Echo".to_string());
}

(from the Playground above)

This function might sometimes (depending on run-time choices) require a borrow and sometimes an owned value. When it requires an owned value and the caller has an owned value that it doesn't need anymore, this can save an unnecessary clone. Using Cow here has runtime-overhead because it always dynamically determines if a value is borrowed or owned. In contrast GenericCow can determine it at compile-time (when possible).


P.S.: I feel like Cow is such a "basic" data type, that the proposal to add something like GenericCow to std is not entirely crazy. There was the argument that Cow doesn't have much overhead, but I responded on IRLO:

Yet we have no GenericCow yet (in std).

GenericCow is a fine "noun trait" name. IntoBorrowOwned is I think the correct "verb trait" name.

I think we now have essentially agreed on the shape of GenericCow.

Reiterating for my own benefit:

GenericCow exists to lift the Cow enum into the type system. Fully reified, it would be

use std::borrow::{Borrow, Cow};
use std::marker::PhantomData;

pub trait IntoBorrowOwned<Borrowed>: Sized + Borrow<Borrowed>
where
    Borrowed: ?Sized + ToOwned,
{
    fn into_owned(this: Self) -> Borrowed::Owned {
        this.borrow().to_owned()
    }
}

// definitely owned (Cow::Owned)

pub struct LiftOwned<Borrowed>
where
    Borrowed: ?Sized + ToOwned,
{
    owned: Borrowed::Owned,
}

impl<Borrowed> Borrow<Borrowed> for LiftOwned<Borrowed>
where
    Borrowed: ?Sized + ToOwned,
{
    fn borrow(&self) -> &Borrowed {
        self.owned.borrow()
    }
}

impl<Borrowed> IntoBorrowOwned<Borrowed> for LiftOwned<Borrowed>
where
    Borrowed: ?Sized + ToOwned,
{
    fn into_owned(this: Self) -> Borrowed::Owned {
        this.owned
    }
}

// definitely borrowed (Cow::Borrowed)

pub struct LiftBorrowed<Borrower, Borrowed>
where
    Borrower: Borrow<Borrowed>,
    Borrowed: ?Sized,
{
    borrowfn: PhantomData<fn(&Borrower) -> Borrowed>,
    borrower: Borrower,
}

impl<Borrower, Borrowed> Borrow<Borrowed> for LiftBorrowed<Borrower, Borrowed>
where
    Borrower: Borrow<Borrowed>,
    Borrowed: ?Sized,
{
    fn borrow(&self) -> &Borrowed {
        self.borrower.borrow()
    }
}

impl<Borrower, Borrowed> IntoBorrowOwned<Borrowed> for LiftBorrowed<Borrower, Borrowed>
where
    Borrower: Borrow<Borrowed>,
    Borrowed: ?Sized + ToOwned,
{
}

// optionally, provide default impl for any type with the right shape

// impl<T, Borrowed> IntoBorrowOwned<Borrowed> for T
// where
//     T: Borrow<Borrowed>,
//     Borrowed: ?Sized + ToOwned,
// {
//     default fn into_owned(this: Self) -> Borrowed::Owned {
//         this.borrow().to_owned()
//     }
// }

// specialize for cow-like types

impl<Borrowed> IntoBorrowOwned<Borrowed> for Cow<'_, Borrowed>
where
    Borrowed: ?Sized + ToOwned,
{
    fn into_owned(this: Self) -> Borrowed::Owned {
        this.into_owned()
    }
}

You can then get to deref_owned@0.8's design by noting the following simplifications:

  • LiftBorrowed<Borrower, Borrowed> is fully redundant; the blanket impl on Borrower behaves identically.
  • The only Borrower type of any interest is &T (why?), so we can provide that impl directly rather than using the blanket impl. (This removes the reliance on specialization to provide the optimized impl for Cow.)
  • LiftOwned can be simplified to just Owned<T>(T) if
    • It borrows through T: Borrow rather than borrowing T itself and
    • The lift to T::Owned is done by the impl IntoBorrowOwned.

LiftOwned cannot be made redundant, though; it conflicts with the blanket impl, as it is perfectly valid for some crate to provide a Borrowed: IntoOwned<Owned=&Borrowed> type. We either have to keep the reification, drop the semantic blanket impl (this is effectively what my previous playground did), or make the Owned type a parameter of IntoBorrowOwned and use a really ugly specialization lattice that can't even be compiled yet.

... for someone receiving impl GenericCow, but for someone providing impl GenericCow, they still need to know about Owned. This is I think a point of design disagreement where you're focusing primarily on the fn() -> impl GenericCow case used by mmtkvdb... but the whole discussion started with the design of GenericCow/Owned, and that cares more about calling fn(impl GenericCow). This is what I keep referring to as input/required (fn(impl _)) and output/provided (fn -> impl _).

The main point of my playground wasn't the exact implementation choice of IntoOwnedBorrow, though. The main point was instead illustrating Packed and Realigned: these structs are the exact thing which mmtkvdb is actually providing to callers.

With -> impl GenericCow<'_, Self>, users of mmtkvdb still need to understand the GenericCow trait, even if they don't have to know about the Owned type. With -> Realigned<'_, Self>, this provides a type encapsulating the use of GenericCow in the implementation, and as a normal type it

  • has a documentation page listing what you can do with the type, and
  • holds a documentation comment to explain what "realigning" is and why mmtkvdb has to do it.

Even if implementers of Realign need to know GenericCow and Owned, callers are completely isolated from your use of deref_owned, making whatever type hackery it has to do a moot point. (And since Pack/Realign are extremely derivable traits, implementers who just use the derive don't need to know about the details either.)

To that end I made a small update to the previous [playground] to use deref_owned. I also switched out the traits to use GATs rather than pseudo-GATs via the for<'a> Trait<'a> pattern.

If you do function_call(Cow::Owned(x)) or function_call(Cow::Borrowed(x)), then there's a high chance there isn't runtime overhead as inlining can trivially strip back to just the code to deal with the one enum arm.

Minor `deref_owned` maybe-bug report

Owned<T> is currently for<B: ?Sized+ToOwned<Owned=T>> Borrow<B> and Deref<Target=T>. This is probably a violation of the "behaves identical to" requirement of Borrow, and it should Borrow and Deref to the same object.

That said, this is likely a non-problem in practice.

1 Like

Yeah, that's why we can simply pass a &str to generic_fn without needing a wrapper here, right?


Not exactly right, I think. Let me show an example:

fn generic_fn(arg: impl GenericCow<str>) {
    let reference: &str = arg.borrow();
    assert_eq!(reference, "Echo");
    let owned: String = arg.into_owned();
    assert_eq!(owned, "Echo".to_string());
}
struct Tmp<T>(T);
impl<B> Borrow<B> for Tmp<<B as ToOwned>::Owned>
where
    B: ?Sized + ToOwned,
{
    fn borrow(&self) -> &B {
        self.0.borrow()
    }
}
impl<B> GenericCow<B> for Tmp<<B as ToOwned>::Owned>
where
    B: ?Sized + ToOwned,
{
    fn into_owned(self) -> <B as ToOwned>::Owned {
        self.0
    }
}
generic_fn(Tmp("Echo".to_string()));
generic_fn(Cow::Owned("Echo".to_string()));
generic_fn(Cow::Borrowed("Echo"));
generic_fn("Echo");

(Playground)

We don't need the particular deref_owned::Owned wrapper for using that interface. We can just define any other wrapper ourselves which fulfills the interface of GenericCow.

Thus GenericCow (or IntoBorrowOwned as you said) would be what we need in std (or a well-known de-facto standard crate). The Owned wrapper could be defined where needed. It's not part of the interface. Do you agree on this or am I thinking this wrong?


Okay, I will look at them again later when I have some more time. But I feel like they are not really needed and only complicate things, because:

  • I need some sort of trait like GenericCow to be able to support third-party crate smart pointers (something like str/String, Path/PathBuf, but externally defined).

This is what I said here:

Click to expand

So you made an update to use deref_owned, which fixes this concern:


But:

  • If I have GenericCow, then Owned is just an implementation detail anyway.

So I don't see the gain of the Realign and Pack structs (yet).

Edit: Now I understand what you wrote about "callers are completely isolated from your use of deref_owned". So yeah, I guess I could isolate callers from GenericCow, but implementors must still know about it. What's so bad about GenericCow? It isn't "type hackery". The type hackery happens in Owned, not in GenericCow.


Interesting point. Do you think this always holds, also in case of GATs? (I really don't know.) Still, this might be a compiler detail that might differ on different platforms?


Note that dereferencing is just provided for convenience by Owned, a particular wrapper that can be replaced by another wrapper. The Deref trait is no longer used by GenericCow at all (due to your advice!). Thus, if the "behaves identical requirement" of Borrow is needed, .borrow() must be used instead of &*. Note that ToOwned::Owned: Borrow<Self>, i.e. .into_owned() doesn't violate the "behaves identical requirement" of Borrow (I think).

Do you agree?

Now this brings me to a most intriguing issue. Remember how I said on IRLO:

I just noticed that with my current definition of Storable::AlignedRef (compare with my previous definition), I no longer have impl Deref<Target = Self>. Thus I cannot use dereference anymore when I can just rely on impl GenericCow<Self>.

However, adding a Deref<Target = Self> bound isn't good because I sometimes own an Owned<Vec<i32>>, for example, and that has Vec<i32> as target and not [i32]. (This previously was the reason for the weird OwnedRef wrapper, which performs double dereferencing).

What I ideally need is more something like:

type AlignedRef<'a>: GenericCow<Self> + TransitiveDeref<T>;

or even better:

pub trait GenericCow<B>: Sized + Borrow<B> + TransitiveDeref<T> { /* … */ }

But apparently we cannot express this in Rust, as shown here:

use std::ops::Deref;

struct A;
struct B;
struct C;

impl Deref for A {
    type Target = B;
    fn deref(&self) -> &B {
        &B
    }
}

impl Deref for B {
    type Target = C;
    fn deref(&self) -> &C {
        &C
    }
}

impl C {
    fn c(&self) {
        println!("Hello, my name is C.")
    }
}

fn foo() -> A {
    A
}

// We cannot express:
/*
fn bar() -> impl TransitiveDeref<Target = C> {
    A
}
*/

// We must use:
fn bar() -> impl Deref<Target = impl Deref<Target = C>> {
    A
    // But then this won't work:
    // B
}


fn main() {
    foo().c();
    bar().c();
}

(Playground)

Now what to do!?

  • Add a Deref<Target = T> bound where I want deref-ergonomics ("transitively" to T) but then having to require implementors/callers to use the right wrapper with the exact dereferencing-level?
  • Forget about ergonomics like in the following example?
fn main() {
    let a = foo();
    let c: &C = a.borrow();
    c.c();
    let a = bar();
    let c: &C = a.borrow();
    c.c();
}

(Playground)

:face_with_diagonal_mouth:


I think this is structurally the right thing to do.

Let me show you why I think it's a bad idea to use extra wrappers just to avoid using the trait:

use std::path::Path;

// This seems straight forward:

fn returns_impl_as_ref_path() -> impl AsRef<Path> {
    "some/path.exe"
}

// This seems unnecessarily complex:

struct AsRefPath<T: AsRef<Path>>(T);

impl<T: AsRef<Path>> AsRefPath<T> {
    fn as_ref_path(&self) -> &Path {
        self.0.as_ref()
    }
}

fn returns_as_ref_path_wrapper() -> AsRefPath<impl AsRef<Path>> {
    AsRefPath("some/path.exe")
}

// So how do we use it?

fn main() {
    // Here `AsRef` needs to be in scope:
    let _: &Path = returns_impl_as_ref_path().as_ref();
    // Here `AsRef` doesn't need to be in scope:
    let _: &Path = returns_as_ref_path_wrapper().as_ref_path();
}

(Playground)

Why would we use returns_as_ref_path_wrapper and make things so complicated, instead of just using returns_impl_as_ref_path?

We could use the wrapper approach if AsRef was some weird/unusual trait from a third-party crate. But structurally it's superfluous. Structurally the trait approach (i.e. returns_impl_as_ref_path) is better, I believe.

This is what I previously called MethodReceiver<T>, with the addition that MethodReceiver<T> also includes T; i.e. any type which you can use method syntax to call the methods defined on T.

Returning Realigned<'_, impl GenericCow> instead allows you to re-provide Deref :wink:

Structurally, yes, -> impl GenericCow is all that is needed. But the thing is, you want to provide more than is just structurally needed (namely, the Deref impl that calls .borrow()). But also...

AsRef is in the prelude (and the edition2015 prelude at that) so is always in scope. My point here is effectively that GenericCow is

And as such encapsulating its use so callers don't have to deal with it is beneficial. Returning a type is always a less complex API surface for callers than returning some impl Trait.

Additionally, if custom user implementations are structural, they can be derived, and then those implementers also don't need to know about GenericCow.

Finally,

My point here is that using fn(impl GenericCow) requires type hackery (e.g. in Owned or another custom lookalike), because a standard usecase of "T as Cow::Owned" requires knowledge of a) why this isn't already the case, requiring knowing b) how to implement a LiftOwned to provide the obvious impl. Thus while the definition of GenericCow is itself straightforward, its use is not.

Optimizations are never (well, very very rarely[1]) guaranteed. In the case of fn() -> Cow, this would require either

  • the inlining of the function, and using that to note that only one variant is constructed, or
  • some optimization pass refining the return type to note it is always a single variant, and using that knowledge to optimize callers.

“enum variants are types” would potentially allow overconstraining of impls to return -> Cow::Owned or -> Cow::Borrowed, which provides the second avenue for the optimization in the type system.


  1. e.g. copy/move elision in C++ is defined as an allowed optimization over the source semantics, and is guaranteed to occur in some cases. ↩︎

Ah, now I understand that part. Yes, that makes sense.

:roll_eyes:

Okay, but at what price?

I think both your approach as well as my approach has downsides. Perhaps it's ultimately a matter of taste?

Exactly my point.

Using a trait is sometimes painful because it requires the user of a crate to bring all necessary traits in scope. See also: Pub use Trait as _ for more hygiene.

That's why I said earlier:

With non-sealed traits (like GenericCow), however, it can be a hygiene problem, as they might be implemented by other crates.

On the other hand:

  1. It's possible that mmtkvdb re-exports deref_owned::GenericCow, thus GenericCow can be seen as part of mmtkvdb. Of course, then this argument still strikes:
  1. But regarding "straightforwardness", I feel like
#[derive(Clone, Default, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Owned<T>(pub T);

impl<B> Borrow<B> for Owned<<B as ToOwned>::Owned>
where
    B: ?Sized + ToOwned,
{
    fn borrow(&self) -> &B {
        self.0.borrow()
    }
}

is less trickery than what was needed in your playground here:

(Even if this can be encapsulated/hidden from the user. But a maintainer of mmtkvdb or a fork of it would have to overlook it.)

My mind has already been almost exploding several times. Understanding this issue was several days of work (but I learned a lot during that process, admittingly).

I think I was violating everything that this video would have taught me if I had seen it earlier (quote from another thread):

I feel like adding yet another (non-generic) wrapper structure makes things even worse for me.

  1. In practice, the lack of MethodReceiver<T> might not be so bad. Consider this:
struct Fancy;

impl Fancy {
    fn fancy(&self) {
        println!("YAY!")
    }
}

struct SomeType;

trait Abstract {
    type Retval; // no bounds at all
    fn foo(&self) -> Self::Retval;
}

impl Abstract for SomeType {
    type Retval = Fancy;
    fn foo(&self) -> Self::Retval {
        Fancy
    }
}

fn main() {
    let v = SomeType;
    // we can call `.fancy()`, even if `Abstract::Retval` has no bounds
    v.foo().fancy();
}

(Playground)


(Preliminary) Conclusion

I would like to give my own (preliminary) conclusion from this discussion:

  • My original approach (e.g. deref_owned version 0.2.0) was flawed because it "abused" Deref (see IntoOwned in version 0.2.0). (Many thanks to you for helping me figuring that out and being so patient with me.)
  • This has been solved by using Borrow instead (see GenericCow in version 0.8.0).
  • Ergonomics of GenericCow are somewhat limited. In the generic case, deref-coercion won't work and we need the weird Owned wrapper when providing an always-owned GenericCow value.
  • However, ergonomics aren't totally bad:
    • In the concrete case, deref-coercion still works (see also Playground above) because for example Owned still implements Deref (without abusing it for wrong reasons and without impeding transitivity of dereferencing).
    • Writing Owned(x) instead of Cow::Owned(x) isn't really that hard. In fact it's shorter. :wink:
  • In the future, the Owned wrapper might or might not become superfluous.
  • Sometimes, providing concrete types instead of impl Trait may be beneficial.
  • Most people will just use Cow and not go crazy like I almost did. :face_with_spiral_eyes:

I think I agree with everything here :smiley:

The simple solution is of course to just use Cow and accept the minimal (and hopefully optimizable) cost to using it. The reason I've gone into the more complex implementation options is that we're trying to provide a better API that communicates what's being done more precisely than just using Cow.

But there is always a correlation between precision and API/implementation complexity; a tradeoff between statically eliminating unnecessary copies/clones and approachability.

I'm in agreement here now that the rest is down mostly to design philosophy and taste.

To this point, this is generally understood as desired to be provided by some future feature #[inherent] impl Trait for Type, sometimes discussed under the umbrella of delegation. The idea of #[inherent] is that you mark a trait impl as fundamental to the use of a type, and when doing method/name lookup on the type the trait is treated as in scope (for just the purpose of name resolution). Providing an #[inherent] impl would be allowed for any type you are permitted to add inherent functions to. (This has nothing to do with the trait being sealed.)

This is why this is sometimes lumped under the more general delegation feature, which would make actually adding inherent functions which just serve to call the trait implementation easier.

And yep, if an associated type is fully resolved, you can call any inherent functionality. This is why I asked about generic consumers of Txn/Storable; if you know the concrete type you know the concrete type projection and the provided associated type bounds are unnecessary.

1 Like

Interesting parallel:

Note that both Into and AsRef are in std::convert.

I believe that GenericCow, just like Borrow, belongs to std::borrow.

(Yeah, I know, most people don't want to see something like that in std at all.)

Now it's more difficult to say where Owned should go. As it's a static variant of Cow::Owned, it might go to std::borrow too. :see_no_evil:


Do you think it's a bad idea to re-open that issue on IRLO again? Afterall:

  • AsRef has a contrapositive (which is Into) in std::convert.
  • Borrow has no contrapositive in std::borrow (note that ToOwned is not generic enough, because <T as Into<T>>::into is zero-cost while <T as ToOwned>::to_owned is not always zero-cost but often involves cloning)

Or am I mistaken here?


After thinking about this again, I'm not sure if AsRef and Borrow are easily comparable. We have impl<T: ?Sized> Borrow<T> for T but not every T implements AsRef<T>. So maybe Into isn't the contrapositive of AsRef either?

I still believe something like GenericCow is missing in std. But discussing this likely is very time consuming and… I have a working solution for my own problem now.

It just makes me mad how I over and over have bits and pieces missing in Rust's type system. :slightly_frowning_face: Trying to find the best possible representation is an endless task that can consume days, weeks, or months, I believe.


Update:

Since I believe every T should implement AsRef<T> (see my post here), I think GenericCow indeed is the contrapositive of Borrow just like Into should be the contrapositive of AsRef (but isn't due to orphan rules / lack of specialization).

Thus the lack of specialization may by the reason for:

  • Having no impl<T: ?Sized> AsRef<T> for T in std
  • ToOwned::to_owned doing unnecessary clones (instead of having GenericCow or IntoOwned which doesn't)

I think these two issues are actually related :bangbang:

to_owned takes &self, so it has to clone. Adding an into_owned(self) where Self: Sized to the trait would be interesting, though...

But I think I agree that std having more/blanket impls is effectively blocked on having those impls be specializable, yeah.

I don't think it should be added to ToOwned. It is what GenericCow does. Ideally, IntoOwned (aka GenericCow) would replace ToOwned and ToOwned::to_owned.

This is what I demonstrated here:

edit: I just noticed this Playground still has the flawed use of Deref.

Here is a non-flawed example:

use std::borrow::Borrow;

pub trait GenericCow<B>: Sized + Borrow<B>
where
    B: ?Sized + ToOwned,
{
    fn into_owned(self) -> <B as ToOwned>::Owned;
}

impl<'a, B> GenericCow<B> for &'a B
where
    B: ?Sized + ToOwned,
{
    fn into_owned(self) -> <B as ToOwned>::Owned {
        self.to_owned()
    }
}

fn main() {
    let hello: &str = "Hello World!";
    let owned: String = hello.into_owned();
    println!("{owned}");
}

(Playground)

edit: If this would replace ToOwned, then then the associated type ToOwned::Owned would have to be moved to GenericCow/IntoOwned, of course.

Thus my hypothesis is:

Lack of specialization leads to:

  • Having ToOwned instead of IntoOwned/GenericCow
  • Having no generic impl<T: ?Sized> AsRef<T> for T

.... wait, we're charging headfirst into the &T => { &T, T } split again. I'm not convinced that lifting the trait to being implemented as self: &T rather than &self: &T changes this.

Additionally, w.r.t. replacing via time machine:

  • only taking self instead of &self prevents the trait from being dyn-safe
  • having a projective owned type is still beneficial; note what happens when you remove the implication of ToOwned in the definition [playground]

I don't understand what the "&T => { &T, T } split" is?

That's a technical implementation issue, yes. Also see my very first post in this thread, in which I mention issue #20671:

Regarding …

… that is because &T is an "owned" &T too, I think? It's this ambiguity:

fn main() {
    let hello: &str = "Hello World!";
    let copied_reference: &str = GenericCow::<&str>::into_owned(&hello);
    let owned: String = GenericCow::<str>::into_owned(hello);
    println!("{copied_reference}");
    println!("{owned}");
}

(Playground)

It's not a structural problem though. More a matter of picking the right method in case of such an ambiguity. Maybe it could be solved.

But even if it was solved, there is still issue #20671 (edit: and dyn-safety). I didn't want to imply we could replace ToOwned with GenericCow now (edit: or ever), but that GenericCow is not only a generalization of Cow but also a generalization of ToOwned (working by-value and not by-reference).

You do, I just overcompressed notation; it's exactly

And while

inference limitations are still important limitations to consider, and I'm fairly certain that this isn't a case where inference could decide the correct impl via information backpropogation; AIUI the applicable trait impl needs to be known just using the type information before the method call because of how auto(de)ref works.

(c.f. E0283 and E0284 on same expression uselessly duplicate identical help · Issue #98891 · rust-lang/rust · GitHub)

Ah :sweat_smile:.

I understand too little to judge whether it could be solved, but will trust you on this.

But notation issues / verbose notation aside (which is the only thing that inference limitations would cause, right?), would you agree that GenericCow is a generalization of ToOwned?

My feeling is that:

  • AsRef<T> relates to Into<T>
    (except that we lack impl<T: ?Sized> AsRef<T> for T for technical reasons)

the same as

  • Borrow<T> relates to GenericCow<T>
    (except that we require an Owned wrapper sometimes for technical reasons).

… but it doesn't! So I had to make some changes:

-impl<T, U> AsRef<U> for Owned<T>
-where
-    T: AsRef<U>,
-    U: ?Sized,
-{
-    fn as_ref(&self) -> &U {
-        self.0.as_ref()
+impl<T> AsRef<T> for Owned<T> {
+    fn as_ref(&self) -> &T {
+        &self.0
     }
 }

This is such that the following tests run fine:

#[test]
fn test_vec_as_ref() {
    let wrapped: Owned<Vec<i32>> = Owned(vec![2, 7, 4]);
    let vec_ref: &Vec<i32> = wrapped.as_ref();
    assert_eq!(vec_ref, &vec![2, 7, 4]);
    let slice_ref: &[i32] = wrapped.as_ref();
    assert_eq!(slice_ref, &[2, 7, 4] as &[i32]);
}
#[test]
fn test_int_as_ref() {
    let wrapped: Owned<i32> = Owned(5);
    let reference: &i32 = wrapped.as_ref();
    assert_eq!(reference, &5);
}

The new implementation of AsRef<T> for Owned<T> is akin to impl<'_, T> AsRef<T> for Cow<'_, T>. But note how there is also impl AsRef<Path> for Cow<'_, OsStr>.

The more I dig into this, the uglier it gets!

:confounded:


Update:

I decided it's best to revert the above diff, i.e. to keep:

impl<T, U> AsRef<U> for Owned<T>
where
    T: AsRef<U>,
    U: ?Sized,
{
    fn as_ref(&self) -> &U {
        self.0.as_ref()
    }
}

This means you cannot (generally) use .as_ref() to go from Owned<T> to T. But implementation of AsRef<U> for Owned<T> (where T implements AsRef<U>) is needed to be able to use .as_ref() for a "cheap reference-to-reference conversion" (which is what AsRef is provided for).

In that sense, I believe that impl<'_, T> AsRef<T> for Cow<'_, T> in std is wrong! It should be impl<T: AsRef<U>, U: ?Sized> AsRef<U> for Cow<'_, T> impl<T: ?Sized + ToOwned + AsRef<U>, U: ?Sized> AsRef<U> for Cow<'_, T> instead. However, then .as_ref() cannot be used to go from Cow<'_, T> to &T anymore. But .as_ref() cannot be used to go from T to &T either. So that'd be just consistent! .borrow() can be used for that.

I think it's too late though to fix std.

:frowning_face:


We often wrongly use impl AsRef<Path>, where, in-fact, we mean B: Borrow<P> where P: ?Sized + AsRef<Path>.

Demonstration:

use std::borrow::{Borrow, Cow};
use std::ffi::{OsString, OsStr};
use std::path::{Path, PathBuf};

#[derive(Clone)]
struct CurDir;

impl AsRef<Path> for CurDir {
    fn as_ref(&self) -> &Path {
        ".".as_ref()
    }
}

fn foo(_: impl AsRef<Path>) {}
fn bar<P: ?Sized + AsRef<Path>, B: Borrow<P>>(_: B) {}

fn main() {
    foo(OsString::from("."));
    foo(&OsString::from(".") as &OsString);
    foo(&OsString::from(".") as &OsStr);
    foo(PathBuf::from("."));
    foo(&PathBuf::from(".") as &PathBuf);
    foo(&PathBuf::from(".") as &Path);
    foo(CurDir);
    foo(&CurDir);
    // foo(Cow::<'_, OsString>::Owned(OsString::from(".")));
    // foo(Cow::<'_, OsString>::Borrowed(&OsString::from(".")));
    foo(Cow::<'_, OsStr>::Borrowed(&OsString::from(".") as &OsStr));
    // foo(Cow::<'_, PathBuf>::Owned(PathBuf::from(".")));
    // foo(Cow::<'_, PathBuf>::Borrowed(&PathBuf::from(".")));
    foo(Cow::<'_, Path>::Borrowed(&PathBuf::from(".") as &Path));
    // foo(Cow::<'_, CurDir>::Owned(CurDir));
    // foo(Cow::<'_, CurDir>::Borrowed(&CurDir));
    bar::<OsString, _>(OsString::from("."));
    bar::<OsString, _>(&OsString::from(".") as &OsString);
    bar::<OsStr, _>(&OsString::from(".") as &OsStr);
    bar::<PathBuf, _>(PathBuf::from("."));
    bar::<PathBuf, _>(&PathBuf::from("."));
    bar::<Path, _>(&PathBuf::from(".") as &Path);
    bar::<CurDir, _>(CurDir);
    bar::<CurDir, _>(&CurDir);
    bar::<OsString, _>(Cow::<'_, OsString>::Owned(OsString::from(".")));
    bar::<OsString, _>(Cow::<'_, OsString>::Borrowed(&OsString::from(".")));
    bar::<OsStr, _>(Cow::<'_, OsStr>::Borrowed(&OsString::from(".") as &OsStr));
    bar::<PathBuf, _>(Cow::<'_, PathBuf>::Owned(PathBuf::from(".")));
    bar::<PathBuf, _>(Cow::<'_, PathBuf>::Borrowed(&PathBuf::from(".")));
    bar::<Path, _>(Cow::<'_, Path>::Borrowed(&PathBuf::from(".") as &Path));
    bar::<CurDir, _>(Cow::<'_, CurDir>::Owned(CurDir));
    bar::<CurDir, _>(Cow::<'_, CurDir>::Borrowed(&CurDir));
}

(Playground)


Unfortunately, type inference chokes on that.

I believe all this could be solved if there was a generic impl<T: ?Sized> AsRef<T> for T, I believe. Which cannot exist, unfortunately.


Sorry for the incremental updates.

Actually using impl AsRef<Path> would be okay if there was impl<T: ?Sized + ToOwned + AsRef<U>, U: ?Sized> AsRef<U> for Cow<'_, T> in std, because impl AsRef<Path> for Path. But …

  • This doesn't hold in the general case of impl AsRef<T>.
  • We have no impl<T: ?Sized + ToOwned + AsRef<U>, U: ?Sized> AsRef<U> for Cow<'_, T> in std.

:crazy_face:

I filed an issue (#98905) on GitHub.

Either inference is going to choke on that, or the compiler will have to start making relatively arbitrary decisions in ambiguous situations (perhaps driven by "insider knowledge" of std, making std more magical / not something a Rust programmer could achieve themselves).

Why? Language design philosophy aside, there's no language-level guarantee that these do the same thing:

let os_str    = os_string.borrow(); let path =    os_str.as_ref();
let os_string = os_string.borrow(); let path = os_string.as_ref();

And even if OsString and OsStr became baked into the language (vs. std), so that case is known [1], I could have my own struct with a Borrow<OsStr> implementation that has side-effects, for example.

"Maximal reach" with generics tends to throw inference under the bus, because Rust doesn't like to make arbitrary decisions in ambiguous situations; the more wide-reaching your generics, the more chance for ambiguity. Incidentally, your playground does have one (I think just the one) unambiguous case:

    // works
    bar(CurDir);

Adding any Borrow implementation seems to break this [2], which shows there is room for inference improvement.

Any reference is going to be ambiguous due to the two blanket Borrow impls. [3]


  1. to not matter which implementation is chosen ↩︎

  2. i.e. even if the implementation is for something which is not AsRef<Path> ↩︎

  3. And if you think about it, accepting owned structs when (as per the API) you can only act on borrowed data is somewhat of an anti-pattern (or at least a speed bump for your users), so this knocks out most of the ergonomic utility of the function (as borrows are most frequently references). ↩︎

2 Likes

I was (intentionally) not very precise when I used the phrase "choke on that" because I had not fully thought this through. You are right that there are ambiguous "paths" (sorry for the pun) when going from OsString via .borrow() and .as_ref() to Path.

I totally agree, you are right. After reading your footnote, I would conclude that bar should rather be defined like that:

fn bar(_: &impl AsRef<Path>) {}

But remember #98905! We have no impl<T: ?Sized + ToOwned + AsRef<U>, U: ?Sized> AsRef<U> for Cow<'_, T> like we ought to have! Thus the "corrected" version of bar will fail:

use std::borrow::Cow;
use std::ffi::{OsString, OsStr};
use std::path::{Path, PathBuf};

#[derive(Clone)]
struct CurDir;

impl AsRef<Path> for CurDir {
    fn as_ref(&self) -> &Path {
        ".".as_ref()
    }
}

fn foo(_: impl AsRef<Path>) {}
fn bar(_: &impl AsRef<Path>) {}

fn main() {
    foo(OsString::from("."));
    foo(&OsString::from(".") as &OsString);
    foo(&OsString::from(".") as &OsStr);
    foo(PathBuf::from("."));
    foo(&PathBuf::from(".") as &PathBuf);
    foo(&PathBuf::from(".") as &Path);
    foo(CurDir);
    foo(&CurDir);
    // foo(Cow::<'_, OsString>::Owned(OsString::from(".")));
    // foo(Cow::<'_, OsString>::Borrowed(&OsString::from(".")));
    foo(Cow::<'_, OsStr>::Borrowed(&OsString::from(".") as &OsStr));
    // foo(Cow::<'_, PathBuf>::Owned(PathBuf::from(".")));
    // foo(Cow::<'_, PathBuf>::Borrowed(&PathBuf::from(".")));
    foo(Cow::<'_, Path>::Borrowed(&PathBuf::from(".") as &Path));
    // foo(Cow::<'_, CurDir>::Owned(CurDir));
    // foo(Cow::<'_, CurDir>::Borrowed(&CurDir));
    bar(&OsString::from("."));
    bar(&(&OsString::from(".") as &OsString));
    bar(&(&OsString::from(".") as &OsStr));
    bar(&PathBuf::from("."));
    bar(&&PathBuf::from("."));
    bar(&(&PathBuf::from(".") as &Path));
    bar(&CurDir);
    bar(&&CurDir);
    //bar(&Cow::<'_, OsString>::Owned(OsString::from(".")));
    //bar(&Cow::<'_, OsString>::Borrowed(&OsString::from(".")));
    bar(&Cow::<'_, OsStr>::Borrowed(&OsString::from(".") as &OsStr));
    //bar(&Cow::<'_, PathBuf>::Owned(PathBuf::from(".")));
    //bar(&Cow::<'_, PathBuf>::Borrowed(&PathBuf::from(".")));
    bar(&Cow::<'_, Path>::Borrowed(&PathBuf::from(".") as &Path));
    //bar(&Cow::<'_, CurDir>::Owned(CurDir));
    //bar(&Cow::<'_, CurDir>::Borrowed(&CurDir));
}

(Playground)

:frowning_face:

So …

… that could be fixed by using &impl AsRef<T>.

(edit: The striked out text above was wrong.)

But still …

… will keep us from using the "right" solution.


Example: Using deref_owned::Owned (with the correct implementation of AsRef) doesn't exhibit this problem:

    bar(&Owned(OsString::from(".")));
    bar(&Owned(PathBuf::from(".")));
    bar(&Owned(CurDir));

(Playground)


Actually we can just use the original foo in that case:

fn foo(_: impl AsRef<Path>) {}

fn main() {
    foo(&OsString::from("."));
    foo(&(&OsString::from(".") as &OsString));
    foo(&(&OsString::from(".") as &OsStr));
    foo(&PathBuf::from("."));
    foo(&&PathBuf::from("."));
    foo(&(&PathBuf::from(".") as &Path));
    foo(&CurDir);
    foo(&&CurDir);
    foo(&Owned(OsString::from(".")));
    foo(&Owned(PathBuf::from(".")));
    foo(&Owned(CurDir));
}

(Playground)

Why is that? Because "As lifts over &". From std:

// As lifts over &
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_const_unstable(feature = "const_convert", issue = "88674")]
impl<T: ?Sized, U: ?Sized> const AsRef<U> for &T
where
    T: ~const AsRef<U>,
{
    #[inline]
    fn as_ref(&self) -> &U {
        <T as AsRef<U>>::as_ref(*self)
    }
}

(source)


Concluding: If #98905 was fixed, then using impl AsRef<Path> is perfectly fine. But that was what I said earlier here:

I mostly agree, or rather,

// The addition of `?Sized` is the important part
fn bar<P: ?Sized + AsRef<Path>>(_: &P) {}

More on that below.

The practical fix is to dereference.


Written before at least some of your comment edits, and mostly just a big long exploration for the sake of a thought experiment -- nothing useful pertaining to your problems or practical for the future of Rust. Feel free to ignore everything below.

As a mental exercise, it is interesting to imagine how things could have been implemented differently. AsRef has its blanket implementation for APIs like File::open, and AsRef<Path> in particular was a significant motivator. Could we have done something different and gotten AsRef<T> for T instead of the &-nesting it overlaps with? [1]

An aside about `Into`:

One interesting note in the RFC is the section Why the reference restrictions?. There we can see that the author explicitly wanted to avoid HRTBs [2] such as:

fn bar<P: ?Sized>(_: &P) where for<'any> &'any P: Into<&'any Path>

And later they say that AsRef implies Into. However, the blanket implementation making AsRef imply Into was later removed as it conflicts with Into being reflexive.

This theme of wanting to avoid complex signatures comes up again below. That said, I feel it's a worthy separation of concerns in this case anyway; I'd hate to have to write (&x).into() and hope I got it right.


Now, if we look at File::open and friends, we can see that they use the pattern I asserted was an anti-pattern:

pub fn open<P: AsRef<Path>>(path: P) -> Result<File>

Which I guess I should back up:

  • Open can't do anything but create the &Path so it has no need for ownership (of a PathBuf, say)
  • as_ref takes a reference as well, so you're not really losing much by requiring a reference as an input, just...
  • ...the ability to pass something owned into open [which] is something you usually don't want to do
    • Because then you, the caller, can't use it anymore [3]
    • A hint to use AsRef would help, but there is none in this case, so the unwary will probably .clone unnecessarily upon getting a "used after move" error
    • (Or a hint to just use & for that matter, given the &-nesting support)

And one more pertinent quality:

  • If you use this pattern, you need to support some level of &-nesting in the implementations, because you can't have P = Path (or any other unsized type) -- you need AsRef<Path> for &Path so you can pass in the P = &Path.

We can't change those patterns now, because those functions accept owned types directly. Bummer. If we had instead

pub fn open<P: ?Sized + AsRef<Path>>(path: &P) -> Result<File>

We, arguably at least, would not have needed the nested & implementation on AsRef. [4] As it turns out though, that impl was included from the start. And hey, if you read the RFCs closely, they actually did use this ?Sized and &P pattern in the Path reform discussion (though often forgetting to write ?Sized). So what gives?

PR 23316:

it's much more ergonomic to not deal with ?Sized at all and simply require an argument P instead of &P.

This change is aimed at removing unsightly ?Sized bounds while retaining the same level of usability as before.

So because it's unsightly :roll_eyes:, we lost the more correct (IMO) implementations forever :-1:. [5]

That said, again, the PR is not what prompted the &-nesting implementation, so further insight would have been needed to avoid it. And maybe it justifies its existence in other ways I haven't thought of. [6]

Well then, should we add more conversion traits and deprecate all the functions with the anti-pattern? No, the current situation works well enough in practice, flawed though it may be. A change like that would be massively disruptive to the ecosystem.


One more anti-pattern-esque thing I'll note which applies to both of

pub fn open<P: ?Sized + AsRef<Path>>(path: &P) -> Result<File>
pub fn open<P: AsRef<Path>>(path: P) -> Result<File>

They both potentially save you some typing to call over taking a &Path (no .as_ref()), but as a result you get a monomorphized version of the entire function for every type used. You can mitigate this by doing a one-line as_ref() call and then passing off to a non-generic private method that takes &Path, but I suspect most people copying std conventions don't do so. [7]

In summary, optimizing for writing is a bane upon programming. :wink: [8]


OK -- congrats on making it this far by the way -- I contemplated the above exploration in the context of your AsRef<T> as T thread after seeing your filed issue, because it reminded me about how the P: AsRef<Path> pattern is common but sub-optimal. In particular, I wasn't really taking the context of this thread into account; it ended up here by accident. [9]

But as a post-script, would having AsRef<T> for T solved the supposed problem in this thread? Well, we can't have both of

impl<X: ?Sized> AsRef<X> for X { /* ... */ }
impl<U: ?Sized, T: ?Sized + ToBorrow + AsRef<U>> AsRef<U> for Cow<'_, T> { /* ... */ }

because T might implement AsRef<Cow<'static, T>>, in which case they overlap. So without specialization, I don't think it would have really helped you out. Also, apparently, you found some arrangements of traits you do like which relies on the &-nesting (in a recent comment edit), so that's sort of amusing. It still must lose out on AsRef<T> for Cow<'_, T>.

Question though -- and again, sorry if this is just me not reading the thread thoroughly -- is there a practical reason you care, or is this all an "ideal design" exercise? I see a lot of types and calls in the example that just don't make a lot of sense to care highly about to me (though they can come up due to metaprogramming sort of easily I suppose). Having a Cow<'_, PathBuf> is like taking a &PathBuf instead of a &Path, or a &String instead of a &str, for example.

And particularly, in the context of the anti-pattern, in what situation do you own a Cow that you're okay throwing away in a call to something that wants AsRef<Path>? It can happen, but I think it's pretty rare, for the same reasons accepting PathBuf is a stumbling block -- if you have a (potentially) owned version, you're probably going to need it later. And you seemed okay entertaining the idea of avoiding the anti-pattern.

So if we discard that use-case [10], what are the remaining situations -- those where you don't want to give away the Cow? I think the top few would be

  • You own the cow and you're going to have to pass &cow or cow.as_ref() anyway so as to not give it away
    • So no great gain over having to do &*cow or the like
  • Your Cow is a field and you can't just pass self.cow either
    • So similar to the previous situation
  • You own cow_ref: &Cow<'_, T>
    • but that should probably be a &T instead
  • You own cow_mut: &mut Cow<'_, T>
    • but that should probably be a &mut <T as ToOwned>::Owned (or &T) instead

I think this isn't a bigger deal in the ecosystem because it just doesn't come up much, and when it does it's easy to work around. The most frequent requests do seem to be of the AsRef<Path> for Cow<'_, str> variety specifically. Maybe you're hitting it because you're trying to be super generic?


  1. I use "&-nesting" as a synonym for "As lifts over &". ↩︎

  2. and where clauses more generally ↩︎

  3. unless it's Copy ↩︎

  4. And the "oops I gave away ownership" speed bump would be removed, though you might need to explicitly deref where you don't today. ↩︎

  5. Ergonomics of the std writer shouldn't outweigh good design for the std consumer. ↩︎

  6. So I guess the takes-ownership pothole is the main fallout from that PR, combined with the fact that others (including myself) often copy std's patterns, so the anti-pattern spreads. ↩︎

  7. Related RFE. ↩︎

  8. Less cheekily, using generics for ergonomics is hard to get right, and also hard to change without breaking something. ↩︎

  9. And I've only skimmed this one, so pardon me if I retread old ground. ↩︎

  10. i.e. let's say you don't use the anti-pattern so you can't take owned values ↩︎

1 Like