A design problem with references vs lifetimed types

Here's another reason. Suppose I've got a type that looks like this:

struct S<A,B,C> {
  a: A,
  b: B,
  c: C,
}

Suppose there are a bunch of traits I'd like to use with S. The traits are TA,TB,TC,TAB,TAC,TBC. I could just impl each of these for S, but really

  • TA only depends on the a attribute,
  • TB only depends on the b attribute`,
  • TC only depends on the c attribute,
  • TAB only depends on the a and b attributes,
  • TAC only depends on the a and c attributes, and
  • TBC only depends on the b and c attributes.

So impling each of them for S would actually lead to bloat, because now for instance the implementation of TAB is unnecessarily duplicated for each C. If the implementation of TAB is large and there are a lot of types C in use, this could be a really large amount of bloat.

For 3 of the traits there's an easy solution. I can just impl TA for A, impl TB for B, and impl TC for C, and then whenever I have an S, I can just do for instance s.a.some_function_in_ta().

But what about TAB? What type should impl it? I would like to be able to define a new type

struct Composite<'a> {
  a: &'a A,
  b: &'a B, 
}

and indeed this can work if TAB doesn't use &mut self. I can just do s.produce_composite().some_function_in_tab().

If TAB does use &mut self, I could potentially instead define Composite like this:

struct CompositeMut<'a> {
  a: &mut 'a A,
  b: &mut 'a B, 
}

but this only works when I either own the S or have a mutable reference to it. What if I only have a &S?

It's really surprising to me others don't discuss this issue; I bump up against this all the time.

You aren't alone here, even in the standard library you see loads of Iter<'a> and IterMut<'a> pairs.

It can be a pain, but I've found you don't run into it that often unless you are designing containers.

Normally I'd try to X-Y the problem by not have this sort of combinatorial explosion of traits/view structs/mutability in the first place.

In Go they have the proverb, "The bigger the interface, the weaker the abstraction". That may or may not be relevant to your use case, but it's what I first thought when I saw your TA,TB, ..., TBC example.

I'm not sure what the problem is here or what you are trying to solve and how you think it could be solved.

Mutable and immutable references are also two different types. You can't mutate stuff through an immutable reference, nor can you do that through a type which contains all immutable references. You have to
separate mutating and pure code. Not doing so (and/or the language allowing you to violate this constraint) would lead to bugs or memory corruption, that is the reason it is forbidden.

This is such a fundamental difference as e.g. the difference between two unrelated types T and U. In the general case, you surely wouldn't expect that you get the implementation of U by writing T.

2 Likes

I think I've explained the problem pretty clearly. As @Michael-F-Bryan points out, this isn't only a problem when you have multiple references. Don't you think it's a bit silly that you see loads of duplication with distinct Iter and IterMut types even in a case where the two types are conceptually almost identical? Imagine a case where the implementation of Iter is large, and essentially identical to that of IterMut except for a few differences between & and &mut or something. It strikes me as really unfortunate to have that duplication, both in source and compiled code.

How it could be solved is either Rust somehow hoisting mutability of a reference to be an attribute of the object containing the reference, and allowing that object to contain references of any mutability status... or possibly by a bunch of silly hacks around the type system.

Here's a really simple illustration. Suppose I have to deal with two APIs. The first API provides me with a u32, and also sometimes a &S or a &mut S. I never get to own an S. The second API has a trait that I need to implement; it looks like this:

trait T {
  fn f(&self);
  fn g(&mut self);
}

I need to implement this trait using the data and functionality of the u32 and references to S provided by the first API. So the question is... what type should I make to implement this trait?

I can't use a type like this:

struct T0 {
  s: S,
  i: u32,
}

because I never own the S. I can't use a type like this:

struct T1<'a> {
  s: &'a S,
  i: u32,
}

because I won't be able to implement g (which needs a mutable S). I can't use a type like this

struct T2<'a> {
  s: &'a mut S,
  i: u32,
}

because sometimes I have a &S and still need an implementation of f.

The only solution that's not completely terrible that I can think of at the moment is this:

  • Implement both a T1 and a T2. The T1 implementation of f should just be unimplemented! (but note that this makes calling (T1 as T)::f a runtime error when conceptually it should just be not callable).

Or, if the conceptual interface needed with the u32 and the S to implement f and g are straightforward, it might be possible to get data from them ahead of time and shove that into a new type R and impl T for R or something, but in many cases that won't be feasible.

In this case, f and g should be in different traits, so that their implementations can be separate:

trait TF {
    fn f(&self);
}

trait TG {
    fn g(&mut self);
}

trait T: TF + TG {}
impl<X: TF + TG> for T {}

You can use generics to help reduce the amount of code repetition you’d need:

struct AnnotatedRef<Ptr> { s: Ptr, i: u32 }
type T1<'a> = AnnotatedRef<&'a S>;
type T2<'a> = AnnotatedRef<&'a mut S>;

impl<Ptr: Borrow<S>> TF for AnnotatedRef<Ptr> {
    fn f(&self) {
        let s: &S = self.s.borrow();
        /* ... */
    }
}

impl<Ptr: BorrowMut<S>> TG for AnnotatedRef<Ptr> {
    fn g(&self) {
        let s: &mut S = self.s.borrow_mut();
        /* ... */
    }
}
1 Like

I don't own f and g. They are part of the trait T, which is in someone else's API that I'm using.

There are plenty of traits which have both &self functions and &mut self functions; are they all broken?

If you are sometimes provided with a &T and sometimes a &mut T, but your interface always contains a function that takes &mut T, then you simply can't implement this interface, full stop. (Well, that is, without interior mutability primitives like RefCell.) If you control neither sides, you'll have to ask one or both of the API maintainers to do something about it.

The fact that you arrived at a dummy implementation using unimplemented!() is, by the way, a telltale sign of trying to do something that you are not supposed to do.

Usually, a trait includes &self and &mut self methods because it is mutating in nature (i.e. meant to be used exclusively with mutable references), but does not need mutable access in some cases. For example, Iterator is pretty much useless unless you can mutate state in order to call next(&mut self), but size_hint(&self) doesn't need mutable access, so it takes an immutable pointer. Yet, you can't implement Iterator for something you can't mutate, but that's not a problem because it doesn't make sense, either.

2 Likes

If you control neither sides, you'll have to ask one or both of the API maintainers to do something about it.

Well... yeah. You understand sometimes this isn't an option though?

I mean I understand the intention of Rust's type system, I understand that a dummy implementation using unimplemented!() is ugly, etc, I understand the intent of & and &mut. I think maybe my listing this in the "help" category or something about the way I phrased this made it sound like I'm asking for elementary help with Rust.

What I'm saying is: I know Rust is preventing me from doing this, I know why it's doing it, but I think it's dumb and blocking me from something almost every mainstream language can do easily. It's completely reasonable to want to glue two APIs together in the way I described or to not want to duplicate code for a plain and a mut version of a type. So what I was looking for was not basic explanations, I was looking for a hacky way around the type system. Actually, I'll post a solution of mine to give you an idea of what I was after.

Here's the kind of thing I had in mind. You can create a struct R that implements T, but you can only use it inside a closure. As long as the implementations of use_r and use_r_mut are correct, this is safe for the closure h.

However needing to put this inside the closure is really silly.

EDIT: I should also clarify I'm not 100% sure this is right with the use of PhantomData and whatnot; it's just an idea I had, not something I've carefully thought through or tested.

struct R<'a> {
    phantom: PhantomData<&'a mut S>,
    s: *mut S,
    i: i32,
}

impl<'a> T for R<'a> {
    fn f(&self) {
        // ...
    }

    fn g(&mut self) {
        // ...
    }
}

fn use_r<'a, F: FnOnce(&R<'a>)>(s: &S, i: i32, h: F) {
    // unsafely create a r: R<'a> out of s and i, then
    h(&r)
}


fn use_r_mut<'a, F: FnOnce(&mut R<'a>)>(s: &mut S, i: i32, h: F) {
    // unsafely create a r: R<'a> out of s and i, then
    h(&mut R)
}

If you want help doing something unusual, it’s best to explicitly signal that by calling out the “usual” solution and describing why it’s unsuitable for your use case. Especially when you formulate things in the abstract, it’s hard to tell the difference between someone who’s trying to understand the basics and someone who’s trying to handle a genuinely tricky edge case— on this forum, the assumption is generally the former.

2 Likes

It feels like this shouldn’t require unsafe; here’s what I came up with:

use std::ops::{Deref,DerefMut};

pub struct HoistMut<Ptr>(Ptr);

impl<Ptr> Deref for HoistMut<Ptr>
where Ptr: Proxy, Ptr::InnerPtr: Deref {
    type Target=Ptr;
    fn deref(&self)->&Ptr { &self.0 }
}

impl<Ptr> DerefMut for HoistMut<Ptr>
where Self:Deref<Target=Ptr>, Ptr: Proxy, Ptr::InnerPtr: DerefMut {
    fn deref_mut(&mut self)->&mut Ptr { &mut self.0 }
}

pub trait Proxy {
    type InnerPtr;
}

pub struct S;
pub struct R<Ptr> {
    s: Ptr,
    i: u32,
}

impl<Ptr:Deref> R<Ptr> {
    fn as_ref(&self)->R<&'_ Ptr::Target> {
        R { s:self.s.deref(), i: self.i }
    }

    pub fn new(s:Ptr, i: u32)->HoistMut<Self>
    where Ptr: Deref<Target=S> {
        HoistMut ( R { s, i } )
    }
}

impl<Ptr> Proxy for R<Ptr> {
    type InnerPtr=Ptr;
}

trait T {
    fn by_ref(&self);
    fn by_mut(&mut self);
}

impl<'a> T for R<&'a S> {
    fn by_ref(&self) { dbg!("&'a by_ref"); }
    fn by_mut(&mut self) {
        // module only provides HoistMut<R<_>> to outside
        unreachable!();
    }
}

impl<'a> T for R<&'a mut S> {
    fn by_ref(&self) { self.as_ref().by_ref() }
    fn by_mut(&mut self) { dbg!("&'a mut by_mut"); }
}

(Playground)

1 Like

But they are oh not identical at all! Taking the Iter{,Mut} pair example, see how implementing Iter is trivial:

struct Iter<'slice> {
    slice: &'slice [u8],
    cursor: usize,
}

impl<'slice> Iter<'slice> {
    fn next (self: &'_ mut Iter<'slice>)
      -> Option<&'slice u8>
    {
        let ret = self.slice.get(self.cursor)?;
        self.cursor += 1;
        Some(ret)
    }
}

and yet when doing the "conceptually-identical" transposition…

  struct Iter<'slice> {
-     slice: &'slice     [u8],
+     slice: &'slice mut [u8],
      cursor: usize,
  }
  
  impl<'slice> Iter<'slice> {
      fn next (self: &'_ mut Iter<'slice>)
-       -> Option<&'slice     u8>
+       -> Option<&'slice mut u8>
      {
-         let ret = self.slice.get    (self.cursor)?;
+         let ret = self.slice.get_mut(self.cursor)?;
          self.cursor += 1;
          Some(ret)
      }
  }

…the code does not compile anymore.

This is because for the shared references case, if one "forgets" to put the self.cursor += 1 line, all they get is a logic error whereby the iterator is infinite and always yields the same element, but it won't lead to Undefined Behavior.

Whereas for the unique references case, if it compiled (e.g., by transmute-laundering the lifetime of the obtained reference), forgetting the self.cursor += 1 line would lead to unsound code.


:grimacing: in that situation you are pretty much screwed (pardon my language): the very point of APIs as abstraction boundaries, enforced by type-level shenanigans is to force a certain way of using it, and since that does not mean the API itself was well designed, it can definitely lead to unusable APIs :sweat_smile:.

So it seems to me that the root problem here, the one causing most grievance, is how ill-designed the APIs you are working with seem to be.

Given the "losing" starting position that you have had the misfortune to find yourself into, using runtime-panicking paths to express that the API can be used in more ways than it was intended (mainly, IIUC, the fact that the trait T can be used without calling g in some cases) is one of the main ways to circumvent overly strict type-level designs. In other words: using unimplemented!() may not be pretty, but is actually the most sensible stand-alone out you have at your disposal.

  • Basically, the API author of the trait T expressed that it was "paramount" for the behavior expressed in T to feature both f() and g() capabilities. Rust trusts that, and does not let you implement f() only, since "g() may be called". That is the case even if g() isn't! So using unimplemented!() (or unreachable!(), we are in between those two concepts) is a way to tell Rust: "hey, don't worry, I'm willing to bet the control flow of my program that g() is actually not called". And Rust is then like "Oh, if you are willing to sacrifice your control flow should you be wrong, then go ahead".

  • If we assume that the trait T should have been written as:

    trait TRef {
        fn f(&self);
    }
    trait TMut : TRef {
        fn g(&mut self);
    }
    use TMut as T; //
    

    then, one way to fix the abstraction from your "weak" downstream user position, is to try and recreate that pattern (still using unimplemented! or unreachable! to soothe Rust) with your own custom trait and newtype wrapper:

    #[derive(::ref_cast::RefCast)]
    #[repr(transparent)]
    struct ImplTRef<X : TRef>(X);
    
    impl<X : TRef> T for ImplTRef<X> {
        fn f(&self) { self.0.f() }
    
        fn g(&mut self) { unimplemented!("No `&mut`s here") }
    }
    // where
    trait TRef {
        fn f(&self);
    }
    

    And you can then feed ImplTRef::ref_cast(your_ref) (where your_ref: &(impl TRef) is a shared reference to a type that implements TRef) to APIs that expect an &(impl T).

  • If you don't want to be that general, but are fine with using unimplemented! and just want to reduce the boilerplate, then go for that suggested solution instead.


What I'd personally do, however, is that, since I find bad APIs unforgiveable (see how much grievance it is causing you), is to:

  1. fork the repo of the crate with the design issue,

  2. fix the issue (e.g., split T into two traits here),

  3. submit a PR so as to hopefully get the fix implemented upstream,

  4. use the patch section of the Cargo.toml file to use your fork in the meantime.

    • should the PR never be accepted, and should the issue be big enough (this API case seems to be one example), then publishing the fork as its own stand-alone crate would be the next logical step.

This way you don't have to use unimplemented!(), and you have potentially helped future users of that library avoid the issue altogether :slightly_smiling_face:

3 Likes

One thing that I should probably call out about my solution here is that the unreachable! is actually enforced by the privacy and type system— it should be sound to replace it with unsafe { unreachable_unchecked() } as there’s no way to get an &mut R<&S> instance.

But that seems unnecessary— if the function isn’t ever called, the compiler can skip the codegen all on its own without the help of unsafe.

1 Like

How about this as an example. Imagine you are implementing a collection (e.g. BTreeMap) and now you need to implement immutable and mutable iterators.

The logic for retrieving successive buckets is going to be almost exactly identical except your methods use& instead of &mut. BTreeMap is one of the most complex pieces of unsafe code in the standard library so for the sake of maintainability you don't want to blindly copy-paste your Iter struct and replace all the &self's with &mut self to create your IterMut.

The point being made is that there must be a way to abstract over mutability so you don't need to duplicate code.

I can see why the OP isn't happy with this response. In a more flexible statically typed language like Java or C# you could work probably around this by casting to Object or using reflection to access private data/methods, but Rust's strong type system means if someone has designed their API to not let you do certain things then you'll need to move heaven and earth to make it happen anyway.

I wouldn't blame the language though. If a library has been designed to not let you do something then either there's a legitimate reason (in which case you've got a square peg and round hole scenario) or you need to submit a bug report/PR upstream so they can relax the constraints.

2 Likes

I'll be blunt about this. If you depend on something that doesn't fullfill your requirements, then do it yourself. In many cases, cloning the project and editing it to suit your own needs is a possibility.


Anyway, the problem you're trying to solve is to be able to generically define the mutability property of the reference. This would require being able to name partial types / type constructors. It's the same problem, if you want to accept a generic container type and you want to decide what's in the container and not leak that implementation detail to the outside, but let the user choose the type of container.

Example:

struct A<T<internal U>>
where T: …, U: …
{
    a: T<u8>,
}

This would enable some neat library designs, but only few languages have a meta (meta) type system to support this. Rust is not one of them and I don't know of any well-performing languages, that do.

In Rust ownership and mutability have to be known precisely and statically. You can't abstract over them. Rust operates at lower level, without a GC, so Rust-specific designs need to tone down the level of abstraction. If you really want design patterns like in Java or Python where ownership and mutability don't matter, you'll have to use something like Arc<Mutex<T>> which is the equivalent of what other languages call a reference.

I really really strongly recommend avoiding putting temporary borrows in structs altogether until you're very proficient in Rust, and even then use them in a very limited fashion when it's proven to be unavoidable.

1 Like
I rant for a bit
S
Composite
CompositeMut
A
B
C
TA
TB
TC
TAB
TAC
TBC
T0
T1
T2
R
T

These are all the names of types and traits you have used in your examples. Not a single one is a concrete noun that describes what the thing does or is.

This is not real code. This is an exploration of the language, which is all right in its own way, but you can't solve architectural problems by throwing toy examples at them because the architecture of the solution depends on the nature of the problem by definition.

Given any piece of code stripped of context, you can keep saying "but what if X, Y and Z?" until there are no possible solutions left. But that's not a productive line of questioning. I see the same thing happening with people who think Rust needs inheritance: they construct a toy problem such that the only possible solution is exactly inheritance, then say "How do you solve this in Rust?" And people in threads like this give some helpful suggestions for reframing the problem, or transform the code in such a way that it technically works but isn't very pretty, and the asker gets frustrated and says "Why can't Rust do this one simple thing?" when the reality is that the problem they're imagining doesn't exist in real life. People solve vastly complicated problems in Rust without needing inheritance. (Serde is a great example.)

I'm not saying you've completely imagined your problem. What I am saying is that even if you have a real problem, it's not evidenced in the alphabet soup of toy examples in the thread so far. If you have real world code that you think isn't well designed because you can't abstract over mutability, post that and we'll see if there's a way to organize it better.

I will agree that the inability to abstract over mutability is occasionally a pain point. However, in all the cases I am currently thinking of, the commonality between the mut and non-mut versions is purely on the syntactic level. Abstracting common syntax, when the types are not held in common, is just what declarative macros are great for. Maybe this is the kind of problem that actually should just be solved with a macro.

1 Like

That is a fair point, however, as @kornel said, abstracting over mutability is not quite possible in Rust, due to them (&T and &mut T) being very different beasts from the borrow checker's PoV.

In the case of BTreeMap and most other collections, I believe the iterator duplication problem could plausibly be solved because they are already using raw pointers and unsafe all around to begin with. So I could imagine that it's only a matter of re-using one common implementation and casting between *const T and *mut T (or vice versa) after having ensured the required invariants.

(I'm not sure if this is actually how it is done, though.)

1 Like

I'll just re-state that example, since it seems to have been overlooked: there can be such a gigantic semantic gap between the looser pre-condition and looser post-condition of a shared input → shared output w.r.t. exclusive input → exclusive output, that trying to merge both implementations as one will be over-restrictive at best (if people start from the exclusive case, and then s/&mut/&/g-"copy-paste" the implementation to the shared case), and error/UB-prone at worst (if people start the other way around).

So even if cumbersome, Rust has made the right choice, here.

  • For very trivial cases, such as field accessors, macros are the go-to tool to reduce the boilerplate.
2 Likes