A design problem with references vs lifetimed types

mrb · October 22, 2020, 12:34am

In some ways a type with a lifetime is like a reference, but in one major irritating way they are different:

I can just implement a struct once and use both immutable and immutable references to it, but for a type with a lifetime, I have to implement two types to use it as both mutable and immutable.

Let me illustrate. If I have the following struct, I can easily implement the interface on it I want:

struct S {
    // ...
}

impl S {
    fn f(s: &S) {
        // ...
    }

    fn mut_f(s: &mut S) {
    }
}

But suppose I realize that actually, my interface needs other components, so the type it should be operating on is not just an S. So I create a new type

struct Composite<'a> {
    s: &'a S,
    integer: &'a i32,
}

impl<'a> Composite<'a> {
    fn f(self) {
        // ...
    }
}

But it's impossible to also implement mut_f on Composite<'a>, I need a whole other type:

struct CompositeMut<'a> {
    s: &'a mut S,
    integer: &'a mut i32,
}

impl<'a> CompositeMut<'a> {
    fn mut_f(self) {
        // ...
    }
}

It's really silly to have to split up my one type into two in this way. This is a recurring problem for me, but I haven't seen much discussion of it in the Rust community. Does anyone else run into this problem? Does anyone have a solution? I have some ideas for getting around the type system, but they feel a little sloppy to me.

Hyeonu · October 22, 2020, 2:30am

The problem is you're storing references within the composition type. In most cases composition types owns types it composite over, not borrow them.

struct Composite {
    s: S,
    integer: i32,
}

impl Composite {
    fn f(&self) {
        // ...
    }

    fn mut_f(&mut self) {
        // ...
    }
}

mrb · October 22, 2020, 3:11am

Of course I understand the composite type could own the objects. But there are a whole host of reasons why that's not always what one wants. An obvious reason is maybe I'm dealing with an API that just gives me references, and I can't own the objects.

mrb · October 22, 2020, 3:46am

Here's another reason. Suppose I've got a type that looks like this:

struct S<A,B,C> {
  a: A,
  b: B,
  c: C,
}

Suppose there are a bunch of traits I'd like to use with S. The traits are TA,TB,TC,TAB,TAC,TBC. I could just impl each of these for S, but really

TA only depends on the a attribute,
TB only depends on the b attribute`,
TC only depends on the c attribute,
TAB only depends on the a and b attributes,
TAC only depends on the a and c attributes, and
TBC only depends on the b and c attributes.

So impling each of them for S would actually lead to bloat, because now for instance the implementation of TAB is unnecessarily duplicated for each C. If the implementation of TAB is large and there are a lot of types C in use, this could be a really large amount of bloat.

For 3 of the traits there's an easy solution. I can just impl TA for A, impl TB for B, and impl TC for C, and then whenever I have an S, I can just do for instance s.a.some_function_in_ta().

But what about TAB? What type should impl it? I would like to be able to define a new type

struct Composite<'a> {
  a: &'a A,
  b: &'a B, 
}

and indeed this can work if TAB doesn't use &mut self. I can just do s.produce_composite().some_function_in_tab().

If TAB does use &mut self, I could potentially instead define Composite like this:

struct CompositeMut<'a> {
  a: &mut 'a A,
  b: &mut 'a B, 
}

but this only works when I either own the S or have a mutable reference to it. What if I only have a &S?

It's really surprising to me others don't discuss this issue; I bump up against this all the time.

Michael-F-Bryan · October 22, 2020, 4:38am

You aren't alone here, even in the standard library you see loads of Iter<'a> and IterMut<'a> pairs.

It can be a pain, but I've found you don't run into it that often unless you are designing containers.

Normally I'd try to X-Y the problem by not have this sort of combinatorial explosion of traits/view structs/mutability in the first place.

In Go they have the proverb, "The bigger the interface, the weaker the abstraction". That may or may not be relevant to your use case, but it's what I first thought when I saw your TA,TB, ..., TBC example.

H2CO3 · October 22, 2020, 5:08am

I'm not sure what the problem is here or what you are trying to solve and how you think it could be solved.

Mutable and immutable references are also two different types. You can't mutate stuff through an immutable reference, nor can you do that through a type which contains all immutable references. You have to
separate mutating and pure code. Not doing so (and/or the language allowing you to violate this constraint) would lead to bugs or memory corruption, that is the reason it is forbidden.

This is such a fundamental difference as e.g. the difference between two unrelated types T and U. In the general case, you surely wouldn't expect that you get the implementation of U by writing T.

mrb · October 22, 2020, 5:44am

I think I've explained the problem pretty clearly. As @Michael-F-Bryan points out, this isn't only a problem when you have multiple references. Don't you think it's a bit silly that you see loads of duplication with distinct Iter and IterMut types even in a case where the two types are conceptually almost identical? Imagine a case where the implementation of Iter is large, and essentially identical to that of IterMut except for a few differences between & and &mut or something. It strikes me as really unfortunate to have that duplication, both in source and compiled code.

How it could be solved is either Rust somehow hoisting mutability of a reference to be an attribute of the object containing the reference, and allowing that object to contain references of any mutability status... or possibly by a bunch of silly hacks around the type system.

mrb · October 22, 2020, 6:07am

Here's a really simple illustration. Suppose I have to deal with two APIs. The first API provides me with a u32, and also sometimes a &S or a &mut S. I never get to own an S. The second API has a trait that I need to implement; it looks like this:

trait T {
  fn f(&self);
  fn g(&mut self);
}

I need to implement this trait using the data and functionality of the u32 and references to S provided by the first API. So the question is... what type should I make to implement this trait?

I can't use a type like this:

struct T0 {
  s: S,
  i: u32,
}

because I never own the S. I can't use a type like this:

struct T1<'a> {
  s: &'a S,
  i: u32,
}

because I won't be able to implement g (which needs a mutable S). I can't use a type like this

struct T2<'a> {
  s: &'a mut S,
  i: u32,
}

because sometimes I have a &S and still need an implementation of f.

The only solution that's not completely terrible that I can think of at the moment is this:

Implement both a T1 and a T2. The T1 implementation of f should just be unimplemented! (but note that this makes calling (T1 as T)::f a runtime error when conceptually it should just be not callable).

Or, if the conceptual interface needed with the u32 and the S to implement f and g are straightforward, it might be possible to get data from them ahead of time and shove that into a new type R and impl T for R or something, but in many cases that won't be feasible.

2e71828 · October 22, 2020, 6:20am

In this case, f and g should be in different traits, so that their implementations can be separate:

trait TF {
    fn f(&self);
}

trait TG {
    fn g(&mut self);
}

trait T: TF + TG {}
impl<X: TF + TG> for T {}

You can use generics to help reduce the amount of code repetition you’d need:

struct AnnotatedRef<Ptr> { s: Ptr, i: u32 }
type T1<'a> = AnnotatedRef<&'a S>;
type T2<'a> = AnnotatedRef<&'a mut S>;

impl<Ptr: Borrow<S>> TF for AnnotatedRef<Ptr> {
    fn f(&self) {
        let s: &S = self.s.borrow();
        /* ... */
    }
}

impl<Ptr: BorrowMut<S>> TG for AnnotatedRef<Ptr> {
    fn g(&self) {
        let s: &mut S = self.s.borrow_mut();
        /* ... */
    }
}

mrb · October 22, 2020, 6:25am

I don't own f and g. They are part of the trait T, which is in someone else's API that I'm using.

There are plenty of traits which have both &self functions and &mut self functions; are they all broken?

H2CO3 · October 22, 2020, 6:32am

If you are sometimes provided with a &T and sometimes a &mut T, but your interface always contains a function that takes &mut T, then you simply can't implement this interface, full stop. (Well, that is, without interior mutability primitives like RefCell.) If you control neither sides, you'll have to ask one or both of the API maintainers to do something about it.

The fact that you arrived at a dummy implementation using unimplemented!() is, by the way, a telltale sign of trying to do something that you are not supposed to do.

Usually, a trait includes &self and &mut self methods because it is mutating in nature (i.e. meant to be used exclusively with mutable references), but does not need mutable access in some cases. For example, Iterator is pretty much useless unless you can mutate state in order to call next(&mut self), but size_hint(&self) doesn't need mutable access, so it takes an immutable pointer. Yet, you can't implement Iterator for something you can't mutate, but that's not a problem because it doesn't make sense, either.

mrb · October 22, 2020, 6:42am

If you control neither sides, you'll have to ask one or both of the API maintainers to do something about it.

Well... yeah. You understand sometimes this isn't an option though?

I mean I understand the intention of Rust's type system, I understand that a dummy implementation using unimplemented!() is ugly, etc, I understand the intent of & and &mut. I think maybe my listing this in the "help" category or something about the way I phrased this made it sound like I'm asking for elementary help with Rust.

What I'm saying is: I know Rust is preventing me from doing this, I know why it's doing it, but I think it's dumb and blocking me from something almost every mainstream language can do easily. It's completely reasonable to want to glue two APIs together in the way I described or to not want to duplicate code for a plain and a mut version of a type. So what I was looking for was not basic explanations, I was looking for a hacky way around the type system. Actually, I'll post a solution of mine to give you an idea of what I was after.

mrb · October 22, 2020, 6:49am

Here's the kind of thing I had in mind. You can create a struct R that implements T, but you can only use it inside a closure. As long as the implementations of use_r and use_r_mut are correct, this is safe for the closure h.

However needing to put this inside the closure is really silly.

EDIT: I should also clarify I'm not 100% sure this is right with the use of PhantomData and whatnot; it's just an idea I had, not something I've carefully thought through or tested.

struct R<'a> {
    phantom: PhantomData<&'a mut S>,
    s: *mut S,
    i: i32,
}

impl<'a> T for R<'a> {
    fn f(&self) {
        // ...
    }

    fn g(&mut self) {
        // ...
    }
}

fn use_r<'a, F: FnOnce(&R<'a>)>(s: &S, i: i32, h: F) {
    // unsafely create a r: R<'a> out of s and i, then
    h(&r)
}


fn use_r_mut<'a, F: FnOnce(&mut R<'a>)>(s: &mut S, i: i32, h: F) {
    // unsafely create a r: R<'a> out of s and i, then
    h(&mut R)
}

2e71828 · October 22, 2020, 6:50am

If you want help doing something unusual, it’s best to explicitly signal that by calling out the “usual” solution and describing why it’s unsuitable for your use case. Especially when you formulate things in the abstract, it’s hard to tell the difference between someone who’s trying to understand the basics and someone who’s trying to handle a genuinely tricky edge case— on this forum, the assumption is generally the former.

2e71828 · October 22, 2020, 8:04am

It feels like this shouldn’t require unsafe; here’s what I came up with:

use std::ops::{Deref,DerefMut};

pub struct HoistMut<Ptr>(Ptr);

impl<Ptr> Deref for HoistMut<Ptr>
where Ptr: Proxy, Ptr::InnerPtr: Deref {
    type Target=Ptr;
    fn deref(&self)->&Ptr { &self.0 }
}

impl<Ptr> DerefMut for HoistMut<Ptr>
where Self:Deref<Target=Ptr>, Ptr: Proxy, Ptr::InnerPtr: DerefMut {
    fn deref_mut(&mut self)->&mut Ptr { &mut self.0 }
}

pub trait Proxy {
    type InnerPtr;
}

pub struct S;
pub struct R<Ptr> {
    s: Ptr,
    i: u32,
}

impl<Ptr:Deref> R<Ptr> {
    fn as_ref(&self)->R<&'_ Ptr::Target> {
        R { s:self.s.deref(), i: self.i }
    }

    pub fn new(s:Ptr, i: u32)->HoistMut<Self>
    where Ptr: Deref<Target=S> {
        HoistMut ( R { s, i } )
    }
}

impl<Ptr> Proxy for R<Ptr> {
    type InnerPtr=Ptr;
}

trait T {
    fn by_ref(&self);
    fn by_mut(&mut self);
}

impl<'a> T for R<&'a S> {
    fn by_ref(&self) { dbg!("&'a by_ref"); }
    fn by_mut(&mut self) {
        // module only provides HoistMut<R<_>> to outside
        unreachable!();
    }
}

impl<'a> T for R<&'a mut S> {
    fn by_ref(&self) { self.as_ref().by_ref() }
    fn by_mut(&mut self) { dbg!("&'a mut by_mut"); }
}

(Playground)

Yandros · October 22, 2020, 10:10am

But they are oh not identical at all! Taking the Iter{,Mut} pair example, see how implementing Iter is trivial:

struct Iter<'slice> {
    slice: &'slice [u8],
    cursor: usize,
}

impl<'slice> Iter<'slice> {
    fn next (self: &'_ mut Iter<'slice>)
      -> Option<&'slice u8>
    {
        let ret = self.slice.get(self.cursor)?;
        self.cursor += 1;
        Some(ret)
    }
}

and yet when doing the "conceptually-identical" transposition…

  struct Iter<'slice> {
-     slice: &'slice     [u8],
+     slice: &'slice mut [u8],
      cursor: usize,
  }
  
  impl<'slice> Iter<'slice> {
      fn next (self: &'_ mut Iter<'slice>)
-       -> Option<&'slice     u8>
+       -> Option<&'slice mut u8>
      {
-         let ret = self.slice.get    (self.cursor)?;
+         let ret = self.slice.get_mut(self.cursor)?;
          self.cursor += 1;
          Some(ret)
      }
  }

…the code does not compile anymore.

This is because for the shared references case, if one "forgets" to put the self.cursor += 1 line, all they get is a logic error whereby the iterator is infinite and always yields the same element, but it won't lead to Undefined Behavior.

Whereas for the unique references case, if it compiled (e.g., by transmute-laundering the lifetime of the obtained reference), forgetting the self.cursor += 1 line would lead to unsound code.

in that situation you are pretty much screwed (pardon my language): the very point of APIs as abstraction boundaries, enforced by type-level shenanigans is to force a certain way of using it, and since that does not mean the API itself was well designed, it can definitely lead to unusable APIs .

So it seems to me that the root problem here, the one causing most grievance, is how ill-designed the APIs you are working with seem to be.

Given the "losing" starting position that you have had the misfortune to find yourself into, using runtime-panicking paths to express that the API can be used in more ways than it was intended (mainly, IIUC, the fact that the trait T can be used without calling g in some cases) is one of the main ways to circumvent overly strict type-level designs. In other words: using unimplemented!() may not be pretty, but is actually the most sensible stand-alone out you have at your disposal.

Basically, the API author of the trait T expressed that it was "paramount" for the behavior expressed in T to feature both f() and g() capabilities. Rust trusts that, and does not let you implement f() only, since "g() may be called". That is the case even if g() isn't! So using unimplemented!() (or unreachable!(), we are in between those two concepts) is a way to tell Rust: "hey, don't worry, I'm willing to bet the control flow of my program that g() is actually not called". And Rust is then like "Oh, if you are willing to sacrifice your control flow should you be wrong, then go ahead".
If we assume that the trait T should have been written as:
```
trait TRef {
    fn f(&self);
}
trait TMut : TRef {
    fn g(&mut self);
}
use TMut as T; //
```
- Examples: Index{,Mut}, Borrow{,Mut}, Deref{,Mut};
- Or the unordered version, where TRef and TMut are independent of each other, and T is just an alias for having both, as @2e71828 showcased, with As{Ref,Mut} as the standard library example.
then, one way to fix the abstraction from your "weak" downstream user position, is to try and recreate that pattern (still using unimplemented! or unreachable! to soothe Rust) with your own custom trait and newtype wrapper:
```
#[derive(::ref_cast::RefCast)]
#[repr(transparent)]
struct ImplTRef<X : TRef>(X);

impl<X : TRef> T for ImplTRef<X> {
    fn f(&self) { self.0.f() }

    fn g(&mut self) { unimplemented!("No `&mut`s here") }
}
// where
trait TRef {
    fn f(&self);
}
```
And you can then feed ImplTRef::ref_cast(your_ref) (where your_ref: &(impl TRef) is a shared reference to a type that implements TRef) to APIs that expect an &(impl T).
If you don't want to be that general, but are fine with using unimplemented! and just want to reduce the boilerplate, then go for that suggested solution instead.

What I'd personally do, however, is that, since I find bad APIs unforgiveable (see how much grievance it is causing you), is to:

fork the repo of the crate with the design issue,
fix the issue (e.g., split T into two traits here),
submit a PR so as to hopefully get the fix implemented upstream,
use the patch section of the Cargo.toml file to use your fork in the meantime.
- should the PR never be accepted, and should the issue be big enough (this API case seems to be one example), then publishing the fork as its own stand-alone crate would be the next logical step.

This way you don't have to use unimplemented!(), and you have potentially helped future users of that library avoid the issue altogether

2e71828 · October 22, 2020, 10:29am

One thing that I should probably call out about my solution here is that the unreachable! is actually enforced by the privacy and type system— it should be sound to replace it with unsafe { unreachable_unchecked() } as there’s no way to get an &mut R<&S> instance.

But that seems unnecessary— if the function isn’t ever called, the compiler can skip the codegen all on its own without the help of unsafe.

Michael-F-Bryan · October 22, 2020, 10:49am

How about this as an example. Imagine you are implementing a collection (e.g. BTreeMap) and now you need to implement immutable and mutable iterators.

The logic for retrieving successive buckets is going to be almost exactly identical except your methods use& instead of &mut. BTreeMap is one of the most complex pieces of unsafe code in the standard library so for the sake of maintainability you don't want to blindly copy-paste your Iter struct and replace all the &self's with &mut self to create your IterMut.

The point being made is that there must be a way to abstract over mutability so you don't need to duplicate code.

I can see why the OP isn't happy with this response. In a more flexible statically typed language like Java or C# you could work probably around this by casting to Object or using reflection to access private data/methods, but Rust's strong type system means if someone has designed their API to not let you do certain things then you'll need to move heaven and earth to make it happen anyway.

I wouldn't blame the language though. If a library has been designed to not let you do something then either there's a legitimate reason (in which case you've got a square peg and round hole scenario) or you need to submit a bug report/PR upstream so they can relax the constraints.

Phlopsi · October 22, 2020, 10:50am

I'll be blunt about this. If you depend on something that doesn't fullfill your requirements, then do it yourself. In many cases, cloning the project and editing it to suit your own needs is a possibility.

Anyway, the problem you're trying to solve is to be able to generically define the mutability property of the reference. This would require being able to name partial types / type constructors. It's the same problem, if you want to accept a generic container type and you want to decide what's in the container and not leak that implementation detail to the outside, but let the user choose the type of container.

Example:

struct A<T<internal U>>
where T: …, U: …
{
    a: T<u8>,
}

This would enable some neat library designs, but only few languages have a meta (meta) type system to support this. Rust is not one of them and I don't know of any well-performing languages, that do.

kornel · October 22, 2020, 12:02pm

In Rust ownership and mutability have to be known precisely and statically. You can't abstract over them. Rust operates at lower level, without a GC, so Rust-specific designs need to tone down the level of abstraction. If you really want design patterns like in Java or Python where ownership and mutability don't matter, you'll have to use something like Arc<Mutex<T>> which is the equivalent of what other languages call a reference.

I really really strongly recommend avoiding putting temporary borrows in structs altogether until you're very proficient in Rust, and even then use them in a very limited fashion when it's proven to be unavoidable.

Topic		Replies	Views
DRYing nearly identical implementations for &T and &mut T	52	5862	April 25, 2016
Generic mutability parameters community	22	10373	April 16, 2018
Cannot use mutable referene as a (generic) return type? help	14	1159	October 5, 2022
Can `trait Foo<T>` have a method that returns `T`, `&T`, and `&mut T`? (Also a request for a code review of my plugin system prototype) help	13	1283	May 3, 2021
Can multiple mutable references co-exist between parent and child structs?	16	1530	January 29, 2023

A design problem with references vs lifetimed types

Related topics