Generalized references and unsized coercion

For some reason (related to this and that), I require to be generic over reference-like types. I have something like this (with some extra bounds that I omit for the sake of simplicity for this example):

trait GenericRef<'a, T: ?Sized + 'a>: Sized + Copy {}
impl<'a, T: ?Sized> GenericRef<'a, T> for &'a T {}

Now I refactored some functions in my code which previously took a &'a T to take an R: GenericRef<'a, T> as argument, instead:

fn takes_ref_to_int_slice(_reference: &[i32]) {}
fn takes_generic_ref_to_int_slice<'a, R>(_reference: R)
where
    R: GenericRef<'a, [i32]>,
{}

But then I run into trouble regarding unsized coercions:

fn main() {
    // This works due to dereference (from `Vec<i32>` to `[i32]`):
    takes_ref_to_int_slice(&vec![1, 2, 3]);
    // This works due to unsized-coercion (from `[i32; 3]` to `[i32]`):
    takes_ref_to_int_slice(&[1, 2, 3]);

    // This does NOT work as `&[i32; 3]` does not unsize-coerce to a `GenericRef<'_, [i32]>`
    takes_generic_ref_to_int_slice(&[1, 2, 3]);

    // I have to do this:
    takes_generic_ref_to_int_slice(&*vec![1, 2, 3]);
    // Or better this:
    takes_generic_ref_to_int_slice(&[1, 2, 3] as &[_]);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0277]: the trait bound `&[{integer}; 3]: GenericRef<'_, [i32]>` is not satisfied
  --> src/main.rs:17:36
   |
17 |     takes_generic_ref_to_int_slice(&[1, 2, 3]);
   |     ------------------------------ ^^^^^^^^^^ the trait `GenericRef<'_, [i32]>` is not implemented for `&[{integer}; 3]`
   |     |
   |     required by a bound introduced by this call
   |
note: required by a bound in `takes_generic_ref_to_int_slice`
  --> src/main.rs:7:8
   |
5  | fn takes_generic_ref_to_int_slice<'a, R>(_reference: R)
   |    ------------------------------ required by a bound in this
6  | where
7  |     R: GenericRef<'a, [i32]>,
   |        ^^^^^^^^^^^^^^^^^^^^^ required by this bound in `takes_generic_ref_to_int_slice`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to previous error

I tried to look into std::marker::Unsize and std::ops::CoerceUnsized in nightly Rust, but I'm not sure if these traits can help me. I feel like I would run into overlapping implementations.

Can anyone help me with making takes_generic_ref_to_int_slice(&[1, 2, 3]); work?

You would have to implement GenericRef for things like &[T; N] and &Vec<T> to do this.

1 Like

Ah, cool, this works:

 impl<'a, T: ?Sized> GenericRef<'a, T> for &'a T {}
+impl<'a, T, const N: usize> GenericRef<'a, [T]> for &'a [T; N] {}

(Playground)

Now I wonder if maybe I can use Unsize, CoerceUnsized, or perhaps Deref to be more generic and cover the Vec case (and others) too, as the following still won't work in the Playground:

takes_generic_ref_to_int_slice(&vec![1, 2, 3]);

You could just use Borrow.

1 Like

This would put an additional bound on my generic function, which would be a problem for me, I think, so that it won't work for me.

The reason is, in my real life example, I want to accept an (&i32, &[u8]) in addition to &(i32, Vec<u8>), but the first won't support Borrow<(i32, Vec<u8>)>.

Well, in that case, you'll have to implement GenericRef for every concrete type you want to use it with.

But maybe I can use my own BorrowStorable (as mentioned here) instead of Borrow, so perhaps your proposal can lead me into the right direction. I'll think about that.

Edit: I don't think it will work, because I still cannot get a &(i32, Vec<u8>) from a (&i32, &[u8]) easily.

For completeness: takes_generic_ref_to_int_slice(&[1, 2, 3] as &_); appears to work as well.

1 Like

I feel like if [T] would implement Unsize<[T]> (which apparently it doesn't), then perhaps the problem could be solved generically. But I'm not sure. Unsize isn't stabilized yet, so maybe there's still a possibility to fix/allow this prior to stabilization.

P.S.: I mentioned this thread in issue #27732 as perhaps it might be relevant for stabilization.

I know I sometimes oversimplify my examples :sweat_smile:. So let me show you the full code (still simplified but not that much simplified anymore):

#![feature(generic_associated_types)]

use std::mem::size_of;
use std::ops::Deref;
use std::slice;

trait Storable {
    type BytesRef<'a>: Deref<Target = [u8]>
    where
        Self: 'a;
    fn to_bytes(&self) -> Self::BytesRef<'_>;
}

impl Storable for i32 {
    type BytesRef<'a> = &'a [u8];
    fn to_bytes(&self) -> &[u8] {
        unsafe { slice::from_raw_parts(self as *const Self as *const u8, size_of::<Self>()) }
    }
}

impl Storable for [u8] {
    type BytesRef<'a> = &'a [u8];
    fn to_bytes(&self) -> &[u8] {
        &self
    }
}

// I could implement it for `(i32, [u8])`
// but that's difficult to handle as it's a custom DST,
// so I use `(i32, Vec<u8>)` instead:
impl Storable for (i32, Vec<u8>) {
    type BytesRef<'a> = Vec<u8>;
    fn to_bytes(&self) -> Vec<u8> {
        let mut result = Vec::new();
        result.extend_from_slice(self.0.to_bytes());
        result.extend_from_slice(self.1.to_bytes());
        result
    }
}

trait StorableRef<'a, T>
where
    Self: Sized + Copy,
    T: ?Sized + Storable + 'a,
{
    fn ref_to_bytes(self) -> T::BytesRef<'a>;
}

impl<'a, T> StorableRef<'a, T> for &'a T
where
    T: ?Sized + Storable,
{
    fn ref_to_bytes(self) -> T::BytesRef<'a> {
        self.to_bytes()
    }
}

impl StorableRef<'static, (i32, Vec<u8>)> for (&i32, &[u8]) {
    fn ref_to_bytes(self) -> Vec<u8> {
        let mut result = Vec::new();
        result.extend_from_slice(self.0.to_bytes());
        result.extend_from_slice(self.1.to_bytes());
        result
    }
}

fn store<'a, T, R>(reference: R)
where
    T: ?Sized + Storable + 'a,
    R: StorableRef<'a, T>,
{
    let data = reference.ref_to_bytes();
    let slice: &[u8] = &data;
    println!("Storing {} bytes", slice.len());
}

fn some_slice() -> &'static [u8] {
    &[1u8, 2u8]
}

fn main() {
    // I want to avoid allocation here:
    store(&(999, vec![1u8, 2u8]));
    // Which is why I can use this:
    store((&999, some_slice()));
    // But now I have this ugly `as &[_]`:
    store(&[1u8, 2u8, 3u8] as &[_]);
}

(Playground)

Output:

Storing 6 bytes
Storing 6 bytes
Storing 3 bytes

I want to get rid of the as &[_], ideally with a generic implementation.

Hmm, interesting! In the full example (Playground above) it doesn't work though.

With all your help and hints, I have come this far: New Playground.

But I would like to turn these three concrete implementations into something generic:

impl<'a, T> StorableRef<'a, T> for &'a T
where
    T: ?Sized + Storable,
{
    fn ref_to_bytes(self) -> T::BytesRef<'a> {
        self.to_bytes()
    }
}

impl<'a, T, const N: usize> StorableRef<'a, [T]> for &'a [T; N]
where
    [T]: Storable,
{
    fn ref_to_bytes(self) -> <[T] as Storable>::BytesRef<'a> {
        self.to_bytes()
    }
}

impl<'a, T> StorableRef<'a, [T]> for &'a Vec<T>
where
    [T]: Storable,
{
    fn ref_to_bytes(self) -> <[T] as Storable>::BytesRef<'a> {
        self.to_bytes()
    }
}

See how they all do the same? But I don't know how to express this in a generic way. Ideally that would later also cover going from String to str.


I tried to generalize the middle implementation like this:

impl<'a, T, U> StorableRef<'a, [T]> for &'a U
where
    T: 'a, 
    [T]: Storable,
    U: ?Sized + Unsize<[T]>,
{
    fn ref_to_bytes(self) -> <[T] as Storable>::BytesRef<'a> {
        todo!();
        // How to perform the unsized coercion here?
        //self.to_bytes()
    }
}

But then self doesn't support .to_bytes(). Also, it will overlap with the other two implementations. Moreover, it wouldn't even cover [T] as can be shown here:

(Playground with Unsize)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0277]: the trait bound `[u8]: Unsize<[u8]>` is not satisfied
   --> src/main.rs:112:11
    |
112 |     store(&*vec![1u8, 2u8, 3u8]);
    |     ----- ^^^^^^^^^^^^^^^^^^^^^ the trait `Unsize<[u8]>` is not implemented for `[u8]`
    |     |
    |     required by a bound introduced by this call
    |
    = note: all implementations of `Unsize` are provided automatically by the compiler, see <https://doc.rust-lang.org/stable/std/marker/trait.Unsize.html> for more information
    = help: the following other types implement trait `StorableRef<'a, T>`:
              <&'a U as StorableRef<'a, [T]>>
              <(&i32, &[u8]) as StorableRef<'static, (i32, Vec<u8>)>>
note: required because of the requirements on the impl of `StorableRef<'_, [u8]>` for `&[u8]`
   --> src/main.rs:60:16
    |
60  | impl<'a, T, U> StorableRef<'a, [T]> for &'a U
    |                ^^^^^^^^^^^^^^^^^^^^     ^^^^^
note: required by a bound in `store`
   --> src/main.rs:97:8
    |
94  | fn store<'a, T, R>(reference: R)
    |    ----- required by a bound in this
...
97  |     R: StorableRef<'a, T>,
    |        ^^^^^^^^^^^^^^^^^^ required by this bound in `store`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to previous error


I think I finally found a solution based on @H2CO3's proposal:

/* … */

impl<'a, T, U> StorableRef<'a, T> for &'a U
where
    T: ?Sized + Storable + 'a,
    U: Borrow<T>,
{
    fn ref_to_bytes(self) -> T::BytesRef<'a> {
        self.borrow().to_bytes()
    }
}

/* … */

fn main() {
    store(&(999, vec![1u8, 2u8]));
    store((&999, some_slice()));
    store::<[u8], _>(&[1u8, 2u8, 3u8]);
    store::<[u8], _>(&vec![1u8, 2u8, 3u8]);
}

(Playground)

Output:

Storing 6 bytes
Storing 6 bytes
Storing 3 bytes
Storing 3 bytes

Unfortunately, I need now explicit type parameters for the function, but I don't think that would be an issue for my application.

I might be oversimplifying things, but instead of using GATs and other advanced constructs that often have a negative effect on maintainability and developer experience, couldn't you just return a Cow<'_, [u8]> from the to_bytes() method and skip T::BytesRef<'a> completely?

That way you have the option of returning a reference to borrowed bytes if you already have them, and if you need to serialize things (e.g. like the (i32, Vec<u8>) impl) you can return a Cow::Owned(vec![...]) that was populated in the method.

1 Like

I really went through hell with all these constructs, so I totally get your point :hot_face:.

If I had resorted to Cow from the beginning, my coding would likely have been muuuuch easier. My colleague told me today, "how comes you almost go insane when coding Rust, is it really that difficult? Maybe you're doing something wrong?"

Perhaps most programmers would just go for Cow, and then things are easy. I guess the potential runtime overhead made me want to find a nicer solution: Some types won't require a copy ever, so using Cow would come with the extra cost of checking whether the value is owned or referenced at runtime (while it's known at compile-time).

That's why I use an associated type to determine whether copying is needed (i.e. when values need to be re-aligned in memory), or when it's not needed (because it's a memory-mapped string on the file system).

I paid the price of the code becoming real complex. For example, my current draft for Txn::get is:

pub trait Txn {
    /// Get reference to value in database
    fn get<'kr, K, V, C, KRef>(
        &self,
        db: &Db<K, V, C>,
        key: KRef,
    ) -> Result<Option<V::AlignedRef<'_>>, io::Error>
    where
        K: ?Sized + Storable + 'kr,
        V: ?Sized + Storable,
        C: Constraint,
        KRef: StorableRef<'kr, K>;
    /* ... */
}

I wonder if I can simplify it yet, maybe split up the Storable trait into two traits, one for storing and one for restoring, but not sure.

Anyway, I see how Cow would make things easier, but I felt like pushing myself to see whether I can achieve the same thing at compile-time. And it does seem possible indeed!

Maybe pushing Rust to the edge is a real time-sink. But I think I learned a lot during that process. (And I would like to thank everyone helping me out!)

I saw other wrappers for LMDB used Supercow :cow: :pill:. I'm not sure if this allows to avoid some runtime overhead by constraining one of the type parameters (I really didn't understand Supercow yet, to be honest). I refrained from using Supercow due to its complexity, and thought I'd just use GATs. It looked not tooooo difficult in the beginning. But then things got more and more complex :sweat_smile:.

It began to become difficult when I wanted to automatically implement Storable on tuples, including types like (i32, [u8]), which are custom DSTs. It looked like they can't do what I need, so I started to implement Storable on Vec<u8> to support storing (i32, Vec<u8>), for example. But then, I had to allocate Vecs in order to store these types, even if the data is static. That's why I introduced the ability to pass (&some_int, &some_slice) instead of in addition to &(some_int, some_vec).

I think I found a solution now, with BorrowStorable and StorableRef (as found in the end of my previous post). But I see how my code got significantly more complex.


P.S.: Returning Cow instead of T::BytesRef<'_> wouldn't fix my problem of copying data (edit: twice instead of only once) when storing a tuple with variable sized parts.

For anyone interested in how my current solution looks like, it can be found here:

Not sure if it's particularly beautiful, but it seems to do what I want:

  • In standard cases, only one trait (Storable) needs to be implemented to allow using a type for keys or values in databases ⇒ "easy" :sweat_smile:
  • Types which do not require alignment, such as [u8], don't need to be copied when reading them from a database ⇒ fast when reading :star_struck:)
  • Types which require alignment are supported by returning a weird smart-pointer (owning_pointer::Owned), which holds the pointed-to value (basically a Cow which is always Cow::Owned) ⇒ known at compile-time :upside_down_face:
  • Automatic implementations of Storable for tuples (T1, T2, …) are provided where all Tn's implement BorrowStorable and all but the last element have a BorrowStorable::Stored type which has a constant size when being stored ⇒ handy :smiley:
  • Storing values can be done by referencing the value or, in case of tuples, providing a tuple of references to the inner values (see doc of StorableRef), avoiding to create an intermediate tuple when references to the inner values exist ⇒ not unnecessary slow when writing :relieved:

One problem for sure, however, is overall complexity.

:face_with_spiral_eyes:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.