Specializing a closure twice


#1

Hi!

I have a case where a function takes a closure as a parameter, which is generic over a trait. I’d then like to apply that closure twice, specialized over two types implementing that trait.

For example, here’s something that doesn’t work:

trait Protocol {
    type Final;

    fn new() -> Self;
    fn with_capacity(sz: usize) -> Self;

    fn put_slice(&mut self, s: &[u8]);
//...

    fn finish(self) -> Self::Final;
}

trait Serializer {
    type Sizer: Protocol<Final=usize>;
    type Encoder: Protocol;

    fn serializer<F, P>(func: F) -> <Self::Encoder as Protocol>::Final 
    where F: Fn(&mut P) -> P::Final,
          P: Protocol,
    {
        let mut sizer = Self::Sizer::new();
        let cap = func(&mut sizer);
        let enc = Self::Encoder::with_capacity(cap);
        func(&mut enc)
    }
}

This doesn’t work because P doesn’t have a specific type - I want it to be both Sizer and Encoder.

I could do something like:

    fn serializer<SZ, ENC>(sizer: SZ, encoder: ENC) -> <Self::Encoder as Protocol>::Final 
    where SZ: Fn(&mut Self::Sizer) -> usize,
          ENC: Fn(&mut Self::Encoder) -> <Self::Encoder as Protocol>::Final,
    {
        let mut sz = Self::Sizer::new();
        let sz = sizer(&mut sz);
        let enc = Self::Encoder::with_capacity(sz);
        encoder(&mut enc)
    }

This works, but is cumbersome - the caller has to supply two independent copies of the same closure so that they can be independently typed.

So I’m looking for a way to either:

  • defer type specialization for a closure so that it can be applied twice with different types, or
  • make a copy of a closure before the types have been resolved

I can’t think of anything in current Rust which would allow either of these, but I’m wondering if there’s anything in the works that might addess this?

(This isn’t a big deal right now because it’s all generated code - emitting two generated copies of the closure is no problem, it just feels messy.)


#2

In rayon we have a ProducerCallback trait to solve a similar problem. In an ideal world, we’d just use closures that would be generic over Producer types. Instead, the explicit callback acts basically like a desugared closure, but the method gets to be generic.

It’s cumbersome, to say the least. If you find a better way, I’d love to improve this!


#3

One option (and it’s not that great, or even necessarily better than passing the same closure twice) is to define an enum that holds either the sizer or encoder, and pass a closure that takes it. Something like:

enum Either<'a, A: 'a, B: 'a> {
    A(&'a mut A),
    B(&'a mut B)
}

trait Serializer {
    type Sizer: Protocol<Final = usize>;
    type Encoder: Protocol<Final = <Self::Sizer as Protocol>::Final>;

    fn serializer<F, P>(func: F) -> <Self::Encoder as Protocol>::Final
    where
        F: Fn(Either<Self::Sizer, Self::Encoder>)
            -> <Self::Encoder as Protocol>::Final,
    {
        let mut sizer = Self::Sizer::new();
        let cap = func(Either::A(&mut sizer));
        let mut enc = Self::Encoder::with_capacity(cap);
        func(Either::B(&mut enc))
    }
}

This is obviously very specific - what if we had 3 types we wanted to pass to the closure? So maybe the missing language feature here is anonymous enums. It’s also unclear whether this is an improvement - would need to see how the body of the closure is implemented.


#4

Yeah, I don’t think that helps - it just pushes the problem into the closure itself. The closure looks something like:

|p: &mut P| { // where P: Protocol
    p.put_slice(b"hello");
    p.put_i32(123);
// everything else to be serialized...
}

(I actually got the original example wrong - the closure is -> () and the caller calls .finalize() on the Protocol.)


#5

If the trait is object safe, you could have the closure take &mut Protocol instead. Or since it’s generated code, something like what I mentioned for rayon might not be too bad.


#6

So the API/design of the serializer function seems a bit odd to me. I think I understand what you’re trying to do: you want to have a “measuring” output stream that “serializes” the data by simply counting how much space is needed, and then using that space calculation creating a real serializer with the needed space allocated upfront, and then doing real serialization through it.

If the above is true, I think I would just work with the two types of serializers/streams (or Protocol in your terms), measuring and real one, in the callers and not try to abstract over that with the serializer function that you have. Or maybe some other arrangement is better, but something about your current one seems off :slight_smile:.


#7

For a “measuring” protocol (i.e. the sizer), you almost certainly want inlining since it’s possible the compiler can constant fold a whole bunch of the “serialization” and not actually do any (or very minimal) work at runtime.


#8

Yep, precisely. I was initially going to have a second trait for it, but I realized it’s pretty much identical with the normal encoder aside from type of the final result (hence the Final assoc type). So the net result would be the same - duplicated serialization code - but generic over two traits rather than two types implementing the same trait.

Well, it’s object safe, but the associated type can be different (&Protocol<Final=usize> vs &Protocol<Final=Bytes>).

Precisely. The implementation of the “sizer” is very liberal with its use of #[inline], and I’m hoping that constant sized structures will be compiled to a single constant and even arrays of constant-sized structures will be strength-reduced to an O(1) size calculation.


#9

What you’re looking for seems to me to be equivalent in expressive power to

where F: for<P: Protocol> Fn(&mut P) -> P::Final,

however, I don’t suspect we’ll be able to write that anytime soon…


#11

So lacking generic (or higher rank?) closures, as @ExpHP mentioned, I’d probably go with the approach of replacing the closures with some concrete type that has a generic method over the Protocol type - that will allow your serialize fn to “reuse” that type since you can vary the protocol over each call to it. That concrete type would contain your serialization code that is currently inside the closure body.

I didn’t see the rayon code but this is probably what @cuviper was suggesting as well.


#12

For reference here is rayon’s ProducerCallback, and some discussion in the README. And although that suggests that associated type constructors could help in our case, we probably wouldn’t want to make the producers part of the public API, as associated types would require. Those types stay totally hidden with the current callback scheme, as they would with generic/higher-rank closures too.