Explosive type sizes ("reached type-length limit") - once and for all

I'm writing code with async and quite a lot of generics. The amount of code is rather small really, but I encountered the compiler error error: reached the type-length limit while instantiating ..... At that time, I managed to fix it with one .boxed() call.
Now I have been rewriting my code in an attempt to make it even more generic.
This time I cannot seem to get rid of the error, no matter where I put .boxed() it would seem.
It's on this branch: s3-algo/copy.rs at forum/rewrite · openanalytics/s3-algo · GitHub and that file in particular. If you build with cargo build --tests you get the error.

Now, I don't merely want to ask for advice on this particular instance of the issue. In fact, I might readily abandon that branch if it turns out to be impractical.
But I'm tired of hitting this issue. I don't want to add the flag to increase the type length limit, because all dependent crates will have to do so too.

I would like to ask, and I hope someone knows the answer: What language construct should we look out for to avoid this problem?
We are clearly facing some kind of exponential growth, since the compiler suggests to increase the limit to, say, 5 million. So what is relevant is anything that causes such exponential growth, not anything linear. We can use the linked branch to discuss some language constructs:

Does async/await syntax improve things (compared to e.g. manually .and_then(..))?

It would be helpful if I could have any idea what to look for when I try to fix this problem. Whether it's about putting .boxed() in the right places, or just redesign.

1 Like

I investigated this problem for Rayon, and found that in many cases it was little closures that were carrying all generic parameters of their scope, even when they don't need to.

For example, if you're in a function with a generic iter: I, and you call iter.map(|item| ...), then the expanded type Map<I, F> will have I repeated -- as itself and as a component of the type of the closure F. Repeat that a few times and you have an exponential effect on type length.

In rust#62429 and rayon#673, I mitigated this by replacing a lot of raw closures with nested functions with tighter generic types. When those closures do need to capture values, I wrote "factory" functions instead to create and return tighter-bound closures.

Maybe Stream needs to get the same treatment. That kind of change is very tedious though, and I hope the compiler can someday do better, like rust#46477.

7 Likes

Thanks very much for the insightful reply! Could you elaborate a bit on how you make functions that returns tighter-bound closures? If it's not too much trouble.

Does anyone know whether async blocks and functions contribute the exponential type growth? Since manually writing async you would also use combinators like and_then.

Edit: I just solved my own problem by putting a .boxed() after the call to s3_request. Might try something you suggested to reduce the amount of boxes.
I wonder, wouldn't it be possible for some code analysis tool to show the sizes of types, so that we can see where it starts to build up?

As I understand it, async blocks do not contribute to exponential growth, although playing around with your example, they don't seem to eliminate it like boxed() does either.

1 Like

Here's a playground example -- in multiply the closure will depend on the entire I, and in multiply2 it will only depend on T, instantiated by the particular I::Item.

fn multiply<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    iter.map(move |item| x * item.into())
}

fn multiply2<I>(iter: I, x: f64) -> impl Iterator<Item = f64>
where
    I: Iterator,
    I::Item: Into<f64>,
{
    fn mul<T: Into<f64>>(x: f64) -> impl Fn(T) -> f64 {
        move |item| x * item.into()
    }
    iter.map(mul(x))
}

fn iter() -> impl Iterator<Item = i32> {
    (0..10).map(|i| i * 42)
}

pub fn foo() {
    let _ = multiply(iter(), 2.0);
    let _ = multiply2(iter(), 2.0);
}

Show LLVM IR, then compare multiply::{{closure}} and multiply2::mul::{{closure}}.

; playground::foo
; Function Attrs: nonlazybind uwtable
define void @_ZN10playground3foo17hbce2de427f35bc00E() unnamed_addr #1 !dbg !161 {
start:
  %_3 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply2::mul::{{closure}}<i32>>", align 8
  %_1 = alloca %"core::iter::adapters::Map<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>, multiply::{{closure}}<core::iter::adapters::Map<core::ops::range::Range<i32>, iter::{{closure}}>>>", align 8
...
2 Likes

I believe they act exactly like closures in this regard. Here's another playground:

use std::future::Future;

fn bar<T>(_: T) -> impl Future<Output = i32> {
    async { 42 }
}

pub fn foo() {
    let _ = bar(0..10);
}

In theory that block doesn't need to be generic at all, but in LLVM IR you'll see:

; playground::bar
; Function Attrs: nonlazybind uwtable
define internal i32 @_ZN10playground3bar17h8a18e5c6d80f79acE(i32, i32) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality !dbg !55 {
start:
  %2 = alloca i32, align 4
  %3 = alloca { i8*, i32 }, align 8
  %_2 = alloca %"bar::{{closure}}<core::ops::range::Range<i32>>", align 4
  %4 = alloca %"std::future::GenFuture<bar::{{closure}}<core::ops::range::Range<i32>>>", align 4

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.