Generic resolution

Is there any compiler option that dumps what the generics are resolved to when a function is invoked? I would love to see how lifetime generic parameters are resolved during invocation. I think it will help a lot in understanding lifetimes. I see -Znll-facts which produces a detailed nll-facts directory. But, I don't know how to use it.

Maybe you mean cargo rustc -- -Z dump-mir=main for generating MIR artifacts for main function.

fn max<'a>(i1: &'a i32, i2: &'a i32) -> &'a i32 {
    if i1 > i2 {
        i1
    } else {
        i2
    }
}

fn main() {
    let v1 = 20;
    {
        let v2 = 30;
        println!("{} {} {}", v1, v2, max(&v1, &v2));
    }
}

I ran it in the above code and didn't find anything that helps. Where should I look?

There have been some lifetime visualization programs but I don't think any are complete. Here's one example. Some projects did their own analysis and other projects hooked into compiler internals or MIR. You could try seeing how the latter type got the information.


As for learning the basics from more of an outside/language-user perspective...[1]

The lifetime generic functions themselves aren't monomorphized with specific lifetimes. All outside lifetimes (lifetime parameters on the function) are longer than the function body, so the only thing the function body cares about in regards to non-local lifetimes (lifetime parameters) is preserving bounds and the like (no creating references with lifetimes longer than the referent is valid, etc). The function definition has to work for all input lifetimes that meet the API bounds; the lifetimes don't effect the compiled code.

The API of such functions also describe a constraint for callers. In your code, for example, both inputs have to remain borrowed as long as the output value is alive. The borrow checker takes this into consideration at the call site.

In general you can think of the intrafunction (callsite) analysis as finding the shortest possible lifetimes that fulfill all the bounds, annotations, and uses of values (modulo compiler limitations).

But really it's more like a constraint satisfaction problem, where the compiler just needs to prove there's a sound solution. So it doesn't necessarily assign a specific lifetime to a generic function call. Even defining a lifetime gets tricky as the compiler gets more advanced.


  1. I'm not sure if any of this is of interest to you or not ↩ī¸Ž

1 Like

You said "In general you can think of the intrafunction (callsite) analysis as finding the shortest possible lifetimes that fulfill all the bounds, annotations, and uses of values (modulo compiler limitations).".

Given that compiler has to find the "shortest possible lifetimes" at the callsite for each of the lifetimes in the function signature, I was hoping that there will be some diagnostic option for compiler to dump it. Looks like not. It might be something that will be useful.

An unrelated question for similar code. The code is:

fn max<'a>(i1: &'a i32, i2: &'a i32) -> &'a i32 {
    if i1 > i2 {
        i1
    } else {
        i2
    }
}

fn main() {
    let v1 = 10;
    let v2 = 20;
    let v12 = max(&v1, &v2);
}

I generated the mir " rustc -Z mir-opt-level=0 -Z dump-mir=main main.rs". Here is the mir for main function

fn main() -> () {
    let mut _0: ();
    let _1: i32;
    let mut _4: &i32;
    let _5: &i32;
    let mut _6: &i32;
    let _7: &i32;
    scope 1 {
        debug v1 => _1;
        let _2: i32;
        scope 2 {
            debug v2 => _2;
            let _3: &i32;
            scope 3 {
                debug v12 => _3;
            }
        }
    }

    bb0: {
        StorageLive(_1);
        _1 = const 10_i32;
        FakeRead(ForLet(None), _1);
        StorageLive(_2);
        _2 = const 20_i32;
        FakeRead(ForLet(None), _2);
        StorageLive(_3);
        StorageLive(_4);
        StorageLive(_5);
        _5 = &_1;
        _4 = &(*_5);
        StorageLive(_6);
        StorageLive(_7);
        _7 = &_2;
        _6 = &(*_7);
        _3 = max(move _4, move _6) -> [return: bb1, unwind: bb2];
    }

    bb1: {
        StorageDead(_6);
        StorageDead(_4);
        FakeRead(ForLet(None), _3);
        StorageDead(_7);
        StorageDead(_5);
        _0 = const ();
        StorageDead(_3);
        StorageDead(_2);
        StorageDead(_1);
        return;
    }

    bb2 (cleanup): {
        resume;
    }
}

If you see the two arguments to the callsite of max in bb0, The first one comes from (bb0[9] and bb0[10])

        _5 = &_1;
        _4 = &(*_5);

I don't quite understand what these two lines are doing. Why can't we just use _5. Why do we have to deference it and again take a reference? What exactly is it accomplishing? Also, note that _5 is defined as immutable &i32 where as _4 is defined as mutable &i32. Not sure if it has anything to do with this.

The same thing is done even for the second argument.

I'm not an expert at MIR. But when you assign (or pass) a reference in the presence of a reference annotation, there's an implicit reborrow, which allows

  • the lifetime to be shorter
  • not giving away ownership of &muts (which are not Copy)

So I think what's going on is that _5 is the expression of the argument within the call expression, and _4 is the implicit reborrow which is actually passed by value ("move _4").

It would matter more if the argument expression was more complicated.


At even more of a guess, the let mut _4 and the like could be because they're temporaries, and/or so that inlining can succeed when max modifies it's arguments.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.