When are generic instances in a dead branch of the code, pruned?

bluss · April 29, 2020, 5:14pm

If we have code that's structured like this:

fn foo<T>() {
    if !std::mem::needs_drop::<T>() { return; }
    
    // More code here, using T.
    let mut v = Vec::<T>::with_capacity(1024);
    // ...
}

For a given type that exits early from this function, like for example i32 (does not need drop), in what part of compilation and when are the generic items that follow in the foo function pruned?

It seems like, from debug mode compilation, that quite a lot of the code in the dead branch will be instantiated and code-generated, even though it should be unreachable.

Does anyone know more details when in compilation this code is removed? I'd want to avoid that it gets emitted for optimization to llvm, so that the compiler doesn't have to do a lot of extra work - for example optimizing Vec::with_capacity for a type that's not even going to be used at that point.

I know that if we had a trait for needs_drop, that it would be possible to statically avoid this extra code generation. Does anyone know other tricks for avoiding problems with this in practice?

The question comes from the the following (more complicated) code (original code is here), that also involves significant code generation with the same needs_drop conditional.

/// Apply and collect the results into a new array, which has the same size as the
/// inputs.
///
/// If all inputs are c- or f-order respectively, that is preserved in the output.
pub fn apply_collect<R>(self, f: impl FnMut(P1::Item, P2::Item) -> R) -> Array<R, D>
    where P1: NdProducer<..>, P2: NdProducer<..>
{
    // Make uninit result
    let mut output = self.uninitalized_for_current_layout::<R>();
    if !std::mem::needs_drop::<R>() {
        // For elements with no drop glue, just overwrite into the array
        self.apply_assign_into(&mut output, f);
    } else {
        // For generic elements, use a proxy that counts the number of filled elements,
        // and can drop the right number of elements on unwinding
        unsafe {
            PartialArray::scope(output.view_mut(), move |partial| {
                debug_assert_eq!(partial.layout().tendency() >= 0, self.layout_tendency >= 0);
                self.apply_assign_into(partial, f);
            });
        }
    }

    unsafe {
        output.assume_init()
    }
}

alice · April 29, 2020, 5:23pm

It is removed by the same thing that would remove the call to bar here:

fn foo() {
    if true { return; }
    bar();
}

bluss · April 29, 2020, 5:25pm

Can you be more specific? And your example does not seem to involve instantiation of generic items

alice · April 29, 2020, 5:26pm

I don't know if it is llvm who removes it, but it probably is. After the generics have been instantiated, it would yield something equivalent to my example.

RustyYato · April 29, 2020, 5:28pm

Yes, LLVM does all dead code elimination, rustc doesn't do dead code elimination to the best of my knowledge.

wesleywiser · April 29, 2020, 5:58pm

rustc can do basic dead code elimination in simple cases:

https://github.com/rust-lang/rust/blob/36d13cb01ba6a0a9b7c13ca2b9461a111cb3e395/src/librustc_mir/transform/simplify_branches.rs

bluss · April 29, 2020, 7:22pm

The only way I see to "improve" my code here would be either with specialization (trait selection is the only way to make sure only needed generic items are instantiated) or more aggressive constant propagation in rustc. needs_drop is a const fn, but that doesn't at the moment - if I understand correctly - mean anything special in terms of compile time evaluation in non-const context. If anyone has other ideas, I'm interested.

itemchenko · April 29, 2020, 7:34pm

It seems that there is a difference for debug vs release build in this regard:

This might not make a difference in release builds (where a loop that has no side-effects is easily detected and eliminated), but is often a big win for debug builds.

From:docs

RustyYato · April 29, 2020, 9:16pm

@itemchenko I think @bluss is worried about code-size, not performance in debug builds.

@wesleywiser it looks like that pass only works on literal true and false, so it doesn't evaluate needs_drop in a debug build which is unfortunate.

bluss · July 28, 2020, 9:16pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to branch on generic at compile time? help	12	2461	July 9, 2020
Conditionally implementing Drop depending on T help	3	596	November 7, 2020
Lifetime Problems With `FnMut` that Takes Generic Arguments With Lifetimes	4	332	March 20, 2023
What type do you use, when you don't need a type? help	10	1284	December 29, 2019
Higher-kinded-lifetime bounds workaround? help	4	2565	January 12, 2023

When are generic instances in a dead branch of the code, pruned?

Related Topics