Inline functions with enable feature

Hi, I want to do something like this:

#[target_feature(enable = "fma")]
#[inline(always)]
pub unsafe fn hadd_pd(v: __m256d) -> f64 {
    let vhigh = _mm256_extractf128_pd(v, 1);
    let vlow = _mm256_castpd256_pd128(v);
    let vsum = _mm_add_pd(vlow, vhigh);
    let h64 = _mm_unpackhi_pd(vsum, vsum);
    _mm_cvtsd_f64(_mm_add_sd(vsum, h64))
}

but I get a compiler error saying inline always can’t be used with target_feature(enable), does this compile properly if it is inlined in a function that has target_feature(enable) annotation, or should I remove the inline(always) to enable target_feature ?

#[target_feature(enable = "fma")] indicates to the compiler that you are going to check, at run-time, whether the CPU supports FMA and then decide whether or not to call this version of the function. The decision not to allow inlining is kind of arbitrary, but kind of makes sense – you’ve told the compiler that you might use other code than this, so it’s not sure what to inline.

On the other hand, #[cfg(target_feature = "fma")] (missing the enable part) says “This function requires FMA, refuse to compile it if the target is missing this feature.” In this case, inlining makes more sense.

If you want to use dynamic detection of the FMA feature, I’d try just removing the inline annotation and looking at the compiled code. I very rarely have to use #[inline(always)], the compiler is pretty good at inlining on its own. (Though I guess it’s possible that target_feature(enable) blocks it.)

2 Likes

This isn’t quite right. The reason is that a function with a #[target_feature] cannot always be inlined because you can’t be sure that the function it will be inlined into has the requisite target features enabled at compile time. In this sense, it is not arbitrary at all, but rather, is an error because the inline(always) directive cannot be upheld in all cases. A normal #[inline] annotation—which doesn’t come with the a guarantee that the function is inlined—is still permitted.

1 Like

Thanks for your answers, I get why this is not allowed but I want to make sure this is inlined and proper instructions are used, I don’t want compiler to put placeholder functions for these intrinsics. I am using this in a function that has target_feature(enable) itself so does this guarantee that It will use proper instructions ?

minimal example:

#[inline(always)]
fn something() {
    __mm256_...();
}

#[target_feature(enable = "fma")]
fn big_func() {
    something();
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.