Partial evaluation on demand where possible


#1

In the next weeks/months I’ll explain four features I’d like in Rust (and I think no one of them is a breaking change). In this post I present a feature that’s not very important, I don’t know if it can be added to a quite impure language as Rust, and I don’t know if it’s a good idea to have in Rust, but it looks nice in theory.

In the Idris language if one function argument is annotated with [static] (or more than one), and you call that function using a compile-time constant for that argument, the Idris compiler creates a partially evaluated (specialised) version of that function for that call point:

That Idris feature is quite powerful, as shown at the bottom of that page it can even specialize away a little interpreter:

data Expr : Vect n Ty -> Ty -> Type where
     Var : HasType i gamma t -> Expr gamma t
     Val : (x : Int) -> Expr gamma TyInt
     Lam : Expr (a :: gamma) t -> Expr gamma (TyFun a t)
     App : Lazy (Expr gamma (TyFun a t)) -> Expr gamma a -> Expr gamma t
     Op  : (interpTy a -> interpTy b -> interpTy c) -> Expr gamma a -> Expr gamma
           Expr gamma c
     If  : Expr gamma TyBool -> Expr gamma a -> Expr gamma a -> Expr gamma a

dsl expr
    lambda = Lam
    variable = Var
    index_first = stop
    index_next = pop

eMult : Expr gamma (TyFun TyInt (TyFun TyInt TyInt))
eMult = expr (\x, y => Op (*) x y)

eFac : Expr gamma (TyFun TyInt TyInt)
eFac = expr (\x => If (Op (==) x (Val 0))
            (Val 1)
            (App (App eMult (App eFac (Op (-) x (Val 1)))) x))

interp : (env : Env gamma) -> [static] (e : Expr gamma t) -> interpTy t

This means that if we write an Idris program to calculate a factorial by calling interp on eFac, the resulting definition will be specialised, partially evaluating away the interpreter:<

runFac : Int -> Int
runFac x = interp [] eFac x

You can also disable this Idris feature:

Partial evaluation is switched on by default. It can be disabled with the --no-partial-eval flag.

A similar feature in Rust could use some annotation at the call site to disable the partial evaluation, to avoid too much specialization bloat in the binary and compilation times where you don’t need a specialized version of the function you are calling with a constant value.

The D language doesn’t have partial evaluation. At best you pass the compile-time known value as template argument (note this code needs to use a static if to avoid recursive instantiation of the template):

ulong myPow(uint n)(in ulong k) {
    static if (n == 0)
        return 1;
    else
        return k * myPow!(n - 1)(k);
}

void main() {
    import std.stdio;
    myPow!3(5).writeln;
}

In D you can also use the n argument with a “static foreach” loop to avoid recursion:

import std.stdio, std.meta;

template staticIota(uint stop) {
    static if (stop <= 0)
        alias staticIota = AliasSeq!();
    else
        alias staticIota = AliasSeq!(staticIota!(stop - 1), stop - 1);
}

ulong myPow(uint n)(in ulong k) {
    ulong result = 1;
    /*static*/ foreach (immutable _; staticIota!n)
        result *= k;
    return result;
}

void main() {
    myPow!3(5).writeln;
}

With that Idris feature in Rust you should be able to write something like:

fn my_pow(k: u64, static n: u32) -> u64 {
    if n == 0 {
        1u64
    else {
        k * my_pow(k, n - 1)
    }
}

As in Idris, such my_pow should be usable both when n is a compile-time computable value and when n is a value known at run-time only.

(If Rust gains D-style CTFE then things get a bit more complex because it’s less easy to define what a compile-time computable value is. In practice I think Rust CTFE will be more explicit, so this problem will be absent).


#2

It might be worth pointing out that LLVM already does this to some degree. Even with #[inline(never)], it will sometimes specialise functions to remove particular arguments entirely.

More generally, humans are really bad at making optimisation decisions; this seems like something the optimiser should be handling… which it already is.


#3

Right, but if in Rust you want something like the example I’ve shown (specialize away a little interpreter with partial compilation on demand), or you want more control on the optimizations done on your code, or you want more control across different compiler back-ends, then an argument annotation could be useful.

More generally, humans are really bad at making optimisation decisions;

I think a partial compilation annotation could be good if you know the semantics of your function is special enough for that need. The compiler knows less than the programmer about the semantics of the function.

After years of resistance against the idea, recently they have added a forced inline even in D (I think the compilation gives an error if the requested inlining isn’t possible).