Will rust implement compile time function evaluation?

I'm aware of lazy_static, but it doesn't evaluate at compile time and i would be really happy if Rust implemented a native way to declare constant values the value of which depends on the result of a function.

Has the rust team spoken about potentially adding this feature? I know it's not of indifferent complexity, but the compiler already performs extensive analysis of the validity of data at compile time

1 Like

Rust already has compile-time function evaluation. Mark your functions as const fn and you can use them in constants.

const fn foo() -> i32 {
    let mut i = 100;
    while i % 2 == 0 {
        i /= 2;
    }
    i
}

const C: i32 = foo();

fn main() {
    println!("{C}");
}

Not every feature of Rust is supported (notably, traits and floating-point arithmetic are currently missing), so it can sometimes be difficult to find a way to express your code as const fn, but work is being done to expand the abilities of const evaluation and the number of functions in the standard library that are marked const fn.

13 Likes

Rust already has const fns. The const story is not complete, but non-trivial value-level compile-time computation is already possible today.

2 Likes

The const value I was trying to generate was a HashMap that translated letter chars to the corresponding integer by defining an alphabet constant &str and collecting its enumeration, so I'm guessing the first roadblock is that the Iterable trait is missing during the const evaluation and it can't iterate on the alphabet.

It's good to know that this will probably be addressed, thank you

1 Like

I forgot to disambiguate: I wasn't talking about defining const fns, but rather using the results of standard functions like .chars() at compile time to generate more complex constants

A HashMap allocates memory, so it's very hard to make it const-compatible. Are you sure u32::from(the_char) - u32::from('a') isn't good enough?

If a table is needed, one could use an array whose indexes are character codes (chars converted to usize)

In the end i used the char-to-unicode casting, but i initially tried to avoid it simply because i found the HashMap attempt more elegant.

Why can't a constant object be put on the stack?
EDIT: Ah, it's probably because its allocated with more space than necessary so you would also need a standard interface to ask it to "prune" itself to the minimum size to avoid potentially enormous memory inefficiency, isn't it?

HashMap can't just use stack:

  • it needs its internal storage to grow dynamically, and doing that on stack is either impossible or pretty ugly.

  • Drop of HashMap will always call deallocator, and the deallocator always expects to get a pointer to the heap. So even if you swizzled HashMap's storage to be on stack, it would crash when it tries to free the stack. There's no "is this a stack?" check on that path, because that would be a small runtime overhead for all of the cases when it isn't.

  • The heap/allocator running at compile time isn't the same as one running at run time. So if a compile-time HashMap was frozen and left in the executable until it's run, its pointers would be bogus pointing to wrong heap in a wrong program. That's crashy again.

4 Likes

Take a look at phf
For other specific cases you always have the option of adding to build.rs

4 Likes

Another way of restarting kornel:

At const-evaluation-time, there's nothing fundamentally wrong with allowing allocation. (In fact, recent C++ standards have constexpr allocation.) What cannot happen is that the const allocations make it into runtime code.

In theory, you could do this by just taking the heap at const-time and splatting it into the static data section of the executable. Then any read-only use of the memory will work just fine.

The problem comes when you try to edit the e.g. hashmap at runtime; it doesn't know that you've replaced its allocations with static memory, so will try to modify and/or reallocate it.

You might say that as a static, you can only use it by-&, so it's not mutable. But shared mutability exists (e.g. Mutex), so that isn't a sufficient guarantee to replace dynamic heap pointers with static data pointers. And even if no shared mutability at all is made available, the type could still try to deallocate the heap pointer, because it created it as a heap pointer, so that's completely valid.

The way around this[1] is essentially what lazy_static and OnceCell do: when the instance is first requested, just create it into the normal program heap.


  1. without essentially rewriting the type ↩ī¸Ž

7 Likes

Most of my previous experience with compile time execution have been: Lisp, Clojure macro systems. Both have the benefit of having a "VM".

One thing I am curious about for Rust compile time execution is: how do people handle functions that return:

pub struct Foo {
  x: usize,
  y: u64
}

in situations where (1) we are compiling on x86_64 where usize = 64 bits and (2) runtime is on wasm, where usize is 32 bits.

Doesn't const-evaluation use the same target as normal runtime code? I don't think it matters what usize is on x86-64 if you are targeting WASM. It'll just be 32 bits wide in const code, too.

Ah, I see, I think I had an incorrect model of Rust compilation. Put another way, are you stating:

at the stage where we do compile time evaluation, the programmer already had to specify the target arch, and therefore the compiler can do evaluation as if it is on that arch ?

Yes. Constant evaluation is very carefully designed to always give the same answer as run-time evaluation.

(And also, since code can have #[cfg(target_arch = "...")], the compiler needs to know the architecture to even decide which code to compile.)

4 Likes

Note build.rs is a step before compile time and is an executable run in the hosts platform (as opposed to the target.)

2 Likes

Yeah, I think the thing that confused me was mental analogy to procedural macros:

procedural macros: compile on HOST, target HOST, run on HOST, generate AST tree that is fed back into compile process

const fn: compile on HOST, target TARGET, run simulating TARGET?, generate 'values'

I think one point of clarification/nuance is that const fns (when running to generate a const value) don't really get "compiled" - they get put into a MIR interpreter in the compiler, but there's no assembly or even low-level IR produced for that execution. The MIR interpreter behaves the same (for a given TARGET) regardless of HOST.

2 Likes

Sorry if this question sounds dumb -- this is new to me. So the MIR interpreter run with options like mir --usize=32 and mir --usize=64 ? (I've just never seen an 'interpreter' allow such config optionx).

Unfortunately I am not sure how the internals work, but yes, for instance MIRI can take the target to emulate as a command line parameter:

https://github.com/rust-lang/miri#cross-interpretation-running-for-different-targets

(The const fn interpreter in rustc shares code with MIRI)