Little bit of evil == rustc nightmare?

4 billion if statements | Blabbin’ inspired me to do similar with rust.

It's terrible, and with some broken logic in it because I'm just experimenting a little on the fly (i.e if you're outside the generated bounds, it assumes the number isn't even, without throwing a warning)

What I'm finding interesting is this seems to be somewhat of a pathological case for the compiler.
I wrote the first attempt as a chain of if / else if, which rustc barfed on the stack created, which fine, sure. I'd guess singular if would be fine, but haven't tried that approach yet, because I figured match would make more sense.

What is interesting to me is rustc seems to be really struggling with the compilation. I've had it compiling for close on 2 days on an (admittedly not wonderfully fast) VM. I don't imagine there are many (any?) legitimate use cases for a compilation time match on over 4 million items, but who knows.

Continuing on the pointless exercise, I'm going to whip up a long line of "if"s and a hashmap form under separate branches.

edit: Mostly I did this because I'd been meaning to look at build.rs for a while. I haven't had any legitimate need for compilation time code generation for the things I've been working on.

3 Likes

hashset and "long series of if statement" approaches added to branches. Both OOM on the 16GB RAM machine that I have to hand for testing!

Same but using recursive types.

My first attempt used const_generics but I could't get it to compile.

This could be made nicer but I got lazy:

(just realized that this doen't work for n == 0 but oh well)

Some very quick compilation performance, in terms of increasing number of entries to limit (ignoring the _ entry)

limit debug release
0 0.21599388699542 0.243635614002415
10000 0.356581996995374 2.63057388699963
20000 0.633751568995649 8.95849192499736
30000 1.0519599649997 20.2605597779984
40000 1.65016355199623 36.0358153490015
50000 2.57371079400036 61.2215769510003
60000 3.85266107099596 87.0177086340045
70000 5.55039826399297 110.905351900001
80000 7.81826143500075 155.121035224998
90000 10.4216318859981 185.115920327
100000 13.2823753760022 254.703925580994
110000 16.6019436899951 314.851370926997
120000 20.0404017239998 368.066561163003
130000 23.8103672759971 405.179086242999
140000 28.0492201300003 473.558604172002
150000 32.3968182820026 541.930442707999
160000 38.0172443449992 646.058881256002

Debug is just straight "rustc" invocation, Release is "rustc -O". Wonder if there is something quadratic in there?

Generated via rough rustc compilation timer for https://users.rust-lang.org/t/little-bit-of-evil-rustc-nightmare/104629/3 · GitHub ... I didn't think it was worth continuing to the full 1,000,000, I'd got enough data.

Every doubling of match count is resulting in a quadrupling of compilation time (give or take)
10000 -> 20000 = 3.4x
20000 -> 40000 = 4.02x
40000 -> 80000 = 4.3x
80000 -> 160000 = 4.16x

Dataflow analysis is typically quadratic in function length. Borrow checking is an example of a dataflow analysis, so it's likely also quadratic in the general case. This specific function is simple enough that one could expect it to be faster than quadratic to compile, but also it's artificial enough that it may not be worth to optimize for.

Also, afaik many of the more complex LLVM optimization passes are also quadratic.

You can try gathering the compiler profiling data, see documentation.

2 Likes

Since you're using pattern matching, have you considered a version that emits the following:

pub fn is_even(number: u32) -> bool {
    match number {
        0 | 2 | 4 | 6 | 8 | 10 | ... => true,
        _ => false
    }
}

Just a thought...

That's interesting. Didn't expect that to make a difference, but it does. Optimised compilation is comparable with non-optimised compilation when I use the 0 | 2 | 4... approach.