Unit-only enum to string slice in Rust 1.86

Goals

I have a unit-only enum, and I want a simple, lightweight way to convert each variant to a string slice which is known at compile time. When I say "lightweight," I'm thinking things like this:

  • Easy for the compiler to inline.
  • In terms of code size and execution speed, comparable to or better than the explicit lookup table implementation described below.

My particular setting has two extra features, which you can ignore if you want to give a more general answer:

  • The string slice is used for string formatting, and perhaps will only be used for that.
  • The variants' discriminants can serve as the indices of an array.

Existing suggestions

I've found a few suggestions for how to do this:

However, I can't tell how lightweight these implementations are (in the sense described above), and I'm curious about whether new Rust features might allow for nicer implementations.

Example implementations

These implementations all have the following code in common.

use SugarLevel::*;

enum SugarLevel {
    NoSugar = 0,
    HalfSugar = 1,
    FullSugar = 2,
}

If you paste the common code followed by the example-specific code into the Rust Playground and hit Run, the program should print:

Buckwheat tea, no sugar
Tamarind tea, half sugar
Milk tea, full sugar

Matching method

This is based on the existing suggestions described above. I'm not sure how lightweight it is, but it's nice and simple from the programmer's perspective.

use std::fmt;

impl SugarLevel {
    fn as_str(&self) -> &'static str {
        match self {
            NoSugar => "no sugar",
            HalfSugar => "half sugar",
            FullSugar => "full sugar",
        }
    }
}

impl fmt::Display for SugarLevel {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.write_str(self.as_str())
    }
}

fn main() {
    println!("Buckwheat tea, {NoSugar}");
    println!("Tamarind tea, {HalfSugar}");
    println!("Milk tea, {FullSugar}");
}

Explicit lookup table

This was suggested to me as something that might be especially lightweight, although it's less simple than I'd like from the programmer's perspective.

impl SugarLevel {
    const STR: [&str; 3] = [
        "no sugar",
        "half sugar",
        "full sugar",
    ];
}

fn main() {
    println!("Buckwheat tea, {}", SugarLevel::STR[NoSugar as usize]);
    println!("Tamarind tea, {}", SugarLevel::STR[HalfSugar as usize]);
    println!("Milk tea, {}", SugarLevel::STR[FullSugar as usize]);
}

You can implement the as_str method or Display impl in terms of the array of strings, no reason to require the user to do it manually. That said, it's very likely that the match version generates essentially identical code to the array version when optimizations are enabled.

1 Like

For this kind of question it’s good to view the generated assembly code. This gets you a (somewhat) definitive answer to how your particular code is getting compiled. The Playground can do it, or cargo-show-asm — though the results can vary depending on many pieces of context.

I would generally write this kind of function with a match, but I did once have a case where such a table, whose elements were small structs (a Vector3 kind of thing) rather than &str, was compiled into a jump table (jumping to code which constructs the result value) rather than a lookup table. This was inefficient, and it improved when replaced with an explicitly constructed lookup table (derived from a match so as to still get exhaustiveness checking, but one which only executes at compile time).

1 Like

I never knew the Rust Playground could show the generated assembly! I tried the Wasm tool, since that's the target in my particular case. The outputs I get from the two comparison implementations below do look remarkably similar to my eyes (which have no experience with reading Wasm, and could easily miss glaring differences).

The func $test definitions in the two outputs differ only in the values of integer constants, which could reflect code organization differences. In particular, they each contain a single call instruction, which is call to $core::fmt::write that I'm guessing comes from the format! macro in the test function. Maybe that means the code that converts the enums to string slices is getting inlined the way I was hoping for? The test code is so simple that I could imagine the conversion being done at compile time. If that's what's happening, I guess it could be obscuring differences that would appear in more realistic programs.

Implementations for compiler output comparison

Playground settings

Tool: Wasm
Profile: release
Rust version: 1.90.0
Rust edition (under advanced settings): 2024

Common code

#![crate_type = "cdylib"]

use std::ffi::CString;
use std::os::raw::c_char;

use SugarLevel::*;

enum SugarLevel {
    NoSugar = 0,
    HalfSugar = 1,
    FullSugar = 2,
}

Matching method

use std::fmt;

impl SugarLevel {
    fn as_str(&self) -> &'static str {
        match self {
            NoSugar => "no sugar",
            HalfSugar => "half sugar",
            FullSugar => "full sugar",
        }
    }
}

impl fmt::Display for SugarLevel {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.write_str(self.as_str())
    }
}

#[unsafe(no_mangle)]
extern "C" fn test() -> *mut c_char {
    let output = format!("Marmalade: {NoSugar}, {HalfSugar}, {FullSugar}");
    CString::new(output).unwrap().into_raw()
}

Explicit lookup table

impl SugarLevel {
    const STR: [&str; 3] = [
        "no sugar",
        "half sugar",
        "full sugar",
    ];
}

#[unsafe(no_mangle)]
extern "C" fn test() -> *mut c_char {
    let output = format!(
        "Croissant: {}, {}, {}",
        SugarLevel::STR[NoSugar as usize],
        SugarLevel::STR[HalfSugar as usize],
        SugarLevel::STR[FullSugar as usize],
    );
    CString::new(output).unwrap().into_raw()
}

Yes, this is a risk — that test code with constant inputs will be constant-folded into not having the algorithm left at all. The way I like to avoid that is to make sure that the function I'm examining takes the relevant data as parameters. In this case:

#[unsafe(no_mangle)]
fn test(s: SugarLevel) -> &'static str {
    s.as_str()
}

It doesn’t matter that this no_mangle function can't be readily used as part of your Wasm ABI surface — all that matters is that it gives non-constant input to as_str, so you can see how the actual algorithm is compiled.

Or, instead, you can use std::hint::black_box() to discourage optimization based on a value. But then you still have to figure out which code in the function is the code you care about and which code isn't.

Personally, I find Wasm much more difficult to skim than x86 assembly. Partly this is having less experience, but also, Wasm is a stack machine, so you have to keep track of what is on top of the stack at all times, rather than having named registers which keep the same value until overwritten by a later instruction.

1 Like