Char::to_digit musings

leonardo · June 6, 2020, 12:07pm

Hi, this is the source of char::to_digit():

pub fn to_digit(self, radix: u32) -> Option<u32> {
    assert!(radix <= 36, "to_digit: radix is too high (maximum 36)");

    // the code is split up here to improve execution speed for cases where
    // the `radix` is constant and 10 or smaller
    let val = if radix <= 10 {
        match self {
            '0'..='9' => self as u32 - '0' as u32,
            _ => return None,
        }
    } else {
        match self {
            '0'..='9' => self as u32 - '0' as u32,
            'a'..='z' => self as u32 - 'a' as u32 + 10,
            'A'..='Z' => self as u32 - 'A' as u32 + 10,
            _ => return None,
        }
    };

    if val < radix { Some(val) } else { None }
}

I'd like to know why it isn't more like this:

pub fn to_digit(self, radix: u8) -> Option<u8> {
    if radix > 36 { return None; }

    // the code is split up here to improve execution speed for cases where
    // the `radix` is constant and 10 or smaller
    let val = if radix <= 10 {
        match self {
            '0' ..= '9' => self as u8 - b'0',
            _ => return None,
        }
    } else {
        match self {
            '0' ..= '9' => self as u8 - b'0',
            'a' ..= 'z' => self as u8 - b'a' + 10,
            'A' ..= 'Z' => self as u8 - b'A' + 10,
            _ => return None,
        }
    };

    if val < radix { Some(val) } else { None }
}

That can be used as:

fn main() {
    println!("{:?}", '8'.to_digit(10));
    let a = [10, 20];
    if let Some(d) = '1'.to_digit2(10) {
        println!("{:?}", a[usize::from(d)]);
    }
}

There are few differences:

Instead of a panic, it returns None if radix > 36.
It contains less "as" casts.
It returns an optional u8, this is handy because you can convert it safely and losslessly (avoiding "as" in user code) using ::from() without "as" casts to usize, u32 and some other types.

What do you think?

TomP · June 6, 2020, 12:15pm

Shouldn't to_digit2() be try_to_digit()?

leonardo · June 6, 2020, 1:06pm

(I've renamed to_digit2 as to_digit). I don't see why you suggest a different name. But anyway, I was asking about semantics/API.

TomP · June 6, 2020, 1:41pm

My mistake. I was still half-asleep and thought that you had switched the signature from returning an int to returning an Option<int>. That refactoring transformation is usually accomplished by prefixing try_ to the name of the function that panics.

trentj · June 6, 2020, 1:44pm

That seems like a bad idea, because passing a radix greater than 36 to to_digit is a mistake you should want to be warned of as soon as possible.

I'd guess it's very rare to use to_digit with a non-constant radix. 99% of the time it will be either 10 or 16, probably 99% of the rest of the time it will be 8, 12, 20, or some other constant value. In my ideal world I'd want to have a compile-time failure for writing char::to_digit(c, 37), but since Rust doesn't support that at the moment, panicking is the next best thing.

You have a point here. Converting to i32 is probably pretty common and is awkward. It's possible that the API designers were thinking of eventually supporting larger radices that could handle larger digits, such as 'ↂ' (U+2182 ROMAN NUMERAL TEN THOUSAND) which would not fit in a u8. Of course it's also possible this is simply an oversight.

system · September 4, 2020, 10:28pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Char::from_digit help	3	775	January 12, 2023
Why does char::to_digit return Option of u32 and not u8 help	6	698	November 5, 2022
Which of these codes is better? help	6	416	October 6, 2023
Converting 0, 1, 2, .., 9 to '0', '1', .., '9'	8	938	July 30, 2019
Using to_digit() gives "not in scope" error help	14	545	March 14, 2023

Char::to_digit musings

Related Topics