Is it safe to cast `char.len_utf8()` to `u8`? If it is, why is the return type `usize` in the first place?

If I'm not mistaken, UTF-8 encoding allow codepoint size to be 6 bytes maximum, this number does not require a whole usize.

Yes that's safe. The documentation, too, specifies that the value will be between 1 and 4. (Not 6 actually.) As to why it's a usize I would assume the reason is that code that needs the length of a char in UTF-8 is very often some kind of parsing code - or similar code - that needs the length of the chat in order to increase or decrease some index of type usize into some string or some byte buffer. If it returned u8, then such use cases would nerd to convert it to usize before being able to add the trust to (or subtract it from) another usize

6 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.