Applying arithmetic operation to character data type

In programming language like java and c++ you can apply arithmetic operation like addition directly to the character data type

eg:
in java

char addition = 'a'+1;
System.out.println(addition);

in case of java character is consider as sub type of int

in c++

#include<iostream>

int main(){
      char addition = 'a'+1;
      std::cout << addition << std::endl;
     return 0;
}

But when it come to rust*

fn main() {
    let addition:char = 'a'+1;
    println!("{}",addition);
    }

output error

2 |     let addition:char = 'a'+1;
  |                         ---^- {integer}
  |                         |
  |                         char

"you need to convert to binary for incrementing, then convert back to char "
eg:

fn main() {
    let addition:char = (b'a'+1) as char;
    println!("{}",addition);
    }

why in language like java and c++ we can simple increment character by adding 1 to it,but in rust it need to cover to bytes ,It it because of any safety reason rust compiler don't allow this or because of using"let" before the variable.
or
is it just because rust compiler don't automatically cover the data type instead of that we manually need to do it.


Can any one explain it clearly how this is working

Because other languages don't really care what the characters are and gladly allow you to do any kind of nonsence. Rust, on the other hand, wants you to be explicit whenever you want to do something strange and/or potentially erroneous.

3 Likes

A Rust char is a Unicode scalar value, so if you could do unrestricted arithmetic, you could generate an invalid value. Example:

fn main() {
    let letter_a = 'a';
    let value = letter_a as u32;
    let add_some = value + 0xD800;
    let back_to_char_maybe = char::from_u32(add_some);
    // No good: Invalid `char` value
    assert_eq!(back_to_char_maybe, None);
}

However, every value that a u8 (byte) can take on is a valid Unicode scalar value, so this works out:

    // ASCII (< 128) not required
    let byte_value = b'a' + 128;
    // Convert to Unicode scalar value
    let scalar_value = byte_value as char as u32;
    // Still equal
    assert_eq!(byte_value as u32, scalar_value);

(But beware: String and str use UTF8, and the UTF8 encoding of bytes above 127 is two bytes long, and for other scalar values can be 3-4 bytes long. A String is not a collection of chars.)

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.