Allow Unicode superscripts in identifiers


#1

Hi all,

Currently Rust allows Unicode subscripts in identifiers with non_ascii_idents turned on, i.e. xᵢ = 2 is allowed. I’ve noticed that Unicode superscripts are not allowed, i.e. σ² = 1 is not OK. This is because superscripts do not have the XID continue property.

I’ve in the process of porting over some math-heavy code from Haskell, and Unicode subscripts and superscripts makes the code much prettier and readable. As my first contribution to the great open source project that is Rust, I would like to add support for Unicode superscripts by allowing identifiers to be extended by either a character with the XID continue property or if it is a superscript [1].

What do you think about this change? Are you in support or do you have any concerns?

[1] https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts


#2

My fundamental problem with this sort of thing is that it yields code that’s incredibly hard for some people to actually work with.

I mean, I can’t type that… the lowercase sigma, the subscript i, the superscript 2; I have no way of doing that outside of manual copy and paste.

I wrote a language once that allowed it, and I really regretted it; it just made life harder.

That said, there are some cases (like transcribing algorithms) where it can be useful. Perhaps if there was a separate “only for local bindings and non-pub items” setting? Then, at least, you wouldn’t have public APIs with this in them. :stuck_out_tongue:


#3

I agree that it makes the code harder to write, but it can make it much easier to read, which is arguably more important.


#4

I’m sure the designers of APL had similar thoughts. That thinking basically killed the language until someone came along and replaced all the lovely symbols with stuff you didn’t need a special keyboard for.


#5

We have an open issue in rust-clippy regarding Unicode in strings, we did not even think about identifiers.