Allow Unicode superscripts in identifiers

Hi all,

Currently Rust allows Unicode subscripts in identifiers with non_ascii_idents turned on, i.e. xᵢ = 2 is allowed. I've noticed that Unicode superscripts are not allowed, i.e. σ² = 1 is not OK. This is because superscripts do not have the XID continue property.

I've in the process of porting over some math-heavy code from Haskell, and Unicode subscripts and superscripts makes the code much prettier and readable. As my first contribution to the great open source project that is Rust, I would like to add support for Unicode superscripts by allowing identifiers to be extended by either a character with the XID continue property or if it is a superscript [1].

What do you think about this change? Are you in support or do you have any concerns?

[1] Unicode subscripts and superscripts - Wikipedia

My fundamental problem with this sort of thing is that it yields code that's incredibly hard for some people to actually work with.

I mean, I can't type that... the lowercase sigma, the subscript i, the superscript 2; I have no way of doing that outside of manual copy and paste.

I wrote a language once that allowed it, and I really regretted it; it just made life harder.

That said, there are some cases (like transcribing algorithms) where it can be useful. Perhaps if there was a separate "only for local bindings and non-pub items" setting? Then, at least, you wouldn't have public APIs with this in them. :stuck_out_tongue:

I agree that it makes the code harder to write, but it can make it much easier to read, which is arguably more important.

I'm sure the designers of APL had similar thoughts. That thinking basically killed the language until someone came along and replaced all the lovely symbols with stuff you didn't need a special keyboard for.

We have an open issue in rust-clippy regarding Unicode in strings, we did not even think about identifiers.