Std::ops precedence change

I'm overloading std::ops::{Add, Mul, BitXor}. I want:

  • BitXor to represent power
  • Add and Mul to represent addition and multiplication
  • BitXor to precede Mul when both are in the same statement without parentheses
#[test]
fn precedence() {
    assert_eq!(1 + 2 * (3 ^ 2), 19); /* succeeds */
    assert_eq!(1 + 2 * 3 ^ 2, 19); /* fails with: 37 != 19 */

How to force precedence?

I'm pretty sure you can't override the precedence order Rust uses by default, because it would require the compiler to parse expressions differently depending on information from the trait system. rustc definitely does not interleave parsing and type inference in this way.

3 Likes

First time I face an issue where the parser limits the versatility of the language.

Well, I think parentheses are Okay, and legible enough.

Do you know if it happens in other languages too? Curiosity.

Haskell is the only language I know of that allows defining custom infix operators and specifying their precedence for parsing; it has a special form for this, which looks like infixr 7 +^ (this declares +^ as a right-associative infix binary operator with precedence 7 on a scale that runs from 1 to 9). The standard operators +, *, etc. are declared this way in the standard library, so (I think) if you don't import them from there you can declare them independently with different precedences in your own module.

Bash is an example of a language implementation that interleaves parsing and even tokenization with execution of the parsed code, see e.g. http://www.oilshell.org/blog/2016/10/20.html and http://www.aosabook.org/en/bash.html (section 3.4). I think this is generally viewed as a bad thing.

1 Like

You mean it limits your ability to mask what your code is doing via redefining operator precedence, thereby making it harder to conceal malware? :sweat_smile: If you consider the statistics that reading code, by you and others, occurs much more frequently than writing it, and that others will likely eventually need to maintain your code unless it's personal throw-away, then Rust's position that you can't trivially obfuscate what you are writing by redefining operator precedence makes total sense.

2 Likes

I'm not sure I got your point entirely.

You say that preventing precedence redefinition is a security issue which it is related to the legibility of the language, because those who will prevent exploitations are those who will review the code. The syntax should be unique so that there is no ambiguity in reading the code and ambiguity is less probable as possible.

That is totally acceptable to me.

I won't argue. But I can't imagine how different operator overloading is from redefining precedence. If ops are overloaded and precedence could hypothetically be overloaded, those who review the prior should also review the latter if necessary.

But I can take it.

By the way, is overloading of BitXor on integer/float any common case of code exploitation? Bad example I gave? :sweat_smile:

Because + is just a shorter way to write Add::add, so operator overloading isn't any weirder than implementing Clone::clone or Default::Default -- sure, you can do silly things in the implementation of the method, but that's true of every single method. A + that divides isn't any weirder than a fn add that divides.

Obviously.

The question I posed is why precedence should not be overloaded due to security issues, if operators are?

Was this line taken with irony, as if I dislike the parser? It was not my intention. What I mean is it is not everyday that we met the parser to interfere with how we code. Most of the day, the compiler is unnoticed. That's all.

The practical reason you can't overload precedence is because it means now the parser (code for reading text into an AST) needs include part of the type system. This prevents you from creating a clean separation between the phases of compilation, greatly complicating the compiler's internals. It means that if I just want to analyse some code I can't just use rustc's parser, but also pull in half the compiler.

The security comment was an off-the-cuff remark that being able to change precedence whenever you want is a great way to obfuscate code.

Being able to change precedence also breaks the principle of least surprise and makes the language a lot harder to learn. I'd be quite annoyed if, just by including someone else's crate, my operator precedences were switched and now the math I'm using for bit masking and register bashing is silently broken.

Overloading precedence is very different to overloading operators because, similar to how for .. in ... is converted to IntoIterator::into_iter(), operators are converted to calls to Add::add() and friends by the parser. The parser doesn't actually know you've overloaded the operator, instead it'll be picked up during the type checking phase.

On the other hand, precedence is built into a language's grammar and is as integral to the language as semantic things like borrowing and move.

Rust uses slightly different precedence rules to what you may be used to from C and I'm sure there would have been very good reasoning for this decision. You can always use parentheses to force precedence, and in these sorts of ambiguous situations I'd recommend it anyway.

5 Likes

As an aside, although I'm opposed to the idea like many others, but if you really want custom operators with custom precedence, you can just write e.g. an attribute proc-macro, and parse the function body, then re-emit the desired code.

1 Like

Woah, what?

Far from unnoticed the compiler slaps me black and blue every minute of everyday for my misbehavior. Sometimes I want to cry and run away to a gentler guardian, like crazy uncle Javascript. But I know my compiler loves me really, and it's all for my own good. It just wants be to be a better programmer and keep myself out of trouble. I have read comments from many who have a similar situation.

Personally I think operator overloading should be used in such away that the operators keep their usual meanings whilst extending the notation available for your own types. Anything else is causing unnecessary confusion and making room for mistakes by those who later have to read and modify your code.

Redefining operator precedence is right out for the same reasons but more so. I'm glad to hear the compiler does not support it.

8 Likes

Python will not allow precedence overloading. A suggestion given is having a custom interpreter to establish the custom precedence, as an additional layer over the operators.

Override all operators to not do the calculations but create list of instructions wrapped in some object.

Well, I don't know if it was clear or not, but the point is not overloading the operators and precedence for primitives, but overloading it for some struct that carries mathematical meaning. What I had in mind was approaching programming language syntax to mathematical syntax.

The idea fails.

Not only because the compiler/parser has its requirements, as @Michael-F-Bryan and @cole-miller said. It is unfeasible to approach both syntaxes ideally, otherwise Rust would be another language.

A better approach is extending the std::primitive syntax. Which is simpler and more extensible. Precedence will be fixed as the self.method precedes std::ops.

#[test]
fn precedence() {
    assert_eq!(1.0 + 2.0 * (3.0_f64.powf(2.0)), 19.0); /* succeeds */
    assert_eq!(1.0 + 2.0 * 3.0_f64.powf(2.0), 19.0); /* succeeds */
}

Since Rust supports operators to be implemented for combining different types and yielding yet another result type (e.g. A: Add<B, Output = C>), it would not even be clear, which type(s) the precedence is supposed to depend on. Seeing an expression x + y * z, you can be sure that x is the left operand of + and z is the right argument to z. So you know the left argument type for + and the right argument type for *. Anything else? Perhaps you can infer the overall return type, too. Whichever operation turns out to be the outer one would need to return that.

Now please tell me, with this little information, how to find out where the parentheses go. I’m having doubts there is even any straightforward way to do this in theory, while avoiding ambiguities. Hence, you don’t even need to start with any practical considerations like giving up the separation of parsing and type-checking.

1 Like

Imagine the primitive number from the sample code is the proper struct and all operations return that same struct, possibly allowing other data types like primitives as inputs.

I wanted to overload BitXor to implement power. The problem is that BitXor will not precede multiplication. So the maths will fail and I can't force BitXor to preced Mul.

If I were to keep parentheses it would be like this.

#[test]
fn precedence() {
    assert_eq!(1 + 2 * (3 ^ 2), 19); /* succeeds */
    assert_eq!(1 + 2 * 3 ^ 2, 19); /* fails with: 37 != 19 */
}

The struct used in maths operations is a struct that generically and recursively appends these operations information through enum without performing prior calculation. It was possible to evaluate the precedence of the operations, as I keep the applied operations in the data struct, adding another layer of abstraction. However, it doesn't seem to be the best option as I can simply extend the std::primitives interface for the same functionality and it solves the precedence issue.

#[test]
fn precedence() {
    assert_eq!(1.0 + 2.0 * (3.0_f64.powf(2.0)), 19.0); /* succeeds */
    assert_eq!(1.0 + 2.0 * 3.0_f64.powf(2.0), 19.0); /* succeeds */
}