Comparison operator output, generalise?


#1

Would it be possible to generalize the output of the comparison operators,
the use case I had in mind was vectorized comparisons yielding select masks.

of course it is open to question how other generic code that uses comparisons expecting a bool would work,
would it be possible via rusts type inference to allow both scenarios, i.e. fn gt(A,B)->C and fn gt(A,B) -> bool could be 2 distinct functions selectable by their context?(or similarly, trait bounds select what you want)

(for my scenario i can of course just name some comparision functions… i was just going on what i was able to do back in C++)


#2

No, comparison operators are restricted to give Boolean results, see

https://doc.rust-lang.org/std/cmp/trait.PartialOrd.html

This reigns in the worst of the operator overloading abuse in my opinion. Masks are nice in numpy when I use them, but they consistently confuse my students.


#3

i’m alluding to the range of uses of the language - its ambition is as a C++ replacement, which would mean making it suitable for writing efficient vectorized code.

it’s also about versatility: possibilities ,being able to handle the unknown. C++ has lasted decades… who knows what demands we will have decades from now


#4

If say that in the context of a systems language, having the semantics of operators be consistent and clear is even more important.


#5

“having the semantics of operators be consistent and clear is even more important.”

In making a SIMD library, having all the operators vectorized is more consistent and clear;

by applying this one piece of dogma across the board (sure, returning a bool is better most of the time), you are making certain use cases much harder and will have people heading back to C++.

Those vector operators would cease to work with simple comparisons, but isn’t the whole point of the trait system that you get nice clear error messages as to why? (might the traits even offer means for library writers to educate through error messages e.g. a macro that gives specific warnings).

getting a boolean vector then having to write “.any()” or “.all()” might make sense in many situations, whilst in others you want to avoid branchy code to allow simd to work well … thats the whole point of a SIMD library.

alternatively you might want to use the operators to generate information through propagated types


#6

If say that in the context of a systems language,

One situation I’ve dealt with is:

a platform where we represented values in vector registers in wrapped classes, but the platform in question had extremely poor performance when converting those to bools for branching - to the extent that eliminating branches was the main point of using the library. If you wanted to make the costs explicit, you needed to emphasise that moving the value from the vector register to something branch-testable was an expensive operation.

In this case the type ceasing to inter-operate with ‘straightforward’ branchy code is actually very helpful (i.e, “only use this wrapped type if you really understand the implications”). The ability to overload bool actually helped the library writer guide the user in a helpful way.

Thats just one example from the past. Who knows what variations of processors will appear in future, in coming decades - how , for example, will the demands of AI change the processor landscape .


#7

There is a (very) long thread on the Internals discord about SIMD https://internals.rust-lang.org/t/getting-explicit-simd-on-stable-rust/4380


#8

This question is specifically about overriding PartialOrd? I don’t think this can be done in a backwards compatible way. Maybe I’m wrong but I can’t think of any way.


#9

if associated types had Defaults , the output could default to bool? ( I see they are work-in-progress, unstable)

However the bigger question I have is ‘does PartialOrd’ really imply ‘suitable for sort algorithm using if (x>y)’, rather than ‘comparable…’ … i can see that would make a mess of things.

Perhaps that could still be done by making it a type-param, e.g. trait PartialOrd<DesiredOutput> ? … and the sort algorithms that need it could just be changed to use 'T:PartialOrd<bool>' rather than plain 'T:PartialOrd'

i would then argue a ‘vectorized comparision’ is still a form of ‘orderable type’

would that do it?


#10

It wasn’t clear to me if you were talking about semantics of PartialOrd in this first paragraph or questioning how to make this backwards compatible. But I think your second paragraph really nails the issues with using default associate types to solve this problem. Using a default associate type might make it backwards compatible for people who are implementing PartialOrd, but users of PartialOrd still expect a bool which would make this a breaking change, no?


#11

edited for clarity,

but users of PartialOrd still expect a bool which would make this a breaking change, no?

misformating meant my earlier post didn’t show … what I mean is that PartialOrd<bool> and PartialOrd<AnythingElse> are distinct bounds… if a user writes plain T:PartialOrd, that implies output must be bool, by default

EDIT again hmm.
difference between PartialOrd and PartialOrd<Output=X> where Output is an associated type specified inside. I think this might require that the implementation sets type Output=X , such that the user can explicitely request that.

trait PartialOrd<X=bool>{
    type Output=X;  // defaults to bool
}
//definer

// legacy user 
impl .... where T:PartialOrd {  // so far so good? that will demand X=bool Output=bool ?
}

// new user..
// if PartialOrd< Vec4<bool> >  exists how to do that..

#12

Oh right, I see. That would be interesting. So from your talks before, I’ve also imagined your situation is for comparing types like (f32,f32,f32,f32). Where you would want to do something like:

let x: F32x4 = F32x4(1.0, 2.0, 3.0, 4.0);
let y: F32x4 = F32x4(0.0, 4.0, 2.0, 6.0);
let ord = x < y;
assert_eq!(ord, Orderingx4(false, true, false, true));

#13

yes exactly that; and in turn the (x < y) output could be used with a ‘vector select method’ mirroring the behaviour of the SIMD vector select instructions;

I can’t quite follow in my head exactly if the bounds will work ok (*), but if you write a code-path that indeed does ‘vector selects’, you can explicitely request the PartialOrd<Vec<bool>> yourself ? … rather than just relying on extracting PartialOrd<output=Whatever> as with most operator-overloading cases.

//write this sort of thing..
fn update_bounding_box<V:PartialOrd<Vec4<T>>, T:..>( box_min:&mut V,  box_max:&mut V,  value:&V){
     *box_min=(value < box_min).vselect(box_min, value);
     *box_max=(value > box_max).vselect(box_max, value);
}

// trivial example handled by vectorized min, max, but of course vectorized bools allow any general component wise logic.

( * ) - the haziness here is the difference between the extraction of Output=… in trait bounds, and passing it in as a parameter to demand it. Can trait bounds express either scenario seamlessly…


#14

Interesting. I’m not overtly against this, considering that user would either be restricting themselves to a custom type (like F32x4 for instance, where this comparison should seem natural), or would be explicitly annotated in generics like V: PartialOrd<Orderingx4>. I don’t think this would follow from the same issues that numpy has since Rust has a fairly explicit type system (even though there are times that type inference can make manually tracking types a little bit harder).

That being said, I also would be at least a little surprised by the return from <, at least for the first time. On the other hand, I don’t know why I would have expected F32x4 < F32x4 to be a meaningful statement apriori. Personally, I would prefer statments like min_x4 = lhs.lt(rhs).select(lhs, rhs).


#15

My take is that operators are very intuitive and ‘discoverable’ to the user; I might be interested in expressing ‘vector gather’ this way aswell. result: Vec4<X> = base:Vec<Vec4<X>> [ iiii:Vec4<index> ]
I guess it’s just down to documentation. “here’s a SIMD library which implements all the operator overloads for intuitive use, e.g:…”


#16

I agree that the standard operators are discoverable, but dispute that they are intuitive when used to represent as in your proposal, something other than their standard meaning.

Your argument from before was that the error message from an operator like this can be a sort of documentation that could direct users towards the right way to do things. That certainly hasn’t worked for numpy. I have never seen a student (in seven years of teaching computational physics) who was helped by the error message you get when you try to use an array of booleans in an if statement. Of course, the problem is that the message suggests using .all() or .any(), which while it could make the code run, and can be used in correct code is never what my students intend when they make the mistake of comparing an array.

You could argue the same applies to all operator overloading. The distinction is that non-overloaded comparison always ends up in an if or a while, whereas arithmetic, for instance, never does. So you can redefine define arithmetic for another type without changing the meaning or correctness of course which uses it, but you cannot do so when making comparisons return something other than booleans.

Breaking programmers’ intuition about what operators do is not worthwhile for a little bit of syntactic sugar.


#17

i’m flabbergasted that you’re setting expectations for Rust from Python.
Rust is strongly typed,
Rust is not being targeted at the use-cases of python.

Rust is being targetted at the use-cases of C++, and C++ users are used to doing this sort of thing. I don’t think this qualifies as ‘abuse’ compared to the way the C++ stdlib handles fileIO …

And I don’t think this is breaking intuition at all. It’s still ‘a function comparing values, returning a bool’ … comparing an array , why would I expect to get anything other than an array of bools ?

(in seven years of teaching computational physics)

hehe.
one of the most enthusiastic operator overloaders I worked with was a trained physicist. He pushed the use of ‘bitwise or’, | ,for dot products . It did get a lot of complaints … but in the end it made sense. It was an amazingly common operation , and allowing it as an infix operator led to writing expressions very easily (mapping from how you think about something to how you write it).
I did actually originally complain that I wanted ‘bitwise or’ available for the select masks… but ultimately it still worked with that, because the types could always get the correct meaning.
Ultimately as well as types, you have variable names …and the surrounding context. ((a>b) | (c>d)).vselect(e,f) … there it’s clearly bitwise or. other times it was clearly dot.

if we could handle that back then in C++, I’m sure in the modern day with much better IDEs, Q&A sites like stack overflow, and this updated language with the whole feature of traits aimed at improving error messages, it will be fine.