Why Rust does not provide several custom operators?

Neutron3529 · December 23, 2021, 6:23am

When using yew, I found a function called neq_assign which return a bool.
When using ndarray, a similar trait called AssignElem exists.
When using rug, a Assign trait exists.
What's more, when we write something like let mut a=1;let b=&2;a=b, an error generated by rustc which force us using a=*b or a=b.into() instead since the type of a and b is different.

I am surprisingly find that, if rust allow CustomAssign trait, and bind it to an unused symbol(like ~(means "similar"), ~~(means similar, too, could be a little bit different compared to ~(e.g., allow LHS uninitialize)), or :=(which is used in Pascal)) things would become better:

#![feature(min_specialization)]
pub trait CustomAssign<Rhs=Self>{
    fn custom_assign(&mut self, rhs:Rhs);
}
impl<T> CustomAssign<T> for T{
    #[inline(always)]
    default fn custom_assign(&mut self, rhs:T){
        *self=rhs
    }
}
impl<T:Copy> CustomAssign<&T> for T{
    #[inline(always)]
    default fn custom_assign(&mut self, rhs:&T){
        *self=*rhs
    }
}
impl<'a,T:CustomAssign<&'a T>> CustomAssign<&'a mut T> for T{
    #[inline(always)]
    default fn custom_assign(&mut self, rhs:&'a mut T){
        self.custom_assign(rhs as &T)
    }
}
fn main(){
    let mut a=0;
    let mut b=[1,3,5,7,9];
    b.iter_mut().for_each(|x|{
        // when an `.iter()` is used, the type is borrowed and we have to deref it.
        a.custom_assign(a+x);
        // a~a+x might be more beautiful.
        println!("{}",a)
    })
}

If we using things like callling a.custom_assign(b) by a~b, we would have an awesome code:

struct BuilderConfig(i64,i64);
struct Builds(i128);
impl Builds{
    fn new()->Self{Self(0)}
    fn assign_new_config(&mut self,cfg:BuilderConfig){
        self.0=((cfg.0 as i128)<<64)+(cfg.1 as i128) // just for example.
    }
}
impl CustomAssign<BuilderConfig> for Builds{
    fn custom_assign(&mut self,rhs:BuilderConfig){
        self.assign_new_config(rhs)
    }
}
fn main(){
    let a=BuilderConfig(1,-1);
    let mut b=Builds::new();
    b.custom_assign(a); // would be `b~a`
    // actually, if we allow using `:=` to initialize variables, we would have:
    // let b:Builds := a;
    println!("{}",b.0);
}

IMHO, add := and ~ (or ~=,~~,~) will make rust more concise and more beautiful.
If we know a and b are in the same type, we have more chance choose a=b since = is easy to type
if we know a and b are in different type(e.g., rug::Integer and i32), using := or ~ will allow us code faster.

Here is my questions:

Is there some disadvantage?
Is it worth discussed in internals.rust-lang.org?
Is it worth a pre-RFC?

Some progress: with the idea of @苦瓜小仔 in the Chinese Rust forum, a macro may help achieve the goal, which seems to be a shift-reduce hell..

    #[macro_export]
    macro_rules! cai {
        (@expr_block ($($ex:tt)+) =>  { $($b1:tt)* } else  { $($b2:tt)* } $($tail:tt)*) =>{
            $($ex)+ {
                cai!($($b1)*);
            }else{
                cai!($($b2)*);
            }
            cai!($($tail)*)
        };
        (@expr_block ($($ex:tt)+) => { $($b1:tt)* } $($tail:tt)*) =>{
            $($ex)+ {
                cai!($($b1)*);
            }
            cai!($($tail)*)
        };
        ({ $($b1:tt)* } $($tail:tt)*) =>{
            {
                cai!($($b1)*);
            };
            cai!($($tail)*)
        };
        (@split_exp_block ($($ex:tt)+) => { $($b:tt)* } $($tail:tt)*) => {
            cai!(@expr_block ($($ex)+) => {$($b)*} $($tail)*)
        };
        (@split_exp_block ($($ex:tt)+) => $t:tt $($tail:tt)*) => {
            cai!(@split_exp_block ($($ex)+ $t) => $($tail)*)
        };
        (if $t:tt $($tail:tt)*) => {
            cai!(@split_exp_block (if $t) => $($tail)*)
        };
        (while $t:tt $($tail:tt)*) => {
            cai!(@split_exp_block (while $t) => $($tail)*)
        };
        (break { $($b:tt)* } $($tail:tt)*) => {
            break { cai!($($b)*); };
            cai!($($tail)*)
        };
        (loop { $($b:tt)* } $($tail:tt)*) => {
            loop { cai!($($b)*); } ;
            cai!($($tail)*)
        };
        (for $t:tt $($tail:tt)*) => {
            cai!(@split_exp_block (for $t) => $($tail)*)
        };
        ({ $($b:tt)* } $($tail:tt)*) => {
            { cai!($($b:tt)*) }
            cai!($($tail)*)
        };
        ($id:ident ~ $ex:expr; $($tail:tt)*) => {
            $id.custom_assign(cai!($ex));
            cai!($($tail)*)
        };
        ($($id:ident)+ $(: $type:ty)? : = $ex:expr; $($tail:tt)*) => {
            $($id)+ $(: $type)? = CustomInitialize::custom_initialize(cai!($ex));
            cai!($($tail)*)
        };
        ($st:stmt; $($tail:tt)*) => {
            $st
            cai!($($tail)*)
        };
        ($ex:expr) => {
            $ex
        };
        () => {};
    }

RedDocMD · December 23, 2021, 6:49am

This is definitely a better place to discuss.

Neutron3529 · December 23, 2021, 6:59am

I just afraid that I provide a rather silly idea which already have better choice that I don't know.

H2CO3 · December 23, 2021, 7:31am

Proposals like this are full of disadvantages, usually without any real, significant benefit. Specifically:

Custom operators in general aren't easy to do correctly, since they make it impossible to parse (or even lex!) the source code without first interpreting it to some degree. This tangles the first few stages of compilation together, from lexing to rudimentary type checking, which in turn makes the compiler more complicated, causes it to have more bugs, and it will be harder to maintain.
For the same reason, code will be harder to read to humans as well. Apart from the long-established mathematical symbols, I have no idea what the code means when I see Haskell or Swift code that is full of <<~!^%>#>> operators. It's just plain inferior to typing out a reasonably descriptive method name. Even if you aren't planning to support custom operators in their full generality (which I can't tell because the title and the post body are in contradiction here), ~ is not something that is immediately recognizable as "custom assignment". At most it could lead to associations of negation (from C, Lua, etc.).
Apparently, you want to customize assignment. That is not something that Rust can reasonably support, either. The very memory model of the language (and lots of existing unsafe code!) is based on the observation that assignments (and moves in general) are a trivial memcpy(). Custom assignment would break this fundamental assumption badly.

To be perfectly honest, I highly doubt that. First of all, writing speed is not the bottleneck in getting things done. Second, it just doesn't matter all that much. Code is read much more than it is written, and a language should optimize for the understandability of the code, and not on making it easy to hammer out undecipherable blobs of symbols as "fast" as possible.

Neutron3529 · December 23, 2021, 7:46am

Firstly, the "custom operators" means that, we have no guarantee what the operator actually done: it could mainly be another "assign" method, but rust should not deny the code blow up the computer when the programmer and the cargo maintainer want to blow up the computer

(I edited the title, now the aim is more precise: add several operator (with no guarantee what they could do), rather than add infinitely many operators)

here, only several fixed item is considered: ~, ~~, ~= and :=, no parse or lex issue is encountered.
This custom remains the flexity to use. IMHO, Compared to the various kind of Assign-like trait, a custom assign may be better.

The intuition that I suggest a custom op is that, every op have its own syntax meanings(e.g., you can'y modify += to custom assign since we could not recognize += as =, but CustomAssign could provide something flexible(compared to the unchangeable = op))
e.g., when we encounter the problem "generate a random number", we could write

let a=gaussian_distribution.sample(&mut thread_rng);

let a=gaussian_distribution.next();// if we write an wrapper of the code above

or

let a:f64 :=&mut gaussian_distribution;
// the defination of CustomAssign could be
// `fn custom_assign(&mut self, rhs:&'a mut T)`
// which means that rhs could also be changed.

I think the latter one is better, since next, sample and generate are all acceptable keywords for generate a new random number.

I agree that <<~!^%>#>> is quite ugly, but at least := does not.

maybe ~~ could, since we "negation the variable twice".

Mostly it is. I don't know the apporiate name of "custom assign", since things could be done further than assign:

Except Rust, I use R. In R, or at least in statistics, ~ is something like "similar" or "is"
e.g., lm(y~x) will generate a linear regression coefficients that minimize sum((y-cbind(1,x)%*%coefficients)^2)

we would also wrote X~N(0,1) to show that a random variable X is the standard normal distribution.

yes, Customize = would break this fundamental assumption badly, but customize := does not, since it is actually a syntax sugar, and what's more, no one use that symbol now.

this is also the reason why I choose symbol ~

I highly agreed with it.
but IMHO, custom ops help us read and write faster.

when you want to assign a value to another which have different type, which command you would use? .set()? .copy_from()? .assign()? .into()?
all of them might be correct in some situations, but none of them is the best answer.

if we have ~, things could be different: crate maintainer now have their abilty to choose the best choice for you and the more important things, you know what they have done.

H2CO3 · December 23, 2021, 9:20am

Neutron3529:

or

let a:f64 :=&mut gaussian_distribution;
// the defination of CustomAssign could be
// `fn custom_assign(&mut self, rhs:&'a mut T)`
// which means that rhs could also be changed.

I think the latter one is bette

Categorically no. Please don't do/suggest that. It's not clear at all what that should do. By assigning one thing to another, I (and probably most people) would expect the left-hand side to become equal to the right-hand side. That's what assignment is. Disguising random sampling as assignment would do everyone (except perhaps yourself) a disservice.

I don't follow. What does double negation have to do with assignment?

I would use the already existing From::from() trait and method.

This is equally true (if not even more pronounced!) for operators. Introducing a new sigil wouldn't resolve the problem that naming things and concepts is hard. You have already recommended two different symbols (~ and :=) and stated that you are not sure which one is better. This should be strongly illustrative of the issue.

Anyway, feature requests are off-topic here, and I'm not going to engage in this pointless debate anymore. Please do some serious research before proposing radical changes that are clearly unfit for the language. If you have done so, you can open a formal pre-RFC over at IRLO.

Neutron3529 · December 23, 2021, 9:31am

Thank you for the long reply.

Where could I do some serious research?

BTW, From::from might not be the best choice if the value do not have a Copy trait. This is why CustomAssign is needed, and exists in some famous crates.

Cerber-Ursi · December 23, 2021, 9:47am

Not sure if I follow this. How would CustomAssign allow you to use value without moving? If it's by implicit cloning, then this is highly unlikely to be approved.

Neutron3529 · December 23, 2021, 10:22am

the story is a little bit long:

create Rug already have some documents for it:

Sometimes we need to assign the result to an object that already exists. Since Rug types require memory allocations, this can help reduce the number of allocations. (While the allocations might not affect performance noticeably for computationally intensive functions, they can have a much more significant effect on faster functions like addition.)

the expr I suggests is a~b,
firstly, b may not with the same type of a, which means that b could be instructions rather than real value that occupy memory.
e.g.,

let mut a=Vec::with_capacity(100);
// ... many other code
a~0..100;

could be the same as

let mut a=Vec::with_capacity(100);
// ... many other code
a=(0..100).collect::<Vec<_>>();// this code may allocate new memory for the collected value, drop the old memory used by a, which might be slow compared to operate the exist memory of a directly.

(what's unexcept, a=From::from(0..100); generates an error since only FromIterator trait exists.)

(We may further discuss let a:Vec<i32> := 0..100;, which could be a syntax sugar that also allow a custom initialize)

VorfeedCanal · December 23, 2021, 2:24pm

Frankly, I think this discussion is a bit premature. Like: five years or so premature.

You are already discussing some complex and visual-cluttering way of doing some obscure things without allocating memory, but currently stable, safe Rust couldn't even create object on heap without first producing it on stack and moving it to heap.

And I would argue that this operation happens about 100 times more often than these curious and interesting cases which you are discussing in that thread.

Never forget quarter-century old RFC: it is always possible to aglutenate multiple separate problems into a single complex interdependent solution, but in most cases this is a bad idea.

I would rather see someone tackle very simple and concrete “Box implies move” problem before an attempt to aglutenate all these different crates needs and wants.

scottmcm · December 23, 2021, 3:02pm

You're going to have an up-hill battle convincing people that a symbol is worth it for things like this, vs the existing a.extend(0..100); (maybe with a.clear(); first) or whatever.

Note that even .await didn't get a new symbol, despite it being very common.

Neutron3529 · December 23, 2021, 7:44pm

the examples mainly shows "what custom ops could do", rather than "what custom ops must have done"

At least now, maintainers could not provide something like "custom assign", which makes assign method have many different names.

for example, .await at least is a common method that we all know that using .await is the best way to perform async operations. -- When we want to assign, things become a real mess:

a=From::from(b);  // From could solve many problems, but not all problems.
a.assign(b);  // assign might be the most common function.
a=b.complete(); // at least in Rug, it is `complete` that convert results to 
a.set(b); // why not crate maintainers using `set` instead of `assign`?

if we have the following things which actually have the same meaning but were defined in different crates:

ops.async_wait(); // we are waiting for the async job
ops.wait(); // we will wait until the job is finished
ops.finish();// we would wait the program finish
ops.result();// we want the result.

assign a new symbol to .await might be acceptable.

the aim to invent new symbols for me is not just "add a new way", that's more about "unify the code".

bluss · December 23, 2021, 8:48pm

I think one could entertain the thought of adding just one more operator: The infix function call operator, let's call it \ for the purposes of the discussion.

So that we can write:

use ndarray::op;

let c = A \op B;

And this A \op B becomes equivalent to op(A, B). op is just a function!

It's not a complete suggestion yet - taking references or not, autoref, is tricky. But I always though it would be very powerful to just re-use functions this way. A properly defined function can be an infix operator with the new \ operator. It would make it very easy for crates to define their own operators (and traits for them!) with the minimum of language machinery.

scottmcm · December 23, 2021, 9:09pm

Note that starting from symbols is then making things harder than you need. The first step would be to to just have a naming convention. And if there's a word that works well, maybe there'd be a way to have a common trait for it to help -- though TBH the generic usage of such would be limited, so std still might not want one.

You'd probably have better luck starting with something more like https://github.com/rust-lang/rfcs/pull/2677 was -- not that it was accepted either.

Yeah, there have been many forms of that discussed over time, including just using . for it, so a.foo(b) can be used to call foo(a, b).

I think it'd be better in a different thread than "several custom operators" though.

system · March 23, 2022, 9:09pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Some subjective thoughts on Rust	24	6535	January 12, 2023
Hello and some notes	17	2986	January 12, 2023
Rust vs. C++: Fine-grained Performance	51	7289	January 12, 2023
Why can't I increment a variable like this?	70	89742	August 15, 2020
Why not an Assign trait? help	13	6468	January 12, 2023

Why Rust does not provide several custom operators?

Related topics