The one thing I don't get about rust

But [u8; 1000] is still a memcpy(): it's much cheaper than incrementing X Rcs.

Also, I see Copy not as "cheaply copyable" but as "memcpy copyable", it.e POD datatype with no resources held. This view is somewhat supported by the Copy's docs suggesting that you should impl Copy everywhere you can, no mentions of type size. You definitely should not actually copy big things by accident (I think there is a lint for that?) but you still should implement it, because maybe someone will need.

2 Likes

That's... a bizarre argument. Then again, moves are a memcpy and those are implicit even for big types.

I don't see a lint, not even from Clippy: (1) Rust Playground, (2) Rust Playground

Is this true? I imagine the answer is going to depend very much on CPU caches so I'm not even sure this is answerable in general?

It's not yet finalized or stabilized -- see issue 83518.

Example on nightly.

6 Likes

C# and swift are not system programming language, they are slow, rust is not.

Not totally wrong, but a pretty serious oversimplification (you can write allocation-free, high performance JavaScript if you really want to, but it’s a pain in the neck because JavaScript makes none of this stuff explicit). If you can do it in JS, you can do it in C#.

In any case, please be specific. Don’t just make unfounded assertions. Offer evidence, make relevant caveats evident, and spell out what you mean by “slow,” because that’s relative.

7 Likes

Agreed. I find Rust pushes code to be beautiful at the large design level, but has some ugly warts at the shallow syntax level. If a tradeoff has to be made it's the right one, but I think in some cases it's an unnecessary tradeoff, and the syntax quirks could be fixed.

Copy/Clone

There are a few relevant cases for copying means:

  1. Types that are just a bitwise copy, e.g. i32.

  2. Types that are logically trivial to copy, but just slightly more expensive than a simple bitwise copy. E.g. String or Arc<T> where T doesn't have interior mutability.

  3. Types that are logically trivial to copy, but significantly more expensive than a simple bitwise copy. E.g. Vector<T> or Matrix.

  4. Types for which a copy is semantically important, like FileHandle or Arc<Cell<T>>.

C++ spells all of these as... nothing. Rust spells #1 as nothing, #2-4 as .clone().

That's an improvement over C++, but .clone() is just a very noisy way to indicate such a common operation, and conflating #4 with the other cases reduces the value with a bunch of false alarms. I'd rather be able to either indicate semantically trivial copies, so e.g. Arc<T> could be silent, or at least make .clone() less noisy.

Arrays

The Python numerics ecosystem (numpy, scipy, pytorch, numba, taichi, etc.) is a great example of how a flexible array syntax can enable all sorts of great things.

Rust's array syntax is limited in a couple of ways:

  1. Single argument, so multi-dim arrays must use a[(i,j)]. This is minor, just one of those little papercuts that makes multi-dim array stuff annoying.

  2. index() and index_mut() return references. So array-like objects that return a computed value must actually have it stashed away somewhere. That means e.g. you can't do something like have matrix[(0..3, 0..3)] return a submatrix. The only real workaround is to just not use array syntax, and instead use matrix.idx(0..3, 0..3) or similar. Not the end of the world, but ugly.

#1 would be relatively simple to fix, by auto-tupling multiple args. I.e. make a[i,j] mean the exact same thing as a[(i,j)]. Fortunately this is currently a syntax error, so such a change would be backwards compatible.

#2 could be fixed with a new Index trait, that returns by-value. It's even possible to do this backwards compatibly, with a blanket impl to convert between the two variations. (Some Deref shenanigans are needed to work around the current syntax sugar definition of a[i] as *a.index(i), but it's doable.)

4 Likes

If you're thinking String is reference counted, it isn't (so it should be in the same category as Vec). Then there's the afore-mentioned SmartString which... might fit in either category.

Agreed, which is where `autoclone` / `AutoClone may come in...


Not sure why you started talking about array/index syntax here? It's a different topic.

If you're thinking String is reference counted, it isn't (so it should be in the same category as Vec ).

But strings are typically short, and in programs doing a lot of small-object allocation, they'll typically be hitting the fast paths in the allocator. (But unlike Arc the time is unbounded, so yes it would be fair to put them in #3 instead. I don't think there's a sharp line in general; the ideal distinction is "what does the person reading this code right now care about", which varies between programs, people, and even moment to moment.

The comments about arrays were more about the general issue of Rust making certain things syntactically annoying, especially when trying to make nice numerics libraries. [edit: also I could have sworn that the post I was replying to also mentioned about the array thing. Did I reply to the wrong post or just imagine that?]

While you could make Vec only implement CheapClone where T: Copy, and then have LessCheapClone otherwise, that just keeps stacking, why not require LessCheapClone only where T: CheapClone? And EvenLessCheapClone where T: LessCheapClone?

And Arc clone is potentially just as semantic as something like FileHandle, by the simple case of it's T being something like FileHandle (presumably other than this T not being clonable)

It just doesn't seem super useful to draw these (specific!) distinctions around Clone. It's not quite right feeling, but I don't think the language can (or could) obviously do something better right now.

Absolutely agreed on Index though. That was a misstep. Maybe we can get a BetterIndex in some later rust edition, with arbitrary signatures like Fns.

1 Like

That is exactly why Rust is correct. Clone is one operation because there is no clear border between the categories you gave, but there is one between bitwise/non-bitwise copies, and it is also very useful.

However, I'm afraid this discussions is going off-topic.

2 Likes

"Make costs explicit" was one of the things that drew me to Rust in the first place as someone who formerly stuck to garbage-collected languages for their memory-safety and one of the reasons I'm looking forward to the stabilization of the large_assignments lint.

Likewise, the extreme ease of accidentally falling off the fast path because you don't have a good enough understanding of how the abstract model will translate into the machine model is one reason I bounced off Haskell despite "strong type system for enforcing invariants at compile time" being probably the number-one thing that drew me to Rust.

...though I suppose that could be some kind of GCed-language survivorship bias.

6 Likes

Consts to the rescue!

pub trait Clone {
    const CHEAPNESS: usize = usize::MAX;
    fn clone(&self) -> Self;
}

impl<T: Copy> Clone for T {
    const CHEAPNESS: usize = 0;
    #[autoclone(cheapness = 0)] // Default
    fn clone(&self) -> Self { *self }
}

impl<T: Clone> Clone for Vec<T> {
    const CHEAPNESS: usize = T::CHEAPNESS.saturating_add(1);
    fn clone(&self) -> Self { ... }
}

#[autoclone(cheapness = 2)]
fn foo() {
    let _copy0: i32 = i32;
    let _copy1: Vec<i32> = veci32;
    let _copy1: Vec<Vec<i32>> = vecveci32;
    let _copy1: Vec<Vec<Vec<i32>>> = vecvecveci32.clone();
}

Just joking :stuck_out_tongue:

5 Likes

As someone who came from Python, PHP, JavaScript, etc., I only realized that assignment of arbitrarily large classes was a deep copy by default in C++ maybe a year ago when I read a post which explained, as context for explaining C++'s decision to have non-destructive move, how such a design emerged as a natural side-effect of extending C's approach to the world of user-defined types step by incremental step.

...and, to borrow something Alan Cox said about the EsounD audio server according to the XMMS website, my response to that realization was "mostly unprintable".

5 Likes

It's upsetting to me how close this is to a good idea. Like, I could see something like this implemented in a language. I hate it, but can't look away.

3 Likes

It completely escapes me how a method call can be an "ugly wart" when the identifier is "clone" and OK otherwise (assuming, of course, that you are not suggesting that every method call is an ugly wart, which would be equally unreasonable).

2 Likes

This seems like an ugly wart to me:

Maybe there's a way to do it better w/o having copy types?

1 Like

I personally feel like a variable "autoclone" marker is the perfect solution for this kind of code. It should be a marker shorter than the full "autoclone" word; maybe a 3-letter abbreviation like cln?[1] Well, at least I'll use that keyword below. (This could be a weak, context-dependent keyword, so you can still name your variables cln if you want to.) The code example would become

fn body(&self) -> impl View {
    let cln text = self.text.clone();
    focus(move |has_focus| {
        state(TextEditorState::new(), move |cln state| {
            let cursor = state.with(|s| s.cursor);
            canvas(move |rect, vger| {
                vger.translate([0.0, rect.height()]);
                let font_size = 18;
                let break_width = Some(rect.width());

                let rects = vger.glyph_positions(&text.get(), font_size, break_width);
                let lines = vger.line_metrics(&text.get(), font_size, break_width);

                vger.text(&text.get(), font_size, TEXT_COLOR, break_width);

                if has_focus {
                    let glyph_rect_paint = vger.color_paint(vger::Color::MAGENTA);
                    let r = rects[cursor];
                    vger.fill_rect(
                        LocalRect::new(r.origin, [2.0, 20.0].into()),
                        0.0,
                        glyph_rect_paint,
                    );
                }

                state.get().glyph_info.borrow_mut().glyph_rects = rects;
                state.get().glyph_info.borrow_mut().lines = lines;
            })
            .key(move |k| {
                if has_focus {
                    state.with_mut(|s| s.key(&k, &text))
                }
            })
        })
    })
}

which is IMO very readable and doesn't contain too much noise at all. Doing .clone() implicitly on every by-value usage of the variable, including move closures, even multiple times in nested closure cases, (achieving essentially the same flexibility as Copy types) should be the main/killer application of such a feature. Besides saving some typing in cases without closures but where you just need to .clone() some variable a lot for some reason or another.


  1. I guess we could also discuss auto to make C++ programmers feel more at home? /s ↩︎

5 Likes

For further comparison, the code with some explicit "capture these variables by cloning" marker on the move closures (stand-in syntax, no idea what the best actual syntax for something like this would be); note that this version of the code is just as explicit about every single clone operation as the original code, but syntactically a lot more lightweight, saving you to type the same identifier twice for each clone operation and the need for additional lines, as well as the need for variable renaming or extra indentation (the latter would be necessary if you used shadowing to avoid the renaming).[1]

fn body(&self) -> impl View {
    let text = self.text.clone();
    focus(move |has_focus| {
        state(TextEditorState::new(), move[cloning: text] |state| {
            let cursor = state.with(|s| s.cursor);
            canvas(move[cloning: text, state] |rect, vger| {
                vger.translate([0.0, rect.height()]);
                let font_size = 18;
                let break_width = Some(rect.width());

                let rects = vger.glyph_positions(&text.get(), font_size, break_width);
                let lines = vger.line_metrics(&text.get(), font_size, break_width);

                vger.text(&text.get(), font_size, TEXT_COLOR, break_width);

                if has_focus {
                    let glyph_rect_paint = vger.color_paint(vger::Color::MAGENTA);
                    let r = rects[cursor];
                    vger.fill_rect(
                        LocalRect::new(r.origin, [2.0, 20.0].into()),
                        0.0,
                        glyph_rect_paint,
                    );
                }

                state.get().glyph_info.borrow_mut().glyph_rects = rects;
                state.get().glyph_info.borrow_mut().lines = lines;
            })
            .key(move[cloning: text] |k| {
                if has_focus {
                    state.with_mut(|s| s.key(&k, &text))
                }
            })
        })
    })
}

  1. Essentially, move[cloning: foo, bar, baz] |...| { ... } (for the most part) directly translates into { let foo = foo.clone(); let bar = bar.clone(); let baz = baz.clone(); move |...| { ... } }. ↩︎

5 Likes

Instead of an abbreviation "cln", how about "twin"?

I think both of those (but especially "twin") are "too clever"... like writing code that prioritizes brevity over maintainability.

During the RFC process, once of the things that comes up is something's effect on the cognitive budget for a new learner. We recognize that Rust is already a big language and there's an explicit effort to limit its ability to become C++, learnability-wise... which is why introducing unique new keywords to remember is a hard sell. (This was also an argument against adding priv to allow some things to be pub by default.)

That's why I think letting clone join move as a keyword in closure syntax is the best option so far.

4 Likes