Borrow/move/closure symantics are driving me to my wit's end

dobkeratops · August 11, 2017, 12:20pm

"But I’m sure you would end up with two very different solutions thanks to how the languages work."

... nope.

the 'C++ way' will be the most efficient, since it is merely a code generator for C, which in turn is a code generator for ASM (sometimes derided as a 'super-assembler' rather than a high level language). you are thinking first and foremost, how to solve the problem on computer hardware (not really caring if it's C++ , Rust, FORTRAN, raw ASM or whatever).

Rust will be able to match this: and if not best tell me now and I'll ditch this. But as far as I know, the claim is 'no loss of performance', the ability to match C/C++.

The 'rust way' is a subset of techniques that are 'provably' safe (in the context of the Rust compiler's current ability to analyse): but to match C++ performance, rust must evolve, over time, to match all the use cases we have seen in 30-40+ years of C/C++ implementations.

Note that 'building an unsafe abstraction' isn't really doing something in some magic new way. It's just assigning a name to a pattern , and claiming "ok, this has been verified". We can also write such abstractions in C++ and say "hey, this is safe!".
.. but you have the same issue of discovery, it takes time to find the right name for the pattern.
The underlying implementation will be the same.

What I want is all Rusts other syntactic/semantic tweaks (enums, match, no header files etc) but with the same shortcuts I have in C++.

the languages aren't as different as you claim. Even Enums are doable in C, they're just more verbose.

zen3ger · August 11, 2017, 1:28pm

I was pointing out that the solutions will differ thanks to the different semantics, and abstractions each language provides, I wasn't arguing that "because of the safety features the rust binary will be full of with black magic" and it wasn't a bit-by-bit analysis of the binaries.

So far we were talking about the higher level abstractions, like lifetimes - ownership... with the extra 'headache' they come with.

And if those abstractions would really doesn't make any difference because all native code could be written in ASM anyway, than we why bother...

I don't see how your suggestion of opt-in unsafe has any relevance to your statements.

dobkeratops · August 11, 2017, 1:56pm

There are times when people will revert to C++ (or something else, e.g. jonathan blow building his language) - where unsafely is a better tradeoff.

For the sake of a simple option, we could make 95% of the great work done in Rust already useful to these cases, and avoid further fragmentation of the wider programming world.
Even unsafe use cases will still have safe subsets that still contribute to the safe ecosystem. (even in C++ there are some types of safe header-only libraries that don't deal with any allocations, they just take inputs and return values).

you were arguing that 'my lack of familiarity with rust' is the issue, I am retorting by explaining my perspective that the real issue here is "how to make real computer hardware do something". What rust does do is introduces some formal syntax for things we are intuitively already familiar with .. we may infer from context.. we didn't get as far as we did with C, C++ for decades being completely hopeless.

We never had lambdas for a long time in C++ but we did abstractions of sorts with #defines .. limited, and clumsy, but nonetheless capable of factoring out many repeated patterns. Most of the time I'm after lambdas it is for 'internal iterator' style use cases and I don't need to think about lifetimes: the values passed in are non-escaping temporaries. It takes some additional markup to state that to the compiler. Similarly most of the time the return values fall into one of 2 simple cases.. a 'very short'(accessors , like operator) or 'very long' ('any pointers in the result are dealt with by RAII') (there's another thread asking for 'opposite of 'static).

I know there's a middle ground where lifetime markup will allow checkable safety in more complex examples. (we are just shifting the boundary between library and user code I guess. Any complex use of 'pointers' should really be encapsulated in some sort of system. we do this in C++.

Over time, the defaults , assumptions , assists available in the Rust compiler and surrounding tooling may change such that this markup is simplified or even automated. (e.g, from your example implementations it may be able to figure them out for you and suggest more accurately). Whatever. The point is that for different tasks, the priorities may vary.

As you can see from RFCs like thisnon lexical lifetimes , the whole idea of safety is a work in progress. rusts picture of safety is not complete.

I also keep mentioning how the mere idea of array indexing introduces another dimension. You might have made your program 'unhackable' with bounds checks (I understand this is a specific goal of rust) , but the fact you have a bounds check indicates you still do not know if your program is correct.

if you (or your compiler) were confident in the validity of the indices (e.g. you know you have no glitchy polygons or whatever)... those bounds checks could be eliminated. To me, the use of bounds checks constitutes a 'debug build'.

My point is if I'm burdened with this correctness through other empirical means anyway (if it panics, thats still a bug that I have to fix), the 'safety' isn't helping me as much as you might think.

Conversely, the real reason I'm here is I'm sick to death of header files and other syntactic nuisances in C++. One of my favourite things about rust is the simple tuples. you can't retrofit this to C++, you need to introduce words ('tuple()' 'pair()') because the comma is used sub optimally. I also like the type-inference; I notice this in lambda based code, there's times when you can't avoid presenting some of the types up-front in C++). we also have the premium '' chars wasted in the type syntax on a fairly useless type of array. (type signatures are extremely important because they're the first thing you read).. and so on.

vitalyd · August 11, 2017, 3:18pm

Please provide an example of this.

True, but "complete" is a high bar for any aspect of any language. NLL has workarounds for vast majority, if not all, practical cases. The workarounds are a bit annoying and a wart but shouldn't be onerous.

Even if you don't care about security exploits, debugging corruption is a huge PITA. You're going to lose massive amount of time, certainly cancelling out any speedup you got by turning off borrow checker.

If you want to remove bounds checks, you can write non-index based code or use the explicit unsafe indexers.

It's a bug, yes, but it's likely easier to identify the problem because you shouldn't have too many panic points. If you do, it's a code smell. Rust makes you consider error/edge cases upfront. If you panic, you can include helpful troubleshooting info into the panic message; you may even get a backtrace if you have that enabled.

Again, just getting something to compile quickly with little care of correctness is not what Rust is about. This may seem counterproductive for quick and dirty experiments. But if those experiments turn into real code you have less code to sanitize and clean up and worry about bugs, and you make up for the "time lost" on the back end of this process.

But as mentioned upthread, let's talk about concrete usability issues with safe code. Everyone is interested in making that easier to write.

dobkeratops · August 11, 2017, 4:03pm

Please provide an example of this.

https://www.reddit.com/r/rust/comments/6kr7o5/lifetimes_nothing_escapes/

(consider in parallel the need for a trait as contributing to the overall 'bloat'.. the 2 things add up, whereas in C++ thats one direct implementation

... coming from C/C++ i'm used to workarounds .. and I'm suggesting a #[unsafe(borrowcheck=warnings)] as one such thing, an option to catch more use-cases.

thats why I do have bounds checks in my debug-builds, and many other checks , like NaN checks. The debug build will be used in stress tests, pushing a system to its limits with pathological cases; and you need to do that sort of thing to gain insights for performance aswell. e.g. we made something to automate frame rate testing ('which areas of a scene drop below 60fps'). Some people make a game play itself to train the AI.

to restate the point:
When there are other issues beyond lifetimes to check for: if the code passes those checks, the probability to lifetime problems is vastly lower. As such front loading one set of issues is not always a win

it's almost as if the 'full definition' of the program is somewhere between the code itself , and the tests.

This is why I keep talking about future tooling. It might be the case that an advanced tool or compiler could figure out more of the constraints from broader context, i.e. including the tests. (not just 'here's a function', but 'here's an set of examples of how it can be used.')

doesn't relate to the issue of indices (taking the rust philosophy to it's extreme, foo[i] would have to return an 'Option'? (to make you think about the fact the index could be out of range..)

I'm not saying what Rust does is useless, (and far from it, I fully support making globals unsafe), but what I'm seeing here is a bit like the 'pure OOP' / 'pure FP' zealotry where proponents of one aspect of a language see it as the be all and end all, rather than just one extra tool that , blended with others, provides an incremental step forward.

But as mentioned upthread, let’s talk about concrete usability issues with safe code. Everyone is interested in making that easier to write.

I know that provable safety is going to take markup; thats why (as explained above) sometimes I might want to dodge it. Other times defaults could be changed

One practical suggestion is the idea of a 'temp, a kind of 'opposite of' 'static: [Short lifetime, opposite of 'static?] This is doable now with for<'a> .. but that IMO is less ergonomic than it could be .. you have to look to one side, you've had to introduce another named symbol and you must read the context of useage to determine the intent, from a general purpose construct which is clearly intended to enable more complex cases.

Another suggestion I might make would be to free up the foo[i] syntax for safe or unsafe use, at the minute you'd have to hack it badly with something leaky. Something like this: make the language primitives fn safe_index(&self,i)->&T, unsafe fn unsafe_index(&self,i)->&T , and make the use of the operator user selectable, so if we're doing unsafe indexing most of the time, we can have it... if you really think about it, with [i] being a panic-point it might be something you want to abstract away anyway, so you might want to do some refactoring.. 'lets redefining it as unsafe to help track down all the uses..'

vitalyd · August 11, 2017, 5:21pm

I will read that Reddit post a bit later and see if I have any comments. For now a few quick ones regarding some things you mentioned.

I wasn't talking about debug builds only. It's nice to have better diagnostics in production/release builds because, let's be honest, a test of a complex system will rarely if ever cover 100% of the code.

Lifetimes, or what distills to it, is a major source of bugs in C/C++. A lot of that code is well tested by any definition of "well" - they're just too complex to get 100% coverage. A lot of those turn into security exploits, but fine, perhaps you don't care about that aspect. Debugging them is painful and a huge time sink.

You keep talking about testing, and I agree testing is needed no matter what. But having a compiler eliminate an entire class of problems makes the job easier.

That's exactly what Rust returns - normal/safe indexing returns a reference in an Option, so you have to think about the possibility you went out of bounds. If you want to skip the safe indexing, there are unsafe APIs to do that.

dobkeratops · August 11, 2017, 5:31pm

in some scenarios maybe.
In some scenarios such runtime failures are uncaceptable, ie. you only release the product after you have tested it to a level where you are 99.99% confident.

Also I've worked on platforms where branches cripple performance (not even if taken; the mere presence prevents other scheduling optimisations). For me to consider rust as a 'main language' it must have been capable of the full range of niches I've experienced.

but this isn't magic. We're doing work up front. I keep saying , the tests would have caught those errors aswell. There's other nontrivial logic problems.

Lifetimes are not really a big deal for me, as I explain in the other link given above I have 2 simple cases most of the time, and any more complex pointer use is built into re-useable abstractions/systems.

what is wrong with having an option?

people can discover this for themselves, in their own niches. The lifetimes help -> great, i'll use them. The lifetimes don't help, ok, I'll disable them but still get the benefit of type-inference, no-header files, tuples, better-macros, enum/match, transitive-const-by-default.

without the option.. I have to stick with C++. (despite those loses, the end result combined with mature tools is still faster to use.. and there is an incoming modules feature which will kill the header files). Or wait for JAI.

dobkeratops · August 11, 2017, 5:37pm

Lifetimes, or what distills to it, is a major source of bugs in C/C++.

nowhere near my biggest problem; my debugging time has always gone on actual behaviour.. states, logic, mathematics, figuring out how APIs and file formats behave.. and almost always requires writing some sort of experimental/diagnostic/visualisation code. ( Thats why I also think having a 'productive language' embedded right inside the 'systems language' would be a big win, but that relates to slightly different requests )

vitalyd · August 11, 2017, 9:22pm

Ok, so I read through that reddit thread. It sounds like the biggest issue there is the type inference engine isn't working as one would expect. A workaround is to tickle the compiler to infer more eagerly, or provide a type ascription that identifies the real type (&usize in that case). I've seen this with closures before, particularly ones that receive reference arguments. So, to some degree, that's a known limitation/annoyance specific to closures.

I think you got a good answer there with regards to the lifetime bounds though (and you had more than necessary in the original code). Writing generic code that's dealing with references is going to require lifetime annotations. Just like generic code requires type constraints/bounds when certain functionality needs to be required of the types, lifetimes need to be specified in similar vein. I don't think much can be elided/inferred there by the compiler since, by definition, there's very little information to go by there - no concrete types are known. So in some sense, one can make the argument that Rust should also switch to dynamic typing since putting type annotations is annoying and one "knows" that the type is correct at runtime anyway .

dobkeratops · August 11, 2017, 9:50pm

I wouldn't call it dynamic typing: If i've understood correctly, the behaviour of a scripting language with dynamic types is different to 'type inference with static types', e.g. what can happen with types in sequences and structures - a truly dynamic language would keep things as unions/dictionaries etc?

something they suggest is 'use macros' but I find that off-putting, because it's a syntactic (and semantic) change vs functions. (again the JAI videos explain the utility in having different cases syntactically close so it's easy to evolve code back and forth..generalize/specialise.. extract a block..)

it would be great to have these options for more inference with the Rust syntax suiting it (there's proposals for C++ using 'auto' in arguments to turn them into typeparams,but not quite as slick as rust where you could just 'omit the trailing type'.

vitalyd · August 11, 2017, 9:56pm

No, I meant someone can come along and say they like some of the syntax/sugar but don't like the strong typing - they want to write x.foo on any type because they "know" it'll be there in practice. It's basically a similar argument to what you're proposing, albeit more drastic but the analogy is fundamentally correct I think - someone asking to change a core tenet of the language because it gets in the way for their use case but they like the syntax better than, eg, Python.

And to be clear, I'm not saying there should be no escape hatch for cases where type system/compiler doesnt allow expressing a construct. I like that unsafe exists, but I love that it's scoped and not the default. I'm merely saying I don't think global opt-out makes sense in the spirit of the language. If you're comfortable with Rust, there shouldn't be too many places you need unsafe (unless you're writing low level code/lib and need to squeeze perf, interface with raw memory allocation, hardware, and so on).

dobkeratops · August 11, 2017, 10:06pm

my idea isn't as drastic as the gulf to python: because in python any function can add the fields (the 'objects' are dictionaries?) ... under my vision you'll still get type errors .. structs would still have to be explicitely created; you just wouldn't need all the trait bounds all the time. .. you'd go back in and add them as soon as you found the error messages getting out of hand ()you figure out on a case by case basis what the bigger problem is).

I have dabbled with haskell and that's basically how I find it works out. you can start with few types, you do have to add more to get it to work , but you still didn't need them everywhere;

I've got another suggestion to 'lighten things up' inspired by haskell, i.e. not requiring trait-impls to re-list the types (infer from the trait def) ; see here: Infer function signatures from trait declaration into 'impl's by dobkeratops · Pull Request #2063 · rust-lang/rfcs · GitHub

it's troubling how this gets 2 thumbs up, 9 thumbs down .. it suggests I'm in a community whose preferences are vastly different and the language is unlikely to move in the direction I'm after, and that's probably one of my most 'tame' suggestions

vitalyd · August 11, 2017, 10:35pm

I was using the Python/dynamic language as a dramatization of what you're asking for, at a high level. You like some aspects of Rust, but find some of its core aspects problematic/counterproductive/useless(?) to your use case/etc. I'm just saying that someone can raise the bar on that thought and want to throw out or make optional even more things because they also like a bunch of things about Rust, but less of them than you. I'm just throwing a straw man out there for illustration.

Rust, as it is, is trying to walk a very fine line between being explicit yet ergonomic and productive; and it threw the GC approach out in favor of a fairly novel type system, at least for mainstream languages. It takes a very strong stance on ownership and mutability and error handling. It's built on certain principles, and it'll appeal to people that share them. I think they've done an admirable job of trying to balance that triple (perf, ergonomics, safety). But it's not a fast forgiving prototyping language - it never will be. It'll get better usability, ergonomics, error reporting, and so on but it won't be as "easy" to prototype as some other languages. And I personally think it's good - it's a general purpose but focused/opinionated language.

If you like most things about Rust, it's more productive to brainstorm with the community how to make the other things better but stick to core principles and goals. It's hard - turning off borrow checking globally is easy (conceptually), but it's conceding to what the language is against - that would be sad.

dobkeratops · August 11, 2017, 10:44pm

and it threw the GC approach out in favor of a fairly novel type system, at least for mainstream languages. It takes a very strong stance on ownership and mutability and error handling

thats all fine, the 'core language engine' is fantastic. the mutability part is one of the strong draws; for optimisation in C++ we sometimes need to mark things as 'restrict' to fudge 'no aliasing'. I don't think they pass the hints through to LLVM yet but in principle rust should allow that almost everywhere by virtue of knowing things are 'really immutable', not just 'not mutated through this pointer'.

one issue is we're losing a middle ground between the fully annotated safety, and the raw pointer; C++'s reference types aren't as safe as rust, but they're definitely more safe than the raw pointer. It's like we have to go one way or the other, and they're both more verbose here.

vitalyd · August 11, 2017, 11:03pm

Correct. I think LLVM has some bugs when given the noalias attribute but I don't recall all the details offhand. If/when that's fixed, look forward to better codegen by LLVM

The middle ground for Rust is the scoped unsafe block that allows you to read/write raw pointers and turns off borrow checking on said pointers (amongst a few other things it allows in there, but these are the relevant bits for this discussion). As mentioned, if/when you're comfortable with Rust, you should be able to express most scenarios using safe code. For those where you can't or want to eek out more performance, there's the unsafe escape hatch. It's clearly marked which serves as a reminder to anyone looking at the code that "you're on your own" here and should be extra vigilant about ensuring safety/soundness.

What I'm having a hard time digesting in this lengthy thread is the notion that most of Rust is unwieldy and hard to work with. This may be true for beginners, no doubt - it's a complex language even without lifetimes. Pervasive type inference can be unsettling in the beginning. Etc. But, once comfort level is built up, I don't think this thread portrays the correct picture. IMO only, of course. There are warts and head scratchers here and there, but they are few and are dwarfed by all the positive things. Not to say those warts should be left alone - they're known and people are working on them. But, let's look at the big picture and stop asking for perfection.

dobkeratops · August 11, 2017, 11:27pm

C++ "&" does tell me 'it's not owned, it's not null' etc. Thats the middle ground I'd like to get here.

It could be an issue of 'what to use the default syntax for'. What if we had another 'unsafe reference type' , carrying the same meaning as the C++ "&". That would fix it; however it would probably take some verbose syntax to achieve because the syntax space is already taken up.

sure, I know rust can do anything through a combination of safety and unsafe blocks as a catch all; the issue is the amount of markup to get there. You lose the in unsafe, and so on. And when you combine it with the markup for traits (I've always enjoyed ad hoc overload in C++ .. I'm very happy with the way you can do maths there) thats when it crosses a pain threshold (that has me thinking 'better the devil you know').

I think with some tweaks you could get something that was more pleasant 100% of the time (because the underlying language engine is so good)

Heh. similarly what I find surprising is the way everyone here talks as if unsafely makes C++ impossible to work with . The pain points in C++ come from silly issues in team scenarios (my big request there is UFCS, r.e. fixing all the stresses over 'what should go in the class..', it would even give more leeway with what gets exposed in headers in some situations)

vitalyd · August 12, 2017, 2:02am

C++ allows bypassing the non-nullness - it's just UB if it is null; optimizer can assume it's non null.

https://doc.rust-lang.org/core/nonzero/struct.NonZero.html would be the comparable construct today (on nightly channel).

It's not impossible of course, just very difficult to not cut yourself in a big and complex application. Modern C++ (11 and onwards) and its practices are better than days past, but essentially you get very little help from the compiler in terms of memory and thread safety. And it's the Wild West where features can be dropped, kind of like you want in Rust . Can't make your code work with const? const_cast it away. Want to make a null reference? static_cast away. Want to violate strict aliasing? Go for it and turn off optimizations that assume no such thing. And on and on and on. It's really hard to reason about correctness in such situations on large codebases. It makes refactoring undesirable and in fact probably leads to less efficient code in some places because defensive copies/clones are made because the ownership/lifetime story is obfuscated behind layers of code and perhaps even threads.

Basically, C++ requires a lot of discipline and constant attention to lots of minutiae. Who do you trust to be more disciplined and not slip up, ever, in focus/attention? You and your coworkers or the compiler?

dobkeratops · August 12, 2017, 3:38am

but that's not what I'm after .. I'm just after the 'safe-ish' case of c++ usage.. the middle ground.

the main thing IMO is not so much the safety , but the syntactic help for writing safe code; if the 'good' way to do thing is laborious, people will take shortcuts.
e.g. I see 'immutable by default' as good because it saves markup, compared to having to write 'const' (and having to do more for the fact it isn't transitive).

The 'expression based syntax' makes it easier to write variables that are initialised, again thats great. (declaring and setting something is safe, and even saves a bit of typing compared to declaring something then setting it elsewhere.).

Some types of functional abstraction reduce the number of temporaries you need to use.
and so on.

Basically, C++ requires a lot of discipline and constant attention to lots of minutiae. Who do you trust to be more disciplined and not slip up, ever, in focus/attention?

sure but you need to test it for other purposes. Every project I've worked on has required shaving off every last byte and every last cycle toward the end.. really pointer bugs are nothing compared to re-workign things for cache efficiency, and the more difficult scenarios are unsafe anyway (e.g. streaming system.. file DMA and low level graphics API overlapping.. beyond the CPU) and actually the process of optimization is re-ordering things to make traversals and use of memory simpler (allocations = pointer chasing = cache misses).

Rust would help me by virtue of those features I keep listing, but is crippled by front-loading that one set of issues that I don't really care about all the time.

other feaures
    A
    |          WHAT I NEED
    |             /
    |           X           
    |     C++                 
    |
    |                
    |            Rust
    |
    +- - - - - - - - - - - - ->
                  some features

this is basically the situation. On some measures, rust demonstrates a set of things I want, hence shows me they're possible. On some other measures (which are orthogonal), Rust becomes a step backwards.

If you just literally took C++ and added the tweaks, the result would be superior for me.

as it stands today, C++ remains the best option.

HadrienG · August 12, 2017, 9:47am

If your ASCII plot is an accurate representation of your feelings, then by virtue of Euclidean distance, you would be better off looking for what you want in the C++ community rather than here.

dobkeratops · August 12, 2017, 10:01am

, then by virtue of Euclidean distance,

The frustrating thing is,

If you were to measure 'what I need' / 'C++' / 'Rust' in terms of whats in the compiler codebase, and the syntax, rust is much closer, e.g. 95% perfect. It's just a few superficial options/choices (disabling things the underlying engine is capable of) that prevents it being suitable.

X= Rust + unsafe module option, whole program inference enabled, recover the sigils, and optional traitless generics (put the traits in when errors are too big).

or
X= C++ with... new CFG syntax, no headers, UFCS, 2-way type-inference retrofit, 'immutable by default' option, ADTs, nicer lambdas, (and maybe concepts, and a borrow checker option is definitely useful aswell)

It would be 5x more work to modify C++ to get what I want versus modifying rust

Rust is also a far younger language with vastly less momentum , hence much easier to steer.
(e.g. I have to fight 1000 entrenched oppponents here, versus fighting 100,000 entrenched opponents in C++, whatever it is )

look how long it took to get Lambdas in C++, look how the arguments over UFCS went, how long people resisted 'auto', etc..

You can't make breaking changes to the C++ syntax, whereas everything I'm after is 100% compatible with the existing Rust syntax.

Topic		Replies	Views
Newbie question on borrowing help	7	655	January 12, 2023
Question about borrowing help	7	549	June 12, 2022
Borrowing puzzle with closure and traits help	8	357	April 29, 2021
Passing mutable borrow to immutably borrowed closure help	9	569	June 3, 2021
Understanding the borrow checker when borrowing independent parts of a struct help	3	374	July 4, 2021

Borrow/move/closure symantics are driving me to my wit's end

Related Topics