Safe/unsafety of operators


#1

would there be any path to generalise the safety/unsafety of operators… are there any RFCs for this e.g:-

  • unsafe array indexing / implementing c-like offsetting of raw pointers
  • maths types which need special attention
  • ‘deref’ ?

(as I understand at the moment, the safety is baked in to the meaning of the operator itself e.g trait Index { fn index(..)} )

one workaround suggested in a recent thread was ‘make the constructor of unsafe types unsafe’, so as to control their use… that’s an interesting idea, but it might go too far. (e.g. an unsafe type could still be used safely with iterators that bypass indexing). I’m actually happy with the concept of the safe/unsafe divide,

it’s just i’d like the option of unsafe code being more ergonomic when needed.

Another idea was a procedural macro to rewrite [] calls as something else…

I know you could also just make an unsafe [] implementation but that would probably preclude one’s code from being sharable on crates.io (conversely I’d imagine it being useful to have some community moderated ‘confidence level’/‘verification’ of crates using unsafe blocks and special flagging of 100% safe crates)


#2

First of all - why do you want special unsafe operator? I mean rust is tend to be used in rather safe manner, and if you really want to do something unsafe, you are forced to explicitly mark it. This is a reason, why I don’t thing, that there is sense to make any sugar for unsafe code. If you really want unchecked indexing for your struct, just make a function, like “unchecked_at” or something like that. Standard containers actually has such things, eg. Vec::unchecked_get. Also I don’t see any reason for preclude anyone from publishing anything on crates.io - as far as I understand it’s just open place to share your rust libraries just like github, but more specialized. However searching crates by its safety may be good idea.


#3

" I mean rust is tend to be used in rather safe manner, "

someone somewhere still has to write the unsafe code. I find that rust makes unsafe code less ergonomic

you are forced to explicitly mark it.

I’m not challenging that. I’m just suggesting that you should be able to choose what the convenient syntax is targeted at, e.g. a predominantly unsafe module (i.e. the internals of a memory allocator or collection class) would benefit from a source-file wide ‘mode switch’ of sorts. (but that makes it sound too radical, what i’m proposing is more along the lines of use )

If you really want unchecked indexing for your struct, just make a function, like “unchecked_at”

yeah thats whats annoying , if unchecked array indexing is your majority case.
Checked arrays indexing represents code which has not yet been debugged: the need for a check is an admission you may still have logical errors. In some domains, this means your program is still incomplete, and must go through further empirical testing , until you are confident the logic is sound.

I don’t see any reason for preclude anyone from publishing anything on crates.io

I agree: I wouldn’t suggest precluding publishing sharing, just making the status of what is shared clearer. Given the intent of the safe/unsafe divide, I think it’s natural to question the safety of code you haven’t verified. some people objected saying ‘crates shouldn’t be penalised for using unsafe’, but I’d argue it’s more ‘rewarding crates for being 100% safe’


#4

But it is (or at least I guess it should be) the relatively low ammount of code. I don’t see any good reason to make syntax suger for corner cases. And I am saying this while working on highly unsafe crate.

But rust tends to be very explicit everywhere. This is very big advantage - when I am looking at some Rust code, I see exactly what happens. I wouldn’t be surprised by side effects, or by some unsafe behaviour. In your proposal, Rust code would behave diffrent basing on some compiler flag, and the difference is on one of most important thing in rust - safety.

Disagree. Checking array indexing represents indexing array with data of unknown source. If you are indexing array with data from stdio/file/network without check, its definetly major safety issue. Most of indexing is such case, even if not directly, the indicies are evaluated basing on such data. At some step you have to check it, and the best is to do it as late as possible, so you are sure you really need it. Best place is moment when you are indexing. However I can imagine case, when I am perfectly sure the index is correct, but in such case:
a) it is easy deduceable from code, eg. case:

if myvec.len() > 5 {
    let x = myvec.len() - 1;
    let y = myvec[x];
}

In such case the bound checking optimisation such be done by compiler. I am not sure if it’s done for now (however as far as I know llvm, this should be done almost for free by it), but if its not, I would recommend working on optimizer instead of proposed syntax;
b) container was indexed with such index before, and it didn’t change from this time - however I don’t see any reason, why you cannot keep reference to indexed element instead if an index;
c) it is based on complicated logic, or programmer knowledge which is undeduceable from code, which is unsafe from language POV, and it should be done loudly;


#5

Unsafe array indexing is the most important use case for me; similarly the most important numeric type is an ‘unsafe’ subset of ‘float’ which is really assumed to be ‘never NaN’ (whole-program logic must ensure no NaN values are generated).

"Disagree. Checking array indexing represents indexing array with data of unknown source. "

so thats data read in. you check it, and pass it on to the rest of your program.
you might be doing IO heavy code. My use cases are all a pipeline of transformations.

“At some step you have to check it, and the best is to do it as late as possible,”

strongly disagree: test as early as possible , then the rest of your pipeline/program internals can take advantage of the assumptions.

Anyway

why the hell are you trying to tell me things about my use cases???

I have other reasons to test. The best scenario FOR ME is extra tests in a debug build. There’s way more besides the indices merely being in-range that I’d like to know about.

There are many niches in software. Don’t assume to much about what I want to do, or why, or what experience I have that i’m projecting into the future.

A systems language needs to give precise control - ‘don’t pay for what you don’t use’. You profile, but short of trying every architectural decision (which would mean developing multiple versions of the same program in parallel) there’s a rule of thumb that ‘no work is cheaper than work’. Checks are a form of work. I’ve worked on platforms that are utterly intolerant of branches, and I know the reasons behind that may recur in up and coming niches (like accelerators for AI)


#6

For context, see the recent discussion: https://www.reddit.com/r/rust/comments/6whm7s/an_interesting_view_on_graphics_programming_rust/

So as I understand it, this is about making it possible to write unsafe very high performance code conveniently, for those who can’t tolerate any runtime checks at all in critical parts of the code (which I guess are of large enough extent to make it worth the bother of adapting the language), and where traditional testing must be used to prove safeness, rather than Rust’s current combination of compile-time and runtime checks. It’s about seeing whether Rust can be made into a language that is comfortable to use for this purpose.

(I don’t know enough to comment on the suggestion itself.)


#7

It has to be, because otherwise in a general context (where you don’t know what the types are) how would the compiler know if something is safe or not? The trait defines the interface, including any potential unsafety. Making a trait whose implementation is sometimes unsafe and sometimes not means the compiler has to conservatively assume everything is unsafe.

I would go so far as to say that unsafe code is intended to be unergonomic, because Rust’s main job is to promote safe code.

One possibility is that of rebindable syntax: the idea would be to allow the user to pick what operators such as [], etc mean in a given source file. So rather than desugaring them to std::ops::Index all the time, it could be overriden to desugar to self::Index or something.

Anyway, that’s pretty far fetched. If you do use pervasively unsafe code, I recommend a shorthand like ptr.o(index) for ptr.offset(index).


#8

That to me is crazy.

What I do like about rust is where it makes safe code easier to write (e.g. the expression based syntax, inbuilt ADTs etc), but there’s no need to cripple the unsafe blocks to achieve that.

C++ would just live on for the use cases as I describe.

One possibility is that of rebindable syntax: the idea would be to allow the user to pick what operators such as [], etc mean in a given source file.

Yes: that’s what I’m after; would there be another use case with wrapping/nonwrapping arithmetic

I recommend a shorthand like ptr.o(index) for ptr.offset(index).

To me that wouldn’t really help: once you’ve got .ident() inplace, you’re not really saving much with the character count, although I would certainly do a ‘.get(i)’ as a compromise for the moment


#9

I wasn’t clear. Obviously if your code can take advantage of such assumptions, early check is better. But in such case if code is written properly, compiler should be able to optimize repeated checks out. It’s so basic optimisation that I can’t belive rustc cant do it, but I will do some test tomorow about it - just because of curiosity. Obviously this becomes more difficult when not everything could be inlined, but in such case good design helps, and if there is no obvious design, its also probably not obvious for someone reading code, so being explicit is good.

Sorry if you feel offenced, that wasn’t my purpose. I don’t say anything about your particular case, I was talking about general. I don’t know what exactly are you doing, and I don’t know anything about your experience, and to be honest - I don’t care much about it. You gave some proposal on public forum, I commented it basing on my knowledge and experience, I gave some arguments. I also told, that I know, there actually ARE rare case, where such behaviour is needed, but I claim, that this should be explicit, not nicely decorated. However unless you don’t specify your case, only think I can do is talking in general. And in general, such problems could be optimised out by compiler, or comes from bad design.

You have precise control. Its just very expressive. And its good, because code is much more read than written.

I actually belive that its one of main assumptions of rust.


#10

“I actually belive that its one of main assumptions of rust.”

surely ‘unsafe’ syntax highlighted in red is enough


#11

I also missed you bringing up floats and NaNs. Rust doesn’t check if float is NaN or not - (5.0 / 0.0) + 1.0 is well defined arithmetic and its just done by processor. And yes, I checked it. And if you complaining about lackinng PartialOrd - I also miss it, but its actually correct not to deliver such by default. If you really need it, there is a crate - https://docs.rs/ord_subset/2.0.0/ord_subset/ which do the job.


#12

ok i’ll take a look at that