Adding safe abstraction into unsafe global raw pointer

droundy · June 28, 2021, 9:15pm

It sounds like your question translates approximately to "can I write a safe allocator?" The answer is "yes" but the details of doing so will depend on what you intend to use it for. If you want to support threads, for instance, then any access to global variables will need to be synchronized. If you want to create objects of different types (particularly types that are Drop), then that introduces other subtleties.

Can you give us any hint as to your goal?

JamesG · June 28, 2021, 9:17pm

From my first message in this thread, I mentioned that standard library is not available. From your writing, it reminds me that I need to specify further that 'panic' is not available either. In kernel mode program, if something goes wrong OS will take care of crashing it. Availability of panic has minimal impact on my question though.

2e71828 · June 28, 2021, 9:25pm

That's why I restricted my example to APIs available in core rather than std. As far as I understand, the core library is generally expected to be present for all Rust environments, even extremely restricted ones. I think that no-core programming is technically possible, but that gets into pretty deep-magic territory.

jameseb7 · June 28, 2021, 9:40pm

Panic is generally still available in no_std programs, but you need to write your own panic handler, annotated with #[panic_handler], to handle it since the standard library panic handling code is not available. Information on writing panic handlers can be found in the nomicon. In your case, you could notify the OS to crash it somehow. It handles it a lot more cleanly and reliably than just allowing out-of-bounds accesses with no checks.

JamesG · June 28, 2021, 10:09pm

@jameseb7, yup, I need to overwrite the panic handler. It it just happen to be I am not in there yet. This is minimal to my main question so let's assume that I have a low level panic handler.

ZiCog · June 28, 2021, 10:56pm

That is not true. If you are writing regular Rust code, without any unsafe you cannot overrun a buffer. In C/C++ you can.

I will generalize.

When I arrive to work on a project, with thousands of files and 10's, 100's of thousands or a million lines of code, a global variable makes me nervous. How do I know what happens if I tweak that variable? No idea unless I spend a year tracking all uses of it down. Or trusting what the guys who have been working on the code base for a decade tell me.

No.

The way I read what you have said is that 'gAddresscontains the address of some unknown data of some unknown type.PVOID` as you day.

Not only that we don't know who owns that data. Who will deallocate it, and when?

I see no way that the Rust language can provide any safety guarantees in the face of so many unknowns.

JamesG · June 28, 2021, 11:37pm

You cannot compare only "safe Rust" with general "C/C++" and make conclusion.

Developers inherit programming environment and those may contain global variables with millions lines of code. When system programs interacts with multiple different systems, local reasoning can be theoretical wish.

I put unknown type but it should be some generic type where T is program specific data type. Here, my curiosity is: if Rust can provide some "memory safety" (I don't know what this exactly include either yet) on global raw pointer, it may be possible to guarantee same in C/C++.

JamesG · June 29, 2021, 12:00am

I was wrong to assume that Core was part of standard library. How about 'core::slice'? For example, from_raw_parts in core::slice - Rust (rust-lang.org) mentioned std::slice. Has core::slice separated from std::slice?

ZiCog · June 29, 2021, 1:30am

When it comes to accessing arrays ("Vec", "vector" whatever you call them) out of bounds I am 100% sure regular safe Rust will not allow it and your program will fail immediately. Meanwhile C and C++ will compile and run it with whatever random data corruption and crashes that may, or may not, occur.

I have enough experience of that to make a conclusion.

I think you are right.

However I don't think we should just shrug our shoulders and give up saying "That is how the world is and will always be". At least not when it is clear that much can be done to make the situation better.

Hmm... Rust does not provide any memory safety for the use of global raw pointers. Neither does C++. The difference being that in Rust you have to announce such potentially dangerous actions by using unsafe.

Interestingly the creator of C++, Bjarne Stroustrup, has said recently that he would like to see Rust like memory safety guarantees in C++. Ideas of ownership, the borrow checker, etc. Not only that he claimed that C++ could do it better.

All sounds great to me. Let's see how it goes....

Cerber-Ursi · June 29, 2021, 4:06am

std reexports everything from core. For example, if you go to the documentation of from_raw_parts in std::slice - Rust and click src, you'll see the core source code - raw.rs - source.

JamesG · June 30, 2021, 5:53pm

Based on Cell information page, I couldn't locate how Rust detects the overrun. Can you point me to the corresponding information page?

2e71828 · June 30, 2021, 6:07pm

The slice type [T] contains an embedded length field that's used for bounds checking. The Cell is there to allow changes through the shared reference.

alice · June 30, 2021, 6:14pm

It doesn't really have anything to do with Cell. All slices have bounds checks.

JamesG · June 30, 2021, 7:53pm

Say there are N cell blocks and k <= N. It looks like the solution protects iterative buffer overrun (walking beyond N). The memory unsafety (corruption) can come in other ways. One I can think of is developers mistakenly start write something into k-th cell but mistakenly overflows into (k+1)-th block. In reality, some developer can even make mistake k to be greater than N behind hundreds/thousands/millions of code abstraction.

2e71828 · June 30, 2021, 10:03pm

There are ways to solve, or at least mitigate, all of those problems. One option is to break up the buffer into subslices that are each the length of a single chunk, thus turning a chunk-boundary overrun into a slice overrun that will be checked in the emitted code.

There’s no one-size-fits-all solution here: Every choice to help with one problem impacts the solutions you might choose for the others. Designing the abstraction you want has to start with a comprehensive review of the requirements, whether they come from the foreign API or expected usage within your program.

If this happens, it will trigger a panic which will in turn shut down the program. While not ideal, it’s significantly better than the alternative, which is writing into some arbitrary memory location and continuing.

JamesG · June 30, 2021, 10:24pm

@chrefr, @Michael-F-Bryan, @ZiCog, @mejrs, @droundy, @2e71828, and @alice. Thank you for sharing your thought.

system · September 28, 2021, 10:24pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is there any way to use raw pointers safely?	16	3594	October 17, 2019
Memory dynamics with reference vs raw pointers help	6	499	August 1, 2020
Problem with raw pointers help	8	1020	January 12, 2023
Dereference Raw Memory Address in unsafe rust help	6	994	January 12, 2023
Has a language level "weak reference" ever been explored for Rust?	9	1627	January 12, 2023

Adding safe abstraction into unsafe global raw pointer

Related topics