Adding safe abstraction into unsafe global raw pointer

Say there is a global mutable variable, which is a raw pointer, to memory location. The raw pointer gets the memory location by calling an FFI (say VirtualAlloc or ExAllocatePool2). I think it is common practice that the global heap data (the raw pointer) is referenced and modified many different places in win32k or kernel mode driver programming under Windows. But this needs to be under 'unsafe' block (Ch19 of Rust Language book and Rustnomicon). The biggest motivation migrating from C/C++ native code to Rust is to get memory safety so at here I want to check whether there is any way to make safe abstraction on global raw pointer variable without sacrificing program performance. If authentic Rust author says 'no, there is no such way', that would be helpful as well. For my question, please assume that Rust's standard library is not available.

Do you want lazy_static or once_cell?

1 Like

Before studying lazy_static or once_cell depper, aren't they require availability of Rust standard library? Please assume that standard library isn't available nor any external crate. But I am happy to create something for safe abstraction if exists.

@chrefr, can you add more context how it solves memory unsafety of global raw pointer? For example, the memory location isn't benefited from move or borrow safety concept because it is unsafe raw pointer?

I'm not quite sure what you want. If you'll describe the situation in greater detail, I may be able to help you more. The crates I suggested are used for initializing statics at run time.

Regarding std, no, both of them are no_std (once_cell has the std feature for enabling synchronized initialization, but it isn't required).

In general, yes it is very much possible to create safe abstractions around unsafe code (i.e. global variables and raw pointers). There is a big caveat, though - your Rust code can only be as safe as the rest of the system.

For example, if you've got some driver which is able to mutate your object at the same time as your program and the two aren't synchronizing their access, then you're gonna have a bad time and Rust can't help you here. We usually use the word "unsound" when you create an abstraction which says it is safe while the code it wraps is doing dodgy things.

That reiterates another point - Rust's safety system isn't magic. It works by encoding information about how something is used into the type system, meaning if you (for example) forget to take a lock before mutating something it'll be a compile error instead of a bug you discover 6 months down the track. In general this can lead to performance improvements because you can be certain something is in a particular state and know that runtime checks or error handling are unnecessary (e.g. no need to add an error path and a null pointer check if you know a particular pointer can never be null).

For your example of using VirtualAlloc() to allocate memory instead of the Rust heap it'd just be a case of creating your own Box type which calls VirtualFree() in its Drop implementation. Creating a safe abstraction for interacting with a kernel driver would be more complicated, but still very doable (and it has already been done by many people).

Not having access to the standard library isn't an issue. The standard library just provides conveniences like std::sync::Mutex (a wrapper around your OS's mutex implementation) and things like that.

OS system is probably written in combination of C and/or C++. So, if my Rust program that interfacing such system is at most safe as the system, what is point of writing a program in Rust instead of C/C++? It is acceptable that shared memory (with OS) is risky but even allocations exclusive to my program has to be global raw pointer. Sure, we can have our own implementation of drop and add VirtualFree. But this sort of abstraction isn't Rust specific and a lot of native code developer usually put a sort of safety measure.

Imagine a hypothetical application in Rust with a million lines of code running on an OS written in C with some more millions of lines of code. Well, seems to me that Rusts memory usage safety guarantees have just reduced the amount of care and checking one has to do on the the whole system to ensure correctness by about half.

Also, that operating system has been in development for decades and used by millions on all kind of platforms. So we can have some confidence in it not misbehaving. Meanwhile the application is fairly new, has far less number of users and is therefor prone to have all kind of memory use errors.

But it does not because Rust won't allow it.

Its all about probabilities. Think of a long windy road up a steep mountain side. Most of the road has a safety barrier to stop you falling off the mountain when things go wrong. But here and there there are gaps in the safety barrier where you could just drive through and plummet to you death.

I think you will agree that mountain road is safer for having the safety barrier. One would not say that having those few gaps makes the whole barrier useless. One would probably just drive more slowly and carefully along the barrier free stretches.

This is similar to how in a large Rust code base one knows one has to pay special attention to those few places one has used "unsafe".

This all sounds like a good proposition to me.

2 Likes

Does such guard exist in my question? Going back to my original question: I have a raw global memory pointer pointing to an allocation using FFI (ExAllocatePool2) from OS written in C/C++. Rust's standard library can't be used (just want to indicate that I need to write my own library if needed). The Rust program language has the following indication:

Different from references and smart pointers, raw pointers:

  • Are allowed to ignore the borrowing rules by having both immutable and mutable pointers or multiple mutable pointers to the same location
  • Aren’t guaranteed to point to valid memory
  • Are allowed to be null
  • Don’t implement any automatic cleanup

By opting out of having Rust enforce these guarantees, you can give up guaranteed safety in exchange for greater performance or the ability to interface with another language or hardware where Rust’s guarantees don’t apply.

The global memory pointer situation is only memory safety concern really in my program. But the Rust specification says it doesn't provide safety guard.

You can, for example, put that raw pointer in a struct with a method that provides safe access to the content by converting it to references that safe code can use.

You're good if you can guarantee that that abstraction is sound. I'm not aware of anything off-the-shelf like that but it is something you can make yourself if you want.

I have no idea how it is done but seems to me almost certainy yes.

All over my Rust code I have things like println!() that do output. It's a global thing. That means they must at some point use FFI to make the system call to the OS to do the I/O. Which is then handles in C or whatever. And likely assembler as well. All of which is "unsafe" in the technical sense of Rust.

Should I conclude that Rust's memory safety checks are a waste of time because much of my code has a `println!' or the like in it. I think not.

My questions are: Why do you need this thing to be global? Why do you need it to be mutable from anywhere? These are generally considered bad practice in any language. And have certainly led to all kind if problems in code bases I have worked on over the years.

My feeling is that, whatever that raw pointer is for, it should not be available for use any which way globally. It's use of it should be wrapped in some little unsafe blocks and any use of it should be via those. Thus ensuring your code has Rusts guard rails everywhere. Only those little unsafe blocks need careful scrutiny for memory misuse errors.

The 'println!()' is part of standard library and it is mainly read only. Your example isn't compatible to my question. The memory should be read and writable and it is standard practice of memory allocation

One needs to have some global raw pointer so that entire program can access the memory whenever it wants (either read or write) either via directly or calling a function that has access to the global memory. Don't assume that you can pass the raw pointer to all functions. For example, some functions are called from system without any parameter. At there you need to free the memory. Not being able to having mutable global memory pointer is deal breaker in this situation.

I thought about doing it. Then, under circumstances that those memory raw pointers are only safety concern in my program, why not doing it on my existing C/C++ code. It will save great cost of migrating entire C/C++ to Rust.

You will be the one providing the safety guard.

The C++ code already has some system in place for properly interacting with this global raw pointer (it needs to, otherwise it's full of UB) and your Rust code just provides an API which follows that system.

It is an improvement because you can be sure that, if you've written the safe abstraction correctly, every bit of Rust code that uses it is doing so correctly. That means if you encounter memory issues when your program interacts with the resource, the piece of code at fault is probably your C++ and not from the Rust code base.

Being able to say "I know for sure that everything in this chunk of code is correct" is incredibly useful when you are wanting to troubleshoot a problem or prove that your system is correct.

There's no reason why you need to migrate everything to Rust, and there are actually several reasons why this is a bad idea (unnecessary code churn, you already have perfectly good code, re-introduce bugs that you spent the last 10 years ironing out, etc.).

Both Firefox and Windows contain a lot of Rust code even though they are traditionally C++ code bases. Their approach was to slowly write new code in Rust and have it slot into the existing C++ code.

This often works really well when working with high-level components where there is a natural "seam" - for example, Firefox was able to replace its previous CSS engine with Stylo because it takes in a reference to your DOM and stylesheet and returns instructions to the renderer for rendering a web page to the screen.

I don't think both programming language cannot provide, for example, buffer overrun protection for global raw pointer. So, what improvement are you referring to (pls note that whole this topic is about global raw pointer)?

Confusing double negatives there. Is that "Both programming language can provide, for example, buffer overrun protection for global raw pointer." or "Not both programming language can provide, for example, buffer overrun protection for global raw pointer."

Makes no difference either way because neither language can provide buffer overrun protection given only a pointer. To check buffer overrun one needs:

  1. The bytes that are the content of the buffer somewhere in memory.
  2. A pointer to that array of bytes, the raw pointer,
  3. The length of the available buffer space.

When you have all that wrapped up together so that buffer use can be checked you have some kind of "smart pointer" of which the raw pointer is only one part. For example Vec and String.

I think I have lost sight of the original question in this thread. Globals are commonly discouraged as they lead to code that is hard to reason about. Raw pointers cannot give the memory use correctness that safe Rust can. Interaction with any foreign language or hardware is necessarily unsafe.

Still, we can wrap those unsafe things with functions that ensure they are used safely and proceed to use them from our safe Rust code.

How cool is that?

1 Like

Indeed it is confusing. I meant what you said. Both language cannot provide buffer overrun protection.

I think discouragement of 'globals' should not be generalized. Mostly, some program environment constraints make them necessary.

Say we have a global variable 'static MUT gAddress: PVOID' and assume that memory was allocated from a foreign function. Could you show an example how to wrap this unsafe global raw pointer to safe Rust?

You haven't provided enough information yet to answer this question. The foreign function should have some documentation saying what you can and can't do with that address. Writing a safe abstraction requires translating those prose instructions into something the Rust compiler can enforce automatically.

1 Like

I did provide at the top but here it is again: ExAllocatePool2. I don't have control over this as Microsoft owns it and once the memory got allocated for read-write purpose. I can do whatever I want to within boundary and release at some point.

The first thing you probably want to do is convert the raw pointer into something with compiler-enforced guarantees. One option is something like this:

use core::cell::Cell;
use core::slice;

// standin for FFI
unsafe fn ffi_alloc(bytes: usize)->*mut u8 {
    unimplemented!();
}

// Note: this will never deallocate the memory!
// Note2: Cell<u8> can't be passed between threads
fn safe_ffi_alloc(bytes: usize)->Option<&'static [Cell<u8>]> {
    unsafe {
        let ptr = ffi_alloc(bytes);
        if ptr.is_null() { None }
        else {
            Some(
                Cell::from_mut(slice::from_raw_parts_mut(ptr, bytes))
                    .as_slice_of_cells()
            )
        }
    }
}

The &[Cell<u8>] will make sure that buffer overruns result in clean panics, but still allow a single thread to make arbitrary changes within the buffer. You could store this in some kind of thread-local storage, or in a global that only permits one thread to access it.

I'm pretty sure that synchronizing the global variable access between multiple threads will require something other than core, either std::sync or a platform-specific library.

1 Like