Runtime detection of a raw pointer passed into generic function or compile time bounds

Hey,

I'm currently writing a crate for idiomatic Rust-like mutex's for Windows kernel driver development, called wdk-mutex. I'm working on a feature at the moment to differentiate wrapping a T in a mutex with new(), where the mutex type owns the T; vs taking a raw pointer to a pool (heap) allocation.

It's becoming a sticking point when freeing memory and looking at lifetimes of allocations that passing in a pointer to new() wont properly free all memory.

Essentially, I'd l ike:

new() to only take in T, and not *mut / *const T, and from_pool() to only take in a *mut T.

From pool is a simple one as my function definition is:

pub unsafe fn from_pool(ptr: *mut T) -> Result<Self, DriverMutexError> { }

However, the new function as you can see will allow a pointer to be passed in:

pub fn new(data: T) -> Result<Self, DriverMutexError> { }

My options seem to be:

  1. Introduce an additional layer of abstraction by defining an enum that contains the T that is well described so the user of the crate knows which type to use (owned vs pointer) enforcing compile time checks. So the enum could be pub enum DataType<T> { Owned(T), Pointer(*mut T) } - I'd rather avoid this as its an additional wrapper for the person calling the constructor.
  2. Use traits, though I cannot find a trait that will disallow a pointer being passed in?
  3. Runtime reflection (to return an error if type = raw pointer) to determine whether the type is a pointer, the only thing I found was TypeId::of::(), which isn't a good candidate as it requires the argument to have a 'static lifetime.

I'd be grateful for any ideas as to how this can be approached in rust, failing that I may have to look back at my memory model of the work I have done on my local branch.

Does the Self type include T, or is there some type erasure going on?

  • Addendum: looking at your docs, it does seem to be the former: KMutex<T>, which is good.

While it is true that <T> genericity can encompass the T = *{const,mut} U case(s), by doing so the user ought to end up with YourWrapper<*{const,mut} U> rather than YourWrapper<U>, so hopefully that severly limits the "damage" of such a mistake.


If what you're worried about are lifetimes, then you probably want is a T : 'static clause/bound.

Indeed, this literally means T : UsableFor<'forever>, as in, any (owned) instance of the type T is safe to use/handle (e.g., move around!) for the duration 'static = 'forever

  • note that this is a weaker property than witnessing a &'forever T reference, since that one means that the pointee/referee behind that reference:

    • is not only of a type which is : UsableFor<'forever>,
    • but also that that specific pointee will remain there, "without moving or getting deällocated", 'forever.

    The typical example being String, which is indeed : 'static i.e. : UsableFor<'forever>, since any (owned) instance of type String is safe to use/handle 'forever. It is "independent" from other lifetimes. This is different from T = &'short str, which is not safely usable beyond the 'short duration. It is also weaker than T = &'static String = &'forever String, which would mean having a specific String instance located somewhere in memory, 'forever, thus kind of "pinned down" there (thus very likely to stem from leaked memory).

    Philosophical questions aside, a type T is : 'static if and only if any lifetime parameter appearing within T is itself 'static. So either the type has no lifetime parameters whatsoever inside, such as T = String, or if it does, they must be 'static, such as Cow<'static, str> or &'static str itself (a &'forever reference is itself : UsableFor<'forever>).

You may still be worried about the case of T = *const i32, for instance, which is : [Safely]UsableFor<'forever>, because:

  • in practical terms, there are indeed no lifetime parameters occuring within the type *const i32;
  • from the PoV of safe API usage, there is no non-unsafe API with which you can do anything dangerous when having a *const i32 instance around: if you do want to do something (useful but) dangerous with such a value, such as dereferencing that pointer, then unsafe { … } is to be used, thereby waiving any liability from the compiler's type system safety guarantees.

Which, as a library author, also waives any liability in your regard: if a user wants to shoot themself in the foot by storing a *const i32 inside your KMutex<*const i32> type, to later try to fetch it, dereference it, and end up triggering UB from a user-after-free, then that's on them, not on you (you are free to try to document lifetime invariants/properties or lack thereof in the docs, but cannot be held accountable for such mistakes).


However, you are dealing with kernel drivers, which is a rather unusual/exotic situation, not all unlike dynamically loading libraries and whatnot. So maybe you'll have to be even stricter than just : 'static, since maybe a &'static i32 or &'static str, alone, can trigger memory unsafety in your API.

If that's the case, do tell us so, because more effort will be needed. But be aware that in this area Rust is quite bad / has a rather poor type system, because Rust assumes that static memory of the process, or leaked memory (including the backing memory of an allocator), both get to live 'forever (which is witnessed, from instance, by that amalgamation of static and 'forever notions in the poorly named 'static keyword).

And I know that at least for dynamic library loading, this is not true. "We" may be a dynamically loaded plugin which may end up unloaded before the end of the life-time of the whole process (the "host").

In such situations, the inadequacy of 'static makes the whole type system unsound. This is known, and the current cop-out is just to say that unloading a dynamic library is unsafe (or, in an even more convoluted and contrived manner, that loading it is, because of it ultimately ending up unloaded), and that any non-type-system-checked use-after-frees resulting from using 'static memory from the to-be-unloaded library is the fault of whoever called that unsafe API.

1 Like

Thanks for your detailed reply, I’m not sure in this case (if it simplifies it) lifetimes are too important, when a new mutex is instantiated one pool (heap) allocation happens which allocates room for an inner type which contains the kernel mutex object, and a second field for a pointer to the T.

The second allocation is of size T, which is pointed to by the previous allocation. So, if I understand what you’ve said correctly, dealing with lifetimes shouldn’t matter as the wdd-mutex type will take ownership of the T as the lifetime is managed internally by the mutex itself.

Once the mutex is initialised, if it’s initialised with a pointer in the first place the original memory won’t get automatically freed when RAII cleans up the outer mutex. I.e. the memory which was allocated prior to constructing a mutex and is alien to the mutex and won’t be dropped by drop_in_place() during Drop. Perhaps your comment regarding documentation is a good way to approach it and just be more explicit about this? I’d like to be able to prevent a pointer being passed in, in keeping with the ethos of Rust’s type safety.

I'm not sure what is wrong with allowing to pass a *mut T to new? It would represent a mutex with a pointer, without an ownership to the value it points to - e.g. it is just a pointer, some numbers with provenance (like if T is hypothetical Newtype(usize, PhantomData<T>, ProvenanceZST). But from_pool would assume that *mut T it got is from pool and would apply different logic. Maybe an enum can be as a private member, but I see no sense in exposing it.

2 Likes

Hey, yeah very good point - perhaps I'm over thinking the issue. From a design perspective there are two behaviours that would happen:

  1. The mutex owns the T and drops the T when the outer mutex (not the guard) calls drop
  2. The mutex does not own the T, so would be constructed from *mut T.

The disparity between the 'ownership model' of the two pathways in my head seems too wide; however, perhaps that is acceptable. I may strip the problem back and go with just new.

If you pass a *mut T into new, which itself looks like:

pub struct KMutex<T> {
    inner: *mut KMutexInner<T>,
}

/// The underlying data which is non-page pool allocated which is pointed to by the `KMutex`.
struct KMutexInner<T> {
    /// A KMUTEX structure allocated into KMutexInner
    mutex: KMUTEX,
    /// The data for which the mutex is protecting
    data_ptr: *mut T,
}

impl<T> KMutex<T> 
where T: Sized {
    pub fn new(data: T) -> Result<Self, DriverMutexError> {
        //
        // Non-Paged alloc for all struct data required for KMutexInner
        //
        let sz_for_inner = size_of::<KMutexInner<*mut T>>();
        let inner_pool_alloc = unsafe {
            ExAllocatePool2(POOL_FLAG_NON_PAGED, sz_for_inner as u64, u32::from_be_bytes(*b"kmtx"))
        } as *mut KMutexInner<T>;
        if inner_pool_alloc.is_null() {
            return Err(DriverMutexError::PagedPoolAllocFailed);
        }

        //
        // Non-Paged alloc for the data we are taking ownership of
        //
        let sz_for_data = size_of::<T>();
        let data_ptr = unsafe {
            ExFreePool(inner_pool_alloc as *mut _); // prevent memory leak
            ExAllocatePool2(POOL_FLAG_NON_PAGED, sz_for_data as u64, u32::from_be_bytes(*b"kmtx"))
        } as *mut T;
        if data_ptr.is_null() {

            return Err(DriverMutexError::PagedPoolAllocFailed);
        }

        //
        // Write the data to the pool
        //

        // SAFETY: This raw write is safe as the pointer validity is checked above.
        unsafe {
            ptr::write(
                data_ptr, 
                data,
            );
        }


        //
        // Write the inner structure which consists of the KMUTEX Object, and a pointer to the data
        //

        // SAFETY: This raw write is safe as the pointer validity is checked above.
        unsafe {
            ptr::write(
                inner_pool_alloc,
                KMutexInner { 
                    mutex: KMUTEX::default(), 
                    data_ptr,
                }
            );

            // Initialise the KMUTEX object via the kernel
            KeInitializeMutex(&(*inner_pool_alloc).mutex as *const _ as *mut _, 0);
        }

        Ok(Self { inner: inner_pool_alloc})
    }
}

You would be dealing with a double pointer for data_ptr? Perhaps I am being overly verbose with my model

Can you please give a couple of use cases and expected outputs? I probably do not understand the goals that we want to achieve with an api.


For now, I would expect that Mutex<Foo>, Mutex<*mut Foo> and Mutex<*mut *mut Foo> behave the same.

Mutex::<Foo>::from_ptr would take *mut Foo, Mutex::<*mut Foo>::from_ptr would take *mut *mut Foo and Mutex::<*mut *mut Foo>::from_ptr would take *mut *mut *mut Foo.

Your code for new would remain unchanged and will not deal with indirection.

1 Like

Your comment has given me food for thought, I'm going to stash my changes and go back to the original state and redesign some tests and use those tests to figure out if passing a pointer in is a problem. Talking it through has helped! If I can reproduce some undefined behaviour by passing in a raw pointer, I'll drop a minimal example which reproduces the issue..

1 Like

Great. From what I can see, your mutex is only working with a pointer as a generic value, never dereferencing it. So only user can dereference it or convert to reference, so any ub would be caused by users's unsafe block. And even if you somehow specialize (with feature(min_specialization) for example) it to work with pointers specifically, user can just cast it to usize when using mutex, and casting it back when he wants.

1 Like