Raw pointer ergonomics

scottjmaddox · February 1, 2023, 5:57am

I'm currently working on a VM inspired by interaction nets (a graph-based model of computation), and I need to use a lot of cyclic data structures with manual memory management, i.e. a lot of unsafe and raw pointers. And I'm finding that Rust is extremely painful for this use case. I think a lot of the pain would be alleviated by providing an ergonomic pointer-offset mechanism, which could be as simple as a field access. For example:

use std::ptr;

#[derive(Debug, Clone, Copy)]
struct Node(*mut Node, *mut Node);

fn main() {
    let mut node = Node(ptr::null_mut(), ptr::null_mut());
    let node_ptr = &mut node as *mut Node;
    
    // let node_0_ptr = node_ptr.0; // Why not let this be pointer offset?
    
    // Instead, I have to do this:
    let node_0_ptr = unsafe { &mut (*node_ptr).0 as *mut _ };
}

When I want to get an interior pointer, I would like to be able to just do node_ptr.0. Is there a reason this can't be supported?

It would also be nice if you could directly take a pointer using *mut node rather than needing to do &mut node as *mut Node, but I'm guessing supporting *mut node would considerably complicate parsing (if not make it ambiguous). Luckily, that pain is at least alleviated by automatic coercion from &mut _ to *mut _, and can be further alleviated by a user-defined macro.

scottmcm · February 1, 2023, 6:01am

Go for https://doc.rust-lang.org/stable/std/ptr/macro.addr_of.html.

That gives you a raw pointer, and will help you avoid stacked borrows issues from mutable references keeping you from using stuff.

H2CO3 · February 1, 2023, 6:02am

The question is not whether it could, but whether it should be allowed. IMO, it should not. Raw pointers were never designed to be "ergonomic"; they were designed to be unambiguous and explicit so as to avoid memory safety problems.

(Anyway, I deeply despise conflating ergonomics with arbitrarily cooked up syntactic sugar. That's what most often people actually mean, but there is so much more to ergonomics than that. Having your raw pointers be offset explicily is probably more ergonomic in the bigger picture of things, because it will pre-empt many kinds of hard-to-debug abuse.)

Back to the topic: if you are trying to describe arbitrary graphs, use indices/node IDs/adjacency matrices/etc., basically any other representation instead of raw pointers. Or just use an existing graph library that abstracts away the boilerplate for you.

scottjmaddox · February 1, 2023, 6:02am

That does look handy, but if I'm not mistaken, it doesn't help with the case where you only have a pointer and you want to just offset that pointer. Am I mistaken?

2e71828 · February 1, 2023, 6:03am

This is something that is better discussed on IRLO as it’s an idea for improving Rust, rather than asking for help using Rust as it stands today.

scottjmaddox · February 1, 2023, 6:07am

Okay, thanks, I'll do that. I suppose it's possible that I'm just missing something, though, so perhaps someone will point out a more ergonomic way to do this here.

scottjmaddox · February 1, 2023, 6:12am

In most instances I would, but this is a case where that would make the code even more opaque and less type safe. I need graphs of nodes with non-uniform size, with pointers to the interiors of other nodes.

Or just use an existing graph library that abstracts away the boilerplate for you.

I'm writing a low level VM with strict performance considerations. I use inlined functions to abstract away the pointer offset boilerplate, but I end up needing a separate function for each node and field.

scottjmaddox · February 1, 2023, 6:15am

Also perhaps worth noting that the current situation requires sprinkling unsafe everywhere, even though the operation, i.e. pointer offset, is not actually unsafe. This makes it more difficult to minimize the use of the unsafe keyword, which will make future auditing of the unsafe code more difficult.

Aiden2207 · February 1, 2023, 6:22am

Pointer offsets are unsafe. Same thing as in C, too, it's just not obvious there.

2e71828 · February 1, 2023, 6:26am

Can you describe a bit more about your memory management strategy, such as how you're enforcing Rust's no-aliasing for &mut rules and deciding when to deallocate Nodes?

Some existing unsafe-based abstractions, like qcell::TCell, might reduce or eliminate the need for you to write your own unsafe code. Without more information, though, it's hard to make a good recommendation.

scottjmaddox · February 1, 2023, 6:28am

That's only because the offset function is unconstrained. If you have a valid node_ptr: *mut Node, there would be no way for node_ptr.0 or node_ptr.1 (i.e. offsetting to the given field) to wrap.

Note that wrapping_offset is not marked unsafe.

scottjmaddox · February 1, 2023, 6:37am

Nodes are never exposed to safe rust directly. Only a safe, owning TermGraph wrapper is. Nodes are manually allocated and deallocated based on the semantics of interaction nets. Basically, interaction nets are linear, with explicit dup nodes, all interactions between nodes are local, and when a node will no longer be reachable, it is manually deallocated. Deallocation has to be dynamic, based on the links in the graph.

Some links in the graph require bi-directional pointers to enable local updates, a bit like a double-linked list.

2e71828 · February 1, 2023, 6:57am

My first instinct here is to use something like Rc<QCell<Node>> for the forward links, Weak<QCell<Node>> for the backlinks, and store the QCellOwner inside TermGraph.

Once that's working, if it's not performant enough in practice, look to replacing the Rcs and Weaks with your own unsafe abstraction that can take advantage of other invariants in your system.

scottjmaddox · February 1, 2023, 7:28am

Thanks for the suggestion! The overhead of Rc (or really Arc in this case) would be unacceptable for this application, but I'll keep these options in mind in the future!

scottjmaddox · February 1, 2023, 8:18am

I got some helpful feedback from the internals forum: Raw pointer ergonomics - #10 by scottjmaddox - Rust Internals

system · May 2, 2023, 8:18am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is there any way to use raw pointers safely?	16	2681	October 17, 2019
Raw pointers notes	3	549	January 12, 2023
Pass raw pointer without copying help	4	309	July 3, 2023
From rawpointer to rust struct	6	663	January 12, 2023
Understanding of rust Raw pointers help	11	1137	May 10, 2023

Raw pointer ergonomics

Related Topics