Understanding of rust Raw pointers

I am currently reading the source code of rust and am interested in the underlying details of the Raw pointer, but when I googled the information I found a problem, most of the information is only general and does not cover the essence of the Raw pointer, how can I understand the Raw pointer and the underlying details of the encapsulation of rust from the ground up?

I have a little bit of experience with c and I know that rust Raw pointers are more complex than c.

Rust's raw pointers are not significantly different to C.

Can you be more specific? The provenance story is still evolving for example.

  1. What is the internal memory structure of the rust primitive pointer
  2. The relationship between rust primitive pointers and manual memory management
  3. The relationship between rust raw pointers and Layout
  4. How rust primitive pointers associate allocated memory blocks with the type system

The same as any other language with raw pointers. They're the same size as usize, (64 bits on 64 bit machines, 32 bits on 32 bit machines). They're literally just an address into memory.

Well, raw pointers let you do things which would otherwise be unallowed such as keep two points of mutable access to an object. They also have a specified ABI when their pointee is Sized so you can pass them through FFI boundaries.

It's UB to dereference a raw pointer if it's not as valid as a reference would be.
In essence this means:

  • Your pointer is correctlty aligned to the pointee.
  • Your pointer points to an object in an initialized state.
  • Your pointer is not null.

You can allocate memory using the allocator in Rust with a particular size and alignment (that's wha the Layout struct is for) and that'll give you back a pointer. You're then guaranteed to have access to that data without:

  • A segfault
  • Overwriting other data you care about

(Although of course you must initialize it properly)

I'm not quite sure what you mean... if you're asking if memory has an inherent type then the answer is no but things can get tricky:

  • Transmuting objects by means of reading the wrong type from a raw pointer is usually UB.
  • Reading padding bytes is also usually UB.
  • Accessing specific fields in a #[repr(Rust)] (by default structs, enums, and unions have #[repr(Rust)] on them without being explicitly placed there) is UB if I recall correctly (you can only get an address to them if you specifically reference the pointee and offset methods aren't guaranteed to work).
2 Likes

Note that in Rust, memory itself is quite definitely untyped. If the bits, provenance, and initializedness are correct, you can legally read/write it regardless of how that memory was originally obtained.

(This is different from things like C++ that have TBAA, and thus certain kinds of reads are UB regardless of the bit pattern behind it.)

2 Likes
  1. It's just a pointer; holds a memory address[1], bitwidth depends on the platform.

  2. a. You typically don't do manually memory management, you use Box or the like
    b. Or for initialization, MaybeUninit you move into a Box (until we have new_uninit)
    c. You get a *mut u8 from the allocator, which you'd probably want to convert into a Box anyway.

  3. Layout is documented here. The main relationship is how various pointer arithmetic is done, and whether or not a pointer is aligned. You can read a bunch on this page and also here, where some operations are byte based but most are size based. (Yeah there's a lot.) You can do unaligned access through raw pointers. References have stricter requirements.

  4. The provenance of pointers can't cross allocated objects. Maybe that's what you meant, but I'm not sure. The allocator hands out pointers that one eventually hands back. What exactly the allocator does can vary per allocator. But again, a typical Rust programmer won't be working on this level, they'll be working with Box (or Vec or...), or perhaps with MaybeUninit.


  1. mumble mumble CHERI, but that's not a Rust specific thing either ↩︎

2 Likes

(You do have to initialize it.)

1 Like

@scottmcm @quinedot: I edited my answer to better reflect the intent :smile: -- thanks for catching those.

2 Likes

Rust does have one additional quirk, though: Raw pointers to unsized types, like *mut [u8] or *mut dyn Trait are fat pointers that contain some additional metadata in addition to the memory address. I believe that all of these are double-width at the moment, but the specifics are dependent on the target type— There’s a possibility that some types could have even wider pointers sometime in the future.

Generally, the metadata will either be a usize that describes the length of a slice or a pointer to a vtable used for dynamic method dispatch.

5 Likes

To consolidate discussions into a single thread, let me quote the relevant answer from @CAD97 in the IRLO crosspost.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.