When to use raw pointer?

Hi,

I've asked question couple weeks ago about multi mut hash map, and the answer for my question was to either use raw pointer or use 3rd party cargo that uses raw pointers internally.

Rust has syntax sugar that allows to create multiple mut references to fields of struct or array in safe way. However in case of any abstraction like hash-map in my question above, the only way to get multiple references is to use raw pointers, which are not safe.

True is that most of algorithm in real programming will require to have multiple references, and there is an article that describes why unsafe code still required.

Questions that I have:

  1. In which situation should I use raw pointers?
  2. In which situation should I stop using raw pointers?

I've checked for couple docs available, and there is the only case for raw pointers is FFI, for me it looks like it is not single case.
https://doc.rust-lang.org/book/first-edition/unsafe.html
https://doc.rust-lang.org/book/first-edition/raw-pointers.html

I believe that safe approach should be used by default.
3) Are there any existing guidelines that describes in which situations use of raw pointers is justified?

1 Like

#1: Data structures (when implemented from scratch), and FFI.

Even then, most data structures can get away with using something like a Vec or array for their backing storage, rather than needing raw pointers.... but not all.

#2: Always find an encapsulation boundary. This is easy for data structures; the data structure is your encapsulation boundary. It may use raw pointers internally but your public API should be safe.

Multiple access such as in your problem can be encapsulated in a number of ways; find the one that's right for your use case:

  • If you can provide a thread-safe API for mutating the container, that API can take &self. (& in rust really means "shared" or "thread-safe," not necessarily immutable!)
  • Provide methods that provide multiple disjoint "&mut views" of the same object, tuned for your use cases.
  • If all else fails, try to factor out a small core of code that takes care of all of the unsafe mutation in a manner such that UB is impossible. For instance, some kind of job manager, like rayon's ThreadPool. Then you can build a standard safe API on top of it.
4 Likes

To answer #1, my rule of thumb is that you hardly ever need to use unsafe, but you'll typically know when you need it.

In less vague terms this is typically when working with raw memory, so things like creating your own primitive data structures, or when working with complex data structures which don't fit well with Rust's memory model (graphs and self referencing types come to mind).

You also need to use raw pointers when working with hardware. On a micro controller, the way you interact with the outside world (e.g. by setting a pin high/low) is by writing values to a particular memory location. One way you can do this is by let mut led_1 = 0xdead_beef as *mut u8; *led_1 = 0xFF;.

A lot of people say you can always drop to unsafe and raw pointers for super high performance stuff, but in all the time I've been writing Rust I can't say I've ever needed to do this. LLVM generates really fast code by default so usually choosing a better algorithm will give better performance benefits than trying to skip a couple bounds checks.

FFI code is probably the major reason why you'll need to use raw pointers. Having the ability to reuse existing C/C++ libraries or interact with the OS is super useful, although you'll typically find people have already created safe abstractions for you (e.g. the nix crate).

#2 you should probably stop using raw pointers if you're doing hacky stuff like pointer casts without understanding 100% what you're doing. Also, as the recent actix-web incident shows, it's a good idea to look back and ask yourself "do I actually need to use unsafe here?" every now and then.

If it's still necessary to use raw pointers, then you'll want to try to expose a safe API which encapsulates the use of unsafe. The best example of this is the Rust standard library. Vec and BTreeMap use loads of unsafe under the hood, yet they've made sure to expose a safe interface which people can consume without worrying about memory bugs.

To answer #3, we don't really have any concrete set of guidelines around writing unsafe code just yet. The nomicon is a great resource, as well as Learning Rust With Entirely Too Many Linked Lists.

Other than that, it's a good idea to ask for feedback on your code (or even a proper audit if necessary) either here or on the Rust Reddit thread. I've done that several times and found it quite beneficial, even though it can be a little humbling when others point out that what you've done is unsound.

3 Likes

Thank you for your answers