References are not pointers

Forking off from Rust References - Few points - Request for comments - #3 by kornel

6 Likes

I don't see how indirection is any different here, and why would the terminology behind it change:

let a = &i42;
let b = Box::new(42);

But how you call it, and whether you refer to the binding holding a value or the value directly, doesn't change the fact that both references and boxes have exactly same level of indirection. They happen to compile to the exact same kind of pointer.

In saying "references are not pointers in the same way pointers are not integers" I mean exactly this distinction, and I think it's essential to what references are/aren't.

Imagine explaining C pointers to C users like this:

Pointers are integers. You can subtract pointers, and you can subtract integers. You can increment pointers and you can increment integers. They only differ in usability, and C will let you do divide and multiply integers, but you can use integers as the mental model to understand pointers.

And it's technically true, but then the "pointers are integers" mental model may cause misunderstanding like this:

(void*) number_of_apples = 0; // pointers are integers

void add_one_apple() {
   number_of_apples++; 
}

void double_number_of_apples() {
   number_of_apples = number_of_apples * 2;
}

And someone will say that the "pointer checker" is hard, because they're just trying to multiply an integer, and it doesn't work! Such a simple operation, and these integers can't do it!

That's because pointers aren't integers.

And the same thing happens when new rustaceans coming from C try to write a doubly-linked list, or just struct Person { name: &str; } and it blows up like number_of_apples * 2.

So to phrase it more precisely:

  • Pointers are integers that have been given additional semantics and restrictions that makes them not usable as integers any more.

  • References are pointers that have been given additional semantics and restrictions that make them not usable as pointers any more.

8 Likes

I still don't get what you mean by the part

The things I use pointers for in C are:

  • Avoiding copying a value
  • Traversing linked data structures
  • Indirectly accessing objects when I don't control where they are coming from
  • Referring to a flat array of values through a single handle

The things I use references for in Rust are:

  • Avoiding copying a value
  • Traversing linked data structures
  • Indirectly accessing objects when I don't control where they are coming from
  • Referring to a flat array of values through a single handle (&[T] and &str)

These look identical to me. I am, and have always been, using references as pointers, with success. And I'm very happy for their additional semantics and restrictions so that I can be sure that I've got the surrounding gory details right.

So I'm not denying that they are there, I am not denying that they are useful, and I'm sorry but I'm not interested in the kind of pedantry asserting that the general term "pointer" is only ever interpretable in the context of a very specific programming language, exclusively with the semantics of that one language. That is not a productive discussion, because almost every technical term is amenable to slightly different meanings in different contexts, as long as we are not talking about formal verification with mathematical precision. Which we are not, because we are talking about building a mental model and educating beginners. Yes, C and Rust are different languages, and they have different idioms and different details. Let's leave it at that – I'm not going to reply to this thread anymore.

You focus on what they have in common, and I focus on what they don't. Because overlooking the "don't" causes the struggle with the borrow checker:

  • Pointers can make doubly-linked lists. References can't.
  • Pointers are usable independently of their scope. References aren't.
  • Pointers can be ownership agnostic, even dynamically. References can't.

And the things you use references for are not exclusive use-cases for references, and require decisions:

  • Avoiding copying a value may require Box or Rc
  • Traversing data structures can be done with iterators.
  • You may be given Rc or a Ref<'a>
  • Vec or Box<[T]>

In C these use-cases are all "of course pointer, only pointer, pointer pointer pointer", but in Rust there's owned/borrowed dimension that you can't ignore. Equating C's one-tool-for-everything with Rust's specific tool for some aspect of these use-cases means people miss the other half of the problem.

3 Likes

Wearing my Rust Historian hat, I'll point out that the original names for &T and Box<T> were borrowed pointer and owning pointer.

(This could be interepreted as evidence for either side. If the creators of the language called these types “pointers,” then there must be some strong sense in which they are in fact pointers. On the other hand, there must also be some reason that we decided to stop using those names as the types evolved into their current forms.)

14 Likes

C also has const pointers and restrict pointers, which impose limitations on the kinds of things you can do with them. But they are still considered pointers. Surely you wouldn't say "const pointers aren't pointers because you can write to pointers and you can't write to const pointers." They're still a kind of pointer.

I think you're saying that "pointer" means something in C that allows certain things, and "reference" means something in Rust that allows different things, and because those things aren't in a strict subset relationship that it is incorrect (or at least problematic) to say "references are pointers".

So far as it goes, I don't venture to argue with you. But this discussion started with "References are not pointers" (emphasis mine). Which seems to me exceptionally pedantic, and likely to cause more confusion than it eases. References are pointers with limitations, like const pointers in C are pointers with limitations. Saying they aren't pointers suggests that even that mental model is incorrect. It's an overzealous correction, in my opinion, that risks causing confusion worse than what it seeks to correct.

I would go farther and say that the "pointer property" is indirection. This other stuff, like ownership/borrowing, mutability/immutability, sharing/exclusivity -- all that stuff may also apply, but it doesn't make something not a pointer because what makes something a pointer is indirection, not any of that other stuff.

11 Likes

In that case it follows that rust has no unrestricted pointers, since it is my understanding that e.g. using a *mut _ to create 2 *mut _ (ie duplicate it) is UB. In other words, even the most C-like pointer Rust has is not as restriction-free as an actual C pointer.

As a side note, I've noticed something: if borrows are more specific pointers ie they exhibit an is-a relationship with pointers, and the difference is essentially taking properties away from pointers in order to achieve specific effects, then how backwards and broken is it conceptually to essentially only allow adding stuff when creating an is-a relationship in an OO language (ie creating a subclass)?

That's actually not true -- *muts may alias, only &mut references are guaranteed to be exclusive.

(although I think you could still make an argument that none of Rust's pointers are as "completely unrestricted" as C's, since you can freely cast pointers from one type to another, it probably turns out to be purely academic.)

1 Like

But it's UB to dereference them both, so what good does the extra copied pointer do you?

Fair point about const pointers are not pointers.

In my mind "References are not pointers" is short for "you can't use references in Rust exactly as you would use pointers in C" (because owning uses won't compile).

And that is true for "const pointers are not pointers" as well: "you can't use const pointers exactly as you would use regular pointers" (because your mutations won't compile).

So I see how "it's not the same" is ambiguous and can be seen as "it's not equivalent" or as "it's a totally different thing".

2 Likes

It's not UB to dereference aliasing raw pointers. From the Stacked Borrows paper:

Raw pointers in Rust basically act like pointers in C: they are not checked by the compiler, and there are no aliasing restrictions. For example, the following code demonstrates a legal way to cause aliasing with raw pointers:

let mut x = 4;
let ptr1 = &mut x as *mut i32; // create first raw pointer
let ptr2 = ptr1; // create aliasing raw pointer
// Write through one pointer , observe with the other pointer.
unsafe { *ptr1 = 13; }
unsafe { println!("{}", *ptr2); } // Prints "13"

If aliasing using raw pointers were disallowed, then it would be impossible to efficiently implement many data structures in Rust, including the standard library’s LinkedList.

6 Likes

I feel like this thread is missing crucial context. Why are we debating the accuracy of the extremely vague and underspecified claim that "references are not pointers"? Is someone proposing an amendment to The Book or std docs or something? Both "pointer" and "reference" are terms used with a wide variety of both narrow and broad meanings across various languages, so whether the claim is "correct" depends entirely on how narrowly each word is being read.

Within the context of Rust itself, by far the most natural usage (and what I believe is already standard) is to say that &T and &mut T are reference types while *const T and *mut T are pointer types. That would make "pointers are not references" a perfectly boring and obvious truth no different from "Box<T> is not Arc<T>".

Comparing to pointers and references of other languages is where things get complicated, and the only solution to that is specify your language. Statements like "Rust references are not like pointers in other languages" are at best extremely confusing, and at worst near-objectively wrong to significant portions of your audience that don't care about e.g. C-like pointer arithmetic. You have to say what languages and what properties/operations/guarantees of those languages you're talking about.

3 Likes

What this does highlight is just how big of a hornet's nest unsafe really is, and how vaguely defined its rules are (and even if they are crisply defined somewhere, they are complex enough to feel vague meaning it's quite difficult to get an intuition for it without walking into an UB trap when actually using them). I went through the unsafe book once (the nomicon) and still got it wrong.

1 Like

They're often implemented using pointers. Sometimes the compiler might optimize a reference away entirely, rather than making a location in memory and a pointer to that location.

This was an important lesson for me to learn, coming from C.

5 Likes

Personally, I usually say "raw pointers" for those, precisely because Rust also has

  • references, which provide indirection (point at things), even when not called "pointers"
  • smart pointers, which provide indirection and are called "pointers"
  • function pointers, which provide indirection and are called "pointers"

But that could just be me, so I did a brief and very informal literature survey.

TRPL contains phrases like "A regular reference is a type of pointer" and "A pointer is a general concept for a variable that contains an address in memory".

The reference describes & and &mut under the section heading "Pointer types". So does Programming Rust.

Both the official book and the official reference also sometimes use "pointer" to refer to raw pointers, unqualified, but only (as far as I can tell) in a context where the rawness is implicit, in the same way you might use "coronavirus" to refer to COVID-19. I could not find any direct comparison between "pointers" and "references", in any official documentation, where "pointer" was not qualified by "raw".

I noticed that the reference also contains some examples of "pointers and references" and "pointers or references" (without "raw"), which does more clearly evoke contrast. However, it also sometimes uses the word "pointers" in a broad sense, where references are clearly included given the context.

2 Likes

Actually I'm the same way. I completely forgot about the "raw" while writing that post, probably because it's been weirdly absent from most of this thread.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.