Why are function pointers special? (no null)

I learned today I can't set function pointers in rust to std::ptr::null() because function pointers are not allowed to be null. It appears the suggested workaround is to use Option similar to how you can use Option<T&> to get "nullable" references. That's all fine, but what I don't get is why? Rust already has nullability as a distinguishing feature between pointers and references, why wasn't that kept as is for function pointers?

I have to imagine this is an annoying inconsistency for any kind of metaprogramming over pointer types, because you will have to specially treat data pointers (can be null) from code pointers (can't be null).

Also what then is the difference between a reference to a function and a pointer to a function?

6 Likes

Do you find dereferencing a null pointer ok? No? Then why would calling null be ok?

1 Like

I'd expect it to be consistent with how data pointers work in Rust -- dereferencing a null pointer is UB.

A function pointer is more like a &’static Code for some abstract Code array than a *const Code pointer. Does that abstraction fit?

8 Likes

Thus if fn() pointer types could be NULL, you could only call them in unsafe.

Maybe *const dyn Fn() is closer to what you want.

"Pointer" in the name is unfortunate, and I have seen people express that &fn() would have been a better choice than just fn().

13 Likes

It sounds like you're saying "in Rust a function pointer is more like a function reference", which I think is what I'm pointing out already. But my question is... why? I'd have expected functions to work just like data: if you need to be sure they're non-null and valid, you take a reference. If you need the flexibility that they may be null or invalid, you use a pointer.

Note that if you wanted a reference that may be null, the right thing to do isn't to switch to raw pointer types, but to use Option<&T>. Thus, the need for Option<fn()> is not a special case.

9 Likes

Because you can call it. You don’t need to use unsafe to call a function since a function is always a valid 'static value. If function pointers were nullable, calling them would be unsound in some cases, so this would require unsafe. Would you want to use unsafe every time you want to call a function?

1 Like

Yeah, but that would be consistent with how data pointers work. Why would that be a problem?

Oof, maybe I'm more confused then I realize. So I know there's a function pointer type, and that for every function there's a singleton type that only includes that one function. And separately, for consistency, I would expect you can have reference to function. When you write Option<fn()> what is that? It seems like none of them.

Every time I call a function by pointer yes, that would be sane and consistent.

1 Like

It might help to adjust your taxonomy a bit. *const T and *mut T are not Rust's “pointer” types, they are Rust's “raw pointer” types. Raw pointers, references, function pointers, and Box are all examples of “pointer” types in Rust.

Only raw pointers are unsafe to use and have inherent null values (rather than ones added via Option).

23 Likes

The argument there is that we should have &fn() and *const fn(), but then what is the fn() type? You don't want it to be the address as that would mean extra indirection. So you want it to be some zero-sized type you can't actually call I guess. Although a bit strange, that would have worked and I think some people think we should have done it that way.

Why didn't it? I dunno, it's been that way long before stabilization -- probably just boils down to safe Rust being the focus and raw pointers the exception.


Would it be a big deal to have to use unsafe to call fn(). Yes! A huge deal! That would make it "easier" in some mindset to your FFI (?) case, but that's a very niche case. [1] The non-niche case is that if you have a fn(), you want to call it. But it would make it much uglier, harder, and denied in many codebases for everyone else.

One of the main points of Rust is to seal away unsafeness without a loss of expressiveness and performance. If you start needing unsafe to do common things, people would just start putting it everywhere and you lose all the (massive) benefit of that isolation. We want common things to be safe and ergonomic.


As for your use case -- do you really want to be checking for NULLs and/or using unsafe and putting yourself at risk of UB all the time? Closures can be coerced to function pointers if they don't capture, so I don't think it's really that big of a deal anyway. Just define your own "NULLs".

pub const NULL_FN_NOP: fn() = || {};
pub const NULL_FN_PANIC: fn() = || panic!("Null fn() called");

pub const NULL_FN_UB: unsafe fn() = 
    || unsafe { std::hint::unreachable_unchecked() };

  1. Also I don't think it really makes it easier; see below. ↩︎

9 Likes

Hmm, let me rephrase because it still seems like there's an inconsistency here.

  • If I want a non-null handle to a valid instance of a type T I add an ampersand and get &T.

  • If I want a nullable handle to a possibly invalid instance of a type T I add an asterisk and get *T.

Unless T is a function, then we go into weird special case land:

  • If you want a non-null handle to a valid instance of a function that takes T and returns U I add an asterisk * fn(T) -> U.

Does that make it clearer why it's inconsistent? It's like the reference and raw pointer cases are swapped, but only for functions. That we can put functions aside and say "function pointers are not raw pointers" is correct in that clearly the design behaves that way, but it still seems like an odd design decision. You can no longer succinctly describe the difference between pointers and references, now you need to distingiush pointer, function pointer, and ref.

Thanks for the clarification. I think the behavior consistent with data would be:

  • &T is a non-null handle to a valid function.
  • *T is a nullable handle to a possibly invalid function.
  • T is the actual data -- in this case literally the function's instructions. An Option<fn()> would directly store them, so not a ZST.

C/C++ don't let you directly use T for functions, they only let you use pointers/references. If you do you also have to deal with thorny issues like whether memory is mapped executably (and carry that across move) so you would probably only expose Pin<fn()> or something.

do you really want to be checking for NULLs and/or using unsafe and putting yourself at risk of UB all the time? Closures can be coerced to function pointers if they don't capture

My use case here is just learning the lang is more difficult because it's inconsistent. It's just another special case to remember. I think if functions were consistent with data then closures would coerce to &'static fn(), which would not require unsafe to invoke.

TBH, I think the way Rust does fn is a mistake.

If extern type had existed in the pre-1.0 days, I suspect that instead of fn(A) -> B we'd have had the equivalent of that be &'static fn(A) -> B. That would be nice for dlopen kinds of things too, since it would allow &'a fn too.

But as it is right now, just call fn a "function reference" instead of a "function pointer" and you'll be good to go. After all, they behave like references -- safe to call, can't be null, unsafe to create from bits -- not like pointers -- which are unsafe to call and safe to create from bits.

10 Likes

Interesting, why would extern have helped? I haven't had to write any FFI code yet so I might be missing context... I've noticed that it looks like extern and unsafe are extra properties that functions can have that function pointers preserve, e.g. if you have unsafe fn foo() {} then a pointer to foo will be of type *unsafe fn(). But I'd expect in a world with function references they're still preserved, &unsafe fn().

extern type isn't stable yet. The idea is a type that exists, but its size isn't know, even at runtime.

Because if fn was just a ZST, then you could do weird things like dereferencing it to get it "by value", but that wouldn't make sense for a function.

But by making the size unknown, all the things like that would necessarily be prevented.

5 Likes

That's true, but no matter what, there's something special to learn about function pointers... and rust pointer/reference types more generally. I think a bigger bummer (having already learned about fn()) is that you can't go fn() to &dyn Fn(), though maybe that could be added.


Why I say there's always something special to learn:

  • What we got

    • fn() is like a &'static _ in some ways but isn't actually a &_ in other ways
  • fn() is the instructions

    • fn() is a DST so you need indirection, &fn(), as with &str and &[_]
      • And so &fn() is a wide pointer
      • And so &fn() can't be coerced to &Fn() because fn() is already a DST (though maybe this could be special-cased)
    • If we get unsized locals, fn() is special cased to still not be movable
  • fn() is a ZST

    • You can pass it around and it looks like fn() but you can't call it
    • I.e. only useful with indirection (&fn())

Rust reference/pointer types are more complicated than "these three varieties"

These are all one usize:

  • *const u8
  • &u8
  • fn()

Well, so are these, but with extra indirection you don't want:

  • *const fn()
  • &fn()

But that's still not enough to cover Rust pointers and references, because there's also wide pointers and references. These are all two usize, and what the second one is varies.

  • &str
  • *const str
  • &dyn Fn()
  • *const dyn Fn()
  • and uncountably more if/when custom DSTs land, perhaps of arbitrary size

And there are special guarantees around niches and enums like Option.

  • Option<&u8> is one usize, NULL corresponds to None
  • Option<NonNull<*const u8>>, similarly
  • Option<fn()>, similarly

Which is why Option<fn()> is the recommendation for a nullable function pointer.

2 Likes

Explain how would that work on AVR where code and data physically reside in different parts of the chip and not interchangeable even in theory.

3 Likes

Interesting! I'll have to read up on the plan for how those would work.

1 Like