IMO, that should definitely be typed as Option<Option<*ptr>>
; "I don't even know" is None
(unreachable payload), "I know what it is and it's NULL" is Some(None)
(reachable, non-dereferenceable payload), "I know what it is and it's non-NULL" is Some(Some(addr))
(reachable, dereferenceable payload).
While match raw_ptr.as_ref() { Some(r) => { valid(); }, None => { invalid(); }, }
is an excellent use of the type system to enforce null-checking, I'm still of the opinion that being able to declare potential nullability in the function signature via Option<*ptr>
would be a useful pattern.
With as_ref()
:
pub extern fn recv_maybe<T>(ptr: *const T) {
match ptr.as_ref() {
Some(r) => { // r is &T
foo(r);
},
None => { handle_null(); },
}
}
pub extern fn recv_firm<T>(ptr: *const T) {
foo(unsafe { &*ptr });
}
Since the function signature doesn't explicitly declare nullability, this reduces the ability to distinguish between functions that expect they may receive null, and functions that expect to only ever be given valid pointers. Semantically, a function that refuses to trust its input and a function that would like to but can't, should look different; as is, this is boilerplate that must go on every FFI function receiving a foreign pointer, even if the incoming pointer is one that the Rust code had previously given out and is known in the abstract design to be valid.
This also means that browsing the signatures alone is insufficient to determine which functions can expect null, and which do not expect that.
With Option<*ptr>
:
pub extern fn recv_maybe<T>(ptr: Option<*const T>) {
match ptr {
Some(r) => { // r is coerced to &T
foo(r);
},
None => { handle_null(); },
}
}
pub extern fn recv_firm<T>(ptr: *const T) {
foo(unsafe { &*ptr });
}
The recv_maybe
function is used when the caller might give it anything; the recv_firm
function establishes as part of its contract that incoming pointers must be non-null and valid, and that the caller's breaking of this contract is a severe logic error. recv_maybe
expects that null might happen and has a plan to deal with it; recv_firm
has no reason to expect null and has no game plan for handling such a case other than panic.
This makes explicit as part of the signature (like the ?:
decoration in TypeScript and, I think, Swift) that some functions are capable of receiving and handling NULL and that some are not, and reduces the amount of boilerplate code that is necessary to handle null checking on functions that should never need it. Furthermore, I'm of the opinion based solely on 1 point of anecdata that seeing the Option
in the signature serves to remind readers of the need to check for validity, whereas the lack of Option implies that, since Rust doesn't have null as a valid member of type sets, the pointer can be reasonably assumed to be not null. Deref is still, obviously, unsafe, but doesn't require the full checking machinery of Option.
In my personal example, I have a Rust library which can receive maybe-null pointers from FFI callers, and react accordingly. This function then hands back pointers into Rust memory which are known good; sibling functions in the library receive that pointer back from FFI later, and must still go through the motions of checking validity when, as long as the caller is not broken, the pointer must be good still -- we can't prove it to the compiler, but we still know.
In my experience working with libraries that pass control back and forth over boundaries both FFI and not, this occurs reasonably often (libgit2 does it with repository pointers, IIRC) and duplicating the null-checks on incoming pointers is needlessly repetitive.
From a safety perspective, I fully recognize that blindly doing unsafe { &*ptr }
or ptr.as_ref().unwrap()
is absolutely a bad idea, since it bypasses null checks and might as well be C's bare *ptr
.
Code which wants to be perpetually paranoid (a state I endorse and strive to maintain) should definitely null-check everything, no matter how it elects to do this. My main point here is that, at present, FFI signatures have no means of displaying the difference between taking *T
and *T | null
, and so we're still stuck without a typesafe way of denoting what foreign pointers are to be treated with utmost caution and what are, while still dangerous, acceptable to treat somewhat more casually (with the understanding that failure will, of course, bring catastrophe).
I'm certainly not proposing we freeze our ABI or memory model, or leak implementation details of Rust types across FFI. Pointers (and I guess Booleans) are the only type that have their null case as an internal variant rather than an external discriminant, and have this case enforced by hardware. Since FFI boundaries require awareness of the memory model and common representation, I'm certainly amenable to the argument that special-casing Option like this can lead to problems with people writing FFI boundaries that attempt to take other types Optionally and running into problems because C can't represent that, or muddying the waters about what is/isn't #[repr(C)]
.
I still feel that this is a worthwhile use of Option
and acceptable instance of fixed representation that shouldn't restrict how Option
behaves anywhere else, but eh.
Hopefully we'll hear from other folks, especially those who've done far more FFI work than I have.