Recently on HackerNews I've read a thread were the user Animats was listing the most common situations where you want to use unsafe code in Rust. One of them are partially initialized arrays. And indeed in my Rust code sometimes I'd like some ways to handle this safely (with a type system improvement).
I read an article that's partially about this topic, "Declaring and Checking Non-null Types in an Object-Oriented Language" by Manuel Fahndrich and K. Rustan M. Leino (2003), expecially part 4 "ARRAY TYPES":
The handling of arrays in that paper is not so good (I remember reading a successive article with a better array handling, but now I can't find it) but it gives ideas about the handling of cooked/uncooked part of the array. To simplify the problem there's a need to say that the first N items of an array are already initialized, and some way to bump that N forward. This could be used in both library code (to reduce the amount of unsafe code in Vec<>) and user level code.
The design of D language takes something from those articles for immuntable class fields (the idea of cooked/uncooked fields):
class Foo {
immutable int x;
this() {
//this.x = 10; // Error: constructor temp.Foo.this missing initializer for immutable field x
//this.x = 10; // Error: immutable field 'x' initialized multiple times
}
void bar() {
x = 1; // cannot modify immutable expression this.x
}
}
void main() {}
So is the need of unsafe Rust code to handle partially initialized arrays common enough that it's worth adding features to the Rust type system to handle most of it in safe code?
2 Likes
From my personal experience, the need to iteratively initialize arrays comes up pretty often in scientific computation and graphics. It is pretty clear to me that we should have a better story here than "just use mem::uninitialized()
and hope your array elements only contain floats" or "just Default-initialize and hope the compiler will optimize it out". But I'm not sure what exactly that should be.
One possibility is, as you say, to encode in the type system the information that the first N items of an array are initialized. This will naturally handle the "left-to-right" initialization pattern that is commonplace in sequential code, and make e.g. porting numerical code from Fortran or C to Rust easier.
On the other hand, we could then effectively be encouraging people to use explicit array indexing (or some similarly low-level imperative operation) for such simple left-to-right initialization, which is a bit of an anti-pattern in Rust due to the fact that array bounds are checked and compilers are not 100% perfect at eliding those expensive checks.
Taking this into account, an alternative approach (which will require const generics to implement) would be to extend the existing Iterator::collect()
interface in order to make it work for arrays. I could see three possible tracks for this:
- Introduce a ConstSizeIterator whose size N is known at compile-time. Adjust the FromIterator machinery so that this iterator can be collect()-ed into an array of size N with guaranteed success.
- Extend ExactSizeIterator so that one can try to collect() it into an array of size N, and that will fail at the start if the length of the iterator is not exactly N.
- Extend Iterator so that one can also try to collect() it into an array of size N. This is more difficult to do efficiently and correctly than with ExactSizeIterator, because there are more None checks involved (one per iteration) and one must be able to drop previously inserted elements if iteration ultimately fails.
A separate interesting issue altogether is initialization that does not go monotonically from left to right. This can happen, for example, when array elements are computed in parallel in multiple threads using a domain decomposition method.
In this case, neither the "first N array elements are initialized" model nor the "extended collect()" model are sufficient. I'm not sure which abstraction could perform well at this task, but I think it might also be an interesting one to handle.
1 Like