Why is Vec implemented un-safely?

It's not null pointers that are the problem. I only raised it because Java has essentially uniformly boxed values, allowing there to be an automatic default value (i.e. not having to make any assumptions about the behaviour/semantics of types). Rust has arbitrary data types with arbitrary representations (even Java has hints of Rust's problem, since the primitive types are restricted), e.g. what should be the default value of Animal?

enum Animal {
    Dog,
    Cat,
    Elephant,
}

There's no pointers to nullify for a value of type Animal, and why should any particular variant get preference (at the language level)?

Yes, inserting an Option everywhere would be very unfortunate, e.g. Vec<Animal> would double in size, and one would have to do more work when manipulating vectors.

Yes, you can, but it's done via unsafe code in the standard library (well, in liballoc), so it doesn't seem inherently different to Vec using unsafe. Building abstractions is nice, but in this case using it also adds more assumptions (the initialisation stuff), so unsafe code in Vec has to be even more careful.

Worth noting: In the long-long ago, Vec (well, ~[T]) was implemented with safe code! As I understand it, it was basically Box<[Option<T>]> As discussed in this thread. Until even more recently, VecDeque (then RingBuf) was also implemented in this fashion!

Rust, like many languages does enable basically everything to be written totally safely, but that's generally going to be really slow and/or memory-intensive!

Yes, I now understand the actual problem. This is why placement-new and explicit calling of destructors in C++ are used to implement variable sized container classes.

I was not proposing storing Option<> instances in the array so much as converting nulls of a pointer type to None on access. It doesn't matter though, that doesn't solve the problems with arrays of structures that contain pointers, or even simple "non defaultable" enums as in your example. I'll concede the point: Vec needs to implement low level cruft to reserve space for unconstructed objects. I was wrong

I still think all of this is unfortunate. While I understand @jpowell's pragmatic position of "just use Vec" even when he doesn't need a resizable array, the detail-oriented (obsessive-compulsive) side of me doesn't want those extra fields in there (pseudo code):

struct Matrix<T> {
    rows: usize,
    cols: usize,
    vec: struct Vec<T> {
         len: usize,
         cap: usize,
         ptr: struct Unique<T> {
              pointer: struct NonZero<*T> (
                  // pointer from imp::allocate()

So basically, there are at least four separate usize variables to keep track of how many elements were in the memory (two would've been sufficient). I know, I know... it's only 16 wasted bytes - I should suck it up, but what I want is (again pseudo code):

struct Matrix<T> {
    pimpl: *struct MatrixImpl {
        rows: usize,
        cols: usize,
        data: <contiguous memory>,
    }
 }

Anyways, I think what I'm learning from all of this is that the current Rust implementation really does require me to use unsafe code to implement fundamental data structures in a way I like. I guess that's relieving - I was getting tired of fighting with Rust to wrestle it into giving me almost what I want. Now I'll just ignore all the rhetoric about treating it like a safe language and whip out unsafe whenever it gets in my way...

"If you can't do what's right, you can always do what's left."

3 Likes

If you want safe code, you also need to store "row*cols" for bounds-checking. That would be

struct Matrix<T> {
    rows: usize,
    cols: usize,
    data: Box<T>, /* mem; size */
}

unsafe is a tool to be used for sure. And it's a tool for abstraction. It gives the language's users almost equal power as the authors, because the users can create their own safe abstractions, safe threading libraries etc.

3 Likes

Comparing Rust's arrays to Java's arrays is unfair, since Rust's arrays' lengths are fixed and known at compile time, but Java's arrays' lengths are fixed but not known at compile time.

The difference between Box<[T]> and Vec<T> is that the latter can be resized to a size only known at runtime, so you cannot implement Vec<T> based on Box<[T]> without unsafe.

2 Likes

Hi there, I'm interested in your learning path, I mean, could you share the structures that you did implement? (also you could provide in your OP the questions references to your questions?).

And also I'm more interested in the unsafe thing reading criticizing things like Criticizing the Rust Language, and Why C/C++ Will Never Die on unsafe, but at less, I did read here a response that unsafe is not another lang (like pointed in that thing).

also

Third, maybe it would have been more appropriate to split Vec into two structures. Something like an Array structure that just does allocation of run-time sized arrays, and then Vec which uses Array to safely implement growable and shrinkable arrays (and iterators, stacks, etc.).

It would be nice to implement a unsafe malloc that is safe to allow implement this/basic structures "by hand"... how we will implement this if we didn't use unsafe? with Vec only?

I have an extra question... what would happen if you simply "delete" from source code of say Vec the unsafe keyword? I guess it will not compile because the compiler thinks it could violate some rule?