Why do enum and stack behave like growable datatypes like vector and strings which are stored on heap?

Let me try to explain my question since I am trying to get my foundations on ownership strong.

Datatypes like i32, u8 and &str are stored on the stack since their sizes are known during compile time.
So we can do deep copy operations like the following without any errors.

    let num1 :i32= 2;
    let _num2 = num1;
    println!("Num1 : {}",num1);
    let slice1 = "Possible";
    let _slice2 = slice1;
    println!("Slice 1  : {}",slice1);

But for datatypes that are stored on heap, I understand that we need to give references.
And I also understand that we can't have 2 or more references, if one is a mutable reference.

So copy operations for something like string should be like:

    let string1 = String::from("Hello");
    let _string2 = &string1;
    println!("string : {}",string1);

Okay, but what I don't understand is why does the same rule apply for enums and structs ? Aren't they stored on the stack?

Here is an example of struct

struct Point {
    x: i32,
    y: i32,
}

let point1 = Point{
                    x: 12,
                    y:23,
                };
let _point2 = &point1;  // Why do I have to create a & reference here?
println!("Point 1 : ({},{})",point1.x,point1.y);

My question beingwhy can't I copy the value just as how other datatypes like i32, &str or u8 would do?
Similarly for enum

enum Mark {
    Cross,
    Peace,
    Love,
}

let marker1 = Mark::Cross;
let _marker2 = &marker1;  // Why do I have to create a & reference here?
println!("Marker 1 : {:?}",marker1);

Same question : why can't I copy the value just as how other datatypes like i32, &str or u8 would do?

Because those other types don't implement Copy (I recommend reading the docs behind the link, they describe what it means to have copy semantics in Rust quite well).

5 Likes

Thanks for pointing it out. Missed that part.

1 Like

You are fundamentally confused about a large number of independent concepts.

No, this is false.

Whether something is stored on the stack or heap is completely orthogonal to all of the following:

  • whether it has a compile-time known size
  • whether it's behind indirection, e.g. a reference
  • whether it's Copy (or Clone)
    • (and whether that copy is "deep" or "shallow", which pretty much is a question of definition and a whole other can of worms)

For counter-examples, see:

  1. let x: [i32; 3] = [1, 2, 3];
    let y: &[i32] = x.as_slice();
    // the referent of y, i.e., `*y`, does not have a size known
    // at compile-time. It is also definitely stored on the stack.
    
  2. let x: Box<i32> = Box::new(1337);
    // the referent of x, i.e., `*x`, does have a known size.
    // It is also definitely stored on the heap, as it's in a Box.
    
  3.  struct X { field: i32 }
     let x = X { field: 0 };
     let y = x;
     x; // error: x was moved
    
    let heap_x = Box::new(X{ field: 0 });
    let heap_y = Box::new(*heap_x);
    heap_x; // error: heap_x was moved
    
    As you can see, moving stuff to the heap does not make it copiable or non-copiable.

The only thing in which there is a piece of truth is that you have to use indirection to initially access heap-allocated memory. This is largely due to a leaky abstraction. All memory must be manipulated by address (because that's how memory is physically built), but when stack-allocating, the compiler manages this fact for you seamlessly, and generates address-based loads and stores when you mention variables by their name.

In contrast, when you heap-allocate, you are using a software construct to mark some memory region as "yours" by the OS or by the runtime (etc.), and since that's dynamic (i.e., it's not associated with any particular name at compile time), you must deal with the fact that what you get back is an explicit address/pointer, which you have to dereference.

So heap allocation means that you'll have to dereference at least once, but the converse is not true:

  • It is NOT the case that all pointers point to the heap.
  • It is NOT the case that all stack-allocated values are copiable.
  • It is NOT the case that all stack-allocated values are non-copiable.
  • It is NOT the case that all heap-allocated values are copiable.
  • It is NOT the case that all heap-allocated values are non-copiable.
  • It is NOT the case that all pointers own resources.
  • It is NOT the case that all pointers pointing to the heap own resources.
  • It is NOT the case that no pointers pointing to the stack own resources.

etc.

4 Likes

I do understand your explanation.
Thanks for sharing.
I was however trying to understand why I couldn't copy enums and structs the way we could with other primitive datatypes. One user has pointed out why.
I had this doubt because the Copy "behavior" for enums and structs was strangely similar to other growable datatypes, and I realized those datatype values were being stored on heap, but not enums and structs.

Most properties of a type (i.e. traits) are opt-in rather than opt-out in Rust. That means that you have to explicitly mention that your type is supposed to be possible to copy, print, default construct, etc. This is the opposite of for example C++, where you have to remember to opt out of all the unwanted properties.

The few that are opt-out (Send, Sync, Unpin at least) tend to be desirable for the vast majority of types. Still, even those have been questioned.

It's also more about the semantics of the type. Container types tend to be treated as "items" and you generally want to move them, rather than having automatic/accidental copies of them. Numbers and similar types are more just plain "values" and you don't care as much if the number 5 in one variable is the same "item" as the number 5 in another.

1 Like

The Copy behavior of types that hold resources is "no".

This also happens to be the default behavior of all user-defined types, even those that could technically be copiable. And that's because it turns out that having to opt in to implicit bitwise copying is vastly more robust than having to remember to opt out when it's wrong.

For example, a Transaction type for a database may not manage any memory-related resource; for what it's worth, it may even be implemented as a connection URL &str to a DB and call the DB every time it wants to do something. Yet, making it Copy by default would have catastrophic consequences (transactions that double-commit, double-rollback, or worse yet, commit and rollback in a self-contradictory manner at the end of the same scope).

This is similar to how memory managing collections have to be non-Copy in order to be sound, but it's more of a societal than a technical reason.

Also note that collections managing a heap buffer are not (necessarily) on the heap themselves. If you declare a String and bind it to a local variable, the struct typed String (essentially, a pointer-capacity-length tuple) will itself live on the stack. The character buffer it manages (and points to) will be on the heap, but being on the heap and pointing to the heap are two completely different things.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.