Weak<T> anda dangling pointer

Hi there,
thinking about WeakT> I'm wondering this. If Weak creates a pointer but does not take possession of the element pointed to and therefore does not prevent its deallocation, what mechanism prevents having a dangling pointer if the element is, in fact, deallocated? I thought that the upgrade mechanism needed to access the element is a protection factor... but is it so?
Thx, cheers

When there are weak references, Arc only runs Drop of the type it carries, but it does not free the memory backing the Arc and its reference counts, until all Weak refs are gone too.

So Weak has a pointer to still allocated data, with still valid reference counts, but de-initialized other data.

This is similar to setting Option to None. The memory exists, but the data is gone.

Thx for your answer...So, I have a valid pointer that points to None? Did I understand correctly (more or less...)?

Hmm, somehow.

Before, you have:

Arc  ----> allocated_memory(arc_counter=1, weak_counter=1, data=something)
            ^
            |
Weak -------' 

When Arc gets dropped, you have:

Weak --> allocated_memory(arc_counter=0, weak_counter=1, data=useless_bytes)

Only when the last pointer, regardless Weak or Arc is gone, the memory gets deallocated. But the data was de-initialized when the last Arc disappeared.

1 Like

Interesting... but is not very close to a memory leak?

No, because when the Weak is dropped in the example, the allocated memory is dropped.

It's not a memory leak because the only way you can create a memory leak with Arc is by holding it inside of its own object, creating a circular reference. This cannot be done with Weak, because the object itself gets destroyed once all Arcs are gone, and if the object contains a Weak to itself, it will not keep itself alive. You cannot create a circular reference with Weak objects.

Ok, it's a bit strange to me that Weak can still point to a zone of useless bytes which, in fact, seems to me to be wasted memory.

It's not really wasted memory in that sense. Everything that is larger than a couple of bytes will be behind a Box allocation anyway, and that one does get destroyed if all Arcs are gone. Just the useless pointer is left over.

If you want to understand the reasoning why, just try to develop a better system. You will quickly realize that it is better to have a couple of 'wasted' bytes in an allocation instead of having to make a second one for those. Especially if you then have to store the address of this second allocation in the first one, adding another 8 bytes (in a 64-bit system) of useless memory to the first one you could have used for data instead.

How would you detect when the Weak no longer points to a valid object then?

@Rejkland It is a little treacherous, I have to admit that. I think the reason is that if people want only use Arc as the reference counted version of Box, then Arc should contain the reference count and the data, for performance reasons. Then, it is impossible to drop the memory whilst Weaks still exist, otherwise the Weaks have no way to detect if the object is still valid.

That means, if you want an object to be physically dropped when all Arcs are gone, you have to make it an Arc<Box<T>> instead of an Arc<T>. This comes with the performance penalty of an additional dereference every time, though.

The good thing is that the current Arc API is made in a way that the developer can decide which advantage he wants. If Arc would always drop the data when only Weaks are left over, it would have to do the second allocation internally, and the developer would have no possibility to disable it and choose performance instead.

As usual I have to think about it a bit... thanks to everyone for the discussion

Both Weak and Arc points to the same one memory block, which contains three items:

  • A counter for Arc-references
  • A counter for Weak-references
  • The data itself

If the data itself is a large object and you know that there are possible long living weak references to this 'dead' object, you can choose on indirection more via a Box: Arc<Box<SomeData>>. Then the Layout becomes:

Weak  ------+
            |
            v
Arc  ---> memory_block(arc_counter=1, weak_counter=1, pointer_to_data) 
                                                             |
                                                             v
                                                      memory_block(data)

A small demo program for the different situations:

The relevant main:

fn main() {
    println!("Dummy to trigger some internal allocations:");
    print_mem_usage("Started");

    println!();
    println!("Normal Arc & Weak:");

    let arc_1 = Arc::new(LargeStruct::new());
    let weak_1 = Arc::downgrade(&arc_1);
    print_mem_usage("- Arc<LargeStruct> and Weak<LargeStruct> created");

    drop(arc_1);
    print_mem_usage("- Arc<LargeStruct> dropped");

    drop(weak_1);
    print_mem_usage("- Weak<LargeStruct> dropped");

    println!();
    println!("Indirect Arc & Weak via Box:");

    let arc_2 = Arc::new(Box::new(LargeStruct::new()));
    let weak_2 = Arc::downgrade(&arc_2);
    print_mem_usage("- Arc<Box<LargeStruct>> and Weak<Box<LargeStruct>> created");

    drop(arc_2);
    print_mem_usage("- Arc<Box<LargeStruct>> dropped");

    drop(weak_2);
    print_mem_usage("- Weak<Box<LargeStruct>> dropped");
}

Running on my system, it prints:

Dummy to trigger some internal allocations:
Started / Memory Usage: 1080 bytes

Normal Arc & Weak:
- LargeStruct created
- Arc<LargeStruct> and Weak<LargeStruct> created / Memory Usage: 1001096 bytes
- LargeStruct dropped
- Arc<LargeStruct> dropped / Memory Usage: 1001096 bytes
- Weak<LargeStruct> dropped / Memory Usage: 1080 bytes

Indirect Arc & Weak via Box:
- LargeStruct created
- Arc<Box<LargeStruct>> and Weak<Box<LargeStruct>> created / Memory Usage: 1001104 bytes
- LargeStruct dropped
- Arc<Box<LargeStruct>> dropped / Memory Usage: 1104 bytes
- Weak<Box<LargeStruct>> dropped / Memory Usage: 1080 bytes
  • The LargeStruct has size 1_000_000 bytes

In the first example, Arc<LargeStruct>

  • An Arc<LargeStruct> allocates 1_000_016 bytes. These are two 8 Byte wide Counters (on a 64bit-System) for the Arc- and Weak-Counters and 1_000_000 bytes for LargeStruct.
  • Despite the destructor is run after the Arc gets dropped, the memory cannot be freed because there is still the Weak reference.
  • Only when all references, Arc and Weak are gone, the memory gets reclaimed.

In the second example, Arc<Box<LargeStruct>>, the behavior is different:

  • In total, 1_000_024 bytes are allocated: One block of memory, consisting of two counters and a pointer to the second memory block consisting of 1_000_000 bytes for LargeStruct.
  • When the Arc gets dropped, 1_000_000 bytes gets freed, because the Box<LargeStruct> gets de-initialized which frees its memory. But 24 Bytes are left: This is the memory block containing the 2 counters and now a useless and unused pointer.
  • When the Weak reference gets dropped, the remaining 24 Bytes are freed, too.

What do you mean with "physically dropped"? There nothing physical in programming. Do you mean to also free the memory containing the T? In that case this will work for that, though it really only helps reducing the memory usage by a constant factor. If you have memory issues due to Weaks floating around you will still have memory issues with this method.

Yah, I meant the memory containing T. Sorry for the wording.