Panic when I clone the Arc pointer

I' sorry I can't give the simply code sample, the project is so big, but the error reappears inside a function.
Sometimes it panics at let old_size = self.inner().strong.fetch_add(1, Relaxed); line, mostly it just abort().

I print some debug messages before cloning the pointer, the code clones the Arc pointer 39 times, but the value of strong makes me confused.

Why does it abort when I clone the Arc pointer?
Any help would be great appreciated.

old_size > MAX_REFCOUNT means one of the cases below happened.

  1. You're rapidly leaking the newly cloned Arc handle. MAX_REFCOUNT is 2^63 on 64bit machines, which means if you leak it every nanoseconds it takes millenniums. There wasn't any Rust program before christ, but well it's possible in theory.

  2. There're some bugs in the Arc implementation which mismodifies the refcount. Software has bug and we should consider it.

  3. Some unrelated unsafe code(like C code) have bug which overwrite the memory location it must not touch. Yes, this is one of the possibilities what happen on UB. You may need to revisit the entire C code and the unsafe {} blocks to find the actual bug.

Thanks for your reply, actually I don't use C code and unsafe block inside the method, how can I check the leaking problem?

If you have good machine with dozen of CPU cores and spin them all 100% to run program which does nothing but leaking the single Arc, it would take decades before it aborts. If it aborts within a day, it's not your problem.

If you don't have any unsafe code, your dependencies may have some. Check your dependencies, find ones with unsafe code but not extensively tested and debug them if you can't replace their usage.

1 Like

Oh I missed this part. Memory errors have global scope, which means every unsafe code ever called within the process would be the suspect.

3 Likes
  1. Isn't possible on a 64 bit system (very possible on 32bit though)
  2. Is extremely unlikely, could be a compiler bug (but still very unlikely).

Are you using a nightly compiler with features (some of those have known bugs)?
Are you compiling for some uncommon platform (like embedded)?

It's most likely 3.

So it's very likely a bug in unsafe code, or a unsafe block that calls non Rust code (doesn't have to be C). It could exist in what appear to be unrelated places in your code, but are likely called before this. The bug could be in an unsafe block of a dependency you use (The unsafe in the std library typically has a lot of testing and validation, so that's the last unsafe code I would dig into). I would guess that the Arc is getting turned into a pointer somewhere instead of what the arc holds and passed to some external code which corrupts the Arcs counters. Another possibility is a use after free. Both of those should only be possibly within an unsafe block.

3 Likes

Thanks for your reply.
I' using Macbook Pro 2015 (64 bit system intel CPU), and I only use stable rust compiler (rustc 1.53.0 (53cb7b09b 2021-06-17)).

Thanks, I will check my dependencies.

Try running your program under valgrind. It may not catch invalid access to the Arc directly, since the counter is still within an allocated program-accessible area, but if something sprays memory badly and hits other addresses too it could be caught.

This error might also be caused by unsafe/FFI code decrementing the count manually or causing a double free with unbalanced Arc::from_raw.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.