Why would the following code cause segmentation fault on my local while passed on playground?

Rust Playground (rust-lang.org)
If I run locally, it will just cause either memory allocation of 3017347338153984 bytes failed or segmentation fault cargo run

Look at the following snippet:

let f = {
                let p = Arc::as_ptr(&fs);
                Foo::from_raw(p)
            };

Arc::as_ptr returns the ptr, but doesn't increase it's ref-count. So, once the ref-counter of Arc reaches 0, which happens as soon as main ends, you have a dangling pointer.
Using Arc::into_raw should work. But you don't implement Drop on Foo. So, you'd have a memory leak if you use Arc::into_raw. So you need to implement Drop on Foo.

1 Like

Hi the Arc is not dropped until main exit, so I don't think that is the reason and why would it pass every time on playground, my local environment is WSL ubuntu 18

well maybe but I don't get in which part I double freed, since I didn't consume the Arc and never manually free, and I ran 20+ times on playground, it's working every time while on my local it almost failed every time(30 times+), which is kind of weird.

Indeed. It's not a double free. It's another, similarly bad error.

What I missed is that your Foo::from_raw() function results in shared mutability and a race condition. You take an Arc which is an inherently shared data structure, and then you convert an immutable Arc/immutable reference to a mutable reference. The default Tokio runtime is multi-threaded, so if you spawn enough tasks, it will start using multiple threads, which will race for the same value inside the Arcs. This is Undefined Behavior.

If you switch to the single-threaded runtime (using #[tokio::main(flavor = "current_thread")]), then the code "works". Note though, that converting an immutable reference to a mutable reference is still, unconditionally UB. Therefore this is not a solution, it is a temporary workaround to pinpoint the bug.

A real solution would involve informing the compiler of the shared mutability and guarding against multi-threaded race conditions, which is done by wrapping the value in a tokio::sync::Mutex, as in this Playground.

I have also improved the style of the code (note in particular the use of Option::get_or_insert_with, deriving Default for Foo, and self.waker.take().into_iter().flatten()), and replaced all atomic orderings with SeqCst since I am not by any means confident that your current usage of more relaxed orderings is correct (or warranted, even if it happens to be technically correct).


In general, pointer-based type punning and similar hacks are a huge red flag. In general, you shouldn't need unsafe for merely implementing a Future. Unless you are perfoming an extremely important optimization, you should basically never need unsafe except for calling FFI. This is specifically true if you are not well-acquainted with memory management and the constraints imposed by the language.


Also note that your use of thread::sleep() is also wrong. It blocks the thread, completely defeating the purpose of async. Even if you temporarily manage to make your code not to crash using the workaround provided above, it will output 0, since blocking the main thread means that the single-threaded runtime won't have any other opportunity to actually poll the submitted tasks. You should swap out thread::sleep() for tokio::time::sleep(…).await.

1 Like

Hi,
Thanks for your patient replying! I understand your concerns about my unsafe usage, and the concern about data race of the struct over threads, but what I think is that using CAS already guaranteed the safety of concurrent polling await, so the reason why I used unsafe is try to avoid mutex cost, otherwise I know that using arc mutex over the struct could make the code works, but locking the struct makes the field flag and wakers in itself meaningless, cause we already locked up on polling each time.

First, some formal things: Creating multiple mutable references to the same thing is always wrong. If you want to use atomics to synchronize access to do something, then the correct unsafe way to do that is to wrap the unsafely synchronized fields in an UnsafeCell and only take immutable references to the outer struct. The value inside the UnsafeCell can be mutably accessed via the UnsafeCell::get method.

Anyway, although it is true that a CAS can be used in place of a mutex if all you need is try_lock, you aren't using it in that way. When you fail to take the lock in the atomic, you still go ahead and access the waker field without synchronization, and that is a data race. Additionally, you are also accessing the waker field after releasing the lock by setting the atomic back to false, which is also a race. Even more additionally, the atomic orderings are wrong — when an atomic is used as a lock, the release operation should use the release ordering.

2 Likes

The fact is, it doesn't, because the code crashes (I managed to reproduce various kinds of crashes, including segfaults and UAFs detected by the system malloc).

There's no debate to be had about the incorrectness of code that segafults.

1 Like

Most likely the segfault is due to the unsynchronized access to the vector, causing its pointer and length and capacity fields to get corrupted. It could also be due to corruption of the values stored in the vector.

1 Like

I agree with your conclusion, but what funny is that it works under a small amounts of spawn while failed on bigger amounts, so what's the best practice of non-blocking async code? I wonder how can we write it. clearly this code is not efficient at all even compare to the code using mutex.

Maybe you can tell me what you want the future to do. Then I can show you a correct implementation.

If this works under small amount of spawns, then that's just luck.

1 Like

the situation is exactly like what did before, make a non-blocking future over a struct, and can call it from multiple threads, and each time call it would modify the state of the struct.

In this particular case, all you do is increment an integer, which you can do by using the type AtomicI32 and calling fetch_add. Then all of the other fields can be removed.

3 Likes

that is true, you remind me of crossbeam AtomicCell, thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.