I can use on_event to register an event handler, and invoke to trigger all registered event handlers for a given event.
However, this API doesn't allow me to unregister event handlers. What's the idiomatic way to do it?
Return some sort of handle which allows removing the event handler?
Store a Weak<dyn _> instead of a Box<dyn _> (and remove the Weak's with a strong_count == 0 during invoke)?
Any other way?
P.S.: I also considered using channels and async tasks to handle the events. In that case, I could simply drop the receiver of the channel if I want to "unregister" the event handling mechanism. But in my case I require immediate execution, so that's not an option for me.
Related question: Can I identify a Box by its address of its contents, and how to get that address? Rc provides the Rc::as_ptr method, Arc provides the Arc::as_ptr method, but I don't see any method like that in Box. I wonder if the following works:
#[derive(Clone, Debug)]
struct S(i32);
fn main() {
let x = Box::new(S(1));
let mut vec = vec![];
let x_identity = &*x as *const _; // any nicer way to write this?
vec.push(x);
let y = vec.pop().unwrap();
let y_identity = &*y as *const _;
assert_eq!(x_identity, y_identity); // does this always hold?
}
Normally, you can't use a plain Vec<T> as an arena and use its index for access later on because removing an object will mess up the indices for all elements after it. To get around the removal-messes-up-indices problem, you can use Vec<Option<T>> where removing a value just replaces it with a None. However, say I had the index for an item, then that item was removed and a future allocation inserted a new value into the "hole", my index would be referring to a different value and I'd have a bad time.
The generational arena works by storing a "generation" number as well as the Option<T> hole (i.e. Vec<(usize, Option<T>)>). You'll increment the generation every time you insert a value into a slot, and when doing a lookup you must pass in both the index and the generation number. If the caller passes in a valid index but the generation number doesn't match, we know they've got the index for an object that has since been removed.
It should be fine to compare the *const T pointer that comes from deref as long as your Box<T> isn't deallocated between the time it is created and when you do the comparison.
However, if there is a chance you'll be hanging onto a "removed handle" after the handler is unregistered, malloc could choose to reuse the old handler's memory (on the heap) for a different handler. If that happened, the user could try to unregister a handler without knowing it has already been unregistered, and accidentally unregister some other random handler.
It's probably better to use Rc<T> for storing the handler and store a Weak<T> in the thing used for unregistering. That way the Weak<T> will prevent the heap memory holding the T from being freed and you can keep doing pointer comparisons without worrying about the allocator reusing your memory for a different handle.
Yes, I need to be sure the Box is still allocated (otherwise the address can have been reused).
I intend to make the unregister method consume the handler, so it should not be a problem.
I tried to experiment with using the address on the heap as identifier, but as soon as dyn is involved, something strange happens:
#[derive(Clone, Debug)]
struct S(i32);
fn main() {
let x: Box<S> = Box::new(S(1));
let mut vec = vec![];
let x_identity = &*x as *const _ as usize; // any nicer way to write this?
println!("id = {x_identity}");
vec.push(x);
let y = vec.pop().unwrap();
let y_identity = &*y as *const _ as usize;
assert_eq!(x_identity, y_identity); // does this always hold?
let f: Box<dyn FnMut()> = Box::new(|| println!("Hello!"));
let mut vec = vec![];
let f_identity = &*f as *const _ as *const () as usize; // compiler complains if I don't cast through a thin pointer
println!("id = {f_identity}"); // this prints "id = 1" !?!?!?
vec.push(f);
let g = vec.pop().unwrap();
let g_identity = &*g as *const _ as *const () as usize;
assert_eq!(f_identity, g_identity);
}
I think there must be a special case where no closure is created but instead a function is used:
use std::rc::Rc;
fn main() {
let f: Box<dyn std::any::Any> = Box::new(10);
let g: Box<dyn std::any::Any> = Box::new(20);
println!("f's address: {:?}", &*f as *const _);
println!("g's address: {:?}", &*g as *const _);
assert_ne!(&*f as *const _, &*g as *const _);
let f: Box<dyn Fn()> = Box::new(|| println!("Hello!"));
let g: Box<dyn Fn()> = Box::new(|| println!("World!"));
f();
g();
// They have the same address:
println!("f's address: {:?}", &*f as *const _);
println!("g's address: {:?}", &*g as *const _);
// But they are not equal, probably due to the extra info in the wide pointer:
assert_ne!(&*f as *const _, &*g as *const _);
let c = 5;
let f: Box<dyn Fn()> = Box::new(|| println!("Hello {c}!")); // let's capture the environment
let g: Box<dyn Fn()> = Box::new(|| println!("World!"));
f();
g();
// They have the same address:
println!("f's address: {:?}", &*f as *const _);
println!("g's address: {:?}", &*g as *const _);
// But they are not equal, probably due to the extra info in the wide pointer:
assert_ne!(&*f as *const _, &*g as *const _);
let f: Rc<dyn Fn()> = Rc::new(|| println!("Hello!"));
let g: Rc<dyn Fn()> = Rc::new(|| println!("World!"));
f();
g();
// They have different addresses:
println!("f's address: {:?}", &*f as *const _);
println!("g's address: {:?}", &*g as *const _);
}
Is that sound? Maybe something like this is already existing as a crate? Wouldn't it be nice in std too (along with implementing Hash)? I remember I have stumbled upon the problem of pointers being !Send and !Sync in past, even in situation where I only want to compare pointers (and know the addresses haven't been reused).
I recommend taking this type of approach — it's worked for me. The advantage is that you can never have event listener “leaks” from code forgetting to unregister listeners (and code that registers listeners doesn't need to track what listeners it added precisely), and you also do not need to care about comparing the “object identity” of the listeners (which is nontrivial in Rust, as you've found).
However, if you take this approach, you do also need to make sure that you check for dead listeners when adding listeners. Otherwise you can have a leak if some event source is frequently consulted (and a listener added) but it doesn't emit any events (because it happens to not be doing anything).
Here's my own code for this:
Note that listeners have a separate alive() method, rather than themselves being weak refs. This allows the listener to have filters which are not built into the destination it notifies, which is useful for multiple event sources having different effects on the same event destination — whereas if the listener itself is weak-referenced, every listener adapter must be independently Arc owned by the destination to keep it alive.
Box doesn't allocate for ZSTs. Box might be holding a thin or a wide pointer. Some wide pointers not might have the guarantees you assume. You've been here before.
Yeah, I just digged out this postthis post by you, where you warned me of false negatives. I think if I only do the conversion into dyn once, then this should't be a problem?
- let boxed_func = Box::new(func);
+ let boxed_func: Box<dyn FnMut(&Arc<dyn Any + Send + Sync>) + Send> = Box::new(func);
let id: PointerId<dyn FnMut(&Arc<dyn Any + Send + Sync>) + Send> = PointerId::new(&*boxed_func);
# then store `boxed_func` later
I thought on that too, but then noticed it could also be a disadvantage. I would like to have my "event listener guard" to have a forget method, which consumes the guard but keeps the event handler installed. This might be difficult with an Arc.
Yeah, that might be the bigger hassle for me. An alternative could be to use an increasing usize. What's the idiomatic way to create an "identity" manually?
I would like to have my "event listener guard" to have a forget method, which consumes the guard but keeps the event handler installed. This might be difficult with an Arc .
In my design, each Listener implementation can use whatever strategy it likes for deciding when it is dead, so listeners intended to work this way could hold a strong reference instead of a weak one. Making a weak listener “upgradable to strong” sounds harder, but also I would expect that you can use different types when a strong listener is needed.
What's the idiomatic way to create an "identity" manually?
A global AtomicU64 is one sufficient way to create unique identities.
Also, Arc, unlike Box, never has a dangling pointer (it always has to allocate to store the refcount), so if you take an Arc's pointer and cast it to a narrow pointer (i.e. comparing the address of the allocation only) then that's a stable identity. However, I am not completely sure that all of that is formally guaranteed to work.
But by avoiding using identities in the listener system you don't have to worry about any of that.
Well, there is a function Arc::ptr_eq to compare "identity" (though documentation doesn't describe it that way). But I tried several approaches, and I'm not happy with any of them yet. One problem I have with Arcs is that Arc<T>: !Send if T: Send + !Sync (while Box<T>: Send if T: Send).
Edit: Another problem is that I need mutation when dealing with FnMut closures, which works with a Box but not with an Arc, unless I add a Mutex (which would be overkill/overhead).
Maybe that would cause me least trouble and be most flexible in regard to "leaking" handles.
I guess it's as simple as that?
use std::sync::atomic::{AtomicUsize, Ordering::Relaxed};
static ID: AtomicUsize = AtomicUsize::new(0);
fn main() {
let a = ID.fetch_add(1, Relaxed);
let b = ID.fetch_add(1, Relaxed);
println!("{a}, {b}");
}
Right, but it compares the vtable too (which I am coming to think is a mistake, at least for Arc<dyn> as opposed to &dyn). The pointer conversion I suggested would check only the allocation being the same or not.
That's true. But you are almost certainly going to have to pick whether your entire event system is Send+Sync or not; if yes, then you need the listeners to be Sync, and if not, then you don't care.
I tried to get rid of all pointer/coercion/heap/wide-pointer magic and use a simple usize for identifying the event handlers. I implement Drop on a guard to remove the handler when dropping that guard, but provide a forget method when I want to never remove the handler.
A lot of code for a seemingly small problem, but seems to work, and no unsafe involved.
I don't need the AtomicU64 in my case now, but out of curiosity, I'm still wondering how overflow is usually handled? It seems to be tricky without locking? Or do most people ignore integer overflows?
It's infeasible to actually overflow a 64-bit integer when you are only incrementing it (not any larger steps) and the program is not spending 100% of its time doing that. It would take years of run time.
I thought on 32 bit platforms. Okay, maybe it's still very unlikely…
Let's see, 232 = 4294967296. Say we have 1000 IDs generated per second, this gives 4294967 seconds, which are less than 50 days. So it is possible.
So not that unlikely. It's unlikely in my case (I will not register handlers that often), but there may be cases where IDs are created more often and where a duplicate ID might have a disastrous consequence.