Methods sometimes need access to the Arc managing self. This is easy: fn f(self: Arc<Self>) (to own the reference count) or fn f(self: &Arc<Self>) (to borrow it with the possibility of owning a count later by clone()). The former has the issue that every method call consumes self so many calls need o.clone().f() which is very unergonomically and it's slower because of the forced atomic inc/dec. But the &Arc<T> version works, so whatever.
However, if I want to pass around an Arc<dyn Trait>, then we have a problem. The Arc<Self> version does actually work fine, however &Arc<Self> does not because that makes the trait dyn incompatible. The former isn't good for the reasons discussed above.
Is there a known work around for this issue? I'd like to use core/std if possible, but I am open to other crates. (I think I know how to implement a custom ref-counted type for this by allowing the user to write methods against what is effectively &ArcInner<Self>, though I haven't worked through it fully, so maybe it won't work either.)
I appreciate any comments or advice. Thanks!
Here is some code that shows what I'm talking about concretely.
mod owned_ref {
use std::sync::Arc;
// self is an *owned* reference count to a Tr. Every call must inc/dec even
// if the method does not need to own a count.
trait Tr {
fn f(self: Arc<Self>);
}
struct St;
impl Tr for St {
fn f(self: Arc<Self>) {}
}
fn main() {
// ERROR: Use of moved value because self is consumed.
let r: Arc<dyn Tr> = Arc::new(St);
r.f();
r.f();
// TERRIBLE: Requires clone() on every call *and* performs inc/dec on
// every call.
let r: Arc<dyn Tr> = Arc::new(St);
r.clone().f();
r.clone().f();
}
}
mod borrowed_ref {
use std::sync::Arc;
// self is a *borrowed* reference count to a Tr. Calls do not require
// inc/dec, but method can create an owned ref count.
trait Tr {
fn f(self: &Arc<Self>);
}
struct St;
impl Tr for St {
fn f(self: &Arc<Self>) {}
}
fn main() {
// ERROR: Tr is not dyn compatible.
let r: Arc<dyn Tr> = Arc::new(St);
}
}
fn main() {}
error[E0382]: use of moved value: `r`
--> src/main.rs:20:9
|
18 | let r: Arc<dyn Tr> = Arc::new(St);
| - move occurs because `r` has type `Arc<dyn owned_ref::Tr>`, which does not implement the `Copy` trait
19 | r.f();
| --- `r` moved due to this method call
20 | r.f();
| ^ value used here after move
|
note: `owned_ref::Tr::f` takes ownership of the receiver `self`, which moves `r`
--> src/main.rs:7:14
|
7 | fn f(self: Arc<Self>);
| ^^^^
help: you can `clone` the value and consume it, but this might not be your desired behavior
|
19 | r.clone().f();
| ++++++++
error[E0038]: the trait `borrowed_ref::Tr` is not dyn compatible
--> src/main.rs:47:16
|
36 | fn f(self: &Arc<Self>);
| ---------- help: consider changing method `f`'s `self` parameter to be `&self`: `&Self`
...
47 | let r: Arc<dyn Tr> = Arc::new(St);
| ^^^^^^^^^^^ `borrowed_ref::Tr` is not dyn compatible
|
note: for a trait to be dyn compatible it needs to allow building a vtable
for more information, visit <https://doc.rust-lang.org/reference/items/traits.html#dyn-compatibility>
--> src/main.rs:36:20
|
35 | trait Tr {
| -- this trait is not dyn compatible...
36 | fn f(self: &Arc<Self>);
| ^^^^^^^^^^ ...because method `f`'s `self` parameter cannot be dispatched on
= help: only type `borrowed_ref::St` implements `borrowed_ref::Tr`; consider using it directly instead.
error[E0038]: the trait `borrowed_ref::Tr` is not dyn compatible
--> src/main.rs:47:30
|
36 | fn f(self: &Arc<Self>);
| ---------- help: consider changing method `f`'s `self` parameter to be `&self`: `&Self`
...
47 | let r: Arc<dyn Tr> = Arc::new(St);
| ^^^^^^^^^^^^ `borrowed_ref::Tr` is not dyn compatible
|
note: for a trait to be dyn compatible it needs to allow building a vtable
for more information, visit <https://doc.rust-lang.org/reference/items/traits.html#dyn-compatibility>
--> src/main.rs:36:20
|
35 | trait Tr {
| -- this trait is not dyn compatible...
36 | fn f(self: &Arc<Self>);
| ^^^^^^^^^^ ...because method `f`'s `self` parameter cannot be dispatched on
= help: only type `borrowed_ref::St` implements `borrowed_ref::Tr`; consider using it directly instead.
= note: required for the cast from `Arc<borrowed_ref::St>` to `Arc<dyn borrowed_ref::Tr>`
This would work, but unfortunately, you can’t define such a type that works for your purpose on stable Rust because in order to get dyn compatibility, you need to implement DispatchFromDyn which is unstable. RFC 3519 arbitrary self types, currently being implemented, is forward progress but explicitly does not cover dyn.
Can you please share your use case more? We may be able to suggest a better design to overcome those limitations. For example, maybe Arc can be replaced with a reference or dyn requirements may be relaxed.
As for the more complex case. Sorry I didn't provide enough info. The specifics are:
We are working on a kernel written in Rust, but the important part is that we have ring buffers RingBuffer<T> which are dynamically allocated (and reference counted) and carried around as Arc<RingBuffer<T>>. The ring buffers can provide handles for the producer and the consumer and those handles act as capabilities: ProduceHandle<T>. The handles clearly need to maintain a strong reference to the underlying buffer as Arc<RingBuffer<T>>.
We also want to be able to have ring buffer references and handles ConsumeHandle<dyn Trait> and table references Arc<RingBuffer<dyn Trait>>, so that access, generally consumers, can interact with the buffer and the handles without knowing the concrete type of the values. We also want to leave the door open to having references to a table for different traits, so putting RingBuffer<Box<dyn Trait>> in the ring buffer will not work (I'm also not sure the type checker how the type checker would feel about that). Also that would create more allocations which can be a performance issue in some cases, like small messages.
We do expect to have control of T and Trait, so we could derive traits for them. We already expect to have a custom derive macro.
I've looked at DispatchFromDyn and I can't figure out how Arc is allowed to implement it. The rules seem to say that DispatchFromDyn can only be implemented for container types C<T> which contain only a single field which is the reference to it's type parameter T that needs to be converted from a fat to a thin pointer. That isn't true of Arc. It has a single field, but it is a reference to ArcInner<T>. I do see how that cast would be safe "thinning" the reference to ArcInner<T> is presumably the same as thinning a reference to T since a fat reference to ArcInner<T> must have all the information to call methods on T. However, even with experimental features I would be surprised if the standard library breaks its own stated rules.
Is it possible to somehow have a reference instead of Arc? Maybe by making some scopes and tying the lifetime to them? It would act as "upper bound", so the ring buffer would always life like in a worst case scenario, but you will not have to do reference counting and ping ponging cache lines between core caches.
I may have also considered having some kind of context each user has access to (using arguments, stored in constructor, thread local, global key-value, etc) and using that context to retrieve the ring buffer - instead of Arc.
You can also consider having an elaborate self reference - so your methods will accept reference, but behind the reference there will be also Arc or Weak, it can be represented as a trait that you derive.
Another hacky approach may be having &self but accept a second argument as Arc with debug assert that self and Arc point to the same place. It would be inline version of the previous method.
Yea. That makes sense, but it forces the set of types to be a closed set and wastes space if types vary dramatically in size.
Enums usually are quite good. Closeness isn't a concern as long as you are controlling the codebase. Size can be reduced by having an indirection in the enum. It may be allocation, or an index into some Vec (very efficient). And if you're having the whole enum behind some kind of indirection, even if enum is huge, you're only passing a handle to it. Ping-ponging cache lines by clonig, having triple indirection etc wastes a lot more then an enum.
I'll explore the idea of using a global lifetime of some kind or a lifetime coupled to a context. It might work, though I worry that it will make everything WAY more complex and verbose to write (though I know Rust is just expected to be verbose).
I did consider putting a weak reference in the handles (or something similar to that). The problem is that it gives up the semantic safety guarantees that rust often provides since Arc::new(old.into_inner().unwrap()) will actually break the underlying object. It obviously is an option since that's a pretty weird case, but the whole point of using a smart pointer is that the object continues to work until it is actually discarded.
As for enums: We do control the codebase in the sense that we control the framework, but there are many crates internally and we want to be able to support plugins in the future. So, having a single central enum for every possible type isn't practical. We even expect some to be generated by macros scattered through the code. (Some of the types here are log entries which we want to support using a simple macro like collect!(value = v).)
You know, I personally enjoy having a lot of lifetimes. Arc and friends are very generic and blend into everything, but genericity has its cost. But if you figure out a proper design, which may be hard and slow, end up with having structs with 3 lifetimes etc, but at the end your types and lifetimes will be a great description of what is going on, what depends on what etc, which will be both performant and descriptive.
I also recommend you reading Data-Oriented Design by Richard Fabian. It is more about games that have a lot of homogeneous types, while operating systems tend to work with a lot of heterogeneous stuff, but it may help breaking out of vtable mentality
Being lifetime parametric is definitely useful. The challenge is that we want to support developers who don't actually understand Rust very well. And I think we want that ad-hoc, less principled, flexibility.
Notably, we expect most of the actual producers and consumers to use concrete types. However, we actually have "observers" which may want to take a much more abstract view of the data. So the parts of the system that need most performance actually are using concrete types.
Anyway. Thanks. This has given me a lot to think about. Hopefully I can come up with something that will provide what we need while not breaking the brains of our non-Rust programmers.
The inconsistency of different types would probably be a problem since the point is that everything is observable all the time. As I think about it, one possibility would be to use different ways to maintain references for the concrete accessors RingBuffer<T> and dyn accessors. I'm not totally sure how to make that work, but it might be a good way to have the correct trade offs in both cases.
Just to make this explicit: the thing you would see in most kernels (and other OO systems) is an "intrusive" reference count where the counts are stored in the object instead of in a wrapper. That makes it so that converting self to a strong reference is totally reasonable. This can be implemented in Rust by forcing the wrapper type to implement a trait that provides access to the reference counts. (and I'm pretty sure that could be derived fairly easily.) You can see similar stuff in intrusive_collections - Rust. There are definitely good reason to do this, but also lots of reasons to avoid it if possible.
This does not invalidate @Ddystopia's comments about data design in Rust. There is still a valid question as to that being a good design in this context, but there are reasons that intrusive reference counts are used in some cases and other techniques in others in C-based kernels (e.g., manually implemented ownership, or tables stored in global or non-global contexts).