Intrusive linked list of different generic types which implement the same trait

In c++, it's often needed to avoid heap allocation. Intrusive linked list is a good way, and we can use inheritance to hide generic classes under base class, then use pointer to parent class to store different types in one intrusive list. It's totally heapless and keep the dynamic behaviors. Here is an example (stdexec/execution.hpp at main · NVIDIA/stdexec · GitHub).

But I have no idea how to implement that in rust. Here's some of my experiment and requirements.

  1. Pick some intrusive list implementation. I choose to use tokio's (tokio/linked_list.rs at d459a9345334911be4af1bba58b6c796d92fde66 · tokio-rs/tokio · GitHub)

  2. Define the intrusive list node

struct Task {
    pointers: linked_list::Pointers<Task>,
    operation: NonNull<dyn Executable>,
}

unsafe impl linked_list::Link for Task {
    type Handle = NonNull<Task>;
    type Target = Task;

    fn as_raw(handle: &NonNull<Task>) -> NonNull<Task> {
        *handle
    }

    unsafe fn from_raw(ptr: NonNull<Task>) -> NonNull<Task> {
        ptr
    }

    unsafe fn pointers(target: NonNull<Task>) -> NonNull<linked_list::Pointers<Task>> {
        Task::addr_of_pointers(target)
    }
}

Here, like in C++, I don't want to leak generic types in Task since that will make the linked list support only one type (like Vec). So I hide it behind one trait (since rust has no inheritance and I cannot use Task as base class of more generic types)

  1. Define some real type, maybe with generics and implement the trait
struct Operation<R> {
    receiver: Option<R>,
}

impl<R> Executable for Operation<R>
where
    R: SetValue<Value = ()>,
{
    fn execute(&mut self) {
        if let Some(receiver) = self.receiver.take() {
            receiver.set_value(());
        }
    }
}
  1. Everything seems to work well. But unlike C++ where subclass share the storage with base class, now I need to store the Task and Operation in different space on stack, and use the link with address.
pub struct TaskOperation<R>(pub Task, pub Operation<R>);

fn connect(self, receiver: R) -> TaskOperation {
    let operation = Operation {
        receiver: Some(receiver),
        run_loop: self.run_loop,
    };
    let task = Task {
        pointers: linked_list::Pointers::new(),
        operation: NonNull::from(&operation),
        _p: PhantomPinned,
    };
    TaskOperation(task, operation)
}

Now, there's bug here. Since the address of variable may changed after returning from the function. The only correct way is to eliminate the function call and never move these two objects.

I think the core issue here is rust's not supporting inheritence so I need to split the base part and child part of one class in order to hide generic types.

Any idea or advice on this question? Here is my code (exec-rs/run_loop.rs at master · npuichigo/exec-rs · GitHub)

I have no idea about this topic, but I just see intrusive_collections - Rust today.

:grinning:thx for the quick reply. Just tried this crates before switching to use tokio's inner implementation since this one has more limit on the reference (pointer) type.

I'm not sure what you mean by this:

but struct layout is certainly not fundamentally different between Rust and C++. The layout of a C++ derived class is basically just something like

struct Derived {
    base: Base,
    other_fields: T,
    ...
}

I don't see how doing something like this in Rust would be impossible. But then again, I don't understand why you would need this, either, or how/why you think the equivalent C++ would be correct, because this:

is true in C++, too. If you point a pointer to a local variable, and you return from the enclosing function, that pointer will be invalid. You'd have to do the same thing that you do in Rust: create the list upfront and not move its elements.

As for avoiding the generic type parameter: in Rust, the means by which you achieve type erasure is a trait object. It's perfectly possible to create non-heap-allocated trait objects: simply coerce the address of a local variable to a &dyn Trait.

Thank you for point out and I made some update.

  • What I want in C++?
class Base {
   Base* next;  // enable intrusive linked list
   virtual execute() = 0;
};

template <typename R>
class Derive {
  R receiver;
  void execute() override {
      self.receiver.set_value();
  } 
};

void Run(IntrusiveList<&Base::next> queue) {
  Base* task;
  while (task = queue.pop_front()) {
    task->execute();
  }
}

  • Rust workaround
struct Base {
    pointers: linked_list::Pointers<Base>,  // enable intrusive linked list
    execute: fn(*mut Base),  // manual vtable
}

#[repr(C)]
pub struct Derive<R> {
    base: Base,
    receiver: Option<R>,
}

impl<R> Derive<R>
where
    R: SetValue<Value = ()>,
{
    fn execute(task: *mut Task) {
        // unsafe downcast
        let operation = unsafe { &mut *(task as *mut Operation<R>) };
        if let Some(receiver) = operation.receiver.take() {
            receiver.set_value(());
        }
    }
}

// Only link the Base part into the intrusive linked list
type TaskQueue = LinkedList<Task, <Task as linked_list::Link>::Target>;
fn run(queue: TaskQueue) {
  while let Some(mut task) = self.pop_back() {
    unsafe {
      // mimic dynamic dispatch with manual vtable
      (task.as_mut().execute)(task.as_ptr());
    }
  }
}

It actually works now with manual dynamic dispatch. But I still have no idea in how to use dyn Trait.

  • Use dyn Trait

In order to call the dyn Trait in Base, I need to add reference to dyn Trait in Base like:

struct Base<'a> {
    pointers: linked_list::Pointers<Base>,  // enable intrusive linked list
    execute: &'a dyn Executable,  // or NonNull<dyn Executable>
}

Now it has one self-reference field and the creation of Derive would be:

impl Derive {
  fn new() -> Self {
    Derice {
      base: Task {
        pointers: linked_list::Pointers::new(),
        execute: &self???,   // case of &'a dyn Executable
      },
      receiver: Some(receiver),
    }
  }
}

or

impl Derive {
  fn new() -> Self {
    let derive = Derice {
      base: Task {
        pointers: linked_list::Pointers::new(),
        execute: NonNull::dangling(),   // case of NonNull<dyn Executable>
      },
      receiver: Some(receiver),
    }
  }
  derive.base.execute = NonNull::from(derive);  // buggy code
  derive
}

Why not set NonNull<dyn Executable> after returning from the new function?

  • It breaks the interface which returns a partial correct object.

I’m having some trouble understanding what you’re trying to do, perhaps due to my unfamiliarity with C++. Can you show how you expect to use this after it’s been implemented, even if it’s just a toy example? In particular, I’m interested in the lifecycle of the various items in your intrusive list.

1 Like

Some background: I'm trying to implement an unified async execution abstraction which is proposed and target C++26 (P2300R6: `std::execution`)

In short,

  1. The proposal abstract async work to the concept Sender which can be chained to form a complex async computation graph.
  2. Abstract callbacks to the concept Receiver which can accept the success/error/cancel signals from Sender.
  3. When Sender and Receiver is connected, the computation graph is built and an OperationState which packages the concrete work and dependant resources is returned.
  4. The OperationState may contain a handler to the execution context (like a thread pool or GPU) which can be used to schedule the work to the execution context once OperationState::start is called.

Since the OperationState decribes the whole work, the proposal can be allocation-free. For execution context like local run loop or thread pool, we often need a FIFO queue to store the works to do. So a efficient and allocation-free solution is to use intrusive linked list to chain the already-in-place (stack) OperationStates without further move or copy.

Here's one of the example which schedule an empty work on the RunLoop after using some tricks I memtioned before (Intrusive linked list of different generic types which implement the same trait - #5 by npuichigo)

Your Rust isn't equivalent with your C++. In particular, I don't see any self-reference in the C++ code. I'll try to come up with an equivalent to your C++ snippet, but it would be helpful to show how exactly you want to use it once it's written.

Actually, what I'm pursued is an intrusive linked list version of Vec<dyn Executable>. But unlike Vec<T>, many intrusive structures have no way to define on dyn object and need to bundle the Link type with your data.

In C++, the base class is not just a trait and is a real class, which can be used inserted directly into an intrusive structure (after inserting the bundled Link type). Besides, it's also a dyn object, which type erases the concrete sub classes.

But in Rust, I need to split the two requirements: as dyn object and as concrete base class.

Your previous advise helps me to mimic the inheritance of c++ with combination. And I can use the base part to chain in the intrusive linked list (with bundled Link in Base)

struct Base {
    pointers: linked_list::Pointers<Base>,  // enable intrusive linked list
    execute: fn(*mut Base),  // manual vtable
}

#[repr(C)]
pub struct Derive<R> {
    base: Base,
    receiver: Option<R>,
}

Now is the dyn object part. I use this optimized C++ implementation as reference instead of the pure virtual version (the author before actually use pure virtual function in struct __task, but optimize to get rid of the overhead of dynamic binding, but the two solutions are all fine)

You can see the interface void (*__execute_)(__task*) needs a pointer to __task, and that's the real derived object. The so-called self reference (to be precise, a reference to derive)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.