Implementing linked list

I'm learning Rust and I enjoy implementing linked lists (and other data structures) to validate my understanding when learning a language. I know Rust is not very friendly with cyclical structures and this demotivated me a lot (I dropped this list implementation several times), but I was able to come up with a working code (sort of).

I would like to know if the following list implementation is a good code for a list implementation in Rust or if I should be using other structures (I know there is a guide to implement linked lists in rust and I read a part of that guide, but I decided to practice before I finished reading it).

use std::cell::RefCell;
use std::rc::Rc;
struct List {
    head: Option<Rc<RefCell<Node>>>,
    tail: Option<Rc<RefCell<Node>>>
}

struct Node {
    element: i32,
    next: Option<Rc<RefCell<Node>>>
}

impl List {
    fn new() -> List {
        List {head: None, tail: None}
    }

    fn append(self: &mut List, value: i32) {
        if Option::is_none(&self.head) {
            let node = Rc::new(RefCell::new(Node {element: value, next: None}));
            self.head = Some(node);
            self.tail = self.head.clone();
        } else {
            let new_node = Rc::new(RefCell::new(Node {element: value, next: None}));
            if let Some(tail) = &self.tail {
                tail.borrow_mut().next = Some(Rc::clone(&new_node));
            }
            self.tail = Some(new_node);
        }
    }

    fn delete_by_index(self: &mut List, index: usize) {
        let mut current = self.head.clone();
        let mut i: usize = 0;
        let mut prev: Option<Rc<RefCell<Node>>> = None;
        while current.is_some() && i < index{
            prev = current.clone();
            current = current.unwrap().borrow().next.clone();
            i += 1;
        }
        if let Some(node) = prev {
            node.borrow_mut().next = current.unwrap().borrow().next.clone();
            if node.borrow().next.is_none() {
                self.tail = Some(node.clone());
            }
        } else {
            self.head = current.unwrap().borrow().next.clone();
        }
    }

    fn display(&self) {
        let mut current = self.head.clone();
        while let Some(node) = current {
            let borrowed = node.borrow();
            println!("{}", borrowed.element);
            current = borrowed.next.clone();
        }
    }
}

You can see I'm using a lot of Rc and this is because I have head and tail which can point to the same node, so this should be a shared address.

I'm also using a lot of clone/unwrap/borrow/borrow_mut. Is there a way to avoid having all these calls ? They seem overly complex. I know I could use the let Some(node) = self.head syntax, but I considered it more complex to reason about the nodes (it could be a lack of practice though).

Overall, I feel Rust really demotivating to implement such structures if compared to C, but I'm trying to enjoy the language.

1 Like

One of the defining characteristics of Rust is having memory management implemented via exclusive ownership and temporary borrowing.

Linked list, despite being such a common data structure in C, is one of the exceptions that completely doesn't fit that model. It's all about having shared ownership and non-temporary references across items. It is absolutely the worst case for Rust, so it is ugly indeed.

You've probably seen: Introduction - Learning Rust With Entirely Too Many Linked Lists

3 Likes

@kornel thanks for the reply. So you do consider the code above a valid rustacean implementation of linked list ?

EDIT: I'm asking about the usage of ownership/borrowing. I know the code above lacks a lot of other features in a typical linked list such as generics.

Sort-of. If you had to have objects that need to have arbitrary references between them, be mutable, and don't have some single well-defined storage (like memory arena/pool), then you would indeed use Rc+RefCell (or Arc+Mutex which is the multi-threaded equivalent).

If you wanted a linked list specifically, then idiomatic Rust would use unsafe in its implementation to avoid overhead, but expose a safe interface around it that can't cause any unsafety. Hiding unsafety behind safe abstractions is a very important pattern. At the lowest level everything is unsafe, but safe interfaces create a de-facto safe language on top of that (sort-of like Python counts as a safe language, even though there's unsafe CPython underneath. In Rust there's a similar layering, but both happen to use the same language).

And idiomatically you probably just wouldn't use a linked list, unless you needed it for wait-free multi-threaded tricks. For most cases of just having a bunch of items to store, you'd use Vec, or other containers from the standard library.

2 Likes

Thanks, your response motivated me more to learn about Rust, specially about considering unsafe blocks (I did not learn that part yet, but I know they exist) valid rustacean.

And I agree I would hardly ever use linked list, like I said, I consider it a good exercise to learn the language, I just got severely frustrated in the process, but I think I understand the language better after that implementation above.

As far as I can tell there are two nice ways to make a linked list in Rust:

  1. Put all the nodes in a Vec and use indices create the links between nodes instead of references/pointers.

  2. Use "unsafe" and work with raw pointers. If you do it this way your linked list implementation will look almost like it's written in C. Be sure to test your code thoroughly with miri.

I feel we should not torture beginners to Rust with linked lists and the "Thou shalt not use unsafe" idea.

1 Like

@ZiCog I don't think I agree with solution 1. It is not really a linked list and it also do not address the main issue which is any cyclical structure. What if I want a binary tree ? or a btree ? a graph also falls in that category if I do not represent as a matrix.

As for 2. I just want to understand how rust works and I consider linked lists help me with that, since it is a fairly complex data structure and stress a lot of rules that rust have regarding borrowing/ownership. I will definitely work in an unsafe implementation of the code above, just to see how it will improve in terms of readability.

The implementation you gave is a single-sided linked list implementation implemented in a traditional way, there's also a functional programming way of implementing it:

pub enum Node<T> {
    Cons(T,Box<Node<T>>),
    Nil
}

I've also written a naive double-sided linked list implementation: rust_simple_double_linked_list/src/main.rs at main ยท lvyitian/rust_simple_double_linked_list ยท GitHub
The most idiomatic implementation would be the one in stdlib that is implemented with unsafe: linked_list.rs - source

Many people argue that a linked list implement in a Vec and using indices as the "next", "prev" links is not a "real linked list". I feel differently. It has the same structure, and supports the same operations. It's just implemented differently. It's all a matter of definition I guess.

There is no problem with cyclical lists in what I describe. For example:

  1. Element 0 of the vec contains a "next" field holding an index of 1. Element zero is linked to element one.
  2. Element 1 contains a "next" field of 0. Element 1 is linked to element 0.
  3. Bingo! That is a cyclical linked list.

Similarly there is no problem with binary trees. Instead of each element having a "next" index it has a "left" and "right" index.

This can be extended to graphs, I have done it. A matrix is not necessary to do that.

I fully support learning how to make linked lists in Rust. Every C programmer wants to do that first when meeting Rust its seems, as it's about the simplest data structure there is and can be done very easily in C. I'm just suggesting it might not be a good task for a beginner to Rust. For the reasons noted in this thread.

3 Likes

LinkedList implemented with Vec as heap memory and indices as pointers surely is a way of implementing LinkedList, but it has several issues, the most significant one is that when you remove a node, you will either have to leak it(keep it in the vec) or iterate over all the remaining nodes in the vec and update the indices manually.

Or segment the Vec into two pieces: the actual linked list, and a free list. (The data of each node could be an Option, or ig MaybeUninit since you could statically know which part is the free list, but the main advantage of the Vec-based linked list is supposed to be avoiding unsafe.)

If you go with Option-based nodes, you could even just chain the entire Vec into one circular linked list, such that the first[1] k-many nodes are Some and the last n - k nodes are None. The (user-visible) linked list would then chain directly into the free list. And if you need a new node and the free list is empty (that is, the tail of the linked list points to something which has Some data rather than None), you can grow the Vec and format the new capacity into the free list structure.


  1. โ€œas determined by a head index and next indices โ†ฉ๏ธŽ

1 Like

Using Rc/Refcell is a perfectly good way to do it, and you will learn about those as a result.

You could also implement it using unsafe and raw pointers, and learn about that.

Generally speaking Rust programs do not use linked lists, as explained here:

NOTE: It is almost always better to use Vec or VecDeque because array-based containers are generally faster, more memory efficient, and make better use of CPU cache.

1 Like

Certainly dealing with deleted nodes from a list in a Vec is an issue. One simple approach to that is to maintain two intertwined lists in the Vec, a list of live nodes and a list of dead nodes. The "free list". Delete nodes from the live list by unlinking them and linking them to the free list.

This has the advantage if order does not matter one can iterate over the list in a very fast, cache friendly, way.

Another advantage is that the linked list in a Vec acts as a memory arena, when you are done with it there is only one memory free operation, on the whole Vec rather than many, one for each node.

As usual, all of this may or may not be useful depending on what is in the list and what you do with it.

2 Likes

I implemented it like you. I hate Rust, because you can't do it simple like in C. But after sometime. I am okay with the implementation. It just works.

I think 90+% of Rust linked list implementations are done by people learning the language.[1] After that adventure, I'd wager your typical Rust programmer never bothers again. Implementing your own linked list is not representative of typical Rust programming, so don't take the experience being miserable as a definite sign you can't enjoy the language.

(Too Many Lists is still a good read IMO, and the author has done the miserable parts for you :slight_smile:. I'm not trying to say the exercise isn't educational, but it would also be a shame if it scared you away from Rust generally.)


But you asked for a review.

  • You should implement Drop for Node to avoid arbitrary recursion depth upon destruction.[2] Relevant TML section.

    • If you evolve this into a doubly-linked list the naive way, you'll have Rc cycles. You could break them with your own Drop implementation(s) or perhaps with Weak. Relevant TML section (at the bottom).
  • You might as well get used to other traits too and implement Display as a more general solution than fn display. You could also derive Default. Deriving Debug would be recursive so that would be another one to implement yourself.

  • In delete_by_index

    • You check while current.is_some() but then later you self.head = current.unwrap() so deleting an invalid index panics. Which can be fine if that's what you want the method to do for that case (it's better than silently ignoring the operation IMO),[3] but I'm not sure how intentional it was.

    • You never set self.tail to None when the list is emptied, and it seems to me you should.

  • (Suggestion) When you're in a mindset amenable to tedium, write tests for these methods.


  1. And that most of the remaining percentile are the "wait-free multi-threaded tricks" implemented with a lot of unsafe, etc, use cases. โ†ฉ๏ธŽ

  2. I fed it 100,000 items and it blew the stack in debug mode. โ†ฉ๏ธŽ

  3. having something that returns Result<_, _> instead is fine โ†ฉ๏ธŽ

2 Likes

The implementation is decent for safe only code, but for an actual linked list implementation, you would need to use raw unsafe pointers then build a safe interface around the unsafe implementation because the traditional linked list in C is built around holding pointers to data which doesn't really fit safe Rust

My only issue with your code is that you're making a display method instead of implementing std::fmt::Display for your List and Node structs, which would make it more flexible instead of just printing and is generally more idiomatic Rust

1 Like

If you like to learn you might want to see how the standard library implements a
linked list.

But you can do it simple like C. Here is a snippet from my attempt:

#[repr(C)]
pub struct LinkedList<T> {
    head: *mut Node<T>,
    tail: *mut Node<T>,
    len: usize,
}

#[repr(C)]
struct Node<T> {
    next: *mut Node<T>,
    prev: *mut Node<T>,
    element: T,
}

pub struct Iter<T> {
    current: *mut Node<T>,
}

impl<T> Default for LinkedList<T> {
    fn default() -> Self {
        Self::new()
    }
}

impl<T> LinkedList<T> {
    /// Creates an empty `LinkedList`.
    ///
    pub const fn new() -> Self {
        LinkedList {
            head: ptr::null_mut(),
            tail: ptr::null_mut(),
            len: 0,
        }
    }

    /// Appends an element to the back of a list.
    ///
    /// This operation should compute in *O*(1) time.
    ///
    /// # Safety
    ///
    /// This function is unsafe because it:
    /// - Dereferences a raw pointer (`self`)
    /// - Performs raw memory allocation
    /// - Dereferences memory that was just allocated
    ///
    /// The caller must ensure that `self` points to a valid `LinkedList<T>`.
    pub unsafe fn push_back(self: *mut Self, elt: T) {
        unsafe {
            let node_size = core::mem::size_of::<Node<T>>() as size_t;
            let new_node = malloc(node_size) as *mut Node<T>;
            if new_node.is_null() {
                panic!("Failed to allocate memory for node");
            }

            // Initialize the new node
            (*new_node).element = elt;
            (*new_node).next = ptr::null_mut();
            (*new_node).prev = (*self).tail;

            // Update list pointers
            if (*self).tail.is_null() {
                // Empty list case
                (*self).head = new_node;
            } else {
                // Connect current tail to new node
                (*(*self).tail).next = new_node;
            }

            (*self).tail = new_node;
            (*self).len += 1;
        }
    }

Looks almost like C to me. Would even more so if it was not generic.

Note: The. above use raw pointers as the Self parameters on methods instead of references. A cool thing I discovered that I did not expect in Rust.

Other note: Do not do this in serious code. This was just me playing around with "Crust" where everything is unsafe all the time and raw pointers are used instead of references.

Here's your Award for Dropping Uninitialized Elements!

(at line (*new_node).element = elt; of course)

error: Undefined Behavior: reading memory at alloc337[0x18..0x20], but memory is uninitialized at [0x18..0x20], and this operation requires initialized memory
   --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec/mod.rs:597:9
    |
597 |         self.ptr.cast().as_non_null_ptr()
    |         ^^^^^^^^ Undefined Behavior occurred here
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: stack backtrace:
            0: alloc::raw_vec::RawVecInner::non_null::<u8>
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec/mod.rs:597:9: 597:17
            1: alloc::raw_vec::RawVecInner::ptr::<u8>
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec/mod.rs:592:9: 592:29
            2: alloc::raw_vec::RawVec::<u8>::ptr
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec/mod.rs:295:9: 295:25
            3: std::vec::Vec::<u8>::as_mut_ptr
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:1935:9: 1935:23
            4: <std::vec::Vec<u8> as std::ops::Drop>::drop
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:4159:62: 4159:79
            5: std::ptr::drop_in_place::<std::vec::Vec<u8>> - shim(Some(std::vec::Vec<u8>))
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:805:1: 807:25
            6: std::ptr::drop_in_place::<std::string::String> - shim(Some(std::string::String))
                at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:805:1: 807:25
            7: LinkedList::<std::string::String>::push_back
                at src/main.rs:62:13: 62:32
            8: main
                at src/main.rs:84:9: 84:53

Uninitialized memory occurred at alloc337[0x18..0x20], in this allocation:
alloc337 (C heap, size: 40, align: 16) {
    0x00 โ”‚ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘
    0x10 โ”‚ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘
    0x20 โ”‚ __ __ __ __ __ __ __ __                         โ”‚ โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘
}
1 Like

Sadly I do not deserve that award (yet). That is not my code you are presenting there.