Trouble implementing linked list mutable iterator variant (based on Gankro's tutorial)

First I want to say thanks a lot for the tutorial that you wrote. I have similar a follow up question on IterMut implementation.
In your tutorial you mention that IterMut works due to "Borrow checking magic!", can you please point to the exact place in the above 2 examples where the disjoint of life time 'b and life time 'a happens in the implementation of next(), because as you point out self comes with one life time and return from next should be for other lifetime.
Thanks.

Here are the relevant parts with all the lifetimes explicitly written out and comments indicating what that allows.

impl<T> List<T> {
    // IterMut is constrained via its lifetime parameter 'a to have a
    // lifetime no larger than that of the linked list.
    fn iter_mut(&'a mut self) -> IterMut<'a, T> {
        //...
    }
}

impl<'a, T> Iterator for IterMut<'a, T> {
    type Item = &'a mut T;

    // Discussion on this below.
    fn next(&'b mut self) -> Option<&'a mut T> {
        // ...
    }
}

According to the signature of next(), IterMut (self) has a seemingly arbitrary lifetime 'b and produces items with the lifetime 'a with which it is parametrized.

Not constraining 'b in this signature means that we can call next() over and over again to return something
(Option<&'a mut T>) that is constrained by a lifetime of 'a. This returned value can thus be thought to be borrowed from either

  • the original linked list (lifetime 'a) or
  • IterMut (which is parameterized by 'a).

Soundness

  • By construction, due to its type parameter, IterMut has lifetime no larger than 'a, implying that 'b: 'a even though it's not written in the signature of next().

  • An IterMut's next() invocation must never return a reference (Option<&'a mut T>) that it has returned in a previous invocation of next(). Otherwise, the iterator would have managed to expose two mutable (i.e. unique) references to the same memory location, which is unsound.
    Note: This property can only be violated if next()'s implementation uses unsafe code.

1 Like

Than you very much. It makes sense now. I was not understanding the "by construction" part, now I do.

So I running circles with this one:

#![feature(box_patterns)]
#![feature(associated_consts)]
#![feature(conservative_impl_trait)]

use std::ptr;

#[derive(PartialEq)]
struct Link<T>
{
	pub next: *mut Link<T>,
	pub prev: *mut Link<T>,
	pub value: Option<T>,
}

type Handle<T> = Box<Link<T>>;


pub struct Iter<'a, T: 'a>
{
	next: &'a Link<T>
}

pub struct IterMut<'a, T: 'a>
{
	next: &'a mut Link<T>
}

pub struct List<T>
{
	sentinel: Handle<T>,
}

impl<T> List<T>
{

	pub fn iter(&self) -> Iter<T>
	{
		Iter { next: unsafe {&*self.sentinel.next} }
	}

	pub fn iter_mut(&mut self) -> IterMut<T>
	{
		IterMut { next: unsafe {&mut *self.sentinel.next} }
	}
}

impl<'a, T> Iterator for Iter<'a, T>
{
	type Item = &'a T;

	fn next(&mut self) -> Option<Self::Item>
	{
		self.next.value.as_ref().and_then(|v| {
			self.next = unsafe {&*self.next.next};
			Some(v)
		})
	}
}

impl<'a, T> Iterator for IterMut<'a, T>
{
	type Item = &'a mut T;

	fn next(&mut self) -> Option<Self::Item>
	{
		self.next.value.as_mut().and_then(|v| {
			self.next = unsafe {&mut *self.next.next};
			Some(v)
		})
	}
}



fn main()
{
}

This code give me an error:

error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
  --> src/test.rs:66:19
   |
66 | 		self.next.value.as_mut().and_then(|v| {
   | 		                ^^^^^^
   |
help: consider using an explicit lifetime parameter as shown: fn next(&'a mut self) -> Option<Self::Item>
  --> src/test.rs:64:2
   |
64 | 	fn next(&mut self) -> Option<Self::Item>
   | 	^

error: aborting due to previous error

error: Could not compile `test`.

To learn more, run the command again with --verbose.

Without implementation of IterMut no errors. My brain hurts I don't understand how is my code substentially different from the examples above.
I would greatly appreciate any help.
Thank you.

Haha, deja vu! Exactly what I tried one of my first times round, except that I see a few other things that are a little surprising.

Firstly, here's a playground link to your code, and wow, that's a poor error message! :wink:

Part 1

Before we get to the IterMut's next(), you have a bug here:

pub fn iter_mut(&mut self) -> IterMut<T>
{
    // `sentinel.next` can be null, but you're derefencing it!
    IterMut { next: unsafe {&mut *self.sentinel.next} }
}

IterMut needs to account for an empty list and should instead be defined and initialized as follows:

pub struct IterMut<'a, T: 'a>
{
    next: Option<&'a mut Link<T>>
}

// ...

pub fn iter_mut(&mut self) -> IterMut<T>
{
    let mut ptr = self.sentinel.next;
    IterMut {
        next: if ptr.is_null() {
                  None
              } else {
                  Some(unsafe {&mut *self.sentinel.next})
              }
    }
}

Part 2

The definition of Link is surprising to me:

struct Link<T>
{
	pub next: *mut Link<T>,
	pub prev: *mut Link<T>,
	pub value: Option<T>,
}

It holds values of type Option<T>. So I would expect your iterators to enumerate values of type Option<T> and not T (i.e. I expected next() would return Option<&mut Option<T>>). If you really intended your signatures to be as written, what if you have a list of 10 options, and the first item was None? Is the iterator supposed to enumerate 0 values (since a None is returned for the first value instead of Option<&None>)?

I'm going to assume that you meant to store T in Link<T> instead of Option<T>. If that's not the case, then please provide answers to the questions above and we can look at doing what you intended.

Part 3

As for the iterator itself, the relevant code is here with comments indicating the real problem with mutable borrows.

fn next(&mut self) -> Option<Self::Item>
{
    // This is mutably borrowing from `self` for the whole block, ...
    self.next.value.as_mut().and_then(|v| {
        // and while a mutable borrow is active, it is modifying `self`.
        self.next = unsafe {&mut *self.next.next};
        Some(v)
    })
}

The trick to addressing the mutability issue is to swap the content out of self, replacing it with a dummy value. Then borrow the value you took out and assigned self. But you can't do that with your definition of IterMut because you can't swap a reference out with a null value. If you want a nullable reference, then you want to use an Option<&'a mut Link<T>>, which, we already did in Part 1. This swapping is so common that Option has a method for it: take().

fn next(&mut self) -> Option<Self::Item>
{
    self.next.take()
        .map(|link| {
            self.next = Some(unsafe {&mut *link.next});
            &mut link.value
        })
}

Final playground link.

1 Like

Got it. Thanks. So the "take" does the trick and "breaks the entanglement with self". That's what I was missing. I feel stupid, I just checked the original post and yes it's there and I missed it :slight_smile:

As for the Part I: this is part of my exploration of Rust raw pointers, I wrote a "sentinel" based deque/list (here is the original post with a code reference: Sentinel based Deque)
so the way it's structured the pointers can never be NULL. When list is empty the sentinel points to itself, creating a loop, but the value it holds Option is None, because sentinel has no value. That what differentiates sentinel from normal value nodes.

As for Part II: I was not planing on having "None" in the list (I guess my current implementation prohibits that), but now I am going to think about it.

Once again thanks a lot. Rust is fun but sometimes drives me crazy. :wink:

Ah got it. As an alternative, if you require T: Default, then you could use the Default::default() value for the sentinel's value. Otherwise you will have wasteful checks on Options that can never be None.

1 Like

Thanks. The problem with default is that it can be legitimate value, so it might be confusing for the end user if I use it as a special marker. I was considering providing nullable trait and requiring it for T, then decided to use option as it provides a lot of optimization opportunities for compiler. I will play with those ideas.

I just want to clarify — Default::default() can be legitimately used in the list. It's only there for you to assign something to the sentinel's value (only because you have to assign something there in safe rust); you just never expose the sentinel's value to the user. So, there shouldn't be any confusion from the usability standpoint. The only issue is that the user can't store items that don't implement Default.

I've been avoiding suggesting the use of std::mem::uninitialized as you have to be really careful dealing with types that implement Drop. If you can handle the constraint of T: Default, I'd go with that for simplicity.

1 Like

Hi all,

My story is exactly as marcianx outlined in the very first paragraph of this thread, so I won't repeat that part. I would be very glad if someone helped me understand what the problem is below:

struct Node<T> {
    elem: T,
    next: Link<T>,
}

enum Link<T> {
    Empty,
    Full(Box<Node<T>>)
}
use Link::*;

struct List<T> {
    head: Link<T>,
}

struct IterMut<'a,T:'a> {
    next: &'a mut Link<T>,
}

impl<'a,T:'a> std::iter::Iterator for IterMut<'a,T> {
    type Item = &'a mut T;
    fn next(& mut self) -> Option<&'a mut T> {
      
        // Attempt 1: 
        // This one fails with
        // error[E0495]: cannot infer an appropriate lifetime
        // for ref mut p (using rust 1.19.0).
        //
        //match self.next {
            //&mut Empty => {None},
            //&mut Full(ref mut p) => {
                //let q = &mut (**p);
                //self.next = &mut q.next;
                //Some(&mut q.elem)
                                        
            //},
        //}
        
        
        //Attempt 2:
        //This one fails too, for the same reason:
        //
        //let link = std::mem::replace(self.next, Empty);
        //match link {
            //Empty => None,
            //Full(ref mut p) => { let q = &mut (**p); self.next = &mut q.next; Some(&mut q.elem)},
            
        //}
        
        
        // Attempt 3:
        // This does not work either. Using (*self).next
        // makes no difference:
        //
        //let retval = iter_value (self.next);
        //self.next = next_link (self.next);
        //retval

    }
}


fn next_link<'a,T> (link: &'a mut Link<T>) -> &'a mut Link<T> {

    match *link {
        Empty => link,
        Full(ref mut p) => { let q=&mut (**p); &mut q.next },
    }
    
}


fn iter_value<'a,T> (link: &'a mut Link<T>) -> Option<&'a mut T> {

    match *link {
        Empty => None,
        Full(ref mut p) => { let q=&mut (**p); Some(&mut q.elem) },
    }
    
}

Unlike Gankro here I insisted on using a custom type rather than Option to define the links, just to see if I could replicate what he did using take() and map. After the first two attempts (cf the comments) I suspected that the reference lifetime information was getting lost at the pattern matching stage (that is, ref mut p does not inherit the lifetime of self.next).

However, after the failure of the third attempt my new theory is that perhaps the lifetime of self.next where self is an IterMut <'a,T> is taken to be the lifetime of self, which need not outlive 'a. On the other hand the next field of an IterMut <'a,T> is supposed to be a &'a mut Link<T>. I am possibly missing something simple here.

So is there a way to make Rust believe that self.next really outlives 'a? Or am I totally lost?

Let me try to explain your approach #2, which I think is the closest in spirit to what @Gankra did with take.

The crucial difference is that he has next: Option<&'a mut Node<T>>. When you take out of that option, you get back ... an Option<&'a mut Node<T>>. This sounds silly, but it's important because it disassociates this value from self.next - ownership of the mutable reference is transferred to the caller of take. The underlying value behind the reference is not changed.

In your case, however, you have next: &'a mut Link<T>. When you std::mem::replace(self.next, Empty), and assuming self.next was Full(Box<Node<T>>), you get back the actual Full(Box<Node<T>>) - that is, you now own the box, not a mutable reference to it. Since you own it, it'll get dropped once next exits and you cannot return a reference to it.

Hope that makes sense.

Yes it does, thanks a lot. And yes, the second way was supposed to imitate the take() approach.

What about attempts #1 and #3? There I only work with references and still the lifetime is not transferred. In #3, in functions next_link and iter_value, changing match *link in to match link makes no difference. I am especially confused about the failure of this third method, given the signatures of these two functions. I would expect that self.next, being of type &'a mut Link<T>, is passed as such to next_link which in turn spits out a reference with the same lifetime 'a.

self.next is not passed as &'a mut Link<T> to next_link or iter_value. Note that 'a in those two functions is unrelated to the 'a in your IterMut. In fact, you can leave that 'a out and use lifetime elision.

Mutable references are affine - this means they're moved, never copied (like immutable references). This ensures that the unique borrow aspect is maintained. Now, when you call a function that takes a mutable borrow, Rust will re-borrow, and not move. Essentially it's doing let retval = iter_value(&mut *self.next). Once that function returns, that (temporary) borrow expires and you're left with the original borrow. The original borrow here is actually tied to the lifetime of &mut self in the next signature. Note that &mut self is not borrowing IterMut for 'a, but something less (just the function call, basically).

Now, what Rust is trying to prevent is your iterator from yielding a mutable reference to the same value. Given what I said about about &mut self being short, imagine you did the following:

let first = iter.next(); // mutable borrow lasts just for this call, which means we can call next() again below
let second = iter.next(); // what if you yielded the same reference as 'first'?

If Rust cannot prove that the above is impossible, it will not allow you to use &mut self to return &'a mut T. And you cannot attempt to "trick" it by using intermediate functions like iter_value and next_link because those just reborrow, as mentioned - ownership of the mutable reference stays with the IterMut.

The reason Gankro's approach works is precisely because Option::take transfers ownership of the mutable reference from the iterator to the caller of take, which happens to be a temporary inside next. Once that ownership is transferred, Rust "knows" that you cannot yield that item again from the iterator because the iterator no longer is the owner of the mutable reference.

Does this make sense?

1 Like

Yes, for a second time! And many thanks as well. I know that the annotated 'a in the last two functions is irrelevant to the lifetime of self.next; I just opted to make it explicit to emphasize the signature.

This bit clarifies the reason of the failures very well for me.

So, would be correct to say that in this scenario, using a wrapper for the mutable reference is necessary? Accessing the value like in #2 I cause a move which reduces the lifetime to that of the local scope. In the other two I do not move but do short-lived reborrows. I need to move the borrow to somebody else and I can only do that if, so to say, I don't "touch" the value inside (so use a wrapper!).

Using safe code, I think so. You basically need something to move the mutable reference out of the iterator but leave the field in the iterator itself valid (that's what Option::take provides - it moves the value out, but leaves a valid None in its place). So any wrapper you might write will likely look like an Option :slight_smile:.

And there goes down the drain my challenge to tweak the iterator :). This one has been a very instructive example for me. One might feel like (perhaps from an "information" point of view), using a Link or wrapped Node should be equivalent. However, only the latter one seems suitable for safe code. Among the material I read so far this is the most striking* case of a nontirivial design choice being enforced to achieve safety requirements.

Thanks again for the wonderful explanations.

  • I must note that I am familiar with C but not a professional programmer, and this could be a rather dull example to those who are.
1 Like

Not at all - mutable iterators are hard to write in (safe) Rust. And, they pretty much throw the borrow system/variance/lifetimes kitchen sink at you, so it's good learning material in that sense :slight_smile:.

I was extremely experienced with C++ before my attempt at the top of this thread. I assure you, I found this exercise massively enlightening on the disciplined perspective of the rust compiler. I can't recommend it enough to any rust newbie; but my perspective is biased by my particular experience.

1 Like

It's interesting you mention this - I happen to agree that it's a great learning experience. At the same time, this exercise (linked lists) in Rust has been thrown around as the "wtf - I can't even write a linked list in Rust! I'm giving up ..." mantra. I think the key to this exercise is to not consider it victory if you can implement the list, but rather it's a victory if you understand why you cannot do it (using whatever approaches one tries at first). So I'm glad you put this in a positive light, and I hope others look at it like that as well.

2 Likes

I completely agree. One learns from failures at least as much as one learns from the ideal examples. What I liked most about Gankro's presentation was that he was demonstrating some failed attempts before revealing the solution. Most of the people learning Rust are probably coming from a C/C++ background like vitalyd or myself, and there are some common traps in Rust for people like these. I hope there will be more documentation in the future taking Gankro's approach.

1 Like