Returning borrowed values from an iterator

Hello everyone! I'm implementing an iterator trait for a struct that owns a buffer and I want to return from next() slices of that buffer. I believe it should be possible to do this using named lifetime parameters to express the idea that "lifetime of an iteration result depends on lifetime of a struct instance".

I can't figure out specifics though (couldn't find any docs on the subject, frankly). Here's what I've got so far (abbreviated, full code is here):

struct Lexer<'a> {
    buf: & 'a mut [u8; BUFSIZE],
    // ...
}

impl<'a> Iterator for Lexer<'a> {
    type Item = &'a [u8];

    fn next(&mut self) -> Option<&'a [u8]> {
        // ...
        Some(&self.buf[start..self.pos])
    }
}

fn main() {
    let lexer = Lexer {
        buf: &mut[0; BUFSIZE],
        // ...
    };
    for chunk in lexer {
        println!("{:?}", chunk);
    }
}

I had to declare buf as a reference living inside the Lexer struct though it's not exactly what I have in mind: I want the Lexer to own its buffer and bind the lifetime of resulting references to that of the Lexer itself, no matter how it was created. Am I completely off base here?

But the main thing is that the compiler complains about &self.buf[...] inside next() saying:

cannot infer an appropriate lifetime for lifetime parameter 'a in function call due to conflicting requirements

… and then:

consider using an explicit lifetime parameter as shown: fn next(&'a mut self) -> Option<&'a [u8]>

… which I cannot do as it breaks the trait signature.

2 Likes

The Iterator protocol is not compatible with this model (returning elements that borrow the iterator itself). It would restrict the iterator to strictly “one element alive at a time” which I don't think we want.

The common solution(?) would be to separate the data owner from the iterator -- sort of like the Vec and its iterator are separate values.

What you are asking for is a streaming iterator. There has been some discussion on reddit as to why it isn't possible to support both streaming and normal iterators in one trait. It might be possible to allow multiple traits in for loops as a workaround.

2 Likes

I explain a little more here: http://www.reddit.com/r/rust/comments/303a09/looking_for_more_information_on_streaming/cpoysog (The back-and-forth between me and @neutralinostar is good stuff too.)

1 Like

Thanks guys! Took me a little time to wade through those threads with unfamiliar terminology, but I think I've got the general idea :slight_smile:

So Iterator is currently deliberately declared in a way that returned elements are on their own, not borrowed. However I could simply drop compliance to the trait and implement an iterative process with my own trait (thanks to eddyb on reddit for this sample).

A few more questions:

  • What's an "associative type"?

  • Though I deciphered what HKT means (higher-kinded types) I still don't get what it exactly is, and why the example here is "regarded as HKT"?

  • It seems to me that by "streaming" you specifically mean a borrowing iterator, even though an iterator reading a huge file and producing Vec<u8> elements, while making a copy of every single element, still doesn't try hold the entire input in memory, so I'd call that "streaming" as well. Is this a Rustian terminology?

@bluss:

It would restrict the iterator to strictly “one element alive at a time” which I don't think we want.

Barring concurrency, in what cases it would be desirable to hold onto many elements at a time? The for-loop only ever deals with them one by one, so it seems like the natural way to consume iterators anyway. Could you give an example?

The simplest example I think is if you want to look at two elements at a time for some reason, for example to form the differences from a sequence of numbers. 2, 5, 6, 8 → 3, 1, 2

To make that example more locked into references, consider if you want to check if an iterator of references forms an ascending order. You wouldn't clone out the values to check that.

Another example is the iterator method .reverse_in_place() that reverses a sequence by repeatedly taking the first and last element (by reference) and swapping them.

.collect() in fact extracts all iterator elements to all be alive at the same time. Just like you can collect from (1..10) to a vector of numbers, you can collect from [1,2,3].iter() to get a vector of references into the array.

@isagalaev FWIW, I've tried adapting @eddyb's code there to a more familiar trait with adaptors, but I think the bug in his comments still exists. e.g., lib.rs · GitHub

To make that example more locked into references, consider if you want to check if an iterator of references forms an ascending order. You wouldn't clone out the values to check that.

A-ha… How would you implement it then if you're not allowed to borrow more than once from a mutable object?

Another example is the iterator method .reverse_in_place() that reverses a sequence by repeatedly taking the first and last element (by reference) and swapping them.

Since this wouldn't make sense for generated sequences, this .reverse_in_place() should be a method of a different trait or an implementation, it shouldn't have anything to do with the general Iterator, no?

.collect() in fact extracts all iterator elements to all be alive at the same time. Just like you can collect from (1..10) to a vector of numbers, you can collect from [1,2,3].iter() to get a vector of references into the array.

Hypothetically speaking, I'd say .collect() should be implementable on top of a single-value iterator by doing copying itself, not relying on an iterator to hold all the values all and to provide references.

@BurntSushi I like your usage of the word "familiar" :slight_smile: In my book, pretty much nothing can be described this way yet! "Adaptor" is the Rust word for "map" and "filter", right?

Thanks for the sample. I'll ask some more questions after figuring it out, if you don't mind!

Hah, right. I meant "familiar" as in "something that looks like the Iterator trait."

"Adaptor" is kind of a weird word. I think the more widely use vernacular is "extension method." You might also call them "default methods." That is to say, they are methods defined in the trait definition itself, usually in terms of the "required" methods. Default methods can be overridden by implementors, but they'll otherwise fall back to their default implementation. "Required" methods have no default implementation, so implementors must implement them.

Adaptors aren't necessarily created using methods, and not all iterator methods are adaptors (fold is not an adaptor).

An adaptor is an iterator value created from an input iterator value and optionally other parameters. The Map is an iterator adaptor. It's a new iterator created from a source iterator and a closure.

1 Like

This is what Iterators allow. You don't have to adapt your Lexer to this at all, but you would have it automatically if you implemented the Iterator trait as it is today.

There's nothing wrong with using another abstraction, except that you lose out on some of the nice features for iterators in libstd. Hopefully we can do something similar for streaming iterators.. and if we can't, lobby rust to change to allow it too.