Lazy iterable to tuple of given length


#1

If you have lines of 3 numbers “a”, “b” and “c”, and you need to compute the sum() of a+b-c for all the lines, you can do it like this:

#![feature(iter_arith)]

const DATA: &'static str =
"36 28 25
23 24 65
1 62 53
96 80 62
80 55 95
12 56 22
63 55 1
40 96 25
43 43 59
58 98 65
13 97 18
76 67 73
54 93 29
18 18 85
29 52 71
13 43 12
9 22 92
6 23 15
60 26 31
19 43 82";

fn solve1() -> i32 {
    DATA
    .lines()
    .map(|r| r
             .split_whitespace()
             .map(|p| p.parse::<i32>().unwrap())
             .collect::<Vec<_>>())
    .map(|v| v[0] + v[1] - v[2])
    .sum::<i32>()
}

fn main() {
    println!("{}", solve1());
}

The result is 850. You can also merge the two map() into one:

fn solve2() -> i32 {
    DATA
    .lines()
    .map(|r| {
        let v = r
                .split_whitespace()
                .map(|p| p.parse::<i32>().unwrap())
                .collect::<Vec<_>>();
        v[0] + v[1] - v[2]
    })
    .sum::<i32>()
}

If you prefer to avoid the heap allocations of the collect(), you can also leave the inner row lazy:

fn solve3() -> i32 {
    DATA
    .lines()
    .map(|r| {
        let mut v = r
                    .split_whitespace()
                    .map(|p| p.parse::<i32>().unwrap());
        v.next().unwrap() + v.next().unwrap() - v.next().unwrap()
    })
    .sum::<i32>()
}

But if the operations done on v are a little more complex, all this is not nice. In such cases you may want to assign the intermediate a,b,c variables.

In Python2.x you solve this problem like this:

data = """\
36 28 25
23 24 65
1 62 53
96 80 62
80 55 95
12 56 22
63 55 1
40 96 25
43 43 59
58 98 65
13 97 18
76 67 73
54 93 29
18 18 85
29 52 71
13 43 12
9 22 92
6 23 15
60 26 31
19 43 82"""

def solve4():
    lines = (map(int, r.split()) for r in data.splitlines())
    return sum(a + b - c for (a, b, c) in lines)

print solve4()

In Haskell you can do (lines already divided for my convenience):

data1 = ["36 28 25",
        "23 24 65",
        "1 62 53",
        "96 80 62",
        "80 55 95",
        "12 56 22",
        "63 55 1",
        "40 96 25",
        "43 43 59",
        "58 98 65",
        "13 97 18",
        "76 67 73",
        "54 93 29",
        "18 18 85",
        "29 52 71",
        "13 43 12",
        "9 22 92",
        "6 23 15",
        "60 26 31",
        "19 43 82"]

compute [a, b, c] = a + b - c

solve5 :: [String] -> Int
solve5 lines = sum $ map (compute . map read . words) lines

main = do
    print $ solve5 data1

In Python2.x (in Python3 you can’t do this) you can also write something similar to the Haskell code:

compute = lambda (a, b, c): a + b - c
print sum(compute(map(int, r.split())) for r in data.splitlines())

Both Haskell and Python (and F#) accept functions that match only partially.

In Python3 you can also assign tuples from lazy iterables, with a bit of pattern matching.

In general in Rust it’s not nice to convert lazy iterables or dynamic arrays into tuples or named variables. To improve the situation I’ve created a simple “to_tupleN”:

#![feature(iter_arith)]

const DATA: &'static str = "...";

trait MyIterExt: Iterator {
    fn to_tuple1<S=<Self as Iterator>::Item>(mut self) ->
        Option<(Self::Item,)>
        where Self: Sized {
        if let (Some(a), None) = (self.next(), self.next()) {
            Some((a,))
        } else {
            None
        }
    }

    fn to_tuple2<S=<Self as Iterator>::Item>(mut self) ->
        Option<(Self::Item, Self::Item)>
        where Self: Sized {
        if let (Some(a), Some(b), None) =
            (self.next(), self.next(), self.next()) {
            Some((a, b))
        } else {
            None
        }
    }

    fn to_tuple3<S=<Self as Iterator>::Item>(mut self) ->
        Option<(Self::Item, Self::Item, Self::Item)>
        where Self: Sized {
        if let (Some(a), Some(b), Some(c), None) =
            (self.next(), self.next(), self.next(), self.next()) {
            Some((a, b, c))
        } else {
            None
        }
    }
    
    // Longer versions here...
}

impl<I> MyIterExt for I where I: Iterator {}

fn solve6() -> i32 {
    DATA
    .lines()
    .map(|r| r
             .split_whitespace()
             .map(|p| p.parse::<i32>().unwrap())
             .to_tuple3::<i32>()
             .unwrap())
    .map(|(a, b, c)| a + b - c)
    .sum::<i32>()
}

fn main() {
    println!("{}", solve6());
}

You can also write it with a single map:

fn solve7() -> i32 {
    DATA
    .lines()
    .map(|r| {
        let (a, b, c) = r
                        .split_whitespace()
                        .map(|p| p.parse::<i32>().unwrap())
                        .to_tuple3::<i32>()
                        .unwrap();
        a + b - c})
    .sum::<i32>()
}

to_tuple1, to_tuple2, to_tuple3, … take an Iterator and convert them to a tuple of length N wrapped in a Option (if Rust gains integral values for generics, you can replace them with a to_tuple::). If the iterator is too much short or long compared to the desired tuple, to_tupleN returns None.

To avoid too much code duplication, perhaps you can write a macro that generates all the to_tupleN (in D language you have variable length templates, plus lot of compile-time power, that allows you to write a single function, that can be used with a syntax like:

to_tuple!3

I think those Rust functions are useful in several situations of simple data munging. So is it a good idea to add something like this to Rust std lib?


Rustify combinators application to create Map from String
#2

I think this should wait for constant type parameters and array/slice patterns. Personally, I’d like to see:

trait Iterator {
    // ...
    fn nextn<const N: int>(&mut self) -> Option<[Self::Item; N]> {
        /* stuff */
    }
}

fn main() {
    // ...
    let [a, b, c] = iter.nextn().unwrap();  // N inferred.
}

#3

I would like just to have ::collect<[T; 3]>(), with a panic if there are more then 3 items in the iterator :wink:


#4

The standard library of D language allows you to create a fixed size array like that from parsing:

void main() {
    import std.conv, std.string;
    immutable data = "10,20,30";
    const parts = data.split(',').to!(int[3]);
}

But the problem of your idea is visible in Rust code like:

let tags = tags_string
           .split(',')
           .map(|kv| kv.split('=').map(|s| s.to_string()))
           .map(|mut kv| kv.to_tuple2::<String>().unwrap())
           .collect::<HashMap<_, _>>();

If you replace the to_tuple2() with something that gives you a [String; 2], then I think you can’t use .collect::<HashMap<_, _>>() on such sequence, nor .unzip(). You need tuples.