Help with complex file load iterators


#1

I’ve been working through problems in reddit dailyprogrammer as a way to learn Rust. Part of this, I am forcing myself to load the problem data from files, to get a handle of FileIO.

Data file example. Single number stating the number of points in the set, followed by x y entries. This is two sets of 4 points each.

4
0.4 0.5
0.6 0.5
0.5 0.3
0.5 0.7
4
0.1 0.1
0.1 0.9
0.9 0.1
0.9 0.9

I’m using a simple Point struct:

struct Point {
    x: f64,
    y: f64,
}

I’m trying to make a method that loads this into a Vec<Vec<Point>>. I throw away the count int, and just have Points. This is the closest I’ve gotten:

fn get_test_data(filename: &str) -> Vec<Vec<f64>> {
    let file = File::open(filename).unwrap();
    let reader = BufReader::new(&file);
    let line_sets: Vec<Vec<f64>> = reader.lines()
        .filter_map(
            |l| l.ok().map(
                |s| s.split_whitespace()
                    .map(|num| num.parse().unwrap())
                    .collect()))
        .collect();
    return line_sets
}

This gives me

[[4], [0.4, 0.5], [0.6, 0.5], [0.5, 0.3], [0.5, 0.7], [4], [0.1, 0.1], [0.1, 0.9], [0.9, 0.1], [0.9, 0.9]]

I’m having trouble figuring out how to make this into Vec<Vec>:

[ [ Point{0.4, 0.5}, Point{0.6, 0.5}, Point{0.5, 0.3}, Point{0.5, 0.7} ], 
  [ Point{0.1, 0.1}, Point{0.1, 0.9}, Point{0.9, 0.1}, Point{0.9, 0.9} ] ]

I tried a very unrust like Vector building:

let mut point_set: Vec<Point> = Vec::new();
let mut test_sets: Vec<Vec<Point>> = Vec::new();
for set in line_sets {
    if set.len() == 1 {
        if point_set.len() > 0 {
            test_sets.push(point_set);
            let mut point_set: Vec<Point> = Vec::new();
        }
    } else {
        point_set.push(Point{x: set[0], y: set[1]})
    }
}

But of course I get point_set errors, as I’ve already given it to test_sets. I’m on the edge of really understanding iter chaining and closures and this is a complex one for me.

Files here: https://github.com/sacherjj/dailyprogrammer/tree/master/20170630_challenge_321_hard_circle_splitter/circle_spliter


#2

If you’re not going to write a full blown parser, the FromStr trait is usually a good place to start.

Quick and dirty:

use std::str::FromStr;
use std::io::Read;

#[derive(Debug)]
struct Point {
    x: f64,
    y: f64,
}

impl FromStr for Point {
    type Err = ();

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        let mut coords = s.split(' ');
        let x = coords.next().unwrap().parse().unwrap();
        let y = coords.next().unwrap().parse().unwrap();
        match coords.next() {
            None    => Ok(Point { x, y }),
            Some(_) => Err(()),
        }
    }
}

fn main() {
    let buf = {
        let mut buf = String::new();
        std::fs::File::open("data").unwrap()
            .read_to_string(&mut buf).unwrap();
        buf
    };
    let mut lines = buf.lines();
    let mut point_sets: Vec<Vec<Point>> = Vec::new();
    while let Some(line) = lines.next() {
        let n = line.parse::<u32>().unwrap();
        let points = (0..n).map(|_| lines.next().expect("Point is missing").parse().unwrap()).collect();
        point_sets.push(points);
    }
    println!("{:?}", point_sets);
}

#3

I didn’t even think of pushing that back on the impl of Point. It makes total sense. I’ll walk a little down that path.

Thanks!


#4

Does the file get closed in this block of code, due to leaving scope?

let buf = {
    let mut buf = String::new();
    std::fs::File::open("data").unwrap()
        .read_to_string(&mut buf).unwrap();
    buf
};

#5

Yup, sure does: https://doc.rust-lang.org/std/fs/struct.File.html

As a point of comparison, I’d have looked at that input file and done something like (fake code, may have dumb errors):

let empty = vec![];
let points: Vec<Vec<Point>> = buf.lines().fold(empty, |list, &line| {
    let tokens = line.split(' ').collect::Vec<_>();
    if (tokens.len() == 1) {
        // A single item means we're starting a new list of Points
        list.push(vec![]);
    } else {
        let point = /* parse Point here */;
        // Panics if we didn't push an empty vec at the start of the list of Points
        list[list.len() - 1].push(point);
    }
    list
})

which dances around the move error you encountered on your first attempt, because the temp vec you push each list of points into is already inside its parent vec.


#6

Yeah, file is closed and buf becomes immutable.