Convert a string which contains either integer or float to float

Hi,
I am reading lines of text file which contains Integers and Float in one column. Some thing like below.

1234
5678.9

When I read this line by line and do .parse::<f32>().unwrap(), I am getting a panic,
Err value: ParseFloatError { kind: Invalid }

I know I can solve this with if and else condition but I am looking for an elegant/idiomatic/optimised way.

Are you sure the problem is the float parser not wanting integers?

println!("{}", "1234".parse::<f32>().unwrap());

runs fine. (playground)

I suspect there's something else throwing off the float parser here. A common one is accidentally including newlines in the string passed into .parse(). Are you hitting that?

1 Like

You are right. Just now figured out my mistake. I was using iter nth and now I realised, it discards the previous elements. I was about to delete this but you were quick. Thanks!

However, I am facing a new challenge now. I want to discard the first 8 and last 8 lines in my file. I have found how to do for first 8 but not able to figure for last 8

let br: Vec<(String,f32,usize,usize)> = BufReader::new(f)
       .lines()
       .enumerate()
       .filter(|(index, _)| index > &8 && index < )
       .map(|(index,line)| {
            //some function
       }).collect();

Also one more question: I am going to iterate over this vec in other function and do you think it is good idea to do collect?

I don't think it's in general possible to discard the last 8 of an iterator - unless it's from a collection with a length, like a Vec, you don't know how many elements are going to be found. So there's no way to know when to stop taking new elements.

One way could be able to do a collect first, and then you can slice off the first and last elements with vec[8..(vec.len() - 8)]. Or maybe you could make the map function fallible (returning an Option or Result) and cut off those last 8 elements after collecting?

Another alternative would be to read the file twice - once simply to count the number of lines, and then a second time to read the (# of lines) - 8 lines.

If you're in control of the way the data is stored on disk, it might be worth adding in a line containing the number of lines to follow, so you can just take that many rather than taking (total number of lines) - 8.

I'd say it depends on the exact situation. If you're iterating over it exactly once, and it's immediately afterwards, I would say keeping it as an iterator is probably better. If you store it for a while or need to iterate over it multiple times, using a Vec is reasonable.

Thanks. Then I will try to handle it inside the Map function; I will make the function churn out Option or Result.

I will use this only once in the calling function. But the problem is I am not able to return a Map with simple signature.

1 Like

No problem!

Thinking about the problem more, I think there might be one other possible solution: queuing up to 8 items, so that you don't process an item until you know there are at least 8 more.

It'd mean a custom iterator to do that, but I think that should be possible. How does this look?

use std::collections::VecDeque;

/// Iterator which returns all but the last `n` items of an inner iterator
pub struct TakeUntilLastN<T: Iterator> {
    /// The inner iterator
    iter: T,
    // A queue to store up to `n + 1` items taken from the underlying iterator
    queued: VecDeque<T::Item>,
    // How many items to store before returning one.
    n: usize,
}

impl<T: Iterator> Iterator for TakeUntilLastN<T> {
    type Item = T::Item;

    fn next(&mut self) -> Option<Self::Item> {
        // use '?' to return None if self.iter.next() is None
        // we don't want to return anything once the underlying
        // iterator is done
        let mut next = self.iter.next()?;
        // keep polling until we have `n` items queued
        while self.queued.len() < self.n {
            self.queued.push_front(next);
            next = self.iter.next()?;
        }
        
        let result = self.queued.pop_back().unwrap();
        // queue the new item (done after popping, so we
        // only have to store up to n items at a time rather than
        // n + 1)
        self.queued.push_front(next);
        // return the popped item
        Some(result)
    }
}

impl<T: Iterator> TakeUntilLastN<T> {
    fn new(iter: T, n: usize) -> Self {
        TakeUntilLastN {
            iter,
            n,
            queued: VecDeque::with_capacity(n),
        }
    }
}

// declare ext trait so we can use a method rather than a function
pub trait IteratorExt: Iterator + Sized {
    fn take_until_last_n(self, n: usize) -> TakeUntilLastN<Self> {
        TakeUntilLastN::new(self, n)
    }
}
impl<T: Iterator> IteratorExt for T {}


let br: Vec<(String,f32,usize,usize)> = BufReader::new(f)
   .lines()
   .enumerate()
   .skip(8)
   .take_until_last_n(8)
   .map(|(index,line)| {
        //some function
   }).collect();

(playground)

That may or may not be better than the other solutions. Hope it helps!

Edit: edited the iterator implementation, the first bit of code I posted was incorrect.

1 Like

Thanks I will look into it. For me it is also simple to use Options or Results. And in the calling function, I can filter it. But when I started, as usual Rust compiler complains :slight_smile:

I tried Options, and I dont know how to filter_map() if Option has None
Then I tried Result, I said ok() for lines between 8 to last-8 and Err for others. However, Rust is complaining I need to mention the second argument for Result. I dont want to create a new ErrorType for this

I think using filter_map instead of map should just work, and then it won't pass Nones through. One thing to be careful of, though, is keeping errors if you really do need them.

One good general result type is Box<dyn std::error::Error> - any errors should be able to coerce into that, and it works if you never need to programmatically get at what kind of error it is.

Thanks for the Result approach. That should work for me. Out of curiosity - I use for_each() in the calling function so I cant replace that with filter_map(). How can this be done?

1 Like

The following would skip all the Nones:

iter
    .filter_map(|opt| opt)
    .for_each(|val| ...);

Thanks!! That also works like an charm :slight_smile:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.