Reading integers from a file into vector


#1

i have a file in which there are 1800 + integer , each integer is in different line
i want to know how can i know exact number of lines in file and set loop accordingly
and load each number into vector as i64 , i tried searching throughout internet for file io with rust , but all were regarding strings and i dont want to convert


#2

I’m assuming this is a text file since you mentioned that “each integer is in different line”.

You don’t need to know how many lines there are. Instead, just iterate over the lines and parse out the i64 from each line. Here’re a couple of versions to get you started, with the first being more “functional”:

use std::fs::File;
use std::io::{BufRead, BufReader, Error, ErrorKind, Read};

fn read<R: Read>(io: R) -> Result<Vec<i64>, Error> {
    let br = BufReader::new(io);
    br.lines()
        .map(|line| line.and_then(|v| v.parse().map_err(|e| Error::new(ErrorKind::InvalidData, e))))
        .collect()
}

fn read2<R: Read>(io: R) -> Result<Vec<i64>, Error> {
    let br = BufReader::new(io);
    let mut v = vec![];
    for line in br.lines() {
        v.push(line?
            .trim()
            .parse()
            .map_err(|e| Error::new(ErrorKind::InvalidData, e))?);
    }
    Ok(v)
}

fn main() -> Result<(), Error> {
    let vec = read(File::open("/some/path/to/file")?)?;
    // use `vec` for whatever
    Ok(())
}

#3

Small addition to @vitalyd’s answer, if for some reason you are looking for additional performance, you can:

  • if you know an approximate number of entries in your file use Vec::with_capacity(n) to prevent needless allocations
  • use memmap crate to optimize file reading (i.e. create Mmap and wrap it into Cursor)
  • iterate over bytes directly and instead of full UTF-8 check use cheaper '0'..='9' inclusion check

So your code can look somewhat like this:

fn parse(data: &[u8]) -> u64 {
    data.iter().fold(0, |a, b| 10*a + (b - b'0') as u64)
}

pub fn read_ints(mmap_buf: &[u8], size_hint: usize) -> Result<Vec<u64>, MyError> {
    let mut ints = Vec::<u64>::with_capacity(size_hint);
    let mut j = 0;
    // we assume file ends with an empty line and does not use `\r`
    for (i, byte) in mmap_buf.iter().enumerate() {
        match byte {
            b'\n' => {
                ints.push(parse(&mmap_buf[j..i]));
                j = i+1;
            },
            b'0'..=b'9' => (),
            _ => Err(MyError)?,
        }
    }
    Ok(ints)
}

To simplify the code it’s written for u64, not i64, but it can be extended to signed integers as well.


#4

looking at that code , i feel like giving up on rust
anyway , can I use C library function to read file ?

Like making a file pointer and using it in a loop to read a line at a time , because I cannot use the code you gave , as I don’t understand anything in it , additionally I believe that using fscanf in loop to load data will drastically reduce read time (because security checks are much less and it contains lesser steps ) , if this is not possible than link me to useful documents where I can find what these syntax mean , because except variable declarations , I don’t know anything in your code , I am sorry for being such a noob


#5

Have you read the Book? It’s not enough to just know C to start write Rust even on beginner level.

Never mind my previous message, I was mostly showing off. Concentrate on @vitalyd’s solution for now, you can revisit my message when you’ll be more proficient with Rust. I’ll add some comments, which I hope will help:

// function takes path in the form of string slice and returns enum
// which contains vector of integers on success or IO error type, see `std::io::Error`
fn read(path: &str) -> Result<Vec<i64>, io::Error> {
    let file = File::open(path)?; // open file by given path
    // wrap file into generic buffered reader, it will use 4 KB buffer internally
    // to reduce number of syscalls, thus improving performance
    let br = BufReader::new(file);
    // create an empty vector, type of the stored elements will be inferred
    let mut v = Vec::new();
    // br.lines() creates an iterator over lines in the reader
    // see: https://doc.rust-lang.org/std/io/trait.BufRead.html#method.lines
    for line in br.lines() {
        // IO operations generally can return error, we check if got
        // an error,in which case we return this error as the function result
        let line = line?;
        let n = line   
            .trim() // trim "whitespaces"
            .parse() // call `str::parse::<i64>(&self)` method on the trimmed line, which parses integer
            .map_err(|e| Error::new(ErrorKind::InvalidData, e))?; // parse() can return error (e.g. for string "abc"), here if we got it, we convert it to `std::io::Error` type and return it as function result
        v.push(n); // push acquired integer to the vector
    }
    Ok(v) // everything is Ok, return vector
}

If you still don’t understand that happens here, you’ll have to (re)read the Book.


#6

Thanks bro


#7

this code failed fabulously


#8

Please, note that we are not telepaths to read your mind. So no details of “failure”, no help.


#9

here are bunch of troubles

  1. why return a enum (compound data always create lot of problem for me )
  2. why there are so many safety and failure checks
  3. why did we pass the &str instead of "String " data type
  4. what is that " let line = line?;" ?
  5. please provide list of libraries i need to use these functions , tell me import syntax with actual functions to import via use[some name here]::[some name here]::....(this is only part giving me errors )

here are the error messages

error[E0433]: failed to resolve. Use of undeclared type or module `io`
 --> src\lib.rs:3:49
  |
3 |     pub fn read(path: &str) -> Result<Vec<i64>, io::Error> {
  |                                                 ^^ Use of undeclared type or module `io`

error[E0433]: failed to resolve. Use of undeclared type or module `File`
 --> src\lib.rs:4:16
  |
4 |     let file = File::open(path)?; // open file by given path
  |                ^^^^ Use of undeclared type or module `File`

error[E0433]: failed to resolve. Use of undeclared type or module `BufReader`
 --> src\lib.rs:7:14
  |
7 |     let br = BufReader::new(file);
  |              ^^^^^^^^^ Use of undeclared type or module `BufReader`

error[E0433]: failed to resolve. Use of undeclared type or module `Error`
  --> src\lib.rs:19:26
   |
19 |             .map_err(|e| Error::new(ErrorKind::InvalidData, e))?; // parse() can return error (e.g. for string "abc"), here if we got it, we convert it to `std::io::Error` type and return it as function result
   |                          ^^^^^ Use of undeclared type or module `Error`

error[E0433]: failed to resolve. Use of undeclared type or module `ErrorKind`
  --> src\lib.rs:19:37
   |
19 |             .map_err(|e| Error::new(ErrorKind::InvalidData, e))?; // parse() can return error (e.g. for string "abc"), here if we got it, we convert it to `std::io::Error` type and return it as function result
   |                                     ^^^^^^^^^ Use of undeclared type or module `ErrorKind`

error: aborting due to 5 previous errors

#10

Because you want to be able to spot an error by utilizing something that is called Result in rust.

Because everything can go wrong. The file could be deleted, you are not able to read the file, etc.

Because it’s a reference and it’s cheaper than a full copy (Stackoverflow)

Error handling. line is a Result and with ? you unpack the result or return early if it is an error.

Please read the book! We gladly will help you, but you have to do something by yourself, e.g. search for something. When in doubt, always look at the documentation and search for your type, e.g. BufReader. You always get examples and everything you need.

Also: C isn’t easier, in fact it’s even harder, because it does not force you to handle the errors right away, whereas rust will do so.
C is ugly in case of error handling (what does an int tell me about errors?!). In Rust you have the Result enum which clearly gives me Ok or Err(or) and I know what went wrong.

File handling is always hard. Go and try something easier, especially if you are not familiar with a programming language or programming at all, e.g. try to implement a fibonacci sequence with dynamic programming or finding Maximum subarray in a random array (which will teach you something about implementing an algorithm, using random and iterating over a vector/slice).
Again: File handling is not that easy, just because it works most of the time :wink:


#11

This response makes me very sad. If you want to use C, why not use C? The whole point of using Rust is to do things in the Rust way? The first of @vitalyd 's proposals is classic functional Rust, and should probably be considered idiomatic Rust, i.e. it is the Rust way of doing things. So isn’t the correct response to say “I do not understand this code can you tell me how to learn about how it works” rather than "I don’t understand this tell me how to write C in Rust?

Being a noob is nothing to be ashamed of, we all had to start with each programming language. I’ve been doing Rust for over 8 months now and I still consider myself a beginner learning new things, learning the Rust way of doing things I know how to do in C, C++, D, Kotlin, Java, Groovy, etc.

Please can I get you to change your approach to learning Rust. Clearly you know C, but Rust is not C – even if it is going to replace it :slight_smile: . C has it’s way of doing things such as reading a list of integers into an array. Rust can do the same thing, but not in the same way. That is the point about different languages, they have different ways of doing the same things. Transliterating code from one language to another is generally the wrong thing to do. You go back to the goal and say “Ok this is how I would do it in idiomatic C, what is the way of doing it in idiomatic Rust”. That you do not fully understand @vitalyd 's code should lead to “I must learn about this code as it is idiomatic Rust code”.


#12

i did read the book , and i am going through second time - book can only teach me words , but now how to implement things (i am reading second edition because thats recommended ) book is just a one or 2 day job , its not like i dont search at all , i searched stuff and ended up in official documentation- that just confused me even more , i am trying to adapt to rust style because i liked it , its true that c does allow everything you want to without stopping you (sometimes result is a hot mess ) , i left C due to the same reason that i was not able to figure out what went wrong with the program , sometimes it just behaves odd

since for loop is so amazing in rust that i dont always need to specify endpoint reducing error chances
i can directly import algorithms i wrote in C , they implement loops and if else same way

thats why jumped directly to file io (i am even able to use and make my crates and modules and personally the best reason to switch to rust )

i read the enum page , i got no clue how to extract my vector from enum , so i tried to to bypass security checks to eliminate need to extract

i am clueless on rust because it has more syntax (by amount of built in features) when compared to C ,
you have to make most of things yourself (both good and bad;bad because it makes us reinventing the wheel, good because sometimes we don’t need some features )

any good market book you can recommend ? because docs are simply not helping me


#13

i am not using C because its harder
tho i get extra speed in using speed but i dont need that 2% extra for wasting hours on debugging


#14

I think most of your questions have been already answered, but I’ll add a bit regarding this:

In my post I’ve just commented on a slightly modified @vitalyd example, in which he have included all necessary uses.


#15

Programming Rust by Jim Blandy is a very nice book if you can get your hands on it, but I’ve heard that it is a bit more geared towards people experienced in C++ (I’m in this category so I cannot tell).

I also learned many things by browsing through the resources referenced at https://github.com/ctjhoa/rust-learning .


#16

I found Rust By Example a great way to see how the language and crates library is used, but as with the example here, you need to be at least sufficiently familiar with the syntax to recognise roughly what’s being done, and be prepared to go back and look at the book to clarify where needed.