std::fs::File::read() does return Ok(0) for a correct Text File

I try to create a general purpose file interaction library with Rust

Inspired by the Documentation at

I wrote a Method that should read a Chunk of Bytes into a Vec<u8>

The debug output tells me that the Application can access the file and the file does have actually contents (The Configuration for the Application to be run)
But the Method std::fs::File::read() returns Ok(0) and the Buffer Vec<u8> is empty.

    Fichero de Configuracion '/path/to/application/config/application.conf':
    Informe (Codigo: '0'): 'rd fl: 'File { fd: 3, path: "/path/to/application/config/application.conf", read: true, write: false }'
    cnk sz: '32768'
    chunk (sz: '0'):
    '[]'
    '

I wonder if there might be an issue with the &mut Option<File>.

Please could you give me any hints on this ?

Please don't take this as an offend (!).

Your format style is... interesting. Please use rustfmt in the upper right corner of the playground to format the code according to the community guidelines which helps us to read your code better and then share it to us :slight_smile:

Rust doesn't use _private_member, but instead everything that starts with an underscore means, that that variable is not used in the context. You should not name your variables like that. This is also true for methods (e.g. _init should be init instead).

Your whole file looks very unrusty. The underscore methods irritate me the most ^^
What probably should take a look at chapter 9 of the book and learn about Result<T, E> to handle errors properly.

This is what the documentation says about Read::read() returning Ok(n):

Since the file is not empty, it looks like you are in case 2.

Given that you are talking about a Vec, you may have made the following mistake:

use ::std::io;

fn bad_read_chunk (
    mut reader: impl io::Read,
    chunk_size: usize,
) -> io::Result< Vec<u8> >
{
    let mut chunk = Vec::with_capacity(chunk_size);
    let mut start = 0;
    loop {
        match reader.read(&mut chunk[start ..]) {
            | Ok(0) => break,
            | Ok(n) => start += n,
            | Err(ref e)
                if e.kind() == io::ErrorKind::Interrupted
            => {}, // continue
            | Err(err) => return Err(err),
        }
    }
    chunk.truncate(start);
    Ok(chunk)
}

gives

[src/main.rs:8] start = 0
[src/main.rs:8] reader.read(&mut chunk[dbg!(start)..]) = Ok(
    0,
)
[src/main.rs:20] bad_read_chunk(&b"Hello, World!"[..], 5) = []

This happens because Vec::with_capacity only pre-allocates data on the heap, which is "unusable" because uninitialized: the Vec, as a buffer, is still empty (like with Vec::new() or vec![]).

This can be easily solved by replacing Vec::with_capacity(chunk_size) with vec![0; chunk_size], since it does (zero-)initialize the allocated memory, which can be now safely used a buffer of size chunk_size:

use ::std::io;

fn read_chunk (
    mut reader: impl io::Read,
    chunk_size: usize,
) -> io::Result< Vec<u8> >
{
    let mut chunk = vec![0; chunk_size];
    let mut start = 0;
    loop {
        match reader.read(&mut chunk[start ..]) {
            | Ok(0) => break,
            | Ok(n) => start += n,
            | Err(ref e)
                if e.kind() == io::ErrorKind::Interrupted
            => {}, // continue
            | Err(err) => return Err(err),
        }
    }
    chunk.truncate(start);
    Ok(chunk)
}

You can also let the caller choose whether to use a stack-allocated buffer or a heap-allocated one by letting them feed an "out-buffer" and return instead how much it was filled a view / borrow on the bytes read; this is very similar to how .read() itself works, except that you know that the buffer or the file is exhausted at the end of the call:

use ::std::io;

fn read_chunk (
    mut reader: impl io::Read,
    out_buffer: &'_ mut [u8],
) -> io::Result<&'_ mut [u8]>
{
    let mut start = 0;
    loop {
        match reader.read(&mut out_buffer[start ..]) {
            | Ok(0) => break,
            | Ok(n) => start += n,
            | Err(ref e)
                if e.kind() == io::ErrorKind::Interrupted
            => {}, // continue
            | Err(err) => return Err(err),
        }
    }
    Ok(&mut out_buffer[.. start])
}
  • Stack-allocated (local) buffer usage:

    let chunk: &[u8] = read_chunk(reader, &mut [0; chunk_size])?;
    // use chunk
    
  • Heap-allocated (Vec) buffer usage:

    let mut chunk = vec![0; chunk_size];
    let bytes_read = read_chunk(reader, &mut chunk)?.len();
    chunk.truncate(bytes_read);
    // use chunk
    
1 Like

Thank you for your complete response.
Now I was able to understand this issue which is not so clearly exposed in the documentation.

So the correct Execution would produce:

file 'my_text_file.txt': created.
file 'my_text_file.txt': written.
file 'my_text_file.txt': read.
Report (Code: '0'): 'rd fl: 'File { fd: 3, path: "/playground/my_text_file.txt", read: true, write: false }'
cnk sz: '32'
chunk (sz: '32'):
'[49, 46, 32, 108, 105, 110, 101, 32, 116, 101, 120, 116, 32, 102, 105, 108, 101, 32, 99, 111, 110, 116, 101, 110, 116, 46, 10, 50, 46, 32, 108, 105]'
cnk sz: '32'
chunk (sz: '32'):
'[110, 101, 32, 116, 101, 120, 116, 32, 102, 105, 108, 101, 32, 99, 111, 110, 116, 101, 110, 116, 46, 10, 51, 46, 32, 108, 105, 110, 101, 32, 116, 101]'
cnk sz: '32'
chunk (sz: '32'):
'[120, 116, 32, 102, 105, 108, 101, 32, 99, 111, 110, 116, 101, 110, 116, 46, 10, 52, 46, 32, 108, 105, 110, 101, 32, 116, 101, 120, 116, 32, 102, 105]'
cnk sz: '32'
chunk (sz: '32'):
'[108, 101, 32, 99, 111, 110, 116, 101, 110, 116, 46, 10, 53, 46, 32, 108, 105, 110, 101, 32, 116, 101, 120, 116, 32, 102, 105, 108, 101, 32, 99, 111]'
cnk sz: '32'
chunk (sz: '7'):
'[110, 116, 101, 110, 116, 46, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]'
'
spl cntnt:
"1. line text file content.\n2. line text file content.\n3. line text file content.\n4. line text file content.\n5. line text file content.\n"

And even I could draw some improvements from your comment about the io::ErrorKind::Interrupted and the chunk.truncate(bytes_read).

The chunk.truncate() becomes quite important with a big chunk_size on a small file. But the variable chunk_size would give performance improvements on big files > 8192 Bytes.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.