Load to string with known length without reallocating in safe code

I ran into a case where I realized that I wasn't able to (in safe code only) load a string from a Read into an existing allocated String without allocating an intermediate buffer. But I wanted to make sure I didn't miss something, because there is no reason for the related functionality to not exist.

Specifically, I am curious whether something like read_to_string2 can currently be implemented, but it requires String#into_vec() which repurposes the string's buffer as a vec. I was wondering whether such functionality exists somewhere that eludes me.

fn read_to_string1(reader: &mut io::Read, s: &mut String, len: usize) -> Result<(), Error> {
    let mut bytes = vec![0; len];  // 1 guaranteed allocation
    try!(reader.read_exact(&mut bytes[..]));
    *s = try!(String::from_utf8(bytes));
    Ok(())
}

fn read_to_string2(reader: &mut io::Read, s: &mut String, len: usize) -> Result<(), Error> {
    let mut bytes = unsafe{s.as_mut_vec()}.clone(); // just to get it to compile
    //let mut bytes = s.into_vec();  // QUESTION: does this functionality exist anywhere?
    bytes.resize(len, 0);
    try!(reader.read_exact(&mut bytes[..]));
    *s = try!(String::from_utf8(bytes));
    Ok(())
}

Playground
(In this context, avoiding a "moving out of borrowed context" error, with s: &mut String, would either require String#into_vec() to not consume the string or the method above to swap the string out with a local temporary.)

This is not performance that I strictly need right now. Just a curiosity.

This works:

#![feature(read_exact)]

use std::string;
use std::io;

#[derive(Debug)]
struct Error;

impl From<string::FromUtf8Error> for Error {
    fn from(_e: string::FromUtf8Error) -> Error { Error}
}

impl From<io::Error> for Error {
    fn from(_e: io::Error) -> Error { Error }
}

fn read_to_string2(reader: &mut io::Read, s: &mut String, len: usize) -> Result<(), Error> {
    let mut bytes = std::mem::replace(s, String::new()).into_bytes();
    if bytes.len() < len {
        bytes.resize(len, 0);
    }
    try!(reader.read_exact(&mut bytes[..]));
    *s = try!(String::from_utf8(bytes));
    Ok(())
}

fn main() {
    let mut buf = io::Cursor::new(vec![104u8, 101, 108, 108, 111]); // "hello"
    let mut s = String::new();
    println!("2: {:?}", read_to_string2(&mut buf, &mut s, 5));
    println!("String: {}", s);
}

Also, String#into_vec() doesn't mean anything in Rust; I suspect you want String::into_vec.

Thanks! That was the function. And I just noticed Into<Vec<u8>> as well -- I could have sworn I looked for it, but I must have been looking somewhere else.

regular read_to_string should work, if you do something like this:

(&mut file).take(n).read_to_string(&mut s)

I had actually looked into that but discarded the option because take consumes its argument and I needed to be able to read more data. But I see you consumed &'a mut R instead which also has an impl for Read. Thanks, this is awesome and much cleaner!

That's one of the key things I really need to get used to in the rust docs: I often overlook the impls on the references.

A couple of things to note about using take, though:

  • take takes in a u64 instead of a usize which is a little strange.
  • take appends to the string, so that the string has to be cleared out first.

So, within read_to_string2, it needs:

s.clear();
s.reserve_exact(len);
try!(reader.take(len as u64).read_to_string(s));

(reader and s are already mutable references).

I/O uses u64 for sizes, so that it's not limited to the platform's usize. So that you can handle large files on 32-bit platforms, for example. (Of course not by slurping them into a string in one go, but you can read and seek parts.)

Don't forget to reserve space in the String, so that it doesn't need to reallocate during growth.

Thanks, updated the code above to reserve space.

The first solution may be better if you must not reallocate or
overallocate? read_to_string will only use the string's own allocation,
but it could possibly grow it more than what's needed.

Edit: Ok, I looked up the implementation again — it will not reallocate / grow the string until the capacity left is 0, so it's no worry.