Difference between String::new() and String::with_capacity(x)


#1

Hello everyone,

I’m a new user of Rust and I would like to know the benefit of using String::with_capacity(x) ?

For example, lets assume that I want the user to enter a word which would be 4 chars long.

First option :

let mut word = String::new();
io::stdin().read_line(&mut word).expect(“Failed while reading user input”);
word.pop(); // Removing the return character

Second option :

let mut word = String::with_capacity(4);
io::stdin().read_line(&mut word).expect(“Failed while reading user input”);
word.pop(); // Removing the return character

-> The fact is that in both scenarios, the user can enter a word longer than 4 chars because String acts as a Vector so what are the benefits ?


#2

It’s to avoid extra dynamic reallocations to grow the internal heap storage used by the String. If you have a good guess or otherwise know the size of the string a priori you can get one allocation upfront and that’s it.


#3

Hi vitalyd !

Thanks for your answer.
So you mean that it avoids “extra allocation” but how can you make the String only 4 octets long ? Like if the user input is = “unixy”, the final storage will always be 4 characters long = “unix”


#4

Check out this playpen: https://play.rust-lang.org/?gist=3dac2754ef2399c8060ea3f8e4b9bad3&version=stable

Each time you see the capacity increase, that means the string needed to allocate more memory and copy the old data over to the new string.

The second one doesn’t do this at all, while the first does this twice.


#5

Basically, String::with_capacity() is used as an optimization for cases where you will be repeatedly adding elements to a String. It allows a single allocation for the backing store to be made rather than having it be repeatedly reallocated to increasingly larger sizes as more and more elements are appended to the string. It does not act as a restriction or upper bound on how many elements a String is allowed to contain.

In other words, the function does not suit your purposes and you will need a different design.

If you want to read precisely 4 bytes from stdin, maybe you could do something like so:

let mut raw_input: Vec<u8> = vec![0; 4];
io::stdin().read_exact(&mut raw_input).expect("Failed to read user input");
let parsed_input = String::from_utf8(raw_input).expect("Failed to parse user input");

Note that conversion from the raw input to “parsed” input can fail because, among other things, nothing keeps the user from inputting multi-byte utf8 sequences that would be invalidly truncated by the 4-byte buffer limit. A more robust solution might take the user’s input and properly shorten it to 4 unicode chars rather than assuming ASCII.


#6

Or use an array at this point - no need for a Vec at all at this size.


#7

I used a Vec because String::from_utf8 will consume the Vec and take ownership of the backing store instead of creating a new heap allocation.


#8

Yeah - I meant str::from_utf8() after if it doesn’t need to be an owned String in the first place. But I’m jumping ahead with some assumptions :slight_smile:.