Why String iterating through each char takes so much time than Java or other lang.?

I try to create a parser to parse a large xml file (about 300MB), and run with the following simple example in my env. (Rust version 1.40.0)

use std::fs;
use std::time::Instant;

fn main() {
    let filename = "Large.xml"; // The file size is about 326MB

    let now = Instant::now();
    let contents = fs::read_to_string(filename).unwrap();
    for chars in contents.chars() {
        // do nothing here
    }
    println!("{}ms", now.elapsed().as_millis());
}

The testing result will print: 17,544ms, which is much slower than Java or Dart.
In the same example above with some logical code side the for loop, Java takes only 7 sec. with creating 50M objects there, but Rust doesn't do anything there.

Is there any mistake I used with the example above?

Are you compiling the rust code with optimizations enabled - running it with cargo run --release?

Unoptimized rust code is slow, since it's, well, unoptimized. Without --release rustc doesn't try to make fast code.

There are other reasons this might be slow, but I want to get that out of the way first - that's a huge slowdown. And I'd honestly expect rustc to completely optimize out that loop.

Edit: added more detail. apologies for the serial edits

1 Like

Yes, it boosts a lot of the performance, but with the same logical code, it still takes 11 sec. which is still slower than Java (7 sec.)

Here is the same code as Java did.

let mut s = String::new();
let mut foo: Foo;
for (index, chars) in contents.chars().enumerate() {
    s.push(chars);
    if index % 6 == 5 {
        foo = Foo { value: s};
        s = String::new();
    }
}
1 Like

The code is allocating memory each time you use String::new() and then s.push(chars).
I think Java Strings are actually a view over a backed owned memory type (haven't used Java for a while, so my knowledge is rusty).

Something similar in Rust could be:

let mut s = String::with_capacity(1024);
let mut foo: Foo;
for (index, chars) in contents.chars().enumerate() {
    s.push(chars);
    if index % 6 == 5 {
        foo = Foo { value: s.clone() };
        s.clear(); // Empty the buffer, but keep the memory.
    }
}

A further improvement would be to see if Foo can simply hold a &str, and not actually need to clone the s in that loop:

struct Foo<'a> {
    value: &'a str,
}

That would need a bigger change to the algorithm (figure out the range of characters for the str).

2 Likes

Cool, it solves the issue and the result reduces to 5.7 sec.

With the owned String inside Foo, essentially you're using 2x326MB (we're cloning parts of the 326MB into each foo).

Oh ya, It would be a useful exercise to try and get the Foo<'a> version working. Two reasons being:

  • It may drop that to less than 1 second.
  • Solving the compile errors may help you understand what's happening / how to control Rust, even if you don't need that kind of performance.
2 Likes

What are you doing with Foo? Can you perhaps post the Java code.

Foo is simply an object to store each xml element content.
Here is what my post to compare Java with Dart - https://github.com/dart-lang/sdk/issues/29131
In that post, it contains both testing code.

To me, I am just curious how fast Rust can do with this case. (I am new to Rust)

Instead of reading everything into memory (std::fs::read_to_string()) before parsing it to XML, you might want to try a pull parser which uses the std::io::Read trait. Read represents an arbitrary stream that data can be read from incrementally, allowing you to use a pretty much constant amount of memory no matter the input file size and start parsing data before you've reached the end of the file. Reading the entire file into memory

Additionally, if you are looking for the fastest possible performance you'll want to check out memory mapped files. This is where you ask the kernel to place a file into the current process's address space and you can access it as an ordinary &[u8] slice. Every time you do something like file.read() it emits an explicit read syscall and copying data from kernel space to user space. With memory mapped files the kernel will automatically handle the reading and give you direct access to the file.

The quick_xml::Reader type looks like a pretty good place to get started for pull-parsing and the memmap crate seems to be a pretty popular library for this.

As well as optimising how fast we can read a file, I'd also make Foo borrow from the original buffer. There's no need for the redundant String copies.


TL;DR: Yes, it's possible for Rust to do this faster. As to whether it's necessary... I'll leave that up to you.

1 Like

Without knowing what you intend to do with those Foo, I implemented some of the versions mentioned earlier. Since I don't have your xml file I cant reproduce your results, but at least with a toy example the other methods are much faster.

https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=22a62d4e83f3d84b386cd0f3f74e6c4a

@jumperchen If you want to know how much time iterating over string chars takes, why are you starting the stopwatch before the call to fs::read_to_string?

2 Likes

To be fair, he also did that in the dart and java version, but yes that does not help when measuring the performance of the rest :slight_smile:

1 Like