Why Java program is faster than Rust in this comparision?

hi , this is a benchmark Between Java and Rust :

Rust result : 10648, 6678, 8274
Java result : 8661, 9608, 6302

avg of 12 times benchmark
Rust : 9948
Java : 8693

I used opt-level = 3 for Rust

Rust Code



fn main() {

    for _ in 0..3 {
        let now = std::time::Instant::now();
        // ===============================================
        
        let mut v = Vec::with_capacity(100);
        for i in 0..100 {
            v.push(Class::new("fname", "fname", i % 30));
        }

        for i in 0..100 {
            let cls = v.get(i).unwrap();
            cls.validation();
        }


        // ===============================================
        let now2 = std::time::Instant::now();
        println!("==> {}", now2.duration_since(now).as_nanos());
    }
}



struct Class {
    fname: String,
    lname: String,
    age: i32
}
impl Class {
    pub fn new(fname: &str, lname: &str, age: i32) -> Self {
        Class {
            fname: fname.to_string(), 
            lname: lname.to_string(), 
            age
        }
    }

    pub fn validation(&self) -> bool {
        self.age > 0
    } 
}

Java Code + 1001 iteration for warm-up

public static void main(String[] args) {
        for (int q = 0; q < 1001; q++) {
            long tInit = System.nanoTime();

            ArrayList<CClass> arr = new ArrayList<CClass>(100);
            for (int i = 0; i < 100 ; i++) {
                arr.add(new CClass("fname", "lname", i % 30));
            }

            for (int i = 0; i < 100 ; i++) {
                CClass cls = arr.get(i);
                cls.validation();
            }

            if (q > 997) {
                System.out.println(System.nanoTime() - tInit);
            }
        }

    }

My immediate guess is that its because the Rust code will spend time allocating space for, and copying string data into the fields, whereas the Java code just stores a String object that already exists before the loop starts running.

1 Like

then JVM is not slower than LLVM and sometimes faster
at least for something like allocation and processing over data ( my means not math) or maybe can write faster code without maximize hard coding and advance optimization in Rust

is it correct ?

I mean, in this particular case, the code is simply not doing the same thing, and Java happens to be doing less work.

That said, it is true that there are cases where the JVM will be faster. In particular, the JVM tends to be faster at making a large number of small heap allocations.

5 Likes

Mostly because the Rust code does more works. What they're doing underneath are:

Rust

  1. Repeat step 2 ~ 8 3 times.
  2. Record the current time.
  3. Allocate heap memory of 100 * 56 bytes.
  4. Repeat step 5 100 times.
  5. Allocate two heap memories of 5 bytes and fill them with fixed value "frame".
  6. The iterate-and-validation part will likely be cleared out as an optimization, since it doesn't produce any observable behavior.
  7. Print the time passed since step 2.
  8. Free all the memories allocated in step 3 and 5.

Java

  1. Repeat step 2 ~ 8 3 times. I'll ignore the warmup part here.
  2. Record the current time.
  3. Allocate GC heap memory of 100 * 16 bytes(for the array itself, Java references are fat) and few bytes(for the ArrayList instance itself).
    Allocating on GC heap is very cheap, especially on single thread. Normally it's just bumping global offset.
  4. Repeat step 5 100 times.
  5. Allocate GC heap memory of few bytes(for the CClass instance), and fill it with known pointer values pointing String instances the JVM pre-allocated for those string literals.
  6. I'm not sure the JVM optimized out the iterate-and-check part, but it's only 100 elements so it should be cheap.
  7. Print the time passed since step 2.
  8. The process terminates quickly, means it's likely the GC never runs. This saves the free time.

The critical part is that allocating on GC heap is really cheap. If most of your task is to allocate memory, using GC language would be more performant. Many real world example do more than just allocation which may reduce this benefit.

6 Likes

^ This is copying the strings.

^ This is not copying the strings.

Remember that rust's String is more like Java's StringBuilder.


If you try a version that doesn't copy the strings, I suspect it'll be faster again: https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=4090cd2727512103742bcdf62877797c

7 Likes

In your case, the issue has to do with Java having garbage collection and copying references.

In Rust, you need to be clear whether your structures want to own or borrow the elements. In your case, the structure contains an owned version of str which means having spending time to copy the contents of string literal to new allocation every time you create a new instance.

You can avoid this cost by storing &str within a structure rather than String.

fn main() {
    for _ in 0..3 {
        let now = std::time::Instant::now();
        // ===============================================

        let mut v = Vec::with_capacity(100);
        for i in 0..100 {
            v.push(Class::new("fname", "fname", i % 30));
        }

        for i in 0..100 {
            let cls = v.get(i).unwrap();
            cls.validation();
        }

        // ===============================================
        let now2 = std::time::Instant::now();
        println!("==> {}", now2.duration_since(now).as_nanos());
    }
}

struct Class<'a> {
    fname: &'a str,
    lname: &'a str,
    age: i32,
}

impl<'a> Class<'a> {
    pub fn new(fname: &'a str, lname: &'a str, age: i32) -> Self {
        Class { fname, lname, age }
    }

    pub fn validation(&self) -> bool {
        self.age > 0
    }
}
4 Likes

thanks for your answer ,
then why in some java software use off-heap to avoid GC heap ??
ok it make Garbage but increase performance on allocating in many cases

Because they not only allocate memories but also free them. Running GC over large amount of objects is an expensive task.

3 Likes

ZGC creates a very short pause of less than 10 milliseconds,
Do you think it is good to choose ZGC now for Big software instead of off-heap?

Because it does not wait for the Garbage to be big and become expensive

It depends, on performance you should only trust benchmarks. But note that the pause time is not the only perf stat the GC have. Nowadays it spends a lot more time in parallel. CPU is a shared resource, using more of it in GC thread would reduce the process' throughput. Also, 10ms is not instant - your benchmark above prints less than 0.01 ms. Intervals between GC pause would also matter - it's useless if it pause for 10ms every 1ms.

2 Likes

10 milliseconds is pretty darn long considering that 1. today's computers have clock cycles and instructions that work at the nanosecond scale, and 2. 10 milliseconds is already too slow if you need e.g. real-time 120 FPS refresh rate in a game (that would be around 8 milliseconds for all the computation for a single frame including GC pauses).

i agree, especially with new area of Blockchain based application , and also today more increased real-time application .

rust is amazing, whole backend of our application written in Rust+Tokio+Axum, just was attractive for me

comparing of JVM and LLVM for increase science

There are real-time algorithms for GC that deamortize the cost of collection. I.e. every time you do an allocation, do a little bit of tracing and collecting, with work proportional to the amount of new memory you're allocating. All the GC languages should switch to doing that.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.