Has Rust compiler made any optimization for conditional operations?


#1

Recently, I complete a lib hash_ord which contains OrdMap(avl tree) and HashMap(use avl to resolve Collision Attack). To improve performance, raw pointer is used frequently. However, I found out that, in a balanced binary tree, search operation does not seem to be much faster than insertion.

In C/C++, syntax like ptr = ( int < int ) ? ptr : ptr may not generate any branch, but in Rust, we can only use if ... {} else {}.

I want to know how to make conditional operations like if usize < usize { ptr } else { ptr } run faster except embedding assembly.


#2

Have you fed your code to a PMC-enabled profiler like perf or VTune in order to figure out what is bottlenecking it? Spontaneously, on a binary tree search, I would suspect pipeline stalls caused by mispredictions or cache misses, but it is good to get a clear answer. Moreover, these tools can help you pinpoint your “slow” instructions down to the assembly level (even if they are not 100% accurate at this level, typically 1-4 instructions off, it already tells you where to look).

There is AFAIK no semantic difference between Rust’s if-expression and C’s ternary operator. So if compilers are able to eliminate the branch in C, they should be able to do so in Rust as well.

A very convenient way to look at the assembly generated by rustc and C compilers is https://godbolt.org/ . You may want to use it in order to figure out if rustc is doing what you want.

If you find out that rustc doesn’t perform this optimization yet, you may want to suggest implementing it at https://github.com/rust-lang/rust/ .


#3

Thank you for answering.

Please have a look at these two code.
C++:

struct Node {
    int key;
    Node* left;
    Node* right;
};
Node* test(Node* node, int key) {
    return (key < node->key)? node->left : node->right;
}

ASM generated with compiler options -O3 is

test(Node*, int):
  cmp DWORD PTR [rdi], esi
  mov rax, QWORD PTR [rdi+16]
  cmovg rax, QWORD PTR [rdi+8]
  ret

Rust:

#[allow(dead_code)]
pub struct Node {
    left: *mut Node,
    right: *mut Node,
    key: i32
}

#[allow(dead_code)]
#[no_mangle] 
pub unsafe fn test(node: *mut Node, key: i32) -> *mut Node {
    if key < (*node).key { (*node).left } else { (*node).right }
}

ASM generated under version nightly and release mode is

test:
  push rbp
  mov rbp, rsp
  cmp dword ptr [rdi + 16], esi
  lea rax, [rdi + 8]
  cmovg rax, rdi
  mov rax, qword ptr [rax]
  pop rbp
  ret

These ASM are similar, maybe there are some other reasons.


#4

The compiler is optimizing out the function. After forcing to generate it, I see:

playground::test:
	leaq	8(%rdi), %rax
	cmpl	$7, 16(%rdi)
	cmovgq	%rdi, %rax
	movq	(%rax), %rax
	retq

The conditional move is there, just slightly different than the C++ one.


#5

You are right. I made a mistake.