Binary search implementation vs documentation

veldsla · March 28, 2018, 8:45am

I am using the slice binary search in a small project, but I require that it returns the last index of the matching elements. Checking the stdlib src I found that this is actually the implemented behaviour. But the docs state that it could return any of the matching positions.

I also browsed trough this PR https://github.com/rust-lang/rust/pull/45333 and noticed there's even this test that checks this specific functionality:
https://github.com/rust-lang/rust/blob/master/src/libcore/tests/slice.rs#L72

#[test]
// Test implementation specific behavior when finding equivalent elements.
// It is ok to break this test but when you do a crater run is highly advisable.
fn test_binary_search_implementation_details() {
    let b = [1, 1, 2, 2, 3, 3, 3];
    assert_eq!(b.binary_search(&1), Ok(1));
    assert_eq!(b.binary_search(&2), Ok(3));
    assert_eq!(b.binary_search(&3), Ok(6));
    let b = [1, 1, 1, 1, 1, 3, 3, 3, 3];
    assert_eq!(b.binary_search(&1), Ok(4));
    assert_eq!(b.binary_search(&3), Ok(8));
    let b = [1, 1, 1, 1, 3, 3, 3, 3, 3];
    assert_eq!(b.binary_search(&1), Ok(3));
    assert_eq!(b.binary_search(&3), Ok(8));
}

The comment basically indicates that if someone relies on this functionality (assuming it's tests would catch it) it would be considered required and not change it?

Why not change the documentation instead and commit to it?

Knowing this, should I just add a test that checks this behaviour and perhaps fail on later rust versions, copy the current implementation into my src (unsafe calls), or open an issue and try to force the stdlib to pick one of the behaviours and adjust docs accordingly.

HadrienG · March 28, 2018, 11:41am

Not involved, but I assume that the motivation is to avoid precluding optimizations. If you make no guarantee about which element is returned, you can exit the search earlier and more easily parallelize it (including via vectorization).

veldsla · March 28, 2018, 12:14pm

Returning early requires an additional equality test in the inner loop. It would only benefit some scenarios. Also vectorization would only be possible for some primitive types, but not for the generic Slice<T>?

I copied the test into my project. We'll see how long it will last

Eh2406 · March 28, 2018, 3:33pm

https://crates.io/crates/superslice from the person that made the code in std

vitalyd · March 28, 2018, 4:02pm

The crater run mentioned in the comment would be a proxy for how big the breakage of a proposed change would be - that, in turn, would influence how much scrutiny the change will receive. Even though the behavior may not be guaranteed, there has to be strong motivation to break existing code. But it's not a guarantee either . Basically, it implies that the behavior may change but (presumably) there will be a very good reason for why that happened.

And to be clear: that's my interpretation, I in no way represent std maintainers' opinion.

Topic		Replies	Views
Macro to do binary search help	25	1186	November 13, 2020
Is there a similar construct like" if a ? b : c" in rust?	5	256	April 9, 2024
Can anyone help me review my binary tree insertion? help	1	665	January 12, 2023
C++'s equal_range in Rust? help	3	402	June 8, 2023
How is Rust's "match" implemented? help	4	1174	January 12, 2022

Binary search implementation vs documentation

Related Topics