I have written a fairly simple recursive binary search function to process a list of floats in Python:
def index_left(list_input: list[Any], list_length: int, value: Any, left_count: Optional[int] = 0):
if list_length == 1:
raise ValueError("`index_left` designed for intervals. Cannot index list of length 1.")
if list_length == 2:
return left_count
split = floor((list_length - 1) / 2)
if list_length == 3 and value == list_input[split]:
return left_count
if value <= list_input[split]:
return index_left(list_input[: split + 1], split + 1, value, left_count)
else:
return index_left(list_input[split:], list_length - split, value, left_count + split)
Some basic profiling suggests the following:
- 100 x list of 1e6 floats: 900ms
- 1e6 x list of 100 floats: 1999ms
I have written a similar function rust:
pub fn index_left<T>(list_input: &[T], value: &T, left_count: Option<usize>) -> usize
where for<'a> &'a T: PartialOrd + PartialEq
{
let lc = left_count.unwrap_or(0_usize);
let n = list_input.len();
match n {
1 => panic!("`index_left` designed for intervals. Cannot index sequence of length 1."),
2 => lc,
_ => {
let split = (n - 1_usize) / 2_usize; // this will take the floor of result
if n == 3 && value == &list_input[split] {
lc
} else if value <= &list_input[split] {
index_left(&list_input[..=split], value, Some(lc))
} else {
index_left(&list_input[split..], value, Some(lc + split))
}
}
}
}
And basic profiling suggests the following:
- 100 x vector of 1e6 f64s: 62us
- 1e6 x vector of 100 f64s: 221ms
Now I have used PyO3 to port the rust function to Python as follows (using maturin develop --release
):
#[pyfunction]
pub fn index_left_from_rust (list_input: Vec<f64>, value: f64, left_count: Option<usize>) -> usize {
index_left(&list_input[..], &value, left_count)
}
Profiling this from Python I get:
- 1e6 x list of 100 floats: 1537ms
- 100 x list of 1e6 floats: 1307ms
The PyO3 website warns of type conversion costs. Is this what I have encountered here or am I doing something fundamentally incorrect?