Recently, while writing some filtering logic, I reached a situation that I was unsure how to optimally write.
Problem Statement
We have some struct Record
, which has a few fields, and an enum RecordFilter
. RecordFilter is a list of Record
's fields, as such:
struct Record {
x: usize,
y: String,
z: bool,
}
enum RecordFilter {
X(usize),
Y(String),
Z(bool),
}
We can then define a function
/// Returns the first record in `records` that matches the provided filter
/// Returns `None` if no such record exisits
fn find_record<'a>(records: &'a [Record], filter: &RecordFilter) -> Option<&'a Record>;
What is the best way to write this function?
Without thinking, I, at first, wrote this:
fn find_record<'a>(records: &'a [Record], filter: &RecordFilter) -> Option<&'a Record> {
use RecordFilter::*;
let predicate: impl Fn(&&Record) -> bool = match filter {
X(x) => |r: &&Record| r.x == *x,
Y(y) => |r: &&Record| r.y == *y,
Z(z) => |r: &&Record| r.z == *z,
};
records.iter().filter(predicate).next()
}
This is invalid, as the rust compiler tells me:
error[E0562]: `impl Trait` only allowed in function and inherent method return types, not in variable binding
--> src/main.rs:108:24
|
108 | let predicate: impl Fn(&&Record) -> bool = match filter {
| ^^^^^^^^^^^^^^^^^^^^^^^^^
or, without the type hint
error[E0308]: `match` arms have incompatible types
--> src/main.rs:110:21
|
108 | let predicate = match filter {
| _________________________-
109 | | X(x) => |r: &&Record| r.x == *x,
| | -----------------------
| | |
| | the expected closure
| | this is found to be of type `[closure@src/main.rs:109:21: 109:34]`
110 | | Y(y) => |r: &&Record| r.y == *y,
| | ^^^^^^^^^^^^^^^^^^^^^^^ expected closure, found a different closure
111 | | Z(z) => |r: &&Record| r.z == *z,
112 | | };
| |_________- `match` arms have incompatible types
|
= note: expected closure `[closure@src/main.rs:109:21: 109:34]`
found closure `[closure@src/main.rs:110:21: 110:34]`
= note: no two closures, even if identical, have the same type
= help: consider boxing your closure and/or using it as a trait object
Following this, I could think of three approaches. Here is a playground link of my three approaches. In short, they are:
Approach 1 -- Inner generic function
fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
fn inner<'b, F: Fn(&&Record) -> bool>(p: F, r: &'b [Record]) -> Option<&'b Record> {
r.iter().filter(p).next()
}
use RecordFilter::*;
match filter {
X(x) => inner(|r| r.x == *x, records),
Y(y) => inner(|r| r.y == *y, records),
Z(z) => inner(|r| r.z == *z, records),
}
}
Pros:
- Produces a short hot path. (based on my assumption that the compiler will generate three implementations of
inner
for the three different cases) - Closure types are inferred
Cons
- Indirection within
inner
makes this harder to read, not immediately clear where arguments are used - multiple
inner
calls within the match are needlessly verbose -
inner
decleration is very verbose
Approach 2 -- Raw Filter
fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
use RecordFilter::*;
records
.iter()
.filter(|r| match filter {
X(x) => r.x == *x,
Y(y) => r.y == *y,
Z(z) => r.z == *z,
})
.next()
}
Pros
- Idiomatic, easy to read
- Terse
Cons
-
match
called more than necessary
Approach 3 -- Boxed Closures
fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
use RecordFilter::*;
let predicate: Box<dyn Fn(&&Record) -> bool> = match filter {
X(x) => Box::new(|r: &&Record| r.x == *x),
Y(y) => Box::new(|r: &&Record| r.y == *y),
Z(z) => Box::new(|r: &&Record| r.z == *z),
};
records.iter().filter(predicate).next()
}
Pros:
- Also pretty readable
- Code mirrors my original approach
Cons:
- Allocations
- Wordy type hints required
Conclusion
I'm unsure which of the approaches above are preferred, or if there is another option I have not considered. This example was simplified to improve clarity. In my actual code, there are more fields, and map
operations before and after the filter
. Because of this relative complexity, I want to try hard to have my code be as easily readable as possible. I am, also, just generally curious what the recommended approach is.
I think in the simple case, approach 2 is my go to, but I don't believe it would scale as well as other approaches for more complex use cases.
Check out the playground link for boilerplate and runnable examples of each of these approaches.