Idiomatic Filter Type

Recently, while writing some filtering logic, I reached a situation that I was unsure how to optimally write.

Problem Statement

We have some struct Record, which has a few fields, and an enum RecordFilter. RecordFilter is a list of Record's fields, as such:

struct Record {
    x: usize,
    y: String,
    z: bool,
}

enum RecordFilter {
    X(usize),
    Y(String),
    Z(bool),
}

We can then define a function

/// Returns the first record in `records` that matches the provided filter
/// Returns `None` if no such record exisits
fn find_record<'a>(records: &'a [Record], filter: &RecordFilter) -> Option<&'a Record>;

What is the best way to write this function?

Without thinking, I, at first, wrote this:

fn find_record<'a>(records: &'a [Record], filter: &RecordFilter) -> Option<&'a Record> {
    use RecordFilter::*;
    let predicate: impl Fn(&&Record) -> bool = match filter {
        X(x) => |r: &&Record| r.x == *x,
        Y(y) => |r: &&Record| r.y == *y,
        Z(z) => |r: &&Record| r.z == *z,
    };

    records.iter().filter(predicate).next()
}

This is invalid, as the rust compiler tells me:

error[E0562]: `impl Trait` only allowed in function and inherent method return types, not in variable binding
   --> src/main.rs:108:24
    |
108 |         let predicate: impl Fn(&&Record) -> bool = match filter {
    |                        ^^^^^^^^^^^^^^^^^^^^^^^^^

or, without the type hint

error[E0308]: `match` arms have incompatible types
   --> src/main.rs:110:21
    |
108 |           let predicate = match filter {
    |  _________________________-
109 | |             X(x) => |r: &&Record| r.x == *x,
    | |                     -----------------------
    | |                     |
    | |                     the expected closure
    | |                     this is found to be of type `[closure@src/main.rs:109:21: 109:34]`
110 | |             Y(y) => |r: &&Record| r.y == *y,
    | |                     ^^^^^^^^^^^^^^^^^^^^^^^ expected closure, found a different closure
111 | |             Z(z) => |r: &&Record| r.z == *z,
112 | |         };
    | |_________- `match` arms have incompatible types
    |
    = note: expected closure `[closure@src/main.rs:109:21: 109:34]`
               found closure `[closure@src/main.rs:110:21: 110:34]`
    = note: no two closures, even if identical, have the same type
    = help: consider boxing your closure and/or using it as a trait object

Following this, I could think of three approaches. Here is a playground link of my three approaches. In short, they are:

Approach 1 -- Inner generic function

fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
    fn inner<'b, F: Fn(&&Record) -> bool>(p: F, r: &'b [Record]) -> Option<&'b Record> {
        r.iter().filter(p).next()
    }

    use RecordFilter::*;
    match filter {
        X(x) => inner(|r| r.x == *x, records),
        Y(y) => inner(|r| r.y == *y, records),
        Z(z) => inner(|r| r.z == *z, records),
    }
}

Pros:

  • Produces a short hot path. (based on my assumption that the compiler will generate three implementations of inner for the three different cases)
  • Closure types are inferred

Cons

  • Indirection within inner makes this harder to read, not immediately clear where arguments are used
  • multiple inner calls within the match are needlessly verbose
  • inner decleration is very verbose

Approach 2 -- Raw Filter

fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
    use RecordFilter::*;
    records
        .iter()
        .filter(|r| match filter {
            X(x) => r.x == *x,
            Y(y) => r.y == *y,
            Z(z) => r.z == *z,
        })
        .next()
}

Pros

  • Idiomatic, easy to read
  • Terse

Cons

  • match called more than necessary

Approach 3 -- Boxed Closures

fn find_record<'a>(filter: &RecordFilter, records: &'a [Record]) -> Option<&'a Record> {
    use RecordFilter::*;
    let predicate: Box<dyn Fn(&&Record) -> bool> = match filter {
        X(x) => Box::new(|r: &&Record| r.x == *x),
        Y(y) => Box::new(|r: &&Record| r.y == *y),
        Z(z) => Box::new(|r: &&Record| r.z == *z),
    };

    records.iter().filter(predicate).next()
}

Pros:

  • Also pretty readable
  • Code mirrors my original approach

Cons:

  • Allocations :frowning:
  • Wordy type hints required

Conclusion

I'm unsure which of the approaches above are preferred, or if there is another option I have not considered. This example was simplified to improve clarity. In my actual code, there are more fields, and map operations before and after the filter. Because of this relative complexity, I want to try hard to have my code be as easily readable as possible. I am, also, just generally curious what the recommended approach is.

I think in the simple case, approach 2 is my go to, but I don't believe it would scale as well as other approaches for more complex use cases.

Check out the playground link for boilerplate and runnable examples of each of these approaches.

  • Approach 2 is straightforward and likely quite good — yes, there's a match inside the loop, but CPU branch predictors make that cheap.

  • I would not use Approach 3 as-is — if you are going to use dyn closures, then accept the filter as a boxed closure instead of an enum, so that it's more flexible at the same cost.

  • Approach 1 involves near-duplicate monomorphized code generation, so it will be the slowest to compile and produce the most machine code in the binary, so you should not use it unless performance is critical and you've tested it with a benchmark to prove it's better than Approach 2 in practice.

By the way, the pattern .filter(...).next() has a name of its own: Iterator::find()

1 Like

That is a good point. I was under the impression that branch predictors might be bad at this because of the local nature of the optimization.

I've also realized that the boxed nature of the closures will probably prevent closure inlining, so this is just a bad idea

This note and the ones made me curious enough to write some benchmarks. For a list of records generated as so:

let records = (0..N_RECORDS)
    .map(|i| Record {
        x: i,
        y: i.to_string(),
        z: i == N_RECORDS - 1,
    })
    .collect()

And an N_RECORDS of 10,000, we get the following results (each filter will match only the last element, using a record filter of the listed field):

test tests::sol0_x ... bench:       3,482 ns/iter (+/- 161)
test tests::sol0_y ... bench:      26,064 ns/iter (+/- 1,176)
test tests::sol0_z ... bench:       5,100 ns/iter (+/- 232)
test tests::sol1_x ... bench:       3,431 ns/iter (+/- 197)
test tests::sol1_y ... bench:      26,322 ns/iter (+/- 787)
test tests::sol1_z ... bench:       5,094 ns/iter (+/- 277)
test tests::sol2_x ... bench:      14,005 ns/iter (+/- 797)
test tests::sol2_y ... bench:      40,585 ns/iter (+/- 1,397)
test tests::sol2_z ... bench:      20,474 ns/iter (+/- 1,120)
sol x y z
0 3,482 26,064 5,100
1 3,431 26,322 5,094
2 14,005 40,585 20,474

My takeaways are that approach 1 and 2 (sol 0 and 1) are effectively identical from a performance perspective. I was right that approach 3 would cause significant slow downs due to allocation and not being inline-able(although perhaps it's just the allocations slowing it down?). I was wrong to be worried about the branch predictor -- clearly the performance was not hurt.

With that given, approach 2 is the better choice. I will admit I'm surprised about these results -- I suppose 'avoid branches in inner loops' is not the universal mantra I once took it for. Regardless, thanks for your input!

It's probably not at all a concern here. I'd wager it won'r even reach runtime (hence no need to trust branch prediction), because loop-invariant code motion can remove the branch at compile time.

1 Like

Remember that find is inherently good for branch prediction: the default answer for loops is that the loop will keep going, and one branch misprediction for actually finding it is inherent anyway.

Unpredictable branches inside every iteration are more troublesome. But it's also possible for branches in the iterations to be mostly predictable -- in text it's a pretty good assumption that the script for the next codepoint will be the same as the previous one, for example, even if it's mixed-language text.

1 Like