Understanding the binary search macro

When steffahn posted the macro I asked about the other day, I had no idea how it worked, it seemed like black magic!, so I decided to see if I could figure it out by putting in more comments and generally playing around with it. I thought others might be interested, this is the result:

/// Binary search macro.
/// Parameters are variable we are searching for ($x) and list of numbers ( or anything comparable with $x ).
// Rust macro features used ( per https://doc.rust-lang.org/1.7.0/book/macros.html )
//   $ is used to define match variables.
//   expr stands for expression
//   tt stands for token tree
//   * means zero or more repetitions.
//   + means one or more repetitions.
//   @ is used for labelling sub-macros.
macro_rules! find_index 
{
  // Initial rule, invoke step A with count set to zero and empty list of neps.
  // nep stands for number-expression-pair { number; expression }
  ( $x:ident, [ $($num:expr),+ ] ) => 
  {
    find_index!( @A, $x, [ $($num)+ ], 0, [] )
  };

  // Step A: pair the numbers with values generated by counting from 0.
  // Parameters are x, input list of numbers, count, output list of neps.
 
  // If the input list of numbers is not empty..
  ( @A, $x:ident, [ $firstnum:tt $($more_nums:expr)* ], $count:expr, [ $($out_nep:tt)* ] ) => 
  {
    // .. append nep {$firstnum; $count} to the output list.
    find_index!( @A, $x, [ $($more_nums)*], ($count+1), [ $($out_nep)* {$firstnum; $count} ] )
  };

  // If the input list of numbers is empty, step A is complete, move to step B.
  ( @A, $x:ident, [], $count:expr, $neps:tt ) => 
  {
    find_index!( @B, $x, $neps, [] )
  };

  // If the initial parameters are (say) x and [5,7,9,13] then the step A evaluates to
  // find_index! ( @ B, x, [], [{(5); 0} [(7); (0 + 1)] {(9); ((0 + 1) + 1)} {(13); (((0 + 1) + 1) + 1)}] )
  // or simplified:
  // find_index! ( @ B, x, [], [ {5; 0} {7; 1} {9; 2} {13; 3} ] )

  // Step B: generate the  final binary search expression from the [number; expr] pairs.
  // Parameters are x, input list, output list.
  // We keep pairing neps in the input list, putting them in output list.
  // When the input is exhausted, the output becomes the input for the next iteration.
  // Step B terminates when the input list is a single nep, and the output list is empty.

  // Input list has two neps ..
  ( @B, $x:ident, [ { $num1:expr; $exp1:expr } { $num2:expr; $exp2:expr } $($more_nep:tt)* ], [$($out_nep:tt)*] ) => 
  {
    // .. append a nep to output, consisting of first number and comparison of x with second number, choosing $exp1 or $exp2.
    find_index!( @B, $x, [ $($more_nep)* ], [ $($out_nep)* { $num1; ( if $x < $num2 {$exp1} else {$exp2} ) } ] )
  };

  // Input list is empty ..
  ( @B, $x:ident, [], $neps:tt ) => 
  {
    // .. swap input and output, start next iteration in step B.
    find_index!( @B, $x, $neps, [] )
  };

  // Input list is single nep, append it to output ( which is not empty since + means 1 or more repetitions ).
  ( @B, $x:ident, [ {$num:expr; $exp:tt} ], [ $($out_nep:tt)+ ] ) => 
  {
    find_index!( @B, $x, [], [ $($out_nep)+ {$num; $exp} ] )
  };

  // Input list is single nep, output is empty, we are done.
  ( @B, $x:ident, [ {$num:expr; $final_exp:expr} ], [] ) => 
  {
    $final_exp
  };
}

Of course there are probably too many comments here, the comments are saying what the macro is doing in too much detail and are mostly redundant, so not proper regular style for anyone who knows Rust macro syntax in detail, this was a learning exercise for me.

Please feel free to ask questions, suggest how the comments could be improved, or other suggestions. I still haven't exactly explained still how the algorithm works, although I do now understand it, just about!

Playground link with macro tracing

4 Likes

If I recall correctly, I also supported optional trailing commas in both the “array”-like expression and the overall macro call.

You must have.. streamlined.. this feature away :stuck_out_tongue_winking_eye:

To clarify, I mean that you could call my macro like this:

find_index!(
    [
        10, 11, 12, 13,
        14, 15, 16, 17,
    ],
    x,
)
1 Like

Yes, I removed the optional commas, and made all kinds of other changes ( as many as possible, to try and better understand how Rust macros work ). For example I put in commas between the macro parameters, and a semi-colon for the number-expression pairs. I also used + to indicate one or more repeats, which allows the final step to come at the end, all kinds of things.

Thank you for this! I'm one of (probably?) many who "want to learn macros someday" and worked examples like this will help a lot.

The first comment paragraph alone demystifies a lot for me :slight_smile:

1 Like

A few extra points I noticed later:

The brackets are not needed in ($count+1).
I forgot to mention that ident stands for identifier.
I say "[number; expr]" in a comment, for consistency that should be "{number; expr}".

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.