Golomb Sequence

A few days ago I began learning Rust just for fun, and joined this forum to seek advice regarding books that might help. See Seeking Recommendations for an Introductory Book on Rust. After learning a little Rust from Chapter 2 of The Book and browsing through some code on various web sites, I decided to practice by writing a little program to work with the Golomb Sequence.

The On-Line Encyclopedia of Integer Sequences: A001462 describes the sequence succinctly as follows:

Golomb's sequence: a(n) is the number of times n occurs, starting with a(1) = 1.

This is the beginning of the sequence:

1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, ...

Wikipedia provides some additional detail. See:

That Wikipedia article offers this recurrence relation:

a(1) = 1; a(n + 1) = 1 + a(n + 1 - a(a(n)))

Based on that relation, I have included a recursion in my program. The program prompts the user for a number, which we can call n, and computes and displays the value of the nth number in the sequence.

The recursive inner function employs a memoization, so that once each term in the sequence is computed, it can be saved for later lookup, which avoids having to compute the same terms repeatedly during the recursions.

I will probably add a loop later on that repeatedly prompts the user and displays results until the user enters an empty input.

Below is my code, so far, which does produce correct results. However, there is likely much additional room for improvement, so please offer any advice that might help this new learner become a better Rust programmer. :slight_smile:

// Rust code to compute nth number in Golomb sequence
use std::io;

fn golomb(n: usize) -> usize { // work with unsigned integers
    // wrapper function to contain recursive golomb_memoized function

    fn golomb_memoized(n: usize, memo: &mut [Option<usize>]) -> usize {
        // inner recursive function that maintains and uses memo
        memo[n].unwrap_or_else(| | { // if there is an error, panic and stop program.
            let res = {
                if n > 1 { // recursive case
                    1 + golomb_memoized(n - golomb_memoized(golomb_memoized(n - 1, memo), memo), memo)
                } else { // base case
                    1
                }
            };
            memo[n] = Some(res); // error message said to wrap res in Some()
            res
        })
    }
    
    // return result of calling the inner golomb_memoized function
    golomb_memoized(n, &mut vec![None; n + 1])
}

fn main() {
     println!("Enter n (1 to 30000) to compute golomb(n):"); // avoid stack overflow
     let mut inp = String::new();
     io::stdin().read_line(&mut inp).expect("Could not read input");
     
     // convert inp to an integer, store in n
     let n: usize = inp.trim().parse().expect("Invalid input (needed an integer)");
     
     let res = golomb(n);
     println!("golomb({n}) is {res}.");
}

Personally, in this case, I would prefer this:

            let res = match n {
                0 => panic!("there is no zero-th element"),
                1 => 1,
                // or the following instead of the above two lines to be equivalent to original example
                // 0 | 1 => 1,
                2.. => 1 + golomb_memoized(n - golomb_memoized(golomb_memoized(n - 1, memo), memo), memo)
            };

because:

  • it is shorter,
  • it is explicit about what values of n are handled in each arm (e.g. your code returns 1 for n == 0, but it is not clear whether it is intentional or you just forgot about 0)
  • compiler can still verify that all possibilities are handled (in my first rewrite attempt I actually did forget about 0)

Also, I would choose shorter name for the golomb_memoized function. It is local, so it does not benefit as much from descriptive name. On the other hand, formulas using it cound benefit from being shorter.

Another idea I would consider is to rewrite the function golomb_memoized (and its calls) so that golomb_memoized(n) returns a(n+1) so you do not have to handle 0 at every step.

1 Like

I prefer to use nominal structs for something like this. Among other things, this means you don't have to throw out your cache between queries. For example, you could avoid smashing the stack for larger values if you extended the cache in chunks.

For example, here's a version that can calculate the 100,000th solution (and beyond), by extending the cache in chunks.


Now, let's say you realize an algorithmic improvement. Since your recursion only reduces the index by one in the inner position, you're going to build a big call tree with a depth equal to the number of unmemoized elements needed. So you're going to end up with everything but the 0th element being Some, for any given call to solve. Therefore I suggest

  • using Vec<usize> instead[1]
  • ditching the recursion and instead growing your cache from current length until n

This will allow you to more directly get rid of self-imposed limit to avoid smashing the stack. The change is minimally disruptive if you have the nominal struct. (You could remove the implicit chunking suggested in the first playground from solve, if you added it.[2])


A bonus round as far as the algorithm goes, even though I think I've made the Rust-specific point already...

The partial sums of the Golumb sequence correspond to the last occurences in the Golomb sequence. That is,

  • let f(n) be the Golomb sequence
  • let g(n) be the partial sums of the Golumb sequence
  • if g(k-1) < n <= g(k), then f(n) = k

If you memoize the partial sums as well, you only need to memoize until g.last().unwrap() >= n in order to solve f(n). After that, finding the correct g(k) can be done with partition_point.

Then you can e.g. find the billionth answer while caching less than a million values.[3]

The only real Rust-specific note is that this again required no changes to the public interface.


Where to go from here? Perhaps


  1. you can return 0 as a white lie for the 0th solution, or panic, or change the API to return a Result or Option ↩︎

  2. I didn't bother to implement it since I foresaw this development... ↩︎

  3. The Vardi paper explores optimizing this further using recursive generalizations of g. ↩︎

2 Likes

Thanks for all your helpful advice, @Tom47 and @quinedot.

Note that in my original posted code, a // for marking a comment was somehow left out. That omission has now been corrected.

Yeah, not accounting for 0 was an oversight. I have replaced this ...

            let res = {
                if n > 1 { // recursive case
                    1 + golomb_memoized(n - golomb_memoized(golomb_memoized(n - 1, memo), memo), memo)
                } else { // base case
                    1
                }
            };

... with this ...

            let res = match n {
                0 => panic!("there is no zero-th element"),
                1 => 1,
                2.. => 1 + golomb_memoized(n - golomb_memoized(golomb_memoized(n - 1, memo), memo), memo)
            };

It works very well, but I'm not sure that it is exactly what you intended.

That works really nicely. As Rust is still quite new to me, I do not yet fully understand that code, but will be working with each of the suggestions that both of you have provided over the next several days or, more realistically, over the next week or two :-).

What do you mean?

I had figured from your comments included in the code, that in the match block, you may have intended for me to have either these two lines ...

                0 => panic!("there is no zero-th element"),
                1 => 1,

... or this single line ...

                2.. => 1 + golomb_memoized(n - golomb_memoized(golomb_memoized(n - 1, memo), memo), memo)

Instead, I have three lines in that block, as posted above, and that seems to work. However, some of that material is still new to me, so its full understanding will require further study on my part.

Thanks for the code.

No he meant that you can replace

                0 => panic!("there is no zero-th element"),
                1 => 1,

with

                0 | 1 => 1,

if you don't want to handle the 0 case with a panic but with a dummy value of 1. Read the comment again, you'll see. In case you don't know, the syntax 0 | 1 simply means match 0 or 1.

2 Likes

Aha! :-) It all makes sense now.