Efficient run length encoding algorithm for runs with 2+ character long sequences


Just for fun, I decided to implement a few simple algorithms, including run length encoding. What I’m doing right now is try to make do RLE encoding, but with multicharacter runs. For example, the string “121212121212” would be seen as 6 "12"s. I already have an efficient algorithm for doing that, however, is there any way to adopt that algorithm to encode strings such as “a1212121212” (which would be encoded as an “a” followed by 5 repetitions of “12”)? The function find_maximum_run finds the maximum run in a string, however, it requires that the ENTIRE STRING consist of repetitions of that run.


Multicharacter runs are part of deflate algorithm. You encode “a12” literally, and then a backreference of 8 characters from position -2.