I'm working on a high performance rope library (a rope is a data structure for storing & editing large strings, like for a text editor). I'm about to release 1.0 but I need some help picking good names for my public API, which match rust's conventions and won't change.
There's several iterator methods I'm implementing, and I'd love some advice on how I should name those methods.
The core data structure is basically a massive string. It uses skip lists and gap buffers internally for performance.
There's a few iteration methods:
Iterate over &str slices. This is the recommended way to read over the string, because its 0-copy
Iterate over chars
Iterate over (&str, unicode_length) pairs. (This is an optimization when you need to know the unicode char length, since we already know it internally anyway).
rope.chars() and rope.chunks() feel right for 2 and 3, but then what should 1 be?
rope.strings() seems weird because it yields str slices, not string objects
rope.strs()? That sounds weird.
rope.iter()? That feels closer to the design of std, but it doesn't match the other methods.
I could call them rope.iter_str_slices() / rope.iter_chars() / rope.iter_chunks() - but iter_str_slices feels long. iter_slices() maybe? But that kind of implies &[_] rather than &str.
Thoughts?
And if it helps, here's the iterator method in question, in context:
use jumprope::*;
let mut rope = JumpRope::new();
rope.insert(0, "lots of stuff");
let mut string = String::new();
for s in rope.iter_str() { // <-- What should this be called?
string.push_str(s);
}
assert_eq!(string, "lots of stuff");
Perhaps 1 could be rope.chunks(), and 3 could be rope.chunk_sizes() (or some other rope.chunk_somethings()), by analogy with str::chars() and str::char_indices(). rope.iter_*() doesn't really match any of the methods on str.
rope.chunks() and rope.chunk_sizes() are the names I was going to suggest, but it looks like you've beaten me to it
You can also be cute and call it threads() and thread_sizes() because ropes in the real world are composed of threads, but I'd probably want to avoid introducing non-standard terminology if possible.
Normally I would drop the iter bit from something like iter_chunks() because the return type will almost always say -> impl Iterator<Item = ...> already.
I'd avoid a bare iter() when your data structure provides multiple possible ways of iterating over it.
I'll cast my vote for iter_strs() as the most direct and literal option. But, I'd also say iter_slices() would be perfectly acceptable, because a &str is a "string slice" (or slice of characters), and I'd only really expect to be getting bytes from a string handling crate if the method were named iter_byte_slices().
I would call them directly what they are, and follow the standard library's convention and @Michael-F-Bryan's suggestion of not adding a redundant iter_ prefix. So:
One nice aspect of iter_* is that it makes it easy to find the iterator methods, but I suppose its not as idiomatic. Threads is a cute idea, but I don't like overloading the term any more. And chunks feels a bit generic - it doesn't feel right to introduce new terminology.
So I think I'm going to go with @H2CO3's suggestion to use substrings. Thats clear enough, and strikes a nice balance between keeping the terminology down while also being easier to read than strs() or something.
Thanks for contributing everyone! This was a great experience.