Making long leaps through the type system shorter?

I've been writing code that needs to manipulate PATH in the environment and have been making use of std::env. I was struck by the types used in its functions: they do not make it obvious how to use those functions together.

(I'm picking on std::env here because it's familiar to me right now, but this observation is not unique to std::env.)

In std::env the functions split_paths and join_paths are complementary:

join_paths(split_paths("a:b:c"))  // Ok("a:b:c")

The context makes it obvious that these are meant to work together, and a quick experiment confirms it. But look at their signatures:

pub fn split_paths<T: AsRef<OsStr> + ?Sized>(unparsed: &T)
    -> SplitPaths
{ .. }
pub fn join_paths<I, T>(paths: I)
    -> Result<OsString, JoinPathsError>
where
    I: IntoIterator<Item = T>,
    T: AsRef<OsStr>,
{ .. }

Wow. I know they work together from my experiment, but how do they work together? These are not obviously compatible. Throwing functions together and seeing what works will go a long way but will be limiting in the long run. To level up I need a deeper understanding, so I forced myself to figure this out from the types alone:

  1. Calling split_paths:

    1. What's an AsRef<OsStr> anyway? AsRef has a single sentence in the book. The reference docs have a lot more. However, I don't yet understand how or why AsRef is relevant here. Is it just for efficiency? It seems to be not about what but about how.

    2. Ignoring for now the exact mechanics and reasons for AsRef I find in its Implementors documentation an impl AsRef<OsStr> for str. Amongst many other choices I see that I can pass a string literal.

  2. Calling join_paths.

    1. I have a SplitPaths struct but I need an IntoIterator<Item = AsRef<OsStr>>.

    2. By chance I spot that there's an IntoIterator implementation for all iterators.

    3. There's an Iterator implementation for SplitPaths.

    4. Hence SplitPaths is usable as an IntoIterator<_>. What about the Item?

    5. SplitPaths's iteration Item is PathBuf. Checking the reference docs I see there's an impl AsRef<OsStr> for PathBuf.

Lightbulb! Got it, I think. But, involved as that was, it omitted the dead-ends I encountered, was written with the clarity of hindsight, and didn't cover how to use these functions in other contexts. It's only one data point, but the effort needed to compose functions in Rust appears considerable.

On the other hand, a fact like iterators all having an IntoIterator implementation is knowledge that a programmer will internalise over time. That's the crux of it really: learning libraries and types, building mental graphs of how to compose disparate functions, is what separates the new learner from the seasoned hand.

I'm okay with that. The rewards are there; I don't mind studying.

Still, the type system is complex, the many combinations daunting. How am I going to fit this all in my brain, or at least a good portion of it? In time the language and its libraries will also grow. Will mastery forever elude me, the finishing line moving further ahead just as I near it?

There's always more to learn so there's no finish line per se, and mastery is relative in any case. But, that's relative to other Rust programmers. What if we're all at a disadvantage next to those OtherLang programmers? Perhaps the knowledge they require to be maximally effective is at the sweet spot where the working set exactly fits into one brain, whereas Rust's equivalent is way too big? Controlling for every other variable, perhaps this means that OtherLang programmers can produce better software quicker than Rust programmers?

(I've heard that C++ suffers from the too-big-for-one-person problem, that it's such a vast and complex language that no one can ever master it. Does anyone want the same for Rust? I don't think so.)

The typing in split_paths and join_paths make them flexible and efficient but obfuscates how to use them. Composing functions requires a mental leap through a type system that may not be mapped out in brain-space. This is a big problem for learners, but an experienced developer moving between libraries would face the same.

Can those leaps be made shorter? Can we make Rust's libraries easier to learn and easier to move and draw connections between?

For example, is there a way to make it more apparent that a SplitPaths struct is usable as a FromIterator<Item=AsRef<OsStr>> without writing more documentation? For that matter, SplitPaths appears to be an implementation detail: does it need to have a name in the docs? If split_paths said only that it returns an Iterator<Item=PathBuf> that would remove one layer of indirection: I would only have to figure out that PathBuf has an AsRef<OsStr> implementation.

Does that get into issues of static vs. dynamic dispatch and trait objects? Could that be avoided by returning SplitPaths but documenting it only as an iterator of paths, like nominating a primary trait implementation for a struct: I'm returning a struct but what's interesting is its impl Foo.

Maybe I'm simply on the steep part of the learning curve, and it turns out that these concerns are not a problem in practice. That would be useful information if nothing else.

1 Like

It's to be more generic and allow more types as arguments. You'll generally see AsRef used in generic contexts (functions, types, etc).

This is a known issue and the "fix" needs impl Trait to be stabilized, which I believe it's on its way. Without that, you'd have to return a Box<Iterator<Item=PathBuf>> which is a heap allocation and virtual dispatch in the caller.

The type system and number of discrete named types can seem daunting at first, but as you say, it gets internalized over time. There are some "foundational" types which form the bases for majority of other language features and functionality. After getting over the learning hump, you'll know what to look for. And of course things like impl Trait will reduce the number of concrete types and let the API (and docs) focus on the important stuff.

2 Likes

Thanks Vitaly. The impl Trait change is going to make a big difference.

For anyone else interested, it looks like the following RFCs represent the current work on impl Trait:

/me continues reading them.

1 Like