Idiomatic signature for accepting an iterable of strings

My goal is to have a struct that represents a newtyped sequence of strings. Let's say CatFullName and DogFullName, so I don't accidentally pass one when the other is needed. I want to these to be usable both in an ownership and reference scenario. From other posts, I gather the way to do this is just to wrap a generic type, and then for each function or impl block, use a trait bound to say what you need the type parameter to be:

struct CatFullName<P>(P);

impl <P : ...> CatFullName<P> {
   ...

The question is, for a function that wants to use reference semantics, and just wants to look at a sequence of &strs without storing them, what should the bound on "P" be? Between Iterator, IntoIterator, [T], &[T], &[&T], str, &str, AsRef<str>..., my head is spinning.

Here's an attempt:

use itertools::Itertools; // 0.9.0

struct CatFullName<P>(P);

impl <P> CatFullName<P> {
    fn to_string<S>(self) -> String 
    where S : AsRef<str>, P : IntoIterator<Item = S> {
        self.0.into_iter().intersperse(" ").collect()
    }
}

Playground link

However, this gives a compiler error because " " is a &str, not an S. Attempts to resolve this have just yielded more confusion.

Halp?

First of t, I'd avoid using generics in this way. Instead, copy the standard library pattern of Path and PathBuf and just define two types. However, regarding your code, I'd start but putting the constraint on the impl, not the fn, and then you need to use the trait itself in your code.

impl <P: IntroIterator, P::Item: AsRef<str>> CatFullName<P> {
    fn to_string(self) -> String {
        self.0.into_iter().map(|s| s.as_ref()).intersperse(" ").collect()
    }
}

Note: I'm on my phone pushing a swing, so I haven't tested the above.

I'd add that having IntoIterator as the bound here largely defeats the purpose of allowing the open of either ownership or reference, since self is always consumed.

You'd need to call .map(|s| s.as_ref()) before intersperse() to make both operate on &str. However, that probably won't work with owning iterators, since there won't be anything storing the s for s.as_ref() to live (it's not possible to have a reference without an owned counterpart).

You can create a good'ol loop that pushes to a String.

1 Like

I tried the Path and PathBuf way, but converting from PathBuf to Path in the standard library requires unsafe. Is there another way to do it?

1 Like

You don't need unsafe for this. The reason std uses unsafe is to make converting from an OsStr to Path a zero-cost operation. The representation for OsStr has a lot of constraints due to the way they are consumed by the OS and that's what the unsafe is actually there for.

What about using CatFullName(Vec<String>) and CatFullNameRef<'a>(Vec<&'a str>)? That's effectively what you were trying to write with the generics.

If the double-indirection due to a Vec<String> is a performance issue for you (it's probably not, benchmark first), you can use something like the smol_str crate to store the string's contents inline (if it's less than 22 bytes long). Otherwise you can take a leaf out of PathBuf and Path's book and use a single string (either String or str) where each component is separated by a well-known delimiter (/ on Unices and \ or / on Windows) and iterating over segments is a case of string.split('/').

1 Like

Well, there's also the fact that PathBuf can be turned into &Path, which I don't see how to do without unsafe (because who owns the Path? Technically, I don't think anyone does).

1 Like

How so? I know that &PathBuf can be turned into &Path, due to Deref, but how can you consume PathBuf and get &Path?

I'm convinced that doing it through Deref is what @hjfreyer meant. They are likely referring to the unsafe block found in the Path::new function.

Indeed, that's what I meant.

As far as I can tell, the main problem is that intersperse requires the two interspersed thing to be the same type as the iterated type, while creating a String with different printable things shouldn't be a problem.

For an alternative, what about using Itertools::format or Itertools::format_with?

Itertools::format has the requirement that the thing you iterate over be Display, which could be reasonable to add to your impl bounds:

impl<P> CatFullName<P> {
    fn to_string<S>(self) -> String
    where
        S: Display,
        P: Iterator<Item = S>,
    {
        self.0.format(" ").to_string()
    }
}

(playground)

Or, using format_with, you could do the same thing but use as_ref() to display the items:

impl<P> CatFullName<P> {
    fn to_string<S>(self) -> String
    where
        S: AsRef<str>,
        P: Iterator<Item = S>,
    {
        self.0
            .format_with(" ", |item, f| f(&item.as_ref()))
            .to_string()
    }
}

(playground)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.