Always use &T for AsRef<str> instead of just T

Hi everybody,

I read very interesting article about AsRef<str> and Into<String> and I started using AsRef<str> in my functions.

I quickly realized that are several problems with using self:

trait MyTrait1 {
    fn a(self) -> bool;
    fn b(&self) -> bool;
}

impl<T> MyTrait1 for T
where
    T: AsRef<str>,
{
    fn a(self) -> bool {
        self.as_ref() == " "
    }
    fn b(&self) -> bool {
        self.as_ref() == " "
    }
}

trait MyTrait2 {
    // fn c(self) -> &str; // missing lifetime
    fn d(&self) -> &str;
}

impl<T> MyTrait2 for T
where
    T: AsRef<str>,
{
    // fn c(self) -> &str {
    //     self.as_ref().strip_prefix(".").unwrap_or_default() // missing lifetime
    // }
    fn d(&self) -> &str {
        self.as_ref().strip_prefix(".").unwrap_or_default()
    }
}


fn main() {
    let x = " ";
    let y1 = x.a();
    let y2 = x.a();
    let x = " ";
    let z1 = x.b();
    let z2 = x.b();

    let x = String::from(" ");
    let y1 = x.a();
 // let y2 = x.a(); // use of moved
    let x = String::from(" ");
    let z1 = x.b();
    let z2 = x.b();
}

I wonder is there is a best practice of always using &self instead of self when self is AsRef<str>?

The reason I'm asking is that, strangely, the author didn't mention that in such good article.

struct Person {
    name: String,
}
impl Person {
    pub fn new<N>(name: N) -> Person
    where
        N: AsRef<str>,
    {
        Person {
            name: name.as_ref().to_owned(),
        }
    }
}

fn main() {
    let x = "john";
    let y1 = Person::new(x);
    let y2 = Person::new(x);

    let x = String::from("john");
    let y1 = Person::new(x);
 // let y2 = Person::new(x); // use of moved
}
  • fixed version:
struct Person {
    name: String,
}
impl Person {
    pub fn new<N>(name: &N) -> Person
    where
        N: ?Sized + AsRef<str>,
    {
        Person {
            name: name.as_ref().to_owned(),
        }
    }
}

fn main() {
    let x = "john";
    let y1 = Person::new(x);
    let y2 = Person::new(x);

    let x = String::from("john");
    let y1 = Person::new(&x);
    let y2 = Person::new(&x);
}

The detail you're missing is: You've changed two things:

struct Person {
    name: String,
}
impl Person {
-   pub fn new<N>(name: N) -> Person
-   where
-       N: AsRef<str>,
+   pub fn new<N>(name: &N) -> Person
+   where
+       N: ?Sized + AsRef<str>,
    {
        Person {
            name: name.as_ref().to_owned(),
        }
    }
}

fn main() {
    let x = "john";
    let y1 = Person::new(x);
    let y2 = Person::new(x);

    let x = String::from("john");
-   let y1 = Person::new(x);
-   let y2 = Person::new(x);
+   let y1 = Person::new(&x);
+   let y2 = Person::new(&x);
}

you changed the signature, and you changed the call-site. The thing that fixed your use of moved value error was the change at the call-site. You didn't have to change the type signature :wink:

struct Person {
    name: String,
}
impl Person {
    pub fn new<N>(name: N) -> Person
    where
        N: AsRef<str>,
    {
        Person {
            name: name.as_ref().to_owned(),
        }
    }
}

fn main() {
    let x = "john";
    let y1 = Person::new(x);
    let y2 = Person::new(x);

    let x = String::from("john");
    let y1 = Person::new(&x);
    let y2 = Person::new(&x);
}

(playground)


Now with your other example, it's a bit different; as a self-receiver type, the & in the type signature helps without the need for any explicit & at the call-site.

That being said, I personally am a fan of the “require & in the signature” approach, even when it's technically making the type signature less general; because this helps in clearly marking a function signature that does not require any ownership of its argument in forcing the user to make this apparent at their call site.

In other words, I personally am not a fan of the design choice in API such as File::open. This is because you can pass an owned string, and it will happily swallow the ownership and discard it. If you refactor your code later, e.g. after a call-site of File::open(p) you need to make further use of your (owned) p: String or PathBuf or Box<…> later, then all compiler suggestions and muscle memory would point you to use .clone() to solve the problem, instead of re-writing it to File::open(&p).

7 Likes

I just read the beginning:

If you want to create a new type that, for instance, must own a String, most APIs tend to agree that the constructor of your type should take a &str or some sort of borrowing ...

I don't know how the author gets to this conclusion, but nope, I disagree.

if I want to own a String, I just take String directly, by value (or maybe sometimes, impl Into<String>, also suggested in the article) as argument. that's it. callsite can decide how to create that the String.

btw, in C++, there's a lot of different opinions about how to pass so called "sink parameters", because the quirks of C++'s "move semantic", but this is not a problem in rust.

AsRef::as_ref() takes &self as receiver, why do you want to use self in the first place? is there a reason at all?

to maximize the flexibility of an API, you should have minimal requirements.

to call the AsRef::as_ref() method, you only need a shared reference, you don't need even a mut reference, let alone the value itself. it's similar to when you only call a callback once, use FnOnce() instead of FnMut() or Fn().

maybe the author assumed the audiences already know it. the article says:

Q: AsRef<T> is a &T on steroids

this doesn't mean you your parameter should have type Q, you can just use &Q when you only need a reference, . e.g.

fn foo<Q: AsRef<str>>(s: &Q);
fn bar(s: &impl AsRef<str>);
2 Likes

AsRef<str> is not useful in the self or &self position.

Types that implement AsRef<str> usually also implement Deref<Target=str>, which automagically does the same thing AsRef<str> whenever a method is called on the type.

This means you can just implement traits for str with &self receiver, and string-like types will deref to &str for you in most cases, without an extra trait indirection.

Additionally, if you find you need &AsRef<str>, it's a red flag. Such code will compile to using &String, &Box<str>, &Cow<str>, etc. These are a double indirection, and unnecessarily more complex than &str. The generic code will be monomorphised (a duplicate code copy'n'pasted by the compiler) for each variation of the string-like type, which bloats the executable and increases compilation time for typically making worse versions of all the methods than if you just used &str for everything.

1 Like

D'oh! Thank you!

That was my thinking, that signature with & clearly shows that the function doesn't need to own the value.

In other words a good practise it to communicate to the callsite what you need (signature with or without &) and let them figure out how to achieve that.

No need, &self is enough.

I didn't realize that. Thank you :grinning:

I agree that if you need a &str (or &T), the best route is often to just ask for one. Going generic can improve call-site ergonomics, but you shouldn't compromise the API in order to do so -- for example, you shouldn't take an S: AsRef<str> when you need an owned String, because you may unnecessarily clone. If you need a String, ask for a String. If you want to go generic, start from there and use a bound that gives you a String: S: Into<String>.[1]

@steffahn pointed out the benefits of taking a reference specifically already.[2] I want to emphasize a few things, though.

The first is that S: AsRef<T> changes "I need a &T" to "I need a &S to call <S as AsRef<T>>::as_ref (to obtain a &T)". That's a technical take on why you should still ask for a reference (&S) and not an owned S. (In contrast S: Into<T> changes "I need a T" into "I need an S".)

The next is that you then need the relaxed ?Sized bound to accept the same types.[3] For example someone mentioned taking a &impl AsRef<str>. That function cannot accept a &str! So going from &str to &impl AsRef<str> would definitely compromise your API. You had the correct approach in your fixed version:

    pub fn new<N>(name: &N) -> Person
    where
        N: ?Sized + AsRef<str>,

(Side note, needing ?Sized is also an example of how quickly the aesthetics of impl Trait in argument position fall apart, IMO.[4])

Finally, as @kornel mentioned, you've arguably still made things worse due to accepting unnecessary indirection and more monomorphised code. You can mitigate this somewhat by doing something like:

    #[inline]
    pub fn new<N>(name: &N) -> Person
    where
        N: ?Sized + AsRef<str>,
    {
        let name = name.as_ref();
        Self::the_real_new(name)
    }

    // Not public to keep the API simple        
    fn the_real_new(name: &str) -> Person {
        todo!()
    }

And I feel this mitigation emphasizes how this is all just meant to be a substitute for consumers calling .as_ref() at the call site -- an ergonomic consideration.


  1. The article's conclusion seems reasonable, but somehow the introduction got it exactly wrong. ↩︎

  2. You've arguably compromised your API by making it too easy to accidentally have unneeded cloning. ↩︎

  3. If T: ?Sized that is, like when T = str ↩︎

  4. fn new(name: &(impl ?Sized + AsRef<str>)) -> ... ↩︎

3 Likes

Thank you @quinedot

The most important lesson I took is that I should keep it simple and ask exactly for what I need and let the callsite to figure out:

struct Person {
    name: String,
}
impl Person {
    // need to own
    pub fn new(name: String) -> Person {
        Person { name }
    }
}

or

trait StringUtils {
    fn non_empty_split(&self, s: &str) -> Option<(&str, &str)>;
}
impl StringUtils for str {
    // don't need to own
    fn non_empty_split(&self, s: &str) -> Option<(&str, &str)> {
        match self.split_once(s) {
            Some(("", _)) | Some((_, "")) => None,
            out => out,
        }
    }
}

And use traits only if I really need ;-).

struct Person {
    name: String,
}
impl Person {
    // need to own + extra flexibility
    pub fn new<T: Into<String>>(name: T) -> Person {
        Person { name: name.into() }
    }
}

or

trait StringUtils {
    fn non_empty_split(&self, s: &str) -> Option<(&str, &str)>;
}
impl<T> StringUtils for T
// don't need to own + extra flexibility
where T: AsRef<str>,
{
    fn non_empty_split(&self, s: &str) -> Option<(&str, &str)> {
        match self.as_ref().split_once(s) {
            Some(("", _)) | Some((_, "")) => None,
            out => out,
        }
    }
}

In most cases the extra flexibility won't be needed.