Is this mem::transmute::<&str, &'static str>(k) safe?

I'm looking at

which is a very simple lib with just one file and mostly tests, so it's short.

There's a struct that I'm trying to understand:

pub struct ChallengeFields(HashMap<UniCase<CowStr>, (String, Quote)>);

with

struct CowStr(Cow<'static, str>);

There's a line where it does

    pub fn get(&self, k: &str) -> Option<&String> {
        self.0
            .get(&UniCase(CowStr(Cow::Borrowed(unsafe {
                mem::transmute::<&str, &'static str>(k)
            }))))
            .map(|&(ref s, _)| s)
    }

I'm annoyed by this unsafe operation. I think CowStr is a Cow with 'static lifetime otherwise it'd be hard or impossible to store strs inside the map. Because of that, when I try to get something inside this map, the str in question has to have static lifetime. This is the reason for the transmute. Am I rigth? If so, why simply not use String, so we can get rid of lifetimes and thus transmute? I don't like unsafe, and reading about transmute it looks very unsafe.

Also, I don't see why Cow is needed at all.

You're pretty much right, yeah. It's unsafe but sound so long as get doesn't hold on to the reference. But I agree it's not great. If they implemented

impl Borrow<str> for CowStr {
    fn borrow(&self) -> &str {
        &self.0
    }
}

Then it wouldn't be needed, due to how HashMap::get works. (This would sidestep the requirement to construct a Cow<'static, str>.)

As for why a Cow and not a String, it allows the HashMap to hold keys of &'static strs in addition to Strings, so that they can have hard-coded strs as keys without allocation. I'm not sure why they wrapped Cow<'static, str> in a newtype.


Edit: I ignored the role of UniCase in this, see below.

3 Likes

How can

    let borrowed_str: &str = &owned;
    let _foo = hm.get(borrowed_str);

work if HashMap::get expects &CowStr? borrowed_str is str even though it came from CowStr

Because the definition of get is

pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> where
    K: Borrow<Q>,
    Q: Hash + Eq, 

and in this case, K is CowStr; thus it expects any &Q such that the bounds K: Borrow<Q> and Q: Hash + Eq are satisfied. Therefore, by supplying the implementation of Borrow<str> for Cowstr, get can accept a &str.

2 Likes

I think I get it. The hashmap can compare k with all the keys of type K because it can convert all of the keys to the type &Q.

I think the lib author was trying to convert, unsafely, a &str to &static str because this wouldn't work:

        pub fn get_mut(&mut self, k: &str) -> Option<&mut String> {
            self.0
                .get_mut(&UniCase(CowStr(Cow::Borrowed(k))))
                .map(|&mut (ref mut s, _)| s)
        }

but why not? Cow::Borrowed(k) would not need to live more than k: &str. Also, UniCase(CowStr(Cow::Borrowed(k))) is the type of the key of the hashmap. It only borrows so again the lifetime of UniCase(CowStr(Cow::Borrowed(k))) it not a problem.

What was he trying to fix?

It can be rewritten with totally safe code with some tricks.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=75bd882a138e612ae4b0f05bc3e265d6

3 Likes

Close; the real key is that if you implement Borrow, the Hash and Eq results should be the same as the non-borrowed structure. So there's no conversion of keys, it's just the thing that you borrow (here, str) has the same hash as the key (CowStr). Read more in the Borrow documentation.

Playground without Borrow. CowStr is defined to take a Cow<'static, str> and it is not defined with a generic lifetime, so there's no way to shorten that lifetime bound.


Important note on all this: I ignored UniCase in my replies up to this point. Looking it up, it's a make-case-insensitive-comparisons wrapper. So it doesn't actually make sense to implement Borrow<str> around it, due to the Hash and Eq requirements. @Hyeonu's solution is the proper one, achieved in part by adding a lifetime parameter (which CowStr lacks).

Adding that Borrow implementation to UniStr might make a decent pull request, in fact.

so, this code:

use std::borrow::Cow;
use std::collections::HashMap;

#[derive(Clone, Hash, Eq, PartialEq, PartialOrd, Ord)]
struct CowStr(Cow<'static, str>);

fn main() {
    let mut hm: HashMap<CowStr, String> = HashMap::new();
    let owned = CowStr(Cow::Owned("owned".to_string()));
    let non_allocating = CowStr(Cow::Borrowed("non_allocating"));
    hm.insert(owned, "foo".to_string());
    hm.insert(non_allocating, "bar".to_string());

    let owned = "owned".to_string();
    let borrowed_str: &str = &owned;
    let _foo = hm.get(&CowStr(Cow::Borrowed(borrowed_str)));
}

doesn't work because we're passing to .get a &CowStr(Cow::Borrowed()). While the key K = CowStr can be borrowed as CowStr (obviously), the type CowStr does not accept generic lifetimes. SO even the line let x = &CowStr(Cow::Borrowed(borrowed_str)); by itself won't compile.

The solution is to wrap Cow inside the Wrapper, so now we have generic lifetimes. Even then, we do

    #[derive(Debug, Clone, PartialEq, Eq)]
    pub struct ChallengeFields(HashMap<Wrapper<'static>, (String, Quote)>);

with 'static lifetime. If we didn't, we could not store arbitrary Wrapped elements. Possibly only static ones.

However, we can still use temporary Wrappeds (borrowed ones) to get inside the HashMap. We only need to use owned ones to insert on the HashMap.

Am I right?

By the way, I did the modifications do the library. Gonna do a lot more before sending a PR though.

THanks!!!

I think you understand it. Basically:

struct CowStr(Cow<'static, str>);

Can never contain a non-'static Cow. Using Wrapper<'a> instead allows constructing a temporary one. Then we can pass that get. The HashMap can't store it, as you said, but it can do the hashing and comparisons against the keys to find the entry as needed.