How to make sure types that implement a trait do not share implemented functions? [Implement a static HashMap to store &'static str]

This one is very concise. Here is my test program.

trait F {
    fn foo();
}

struct G {}
struct H {}

impl<T> F for T {
    fn foo() {
        static mut value : i32 = 0;
        unsafe {
            value += 1;
            println!("counter is {}", value);
        }
    }
}

fn main() {
    let _ : () = <G as F>::foo();
    let _ : () = <H as F>::foo();
}

Expected result in the following program is

counter 1
counter 1

But the actual result is

counter 1
counter 2

In other words, I want to make sure every trait implementation gets its separate foo() implementation, is that possible?

If you look at the disassembly, you can see that actually there are two separate functions being generated. They just refer to the same static.

I don't see this written down anywhere, but I think whenever you write a static item in your code, there will always be exactly one static item in the generated code.

If you want to keep track of things by type, look into using TypeId.

3 Likes

This is indeed the case - it's written down in the Rust Reference (emphasis mine):

A static item is similar to a constant, except that it represents a precise memory location in the program. All references to the static refer to the same memory location.

@gevorgyana: It's also worth noting that static mut is extremely hard to use correctly, and makes it very easy to cause undefined behaviour in your program. In fact, there were some calls to deprecate it entirely in Rust 2018, although this didn't end up happening.

Is your question about these traits purely hypothetical, or does it relate to a problem you're having in your actual code? If the latter, it would be a good idea to give more info so that people can give more helpful advice :slight_smile:

1 Like

I don't think this statement covers this situation. The question is how many static item definitions there are here, not how many references.

3 Likes

That's fair! If looked at from that angle though, I think that Items - The Rust Reference kind of answers this. For items declared within functions or other scopes, "The meaning of these scoped items is the same as if the item was declared outside the scope."

That statement could still be interpreted multiple ways, but I think the way I'd read it confirms what you've discovered: per static declaration in your code, there's exactly one static item, and thus exactly one memory location representing that static.

Or in other words, the static declaration is not at all modified (including by duplication) by being inside a generic block.


I think we can still solve this, though it needs more complexity. Rust won't create N static memory locations, but we can disambiguate a single one.

I think one common-ish pattern is to use Any::type_id for this. If you store a HashMap<TypeId, u32>, you can use a single static to support all counters.

For a full solution, I'd recommend also replacing static mut with a static OnceCell<Mutex<...>>@17cupsofcoffee said, though, static mut itself is really hard to get right in Rust. Using a safe abstraction like OnceCell for initialization and Mutex for synchronization will avoid that.

1 Like

In this specific instance, an atomic integer would be sufficient.

3 Likes

Sorry for a little delay, I had to do my homework to figure things out. Thanks everyone for replying, your were helpful, but let's get straight to the problem. Here is a refreshed example, with applied suggestions from @daboross and @jethrogb - I am trying to use a HashMap to disambiguate calls to a single static. Here is what I have got so far.

trait Core {
    fn foo() -> &'static str;
}

use once_cell;
use std::collections::HashMap;
use std::any::{TypeId, self};

trait Wrap : Core {
    fn foo() -> &'static str {
        // TODO use mutex

        // FIXME #1: need `mut` here because the HashMap is initialized to be empty,
        // then I lazily add values to it.
        static mut value : once_cell::sync::OnceCell<HashMap<TypeId, String>>
            = once_cell::sync::OnceCell::new();

        // FIXME #2: unsafe because of FIXME #1
        unsafe {
            let map : &HashMap<TypeId, String> = value.get_or_init(|| {
                HashMap::new()
            });
        }

        // FIXME #1
        unsafe {
            let map : &mut HashMap<TypeId, String> = value.get_mut().unwrap();

            // FIXME #3: This is the last puzzle piece to this thing working.
            // I cannot use Self as trait object, but I need to somehow get the
            // underlying type of the implementor... Any ideas?

            // let type_id = Self as &dyn any::Any;

            let type_dyn : &dyn any::Any; // ???
            let type_id : any::TypeId = type_dyn.type_id();

            if !map.contains_key(&type_id) {
                map.insert(type_id,
                           format!("wrapped {}", <Self as Core>::foo())
                );
            }

            &map.get(&type_id).unwrap()
        }
    }
}

impl<T : Core> Wrap for T {}
struct A {}
impl Core for A {
    fn foo() -> &'static str {
        "a"
    }
}

struct B {}
impl Core for B {
    fn foo() -> &'static str {
        "b"
    }
}

fn main() {
    // I want these to pass
    assert_eq!(<A as Wrap>::foo(), "wrapped a");
    assert_eq!(<B as Wrap>::foo(), "wrapped b");
}

It does not compile at the moment, and I marked the problematic pieces of code with FIXME. Basically my problem is I do not know how to obtain type identifiers of generic types that implement a trait. I know it sounds complex, and maybe it is not possible to do what I am trying to do, but please have a look. What I am trying to achieve is make the two tests at the bottom pass.

1 Like

Ah - I think there might be an alternative way to get a TypeId which could work better. TypeId::of::<Self>() should give you a typeid without access to a value.

The only limitation is that Self: 'static - I'm not sure if there's a way to work around this, but it shouldn't be too limiting - it would just prohibit references to stack variables in things implementing Wrap.

Re the other FIXMEs, I'd definitely recommend using a Mutex in there - if you store a OnceCell<Mutex<HashMap<TypeId, String>>>, since then the Mutex will ensure you aren't modifying the data from multiple threads simultaneously (and it will also, then, be safe and won't need unsafe blocks).

If it were a single counter as in code in the OP's post, I'd definitely agree. But the only way I know how to implement one counter per type is using a HashMap, and I think initializing that will require more heavy duty synchronization primitives?

I have fixed the type-id-of-Self part and I am using Mutex now (I did not understand at first that doing so eliminates the need for unsafe :smiley: ). So all the FIXMEs from my previous post are fixed now, but... Eventually I need to return a reference to values stored in the HashMap. The two options are returning a &String or &str. Neither will work in this example due to non-sufficient lifetime of the references. Is there a way to rewrite this to return static references to the values stored in HashMap?

I though I could lazily put values into it and then get views on them as &'static str whenever I need.

use once_cell;
use std::collections::HashMap;
use std::any::{TypeId, self};

trait Wrap
    where Self : 'static
{
    fn bar() -> &'static str {
        static value : once_cell::sync::OnceCell<
                std::sync::Mutex<HashMap<TypeId, String>>>
            = once_cell::sync::OnceCell::new();

        let map : &std::sync::Mutex<HashMap<TypeId, String>> =
            value.get_or_init(|| {
                std::sync::Mutex::new(HashMap::new())
            });

        value
            .get()
            .unwrap()
            .lock()
            .unwrap()
            .get(&any::TypeId::of::<Self>()).unwrap()
    }

    fn foo() -> &'static String {
        static value : once_cell::sync::OnceCell<
                std::sync::Mutex<HashMap<TypeId, String>>>
            = once_cell::sync::OnceCell::new();

        let map : &std::sync::Mutex<HashMap<TypeId, String>> =
            value.get_or_init(|| {
                std::sync::Mutex::new(HashMap::new())
            });

        value
            .get()
            .unwrap()
            .lock()
            .unwrap()
            .get(&any::TypeId::of::<Self>()).unwrap()
    }
}

Seems like I have found a related problem. MutexGuard limits the lifetime of anything it stores, which causes the borrow checker to fail, I guess so? https://users.rust-lang.org/t/ownership-issue-with-a-static-hashmap/27239/2

Update. https://users.rust-lang.org/t/ownership-issue-with-a-static-hashmap/27239/4?u=gevorgyana I think this post answers my questions. It is generally unsafe to do what I intended, so not possible in safe Rust.

Since you are never going to drop the global map anyway, you can use String::into_boxed_str().leak() giving you &'static str

1 Like

Do you mean something like this? It still does not compile, this time because of String not implementing the Copy trait.

trait Wrap
    where Self : 'static
{

    fn foo() -> &'static str {
        static value : once_cell::sync::OnceCell<
                std::sync::Mutex<HashMap<TypeId, String>>>
            = once_cell::sync::OnceCell::new();

        let _: &'static std::sync::Mutex<HashMap<TypeId, String>> =
            value.get_or_init(|| {
                std::sync::Mutex::new(HashMap::new())
            });

        let v:Box</*&'static*/ str> =
            value
            .get()
            .unwrap()
            .lock()
            .unwrap()
            .get(&any::TypeId::of::<Self>())
            .unwrap()
            .into_boxed_str();
        std::boxed::Box::leak(v)
    }
}

I mean storing the &static str in the HashMap:

3 Likes

I have no idea what I would do without this forum. Works as expected now

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.