String Arena and HashMap

#1

I want to store strings in an Arena and provide fast lookup to the id via a hashmap. Unfortunately, I seem to be encountering lifetime issues, which was the point of the arena in the first place.

I am using the id-arena crate.

I create an Arena and a HashMap<&'a String, Id>.
I first add a new string to the arena, which returns an Id.

I then try and look up the data from the arena using get, which returns Option<&String>.
When I unwrap and try and insert this into the hashmap, I get a familiar complaint:

cannot infer an appropriate lifetime for autoref due to conflicting requirements

thoughts?

#2

Could you please post a minimal example that shows the problem.

This is because you are storing references in the arena, why not store String directly instead of &String

#3

The arena is of type Arena String, not Arena<&String>

The Map is HashMap< &String, Id>
given an arena, doing arena.get(id) returns an Option<&String>
I am having difficulty with reconciling the lifetime of the Option<&String> and the hashmap

here is a gist https://gist.github.com/rust-play/c8b4f05973c2535baf4db0cb8ddd0c20

the issue is in the add function

#4
fn add<'a>(arena: &mut StringArena, map: &'a mut Map, package:String) {
    let id = arena.alloc(package);
    let refval = arena.get(id).unwrap();
    map.insert(refval, id);

}

You’d need to borrow arena for 'a as well - the arena’s get() returns a reference that’s tied to a borrow of the arena, and there’s no connection between that borrow and 'a otherwise. The map, in turn, would need to be Map<'a>, and its borrow can be elided (i.e. map: &mut Map<'a>).

However, if you do that, you’ll find that you won’t be able to call add more than once until map goes away, because the mutable borrow of arena will be extended.

You probably want to instead use an arena that doesn’t require a &mut self borrow, such as typed_arena - it also doesn’t use any ids, just hands you a reference back.

1 Like
#5

thank you!

#6

Yes, I know that, but there is still a stored lifetime parameter, which you could get rid of.

#7

I tried to switch arena implementations and got this to work:

However, when trying to store both the arena and a HashMap referencing the arena data as keys, I ran into the same sort of trouble:

#8

I did something similar using the internment crate, and it “Just Worked”. A bunch of HashMap<K, _>, where the _'s are various structs with a bunch of Intern<String>. No messing with explicit Id’s, they just deref like strings.

Worth looking at, if your use case regarding storage and lifetime of the arena matches one of the variants provided by the crate, because (if you can accept one of the various conditions) it makes all the lifetime issues go away. Very convenient.

#9

Can you point me to a particular struct that i should be looking at?

#10

I’m not sure what you mean. For the internment crate, there are three structs:

  1. Intern
  2. LocalIntern
  3. ArcIntern

They’re all described in the linked doc page, with different trade-offs and considerations, so you probably don’t mean those.

If you mean the structs I was talking about here:

those are just in my own code, using internment. Stuff like:

#[derive(Deserialize, Debug)]
struct ELBLogLine {
    timestamp: DateTime<Utc>,    // The time when the load balancer received the request from the client, in ISO 8601 format.
    elb:       Intern<String>,   // The name of the load balancer
    client:    SocketAddrV4,     // The IP address and port of the requesting client.
    #[serde(deserialize_with = "csv::invalid_option")]
    backend:   Option<SocketAddrV4>, 
    ...

The rest of the code just largely works as if those fields were String, via Deref, with some nice speedups for equality comparison because it just compares the pointers.

For serde, it all Just Works (another advantage). Intern deserialises and adds the string to the intern arena automatically via Intern::new(), and outputs as a string as well.

#11

Guilty of reading your response too fast. I thought you said that you wrote the internment crate and that it had some relevant code. But going back to your post, I see that you were using the internment crate. my bad.

1 Like
#12

Can’t do this in Rust - this would be a self-referential struct.

If you’re working with that arena, you typically create it at an earlier point in your program, and then pass/hold references to it for places that want to allocate from it and hold the returned references.

1 Like
#13

yup. thats what i did. Constructor function takes a mutable reference to the arena and it works.