[SOLVED] Coming from OOP, stuggling to find a pattern in Rust

I have a simple function that checks weather a HashMap in self contains an object with the key given and returns the object if it is found OR if it's not found creates it and returns it.

I am running into problems with the Rustian restriction of 1 mut ref at the time.
Have googled and read alot trying to find a solution. Code blocks are not an option as &mut self is a function parameter and, only way to release is to return from the function.

I am sure other people must have similar requirements and there are patterns in Rust that allow doing this simple opration, I just can't find it at the moment. I know I could just use unsafe block and it would fix it, but I would like to learn how to deal with these situation in Rust idiomatic way! Any help is highly appreciated.

    fn get_environment(&mut self, name: &String, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
        match self.environments.get_mut(name) {
            None => {
                self.add_new_environment(name, type_info.clone());
                return self.environments.get(name).unwrap()
            },
            Some(e) => {return e}
        }
    }
3 Likes

This looks like a job for the entry API. Instead of operating on the whole map, you just operate on the vacant-or-occupied slot:

fn get_environment(&mut self, name: &String, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
    match self.environments.entry(name) {
        Entry::Vacant(mut slot) => {
            // note: there's really no way to take a reference to self while holding an Entry handle,
            // so we have to be able to create an environment separately.
            let environment = Environment::new(name, type_info.clone());
            return slot.insert(name, environment);
        },
        Entry::Occupied(slot) => {
            return slot.get_mut();
        }
    }
}
2 Likes

Wow :slight_smile: Thank you so much for pointing me to the right direction! So this allows to do all the stuff I need within a single mut ref. Beautiful!

Tried out the code but now ran into a new problem with the slot being a local variable.
" cannot return value referencing local variable slot". Any ideas how to work around that?

    fn get_environment(&mut self, name: &String, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
        match self.environments.entry(name.clone()) {
            Entry::Vacant(mut slot) => {
                let environment = Environment::new(name, type_info.clone());
                return slot.insert(environment);
            },
            Entry::Occupied(mut slot) => {
                return slot.get_mut();
            }
        }
    }
1 Like

Oh, right. Duh.

In that case, you'll need to make sure it doesn't reference the local variable slot.

fn get_environment(&mut self, name: &String, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
    match self.environments.entry(name.clone()) {
        Entry::Vacant(mut slot) => {
            let environment = Environment::new(name, type_info.clone());
            slot.insert(environment);
        },
        Entry::Occupied(_) => {}
    }
    return self.environments.get(name);
}
A few extra notes on writing idiomatic Rust code

I want to avoid distracting from the point where we teach you how to use the Entry API, but some of the code here is unnecessarily noisy or inefficient.

For one thing, we almost never want to use &String. The String object is Rust's equivalent to a Java StringBuffer. It's supposed to be mutable, but you have it behind an immutable reference. It also adds a layer of indirection: &String is a pointer to a structure that contains a pointer to the underlying string buffer. You want &str instead, which is one pointer instead of two.

It's also not considered normal to use return so much. And the Entry API has convenience methods for what we're trying to do.

fn get_environment(&mut self, name: &str, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
    self.environments.entry(name.to_owned())
        .or_insert_with(|| Environment::new(name.to_owned(), type_info.clone()));
    self.get(name)
}
4 Likes

Wow! I'm honestly blown away! I know I said that already, but I literally had tear in my eye when I opened up the extra tab and saw the gorgeous Rust code. I wish I could get to this level already where I can enjoy writing the code instead of spending 90% of my time googling and trying to find ways to do simple things :slight_smile:

I this is a kind of functional programming, which I have very little experience in. Perhaps I should study these patterns more so I am not stuck in OOP. I would love if you could mentor me just in a tactical level. How I should get to where you are in terms of your Rust skills.

There was a bit changes required in the last line, so the final code looks like this and it works! :sunny:

fn get_environment(&mut self, name: &str, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
    self.environments.entry(name.to_owned())
    .or_insert_with(|| Environment::new(name.to_owned(), type_info.clone()));
    self.environments.get_mut(name).unwrap()

PS. How do you make your code look so good in here?

2 Likes
```rust
fn get_environment(&mut self, name: &str, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
    self.environments.entry(name.to_owned())
    .or_insert_with(|| Environment::new(name.to_owned(), type_info.clone()));
    self.environments.get_mut(name).unwrap()
```
1 Like

You can also omit the rust after the opening backticks to get Rust highlighting implicitly.

2 Likes

One more question. Is there a way to change the entry if found before returning it? I simplified my code to explain the point here, but in practice I need to inject the type_info provided to the object, unless it already has one. I know how to do this as per your first example, but I would love to do this in this shorthand code as your second example.

Is or_insert_with() really necessary? Won't plain or_insert() work?

I could've used or_insert, but then I would've been unconditionally cloning the TypeInfo, when we really only want to clone it when creating a new environment.

As described in the Entry API docs, the function and_modify looks like it could do it.

1 Like

I am not quite sure I follow. You are anyway cloning TypeInfo in every case where the key doesn't have a value. Just you are currently passing a closure which just returns the result of the Environment constructor instead of passing the result of the constructor into or_insert().

Thanks here is the code

    fn get_environment(&mut self, name: &str, type_info: Option<TypeStmt>) -> &mut Environment<'a> {
        self.environments.entry(name.to_owned())
        .and_modify(|e| { if e.type_info.is_none() && !type_info.is_none() { e.type_info = type_info } })
        .or_insert_with(|| Environment::new(name.to_owned(), type_info.clone()));
        self.environments.get_mut(name).unwrap()
    }

Yeah, but if the key does have a value, then the closure isn't called, and the clone does not occur. It's a minor perf improvement, but it's so easy I didn't know of any reason not to do it.

2 Likes

I don't think you need to call get_mut at the end, or_insert_with already returns a mutable reference.

2 Likes

I'm still confused, and I think maybe you misunderstood. I think you understood this:

self.environments.insert(Environment::new(name.to_owned(), type_info.clone()));

I meant this:

self.environments.entry(name.to_owned())
    .or_insert(Environment::new(name.to_owned(), type_info.clone()));

Because I can't for the life of me find what is different between or_insert() and or_insert_with() in this case. Maybe you were thinking about the difference between insert() and or_insert_with()?

PS sorry for having this tangential discussion on this thread.

The difference is as follows:

.or_insert() is defined as follows:

fn or_insert(self, default: V) -> &'a mut V;

And .or_insert_with() is as follows:

fn or_insert_with<F: FnOnce() -> V>(self, default: F) -> &'a mut V;

Therefore, if we call .or_insert where default takes a long time to create, then we'll be slowing down the program, while if we call .or_insert_with, then we can assure that the "long time" will only be spent whenever there is an actual need to create the value. IE, if there is no value present.

Say I have the following:

fn create_value() -> usize {
    //Call this a loop or something
    std::thread::sleep(std::time::Duration::from_ms(1000));
    20
}

//This will always take a second to run:
self.my_usizes.entry(name.to_owned())
    .or_insert(create_value());

///This will only ever take one second in the case that there is no value:
self.my_usize.entry(name.to_owned()) 
    .or_insert_with(create_value); //Note we don't call the function. 
                                   //`.or_insert_with` calls it.
6 Likes

Thanks! For some reason I thought that the or_insert() method is only called when there is no value for the key, and in other cases the chained calls exit early. Don't know why...

or_insert will be a no-op on the inhabited entry, but the argument passed to it must still be calculated.

1 Like

to summarize, the …_with variants defer the evaluation of the right-hand operand until such evaluation proves to be necessary, whereas those variants that do not end with _with do an eager evaluation of that operand 100% of the time.

8 Likes

OK, noob question here about performance. If this were performance-critical code, would you be better off with the code that explicitly matches on the Entry than calling and_modify() and or_insert_with(), or are those functions zero-cost abstractions that somehow get compiled down to the same code?

Also, I had the same question as jethrogb, which I didn't see answered. Why call get_mut() at the end -- didn't or_insert_with() just give you the mutable reference you want?