[Beginner] Fighting with lifetimes

Hi,

I'm struggling with lifetimes, moves and borrowings... all the common things for beginners I guess?
What I want to do is basically store a collection of items into 3 different index maps. I don't need the keep the original collection, having the items in the maps is sufficient.

I had a first version where I clone the items for each different maps, and it compiles fine. But I would like to use references instead. Here's my code so far :

    use super::files;
use std::collections::HashMap;

#[derive(Serialize, Deserialize, Debug, Clone)]
#[serde(rename_all = "camelCase")]
pub struct UserInfo {
    int_idx: i32,
    name: String,
    prov: String,
    prov_u_id: String,
}

pub struct Users<'a> {
    by_id: HashMap<i32, UserInfo>,
    by_name: HashMap<String, &'a UserInfo>,
    by_provid: HashMap<String, &'a UserInfo>,
}

fn provider_key(pname: String, id: String) -> String { format!("{}-{}", pname, id) }

pub fn load_users<'a>() -> Users<'a> {
    let content = files::read_file("users/_index.json").expect("Could not open users index file");
    let raw_users: Vec<UserInfo> = serde_json::from_str(&content).expect("Could not deserialize index file");
    println!("{:?}", raw_users);
    let mut by_id: HashMap<i32, UserInfo> = HashMap::new();
    let mut by_name: HashMap<String, &'a UserInfo> = HashMap::new();
    let mut by_provid: HashMap<String, &'a UserInfo> = HashMap::new();
    for user in raw_users {
	let prov_key = provider_key(user.prov.to_owned(), user.prov_u_id.to_owned());
	if !by_id.contains_key(&user.int_idx) && !by_name.contains_key(&user.name) && !by_provid.contains_key(&prov_key) {
	    let u1 = user.clone();
	    let ref_user = &u1;
	    by_id.insert(u1.int_idx, u1);
	    by_name.insert(u1.name.to_owned(), ref_user);
	    by_provid.insert(prov_key, ref_user);
	}
    }
    Users {
	by_id: by_id,
	by_name: by_name,
	by_provid: by_provid,
    }
}

My idea is to have the first Map containing the owned objects, and the two others containing references.
But it doesn't compile because " u1 does not live long enough". I would like to make explicit that lifetime of the reference is the same as the owned object itself, but I don't know how. How could I do?

Thanks
Joel

This won't work, because you're trying to make a struct which references itself. ( The UserInfos are owned by the struct) Try making it look like this instead:

pub struct Users<'a> {
    by_id: &'a HashMap<i32, UserInfo>,
    by_name: HashMap<String, &'a UserInfo>,
    by_provid: HashMap<String, &'a UserInfo>,
}

The implication will be that something else will have to own the data. In short, whenever you make a struct with a lifetime parameter, you should be treating that struct as a 'view' into some other object. You can't have structs with lifetime parameters that don't depend on some other data without a lifetime parameter.

4 Likes

Along with what @skysch posted, your immediate problem stems from that u1 is only alive up until you move it into the hashmap. After that, the value is dead, and all references to it are invalidated. Hence, you can't put a reference to it in a hashmap.

However, you can first try writing some code like this to get around that, again with the caveat that you can't have the users struct referencing itself:

pub fn load_users<'a>(raw_users: Vec<UserInfo>) -> Users<'a> {
    // ...
    for user in raw_users {
        // This loop is split from the other because you have to mutable borrow by_id
        // here, and can't store references to its data while it's mutably borrowed
	let u1 = user.clone();
	by_id.insert(u1.int_idx, u1);
    }
    
    for val_ref in by_id.values() {
        let prov_key = provider_key(user.prov.to_owned(), user.prov_u_id.to_owned());
        by_name.insert(val_ref.name.to_owned(), val_ref);
	    by_provid.insert(prov_key, val_ref);
    }
    // ...
}

This direct change alone won't actually work, because by_id only lives as long as the function, so you can't keep references to it in the other hashmaps.

However, code along the lines of:

pub fn load_ids<'a>(raw_users: &Vec<UserInfo>) -> HashMap<i32, UserInfo> {
    // ...
    for user in raw_users {
        // This loop is split from the other because you have to mutable borrow by_id
        // here, and can't store references to its data while it's mutably borrowed
	let u1 = user.clone();
	by_id.insert(u1.int_idx, u1);
    }

    by_id
}

pub fn load_users<'a>(by_id: &'a HashMap<i32, UserInfo>) -> Users
    for val_ref in by_id.values() {
        let prov_key = provider_key(val_ref.prov.to_owned(), val_ref.prov_u_id.to_owned());
        by_name.insert(val_ref.name.to_owned(), val_ref);
	    by_provid.insert(prov_key, val_ref);
    }
    Users {
	    by_id: by_id,
	    by_name: by_name,
	    by_provid: by_provid,
    }
}

Is probably more like that you want
(I haven't compiled and run this, so there may be typos)

1 Like

Thanks for the answers! I'm making some progress, but not fully done yet (gluing everything together is not done yet). I'll continue later but for the time being there's one thing I don't understand:

@skysch , why you say the struct references itself? In my understanding, Users struct holds both the owned UserInfos and the references to the same, but the Users struct itself is at a higher scope so I don't see any kind of cycle here, no?

The struct Users<'a> owns three HashMaps, and one of those maps owns its UserInfo values, while the other two own references. Ownership is transitive except through references, so the UserInfos in the first hashmap are both owned by Users and referenced by it.

It's not a cycle in the sense that "the hand points at itself" is a hand pointing at the same hand, but in the sense that "the hand points at the foot" is still you pointing at yourself. Users contains references to data owned by Users. If that struct (and the hashmap it contains) is moved, those references will be invalidated, so the struct is not self-consistent across moves.

2 Likes

Thanks again for the help, much appreciated, it compiles now.
For the record, I've separated owned variables from references, as you suggested. Ended up with that:

    pub fn load_users() -> Vec<UserInfo> {
        let content = files::read_file("users/_index.json").expect("Could not open users index file");
        serde_json::from_str(&content).expect("Could not deserialize index file")
    }

    struct Indexes<'a> {
        by_id: HashMap<i32, &'a UserInfo>,
        by_name: HashMap<String, &'a UserInfo>,
        by_provid: HashMap<String, &'a UserInfo>,
    }

    fn provider_key(pname: &String, id: &String) -> String { format!("{}-{}", pname, id) }

    fn build_indexes<'a>(users: &'a Vec<UserInfo>) -> Indexes {
        let mut by_id: HashMap<i32, &'a UserInfo> = HashMap::new();
        let mut by_name: HashMap<String, &'a UserInfo> = HashMap::new();
        let mut by_provid: HashMap<String, &'a UserInfo> = HashMap::new();
        for user in users {
            let prov_key = provider_key(&user.prov, &user.prov_u_id);
            by_id.insert(user.int_idx, &user);
            by_name.insert(user.name.to_owned(), &user);
            by_provid.insert(prov_key, &user);
        }
        Indexes {
            by_id: by_id,
            by_name: by_name,
            by_provid: by_provid,
        }
    }

    pub struct UsersService<'a> {
        users: &'a Vec<UserInfo>,
        indexes: Indexes<'a>,
        // ...
    }

    impl<'a> UsersService<'a> {
        pub fn new(users: &'a Vec<UserInfo>) -> Self {
            let indexes = build_indexes(users);
            UsersService {
                users: users,
                indexes: indexes,
            }
        }
        // ...
   }
1 Like