Simple question about using usize

I have a graph data structure that uses u32 for node representation. I have methods like num_nodes() which I know can't exceed the value range of u32 because of this. But I frequently need u32 in contexts that expect a usize (like when I make a Vec<u32> with the with_capacity method that expects a usize).

Should I just change the return type of num_nodes() to usize or keep the constant casting?

As I see it that would be a loss of information because I know num_nodes() can't exactly return a bigger value, but on the other hand it's kind of inconvenient and num_nodes() does represent a size, so usize might be a better fit.

If you're on a 32-bit architecture, usize and u32 are the same. If you're on a 64-bit architecture, usize might be more efficient than u32. In any case it sounds like you're indulging in premature optimization. Get it working, test it to make sure it's correct, and only then measure it if the performance is less than you need.

I previously used usize to represent the nodes themselves, but quickly got an OOM because the algorithm allocated too much memory. So I changed it to u32. Now I wasn't sure if I should also change the returned sizes to usize. By what you say, it seems like using usize for the sizes would be a better fit, but keeping the nodes themselves represented as u32.

You could also use a macro to convert u32 to usize in index expressions. See https://users.rust-lang.org/t/using-isize-vs-usize/22816/2

1 Like

I would separate out the idea of the node ids from the counts of nodes.

For the things that are counts, and you use as capacities and such, usize is fine. It'll be easier, and there's probably relatively few of them anyway. (They'd largely be transient things that you're not storing all over the place.)

But then for the ids, you might add a newtype like this:

#[derive(Copy, Clone, Debug, Hash, PartialEq, Eq)]
struct NodeIndex(u32);
impl<T> std::ops::Index<NodeIndex> for [T] {
    type Output = T;
    fn index(&self, i: NodeIndex) -> &T {
        use std::convert::TryInto;
        // Won't actually ever panic on 32- or 64-bit platforms
        let i: usize = i.0.try_into().unwrap();
        &self[i]
    }
}

so that you can conveniently index slices by those ids (and, as a bonus, can't accidentally do things like add them).

4 Likes

That seems like a good solution when you have to do more complex checks like in the post you linked, but I think I prefer x as usize instead.

I thought about it a bit more though and decided to use usize since it seems more natural. I do require it in contexts that need a u32 though, as well. Casting here isn't bad, but I think it would be nice if I could make the method generic over u32 and usize. I know there's the num crate, but is it the only way? If I use the num crate, I can't restrict it to just u32 and usize either, but any number. I just need usize and u32 though.

You can always make your own trait that you implement only for those two types.

Oh, I didn't think about it like that! That took me a bit to understand. So that basically converts the NodeIndex to a proper usize, but I can still use usize to normally index my nodes. That also makes it easy to change the implementation later. Nice!

Could I also use .into() here? I don't really get the difference.

Then you implemented this for a generic T. I don't need the output to be generic, but that would basically allow me to have the nodes themselves be of arbitrary type?

Is this use std::convert::TryInto import right within the method idiomatic?

That is suspicious at best. There's probably a 4-byte difference between u32 and usize on a 64-bit system. If you managed to get an OOM with that, then you were probably using so much memory in the first place that you were already near the memory limit. What type of other data do your nodes contain? Does its size dominate the size of the node IDs? And how many nodes do you have?

2 Likes

Yes, you were right. The pseudocode I based this on creates a ton of copies, but that isn't actually necessary. :sweat: ... Now it went from completely blowing everything out of proportion (and I have a lot of memory) to consuming almost nothing. :slight_smile:

1 Like

If you need it all over the place, you might want to move it outside the function body, but usually yes.

1 Like

Honestly the reason it's inside the method here is because for little demos having it inside makes it self-contained and thus easier to copy-paste correctly. I'm hopeful that it'll eventually just be in the prelude and thus won't need to be imported (the same way Into isn't). For well-known traits with specifically-named methods, I'd usually just bring them in scope for the whole file.

Importing inside a method I usually only do for things I don't want at the wider scope. A common example would be using SomeEnum::*; to save retyping the enum name a bunch of times inside a method that uses it heavily.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.