Possible to swap collection impl. without impacting callers?

Hello! Is it possible to change what concrete collection my function returns (Vect<String> vs. HashSet<String>) without having to change all the call sites and their callers .... ?

In Java, my function would delcare that it return the most abstract super-interface Collection<String> while returning the concrete impl. of ArrayList. Swapping that for a set would have no impact on the callers.

Is this possible wiht Rust generics? And is it a good practice or bad one? I assume the compiler needs to know the concrete type so that it can allocate the required space on the stack etc. So perhaps in Rust my only option is to use and declare Vec<String> if I change that then I also must change all the places that depend on it?

Thank you! #newbie

The compiler, yes, but not the consumer of the code! You can invent a trait that you then implement for both Vec<T> and HashSet<T>, and then return impl MyCollection from the function. Even if you don't use impl Trait, you can create your own Collection newtype, which privately contains the concrete collection type, but that is not observable by anyone else, since it's private.

In theory, it can be good practice, although in the specific case of collections, I don't really see what one can do with an arbitrary collection. (This is pretty much why there is no Collection trait in the Rust stdlib.) Perhaps you are looking for impl Iterator or impl IntoIterator?

3 Likes

Rust isn't as dynamic as Java (roughly speaking, Rust defaults to having everything final), and Rust programs tend to avoid adding abstractions unless necessary.

For type abstractions you need to work with traits and generics. dyn Trait (similar to Java interfaces) has a runtime overhead that concrete Rust types don't have, and statically dispatched generics can get verbose to use.

Collection types in the standard library are generally good. Vec is as simple and memory-efficient as it can get, so if your data is already in a Vec, just return what you have.

BTW: Rust programs also avoid getters and setters. If a field can be public, just make it public. Private fields are kept private only when that is needed for safety/correctness, not just in case they need to be changed in the future.

2 Likes

I'll push back on this a bit, because it does impact the callers: things would almost certainly start being in a different order than they used to be, and they can't store duplicates any more. Thus I think that for the storage type this is often questionably useful.

I'd second H2CO3 here -- the place where this is most useful is not for the storage type, but for various accessors. If you're returning a subset or a modified part or whatever, then -> impl Iterator<Item = Whatever> can be great. But for ownership, might as well just tell them what it is.

(That also helps avoid the "oh it has a Contains methods; I'll use that" problem when the collection interface has it so it gets implemented, but it's actually O(n) because it happens to be an ArrayList, not a Set.)

2 Likes

If you are writing helper methods and the caller isn't meant to be storing the result in a struct, often people will use impl Iterator so you can hide the return type. This is particularly useful when your computation could be done lazily using iterator chains instead of eagerly collecting the values into a Vec that will be immediately consumed by a for-loop.

Rust doesn't tend to have the same sorts of collection interface hierarchies that you see in C# or Java, though. Writing those traits tends to get awkward without either advanced type system features ("Generic Associated Types") or using boxed values and dynamic dispatch (i.e. Java interfaces).

I would also argue that switching from a Vec<T> to a HashSet<T> happens infrequently enough that it doesn't make sense to insert an abstraction.

If I felt that I might need to change collection types at some point down the track, my typical solution would be to

  1. make sure my code is sufficiently decoupled/encapsulated that the change doesn't require much code churn, or
  2. Create a domain-specific type which contains the collection of Ts and only exposes common functionality, letting me switch out the underlying collection type without breaking callers. This adds a larger burden on the developer to justify the extra maintenance cost, though.
4 Likes

Thank you very much, everybody! I will stick to concrete collections but remember that I can use impl Iterator if I really need it.

You are right, I have not formulated it right. The thing is that my function should not have any promises about order or duplicates so that I would be free to change whether it returns a particular order and duplicates or not. I.e. the callers should not know and thus not be able to care.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.