Style Question: Are Rc<> pointers acceptable in public interfaces?

Hello,

TLDR: Do you think Rc pointers in public interfaces are acceptable? Or is there another solution to the below situation I haven't thought of?

This isn't a "how do I...?" question, but I'm wondering if some of the more experienced Rustaceans have an opinion about the best way to structure the interface for the following situation.

In the beginning, I had a struct that kept clusters of points (as indices), and a reference to the points themselves. It looked like this:

pub struct Cluster<PointT> {
   member_indices : Vec<usize>,
   cluster_mean : PointT,
   cluster_meta_whatever : PointT
}
pub struct ClusterSet<'a, PointT> {
   points : &'a [PointT],
   clusters : Vec<Cluster<PointT>>
}

Everything seemed to be good, and I built quite a lot using this interface. Typically clusters were short-lived things. I'd have functions that built clusters, used them, and dropped them.

But one day, I wanted to have clusters that hung around, so I tried to encapsulate everything in a self-contained struct like this:

pub struct BigStruct<PointT> {
   shared_points : Vec<PointT>,
   clusters_a : ClusterSet<'?, PointT>,
   clusters_b : ClusterSet<'?, PointT>,
   clusters_c : ClusterSet<'?, PointT>,
}

Obviously this isn't going to work. There is no lifetime I could give that would make the borrow-checker happy to allow a struct member to borrow other members of the same struct. I understand why it has to be this way, but it's still an issue that makes rust particularly tricky in a number of situations that programmers from other languages don't think about.

Option 1, Dormant meta-data struct
One approach I could use (the one I did use) was to have two flavors of ClusterSet. The second flavor doesn't have a reference to its points (borrowed data).

pub struct DormantClusterSet<PointT> {
   //points : &'a [PointT], This member is gone in the dormant form
   clusters : Vec<Cluster<PointT>>
} 

Then, to use the ClusterSet, you need to "wake it up" by re-uniting it with its points.

The foot-gun for this approach is that there are no longer any guarantees that the points weren't modified behind the ClusterSet's back while it was dormant, or that cluster set is being re-united with the same points buffer at all.

I could perform a checksum at wake-up time, but one of the design goals in storing the cluster set instead of re-creating it was to minimize the number of times I had to scan through the points buffer in memory.

Option 2, Use an RC pointer
An alternative is have ClusterSet store an RC pointer for its points. This actually works across the board and I think it solves all the issues I mentioned. But I have never seen an RC pointer as part of a public-facing API. So I guess my misgiving here is really aesthetic more than practical.

I've always thought of RC pointers as belonging in the guts of an implementation, and as a feature-of-last-resort when you couldn't make the ownership model work any other way.

So, that's really why I'm asking. Do you think RC pointers in interfaces are acceptable? Or is there another solution I haven't thought of?

Thanks for reading all that!

The official MongoDB Rust driver used to contain typedefs over Arc in its public API. I don't think there's anything wrong with that. However, if you want to hide the fact that you are using Rc, you can always wrap it in a newtype.

Are you implying serialization and deserialization here? Note that serializing a general directed graph naïvely is a nontrivial problem in itself – if you have multiple Rcs to the same Node, how/when should it be serialized during graph traversal? This is something that you will need to think about eventually, regardless of what representation you are using.

The official MongoDB Rust driver used to contain typedefs over Arc in its public API. I don't think there's anything wrong with that. However, if you want to hide the fact that you are using Rc , you can always wrap it in a newtype.

Thanks! I think wrapping the Rc<> it in my own type is the way to overcome my irrational "That's ugly" feeling I have when I think about requiring an Rc<> as an argument. Thanks for helping me overcome my neurosis.

BTW the reason I have for not wanting Rc in the public API is that you don't want to prevent someone sometime from using threads with your code and wishing you to use Arc instead. If you wrap it up like you're thinking then you could later add Send by changing to Arc internally without breaking anyone's code.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.