What is the better code design?


#1

I’m learning rust. I come from a background of Java. The reason I say this is because I’m struggling to understand some of the memory basics in rust around ownership and life time. Where it really seems to get me confused is around classes/structures.

So I started to make a project which I thought we be better for me to learn. I created two structures with a has-a relationship. Then I started wondering what problems I will have with my first approach. So then I created them slightly differently after some reading.

Now I am wondering is that even right. I would like some suggestions if possible, please. What is better? I realize it might be difficult without knowing use cases: I’m leaning to restful style API with standard CRUD functions (mysql) and business logic. Hopefully thats enough to generate some ideas.

here’s my first design:

pub struct Company<'a>
{
    pub symbol : &'a str,
    pub name : &'a str
}

pub struct Trade<'a>
{
    pub company : &'a Company,
    pub when : DateTime<Local>,
    pub qty : f32,
    pub unit_price : f32
}

or my second design:

pub struct Company
{
    pub symbol : String,
    pub name : String
}

pub struct Trade
{
    pub company : Company,
    pub when : DateTime<Local>,
    pub qty : f32,
    pub unit_price : f32
}

or is some other design going to be best long term?

Very appreciated,
Matt


Borrowed value does not live long enough (local vars in fn/constructor)
#2

It very likely won’t be your first design. Second design would be the easiest but has the potential to leave performance on the table if some use cases could allow borrowing instead of owning. There’s a way to abstract over owned vs borrowed by using generics, but that will proliferate generic type parameters everywhere Company and Trade are used; you’d get some flexibility in terms of using owned vs borrowed on a case by case basis however.

Let me throw out some random thoughts.

Do you intend to maintain an in-memory cache/database of Company and Trade values? If so, you probably wouldn’t want each Trade to have its own Company value as you’ll likely have lots of duplicated strings in memory. So using Rc might come in handy. Or Arc if you intend to share across threads.

Will Company and/or Trade modify their internal strings after they’ve been created? If not, Box<str> or Rc<str> may be better.

If you maintain a cache/database of Company values, you may want to have Trade just store some cheap to copy handle to the Company, perhaps a numeric id that’s associated with a company. Whenever you need more details about the company, you query the cache.

You can also create different flavors of the types, each tailored to some subset of functionality and then add conversions/facades to unify them for APIs that don’t care. For instance, you can have Trade and Company representations that are purely reference based (eg simple stack usage, maybe quick decoding off a buffer) and then other representations that own the data (eg long term storage in the in-memory cache/db).

I think a lot of this will depend on how data flows and interacts in your program. You’ll be better off modeling your types based on those interactions.


#3

You almost never ever want & in structs. References are not like pointers. They are for temporary use, and such structs would be pain to use in more than one function only.


#4

Thnx @vitalyd

I’m not sure caching/performance is a concern. Only because this is just a personal project and not something that will have high volume. That being said, it doesn’t hurt to think about that though, from a learning perspective.

Let me extend my thinking on this a little: the things I will want to do are

  1. get a list of all open positions (aka the portfolio)
  2. get a list of transactions to make a trade
  3. calculate profit/loss

a snippit of code to help explain:

// a position is one or more open trades
// for a given company.
pub struct Position
{
    // list_of_trades:
}

// a portfolio is a collection of positions
pub struct Portfolio
{
    // list_of_positions:
}

I think my biggest concern, right now, is I don’t want to get into trouble with the borrowing/ownership rules. These seem to be the rust concepts I have the hardest time grasping. So I want to make sure my structures don’t get me into a bind (like my first test did)

Thnx
Matt


#5

Do you mean the references or the structs?
Thnx
Matt


#6

Structs with references are as limited as the references themselves.

In Rust you use references mainly:

  • For function arguments. Make functions take &[T], and never &Vec<T>, since it’s more flexible for the function callers.
  • Return types only if the function is returning data from its argument (including lending from &self). Functions that create new objects must return owned values.
  • Closures in iterators used within a function. This is also where you might sometimes use a struct with reference to hold a temporary data, but it’s rare.

For everything else, more permanent and shared across functions, owned values make most sense.


#7

As a learning exercise, getting into a bind is very useful - knowing what not to do is just as important as what to do :slight_smile: And hitting these corners first hand, rather than reading about them in the abstract, cements the concepts involved. So I personally find this valuable as well.

Here’s one way to decide about using references or not in structs. Ask yourself: if I’m taking a reference, who will actually own the data? Taking Trade and Company as an example, who would own those string slices? Assuming you have an answer and there’s an actual owner, now you have to answer: how will the owner and the borrower be connected? In other words, how will the code constructing the Trade/Company get a handle to those string slices? Finally, keep in mind that something has to be the owner ('statics aside) so you’ll need to define those at one point or another. It can also be that you need shared ownership (eg Rc) where the Rc “mediates” between the different owners. It may also be that ownership is transferred (ie move semantics) over the life of the value.

This requires some thought/design upfront, although you can of course iterate and change things up as the use cases become clearer.


#8

I really like this as a clean way of thinking about it.


#9

When people say “Rust makes you write cleaner code”, I believe they’re really referring to the fact that you have to (a) think about ownership and how data flows and (b) then write code that’s amenable to that - borrowck is sort of like the guard rails that keep you moving in that direction, preventing you from going off into the weeds.

Some (particularly GC’d) languages make it easy to spray a component across many different places, and then everything can reach everything else, and it can get extremely complex to figure out how exactly data flows through the system. GC’d languages at least make this memory safe; C++ (just to pick on someone) can also burden you with worrying about lifetimes in your head. But one way or another, one will likely want to organize things better - it’s nice to have the compiler help out.


#10

@kornel Do you find that? I go out of my way to code the opposite - maybe it’s the wrong design pattern I’m following?

For example, I was parsing a file of data into a String. The alternatives were

struct Foo {
data : vec<String>
}

with a BufRead::lines()

or

struct Foo<'a> {
Vec<&'a str>,
data : String, // Owned
}

with a File::read_to_end()

The second approach read the data in one block and just passed around references to the underlying data. I try to minimise allocation/heap fragmentation so try to be as immutable and functional as possible?

I do agree it’s messier coding especially with Traits.

@mattraffel : The borrow-checker does push you do think about data flow, it forces you to structure code a certain way. As a rule of thumb, data flows one-way from top of stack downwards. I’m not sure if I’m doing it the right way but my general rule when coding Rust (and now C++ too) is read data once and refer to it everywhere else. If you see an clone() or allocation then somethings probably wrong in your code structure.


#11

Ownership and immutability are largely separate concerns (with owned values you still need explicit mut).

I appreciate worrying about needless heap allocations, but avoiding unnecessary references in structs doesn’t mean the only alternative is to have wasteful allocations. It may even be more efficient, since you can inline data in larger structs rather than have a graph of references with layers of indirection (e.g. struct {vec} is better than struct {&mut vec}). References have to point to data that is owned somewhere. So you can’t use references instead of all owned data. And in practice it usually turns out that you use structs to organize data ownership, and then references into those structs to use the data.

Rust doesn’t support self-referential structs, so your alternative wouldn’t even compile (I’m assuming the vec contains references to the data string, not something else). However, you could use the alternative struct with integer indices. It might even be faster, since you could reduce its size by using u32 indices instead of 2×usize &strs. It does make sense when you want to avoid fragmentation and overhead of small allocations.

References are useful in lots of places, sometimes even in structs, but use in structs is relatively rare:

  • You might want a struct to hold some 'static references, but it’s a special case since it’s limited to compile-time data or you have to leak memory.
  • You might want to create complex view of some temporary data, e.g. you want to implement a hash join of items from a couple of slices. But in such cases often tuples are used instead.

To be clear, I’m not saying not to use references in structs. Just as a rule of the thumb, if you’re a beginner and not sure which one to use, it’s more likely that owned version is the one you want.


#12

lazy_static can generate runtime 'static values.

References in structs are very common - in fact, most people use them everyday: any non-consuming iterator will have references.

So it’s really about using the right tool for the job - they can’t be compared in the abstract.


#13

You’re right, Iterator structs are a great exception to this.

Lazy static indeed isn’t compile-time, but you’re still left with a constant for the lifetime of the program, so it’s not very different in my view (and if you use lazy static with interior mutability, you could as well use Rc/Arc).


#14

Throw closures in there, although compiler does the desugaring :slight_smile:. This will also include generators and async/await based futures. Parsers would be another example although not as widely used as iterators. They (and other types like them) all have a couple of key things in common:

  1. typically very short lived (i.e. mostly stack based)
  2. provide some transformation/utility over data, but aren’t owning the data.

Yeah, I just wanted to point out that it’s not just compile time constants or leaked values. Once const eval gets richer, I suspect there will be more capability (although similar to lazy_static in principle - “forever” values).


#15

When I say struct I mean only cases where user types the word struct, so closures, and all other cases of some structured data in memory, don’t count :slight_smile:


#16

Blockquote The borrow-checker does push you do think about data flow, it forces you to structure code a certain way.

Thank you. That is why I thought I would ask about structure design up front.

Matt


#17

What about http://immutable.rs? Will it be a suitable use case to store the data immutably here?


#18

If you’re concerned about unnecessary copies and performance it probably doesn’t make sense to use immutable data structures.

Most immutable data structures work by essentially creating reference-counted linked lists. This sort of thing isn’t great because there’s so much pointer chasing (cache misses, etc) and you have the extra overhead of an atomic integer and a pointer for every item in the collection.

That said, immutable data structures are pretty awesome in other languages. They’re not as necessary in Rust because everything is immutable by default and it’s not possibly to have multiple mutable references.