Best practices for option object?


#1

Hello everyone, I’m writing a library for generating Office docx file. But there is one thing that keeps annoying me: how to make an ergonomics option object?

Image an api called app_paragraph that allows user to append a paragraph to the body. It accpets two arugment: the content string and its styles:

fn add_paragraph(content: &str, style: StyleOpt)

The StyleOpt struct is something look like:

#[derive(Default)]
struct StyleOpt {
  font_size: i32,
  font_color: String,
  font_family: String,
  // many many fields
}

So anytime user call add_paragraph, he has to pass a new StyleOpt. This look weird even if they can omit some other fields using Default::default:

add_paragraph("hello world", StyleOpt {
  font_size: 20,
  font_color: "#fff",
  ..Default::default(),
})

#2

I would probably pass the style by reference, so that a user can reuse a given style across multiple paragraphs (which I expect to be the common case).


#3

Thank you for suggesting! Reference is a very good idea that I will cosider.

But I just came up a better way for passing option, you probably want to take a look:

pub enum Style {
  FontSize(i32),
  FontColor(String),
  FontFamily(String),
}

fn add_paragraph("hello world", vec![
  Style::FontSize(20),
  Style::FontColor("#fff"),
]);

Instead of a big option object that contains tons of unused default fields, a vector of necessary options look much cleaner and efficient. :slight_smile:


#4

Note that each enum object has an integer tag attached to it, which indicates which enum variant one is dealing with. So the enum approach is only more space-efficient when few options are set. When many options are set, the struct-based approach becomes more efficient.

The vector-of-enums is heap-allocated, which means it’s efficient to pass by value but there are some slight costs involved if you create or access one. That being said, there are ways to play around with this trade-off using something like the smallvec crate.

Another difference is that somewhere in the implementation of add_paragraph, the vector of enums will need to be decoded back into something like your initial style struct in order to be able to quickly figure out what the actual style is (otherwise, you need to do an O(N) lookup through the vector each time you want to know the style). This is not the case when a style struct (or reference to such a struct) is directly passed to the method.

TL;DR: As usual with cleanness and efficiency, it’s a trade-off :slight_smile:

If your final target is a docx file, you may want to have a look at how these files handle Word styles, as it may have an impact on your final API. For example, you may end up stating that each paragraph maps to an underlying document-wide style with some optional paragraph-specific modifications.


#5

Thank you for your detailed response.

I’m sorry that I had missed some details that the style will be transformed into a vector of Attribute struct when it is used. So I don’t think there’s a big difference when implementating the add_paragraph function because you alway need to walk through the style struct or vector to gain the attributes.

What’s more, consider we have n different type of style and we only want m of them. If we’re using struct, we have to check n times to finally know the style. On the other hand, we can do the same thing with loop and match m times with the vector. This is why I think the vector solution is more efficient at the first time. But I’m not sure the exact time complexity of matching an enum.

For your last question, yes, word do have the default style that applys to all paragraphs. And you can event define a group of styles and use it everywhere, but the style name is - not unexpectedly - one of the possible style option.


#6

If you do go with the enum approach (which sounds reasonable to me, given what you’ve said), a couple of fly-by suggestions:

  1. Consider using Cow<'static, str> instead of String - it seems highly probable there will be use cases where the value is a compile time constant.
  2. Consider taking a generic I: IntoIterator<Item = Style> (or Item = &'a Style if you go with references) rather than a Vec<Style>.

#7

Thank you for sharing your ideas! I totally agree with your suggestions. Accepting a generic of IntoIterator means we can use other types of iterator such as LinkedList but I bet people may still use Vec. Because it has a handy marco vec!.


#8

Mostly this would allow passing in types that don’t have contiguous storage and thus cannot be coerced to a slice. Otherwise you could also take &[Style], which could avoid allocations on the caller side (ie use a stack allocated array). But the iterator adds an extra level of flexibility.


#9

Oh, you’re right, vector is heap-allocated. People may want to use a more efficient type some times.


#10

Ok, my friend just recommended me to use some kind of Builder syntax which is looked:

struct Paragraph {
  text: String,
  attrs: Vec<Attribute>
}

impl Paragraph {
  fn new(text: String) -> Paragraph {
    Paragraph { text }
  }

  fn with_font_color(&mut self, font_color: String) -> &mut Self {
    self.attrs.push(Attribute::new("font_color", font_color);
    self
  }

  // ...
}

fn add_paragraph(Paragraph::new("hello world")
  .with_font_color("#fff")
);

#11

Yeah, a builder is sort of the canonical answer here; it’s a tad boilerplate-y though but would work as well.


#12

yeah normally when your struct has many properties which are optional, a builder is a better choice.