Feedback on macro syntax

I've been working on a crate that defines macros for initialising collections. The "big" feature is that you can use the .. operator to include multiple values at the same time, from an IntoIterator. The usage is supposed to mirror to how you can already initialise structs in Rust, e.g.

let my_struct = MyStruct {
    foo: 1,
    bar: 2,
    ..Default::default()
}

There's no problem for things like sets and vectors because values are always added, and never replace existing ones:

let set = hash_set!(1, 3, ..others, 6, 7, 8);

But for maps, I realised that the semantics of .. are subtly different from the struct initialisation semantics. In particular, the order they are added.

In velcro, items are added to maps in order, from left to right, regardless of whether the key already exists, so this:

let map = hash_map! {
     'a': 1,
     ..('a'..='c'): 0,
};

is equivalent to:

let map = hash_map! {
     'a': 0,
     'b': 0,
     'c': 0,
};

While this:

let map = hash_map! {
     ..('a'..='c'): 0,
     'a': 1,
};

is equivalent to:

let map = hash_map! {
     'a': 1,
     'b': 0,
     'c': 0,
};

The values added later overwrite the values added before.

Now contrast this with a similar example initialising a Rust struct:

#[derive(Default, Debug)]
struct Foo { a: i32, b: i32, c: i32 }

let foo = Foo {
    a: 1,
    ..Foo::default()
};

println!("{:?}", foo); // Foo { a: 1, b: 0, c: 0 }

For structs, the spread values must be the last thing and they never override fields that have been explicitly set.

So I'm a little bit torn on the issue. I think my options are:

  1. Use different syntax, to avoid confusion
  2. Insert map values from right to left instead.
  3. Keep it as it is ¯\(ツ)

Would you find it surprising if the order was reversed? For example, would you expect the following?

let map = hash_map! {
     'a': 0,
     'a': 1,
     ..('a'..='c'): 2,
};
assert_eq!(map[&'a'], 0);

The .. has a special meaning and is only allowed once for struct initialization.

Yours looks more like a compact way of adding a range of keys, so I would expect it to work in top to bottom order.
I would also not use ...

let map = hash_map! {
     'a': 0,
     ('a'..='c'): 2,
     ('c'..='e'): 3,
};

would result in a -> 2, b -> 2, c -> 3, d -> 3, e -> 3

One way to "fix" it would be to only allow each key once…

For sets, this may or may not be true – it's possible to come up with an example (intentionally, but it could as well occur accidentally) where two values comparing equal and hashing to the same bucket can be distinguished. The trivial example is when comparison and hashing is based on a strict subset of the fields of a struct, or only on the discriminant of an enum.

I didn't look at the way your macros for sets are actually implemented, but if you assume that HashSet::insert(foo) will absoutely not change anything if foo is already in the set, then you might need to re-evaluate this assumption.

This is only natural – structs prohibit duplicate "keys" (field names).

How would this help? Rust's struct initializer syntax can only be unambiguous because fields are never allowed to be duplicated, and the "rest" .. operator is defined not to override anything, and it can't contain arbitrary field names that weren't part of the declaration of the struct type in the first place.

If it were allowed to put the .. at the first (or any) position, it would still be possible to define its semantics so that it always has the lower priority.

However, if you simply insert from right to left into a HashMap or HashSet, you still have the problem that you can put such a range expression at the end of the list (last position if you go LTR, first position if you evaluate RTL), and we're back to square 1, because it then overrides the previous keys/values.

You could ensure that there's only one range expression, however that renders it way less useful. And with multiple range expressions, it's generally not possible to verify at compile-time that their ranges do not overlap.


All in all, I'd think that most programmers would expect the current semantics, i.e. overwriting earlier elements with later ones. Your .. syntax is not really comparable to Rust's rest operator – it means a dynamic range, not a closed set of compile-time constants, so the existence of some semantic difference should be fine.

I used ranges for simplicity of the examples, but we support any impl of IntoIterator, e.g.:

let keys = vec!['a', 'b', 'c', 'd'];
hash_map! {
    ..keys: 1,
}

There needs to be a way to distinguish a single value from multiple values, and RangeInclusive is a perfectly valid key type for a map.

Yes, that was inaccurate, due to a rushed edit! It's true for vectors, but obviously not sets.

My thinking was that if you mechanically converted a valid struct initialization into a map initialization then it will behave the same. That would give consistency at least in the places where the two things (admittedly superficially) overlap.

That could only work if I restricted keys to being literals, or added non-negligible runtime overhead, which I'm not prepared to do.

If you want to make the “rest operator” in your macro act like the real one, then the easiest option would be to always give key: value pairs priority. That means breaking your macro into two loops, so the first one handles the “rest operator”, while the second one handles the pairs.

This makes sense for realistic use cases, since the rest operator is less specific, so it should have lower priority regardless of how code presentation dictates its placement.

3 Likes

That actually makes sense.

  • For sequential collections, where order matters, keep things the same.
  • For set- and map-like collections, where order doesn't matter and where equal keys/values can replace each other, I can require that single values must be specified first and iterators be provided at the end. Still add items in the order they are given from left to right, but process the iterators first.

I'll have more of a think about it, but right now it seems like the best compromise. Thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.