Why does Cargo use toml?

I watched a couple @steveklabnik talks on rust recently, and he consistently says he doesn't have time to go over why Cargo uses toml :smile:

I searched briefly and wasn't able to dig up any details on why toml was chosen. I'm not really being critical either, but I'm quite curious.

Can anyone point me to some links/discussions/wtf on this decision? Would love to learn more!

Thanks!

3 Likes

Because its a simple format that works well for Cargos use case.

We dont need a hierarchical format like JSON or XML, and the INI format is not well specified, so mojombo made TOML.

3 Likes

That's what I figured, just wanted to see if there was any more to it :slight_smile: works for me, thanks :thumbsup:

Yes, as @sinistersnare says, it's basically the least terrible option.

2 Likes

Protip: TOML is actually a superset of JSON.

Not at all! Inline table syntax can't extend over multiple lines and it uses = instead of :. Keys aren't quoted either. And an inline table is only valid in a value position. i.e., { key = "value" } on its own isn't a valid TOML document.

Also, arrays are mostly homogeneous. ["a", 1] is illegal.

1 Like

No, that's YAML

/@_@\

I could've sworn it was the other way around. I had even looked it up in the past. Hooray for memory!

Bright side: look at all the extra markup language insights in this thread now :smile:

1 Like

I always wondered why Cargo uses TOML instead of YAML, which is somewhat similar, more popular (and adopted) and with a settled spec, whereas TOML readme starts with a disclaimer about the changing spec.

This, plus after what @BurntSushi said, it makes more sense now. :grinning:

I think YAML is a terrible format, with a horrendous spec, and lots of weirdness.

TOML works quite well, actually, and its super easy to implement.

1 Like

Yaml spec makes XML look like JSON[1].

Plus, spec doesn't address security issues of loading binary parts into memory via Yaml.

[1]Lets put that feeling in numbers. Not counting special validation rules. Yaml grammar has approx 211 rules, XML has 81 and JSON has about 15 grammar rules.

2 Likes

The TOML readme has a small section comparing itself to JSON, YAML and INI:

In some ways TOML is very similar to JSON: simple, well-specified, and maps easily to ubiquitous data types. JSON is great for serializing data that will mostly be read and written by computer programs. Where TOML differs from JSON is its emphasis on being easy for humans to read and write. Comments are a good example: they serve no purpose when data is being sent from one program to another, but are very helpful in a configuration file that may be edited by hand.

The YAML format is oriented towards configuration files just like TOML. For many purposes, however, YAML is an overly complex solution. TOML aims for simplicity, a goal which is not apparent in the YAML specification: http://www.yaml.org/spec/1.2/spec.html

The INI format is also frequently used for configuration files. The format is not standardized, however, and usually does not handle more than one or two levels of nesting.

From: GitHub - toml-lang/toml: Tom's Obvious, Minimal Language

While I can not speak for the Rust authors, having used the different formats in my own projects (as well as JSON variants such as HCL) I believe they made the right choice.

1 Like

While non critisising the choice for TOML, I think the issues with YAML are exaggerated. Only a very small and obvious subset of YAML is needed for use cases like Cargo's. I think YAML would have worked out equally conveniently, but this is bikeshedding.

1 Like

I don't think they are. Especially security issues aren't emphasized enough. The biggest issue is that YAML attempts to be many things to many people. It's JSON, it's a serialization format, it's human readable, etc. I've been reading the mailing list for YAML for quite some time now.

To me YAML has become F-35 of the ML formats. That is to say overly complex, does many things and isn't really good at any particular thing. I mean, can YAML parser beat JSON parser at parsing JSON (I don't think it is possible, simply because YAML has more states)?

Does anyone really use YAML parser to parse JSON in the wild? I mean sure - it's technically possible. It's also possible to sear meat in a toaster, but you generally avoid it.

It's really a shame, since I think basic YAML idea is really good. Using .travis.yaml is pretty neat. The indented syntax is great.

But using YAML for Cargo would have been horrible. Especially, if parser ever support converting YAML into native data. I don't think there is any parser that supports YAML 1.2. Hell I'm not sure there is a fully compliant YAML parser for Rust?

In this regards, TOML is perfect. It's simple, it seems to be mostly text and it doesn't attempt to be everything to all people. It's also dead easy to implement.

1 Like

I meant the issues with YAML for this use case are exaggerated. Surely, Cargo's subset of YAML would have been simple and safe.

But if you're using a subset of YAML, you're not using YAML. You're using a custom, specific format, that's poorly specified.

At that point, you cause all of the problems that you see with, say, INI style formats. Lots of stuff has config files in some variant to INI format, but everyone has slightly different formats and semantics. Some accept spaces in section names, some treat section names separated by . hierarchically, some allow spaces in keys without quoting, some don't, some allow keys without values, others don't, etc, etc. So you can't just take an off the shelf library and parse an INI file, all of the INI parsers have to have various knobs that can be turned and overridden to parse or write out INI files that comply with everyone's different interpretation.

As soon as you start subsetting YAML, you run into the same problem. If you use a tool that assumes one YAML feature is present when writing, but your subset doesn't accept it, then you can't use that tool to produce YAML that Cargo can read.

TOML covers many of the same use cases, but is much simpler. JSON is also simpler, but not a good format for hand-written config files. TOML seems like a decent happy medium; well specified, not too complex, reasonably familiar to those who are familiar with INI files, has a data model that maps well to the kinds of things people generally want to express and parse, doesn't have arbitrary restrictions on nesting depth like INI files have.

6 Likes

I don't think that there are cases where a YAML configuration file caused compatibility or security issues. But I've never investigated this ... can you point me to an example?

YAML Sucks. Gems Sucks. Syck Sucks. | Blag (I remember this one from a few years ago; this is seen in practice)

JSON::XS - JSON serialising/deserialising, done correctly and fast - metacpan.org (another old good one)

GitHub - cblp/yaml-sucks: YAML sucks. (found this just now while trying to re-find the first one, looks good though)

2 Likes

These links contain general rants about YAML. But I was asking for a case where a YAML configuration file actually caused trouble.

Granted, I know that YAML is convoluted when looked at its full specs, and I realize that it is not simple to write a parser with 100% support. I see you point. But I think this is theoretical when it comes to Cargo's use case. I've used YAML configuration files extensively, and it really is simple and safe. And easy to document precisely. And people know it, and it has a plugin for your $EDITOR. And most inportantly -- you don't need yet another markup language (pun intended).