Invitation to bikeshed: command-line interface evolution

Hi all — I'm hoping to gather some feedback about how to evolve the CLI of my program.

I lead the development of tectonic, a modernized, self-contained TeX compiler. Right now, we distribute a tectonic binary that has an interface kind of like rustc: you give it a source file as an input, pass it a bunch of options, and get an output file. But I want evolve the to be CLI more like cargo: have the build options declared in a Tectonic.toml metafile, so that the user just has to run tectonic build or some such thing.

This raises the question: how should I evolve the CLI? I've focused on two main options:

  1. Take the Rust approach: add a new CLI program with a different name, and have it invoke tectonic under the hood. Pros: least breakage. Cons: permanent maintenance overhead from having the cargo/rustc aspects in different executables; training overhead to teach people to use the new program.
  2. Keep the two aspects in a single executable and migrate the CLI. One potential plan:
    1. Add an optional flag to indicate current the "rustc"-type CLI: tectonic -Y [rustc-style]
    2. Add a required flag to indicate the the new "cargo"-type CLI: tectonic -X [cargo-style]
    3. Start telling people to migrate to the -X style CLI
    4. At some point, swap the default CLI interpretation: -X is optional, -Y is required
    5. Eventually get rid of the -Y CLI altogether, probably. (With a good enough "cargo"-style CLI, I don't think it adds any value.)

You could imagine a few variations on these basic ideas. I think the basic tradeoff is that keeping things in a single executable seems like a big win for future maintenance, but requires some kind of migration process.

But maybe I'm not thinking creatively enough. Anyone have any ideas about how to get the best of both worlds? Or maybe some technical reasons that the "rustc" and "cargo" functionalities split into separate executables despite the added complexity?

A lot of Unix utilities do this by dispatching on argv[0] to pick the interface. Then the new name gets symlinked to the old one (or vice versa) and it looks like option 1 to users, without maintaining two binaries.

Depending on how distinct the two usage styles are, you may be able to have a single executable automatically detect which one it was invoked with. Basically, your option 2 without the -X and -Y flags.

Yeah, that's a good point. There's still some overhead to the split, though, because if I implement some new option foo-mode in the Tectonic.toml, I also need to code to parse it, translate it into some --foo-mode argument for the rustc-type program, and then deserialize it again inside that program. The existence of that extra interface layer between the two programs is the main thing that concerns me about the split executables.

Edit: Or you were probably getting at that you could just use argv[0] to choose the CLI parsing method and not have an actual sub-executable invocation. This seems promising! I still wouldn't love to have to name and promote a new CLI entrypoint but that's not the worst thing in the world.

Hmmm, that would probably be possible for most of our use cases. I guess I've been thinking that it would be better to be more explicit about the transition, but I guess a more automatic mode would ease the transition for folks?

I hadn’t meant for there to be two programs at all. When invoked with one command name, it reads all its options from the command line (compatibility mode). When invoked with the other, it attempts to find and read a config file; most new features will be available only in this mode.

Yeah, I think I replied too hastily — I've now edited above.

So far I'm not seeing any arguments that there are good technical reasons to have two different executables. Does anyone have any? I've been assuming that in Rust, rustc and cargo are mainly separate because the compiler is written in its own language ... but maybe there are deeper reasons that would also apply here. (FWIW, because the engine contains an entire typesetting system, the "compiler" executable is pretty large — about 20 MiB.)

There's a pretty strong separation of concerns between cargo and rustc which alters the equation a bit. Cargo is rust's replacement for make and package managers: it fetches dependencies from the internet and figures out the order of operations necessary to produce the requested build.

The compiler proper is then invoked several times by Cargo to actually do each of those operations. It expects / needs all of the relevant dependencies to be taken care of before it's called, and therefore doesn't need to contain any complicated resource-fetching logic.

Does your use-case really motivate having to distinctly separated ways of calling the program? I often see a kind of middle-ground aproach, where all the cli arguments still exists, but they take their default (or pseudo-default, for the give directory) values from a configuration file.

A variant of that theme is when the option takes their defaults from environment variables, which is read from a .env file if one exists, but in your case I guess a tectonic.toml would be preferable, and allow more structured configuration.

1 Like

By the way, it's probably possible to have parsing CLI arguments and configuration file share a lot of code. Extreme example would be if you're able to use structopt for CLI and use the same format as configuration.

Its actually kind of interesting, tectonic isn't immune from this separation, in fact it currently behaves as though rustc did the fetching, and cargo didn't exist, which makes it difficult to parallelize downloads.
I think the same problem which motivates the separation of concerns exists, but its a bit more difficult to bolt on cargo style package management to a language like TeX retroactively

Well, part of the reason I think I'm learning towards the -X/-Y migration route is that I think the cargo style CLI would be a strictly superior UX. In the future, I don't think it would be a big loss to not have the rustc style CLI. There will be cases where people might want to have the UX of "just compile this one input file" but hope that we can make that pretty rare for our community.

(What I'd actually plan to do is add a utility command named something like tectonic compile that would do the one-shot compilation ... but with no pressure to add lots of bells and whistles on the command-line interface there. If you want to activate fancy engine options, it's time to start using the Tectonic.toml + tectonic build approach.)

Indeed. I have ideas :smiley: ... but there are only so many hours in the day.