Signature based API comparison

Continuing the discussion from Cargo Crusader 0.1 - Test the downstream impact of Rust crate changes before publishing:

I would like to extend the question. Do you think it would be possible to have cargo tell you, e.g. whether you should change the major version of you crate instead, but comparing e.g. public function signatures?


There was this trick that VB5/VB6 could do in the old days. Basically, if you kept a compiled version of a library around, you could point the compiler at it and basically say "make sure I'm binary-compatible with that". I believe that was to keep things like vtable layouts and the like consistent, not full on rejecting incompatible changes.

Theoretically, a tool could do something like this by dumping the public interface of a crate in some stable format, and having a way to diff two of them. Cargo could build on that to warn when there's an (apparently) incompatible change without a corresponding change to the version number.

That said, I think some other things need to happen first. For one, I'd really like to be able to use stability markers. I'd definitely want to be able to deprecate things. Otherwise, it would be impossible to make anything public without an implicit "you can't ever change this now" contract (currently, the best you can do is to also flag something in the docs, but this tool might create a stronger expectation of general stability).

But in theory a crate only needs to stay API compatible, not ABI compatible right?

The comment about VB was just provided as a "there's some precedent for doing this" thing. Rust doesn't have an ABI, so it's definitionally impossible to maintain ABI compatibility.

Right, I read that already. But that is what I meant. API compatibility is all that is needed. The API is basically defined by public signatures, and it should be possible to automatically reason about those right?

@hoodie, Thanks for bringing this up. I want such a tool badly and consider it something of a holy grail.

I think it should be relatively straightforward to create semantic diffs between two revisions of a Rust crate, and compare them against the API evolution rules to determine if a change is semver-valid.

We can then run this analysis over all of to keep everybody honest about API compatibility. I really think there's a great deal of opportunity to create an library ecosystem that is uncommonly reliable.

That is what I mean. Cargo could perhaps even tell you when you are trying to publish a crate under a certain version number when you are violating semver. Not that it should actually forbid publishing, there can always be a $ cargo publish -f. But it would be nice to know, whether what you have done cause a major API change or a minor.

@hoodie are you interested in working on this tool?

Sure, that sounds fun. I haven't worked with the compiler plugin api yet, bit perhaps we could bounce a few ideas around.

1 Like

I currently investigate how the availability of the public API of a package in a comparable format can result in tools and lint passes which help maintain a higher quality of rust software (generating drop-in replacements, aided translation between versions, linguistic consistency checking, ...). Maybe this turns into a diploma thesis. If not, @hoodie and I want nevertheless to build API comparison :smile:


@payload That sounds very applicable! What sort of comparable format do you have in mind?

Please keep me updated on your progress.

Is there a way to make a signature more general while keeping it API compatible? I was thinking of something along the lines of fn foo(path: &Path) -> fn foo<P: AsRef<Path>>(path: P), but that particular case does change the API in a small way.

But I guess you can convert a concrete type into a trait, so long as that type already implemented the trait.

Not if a caller was relying on type information provided by the argument type.

For example, you can't do Into::into(From::from(v)), because the compiler cannot possibly know what intermediate type it's supposed to use. So if you'd been passing an argument as From::from(v), and the type changes from some concrete T to U: Into<T>, calling code can stop working.

I just don't know the format yet. Inspirations come from hackage-diff and elm-package. Both use a mix of parser and documentation generator types and structures. None export the format. I hope it is a format which is almost rust code :slight_smile: human readable. But diffing is usually structural, not text line based.

An idea would be to diff the public branches of the AST itself. When you go top down it could even save some time because you can neglect children of the parents already don't match.

I wonder if the recent refactor work on HIR/MIR could be of use. However, I don't know what these formats will contain, or if it's possible to access them for this at all.

@hoodie, yes, but I would ask hoogle (hackage-diff) and elm-package maintainers why they chose to go the documentation path. Maybe it is just for printing parts of the documentation along the diff. Somewhere the haskell tools also use a parser, but they noted that cause of conditional compilation this parsing may fail. Interesting interesting.
But you are of course right. libsyntax changes are not that scary and one can just traverse the AST in a LintPass compiler plugin or something.
I hope exams are over soon. I want to build stuff!

Thanks for pointing us to HIR/MIR :slight_smile:

And of course I need to look into racer.

And cargo doc :stuck_out_tongue_winking_eye: