PSA: Check if your Cargo crates are clean and tagged

I'm the maintainer of a crate that can’t include tests in crates.io uploads (because the test files are too big). My recommendation for packagers who want to include tests is to use the source tarball for the corresponding release tag:

Fair point, I would like to believe that is the exception rather than the norm, but I have no statistics to back that up with (nor do I see any reasonable automated way to collect such statistics).

One obvious area where this might be common is in crates for parsing (or identifying as in your case) files formats. There might be other "types of crates" that have the same issue, but I can't think of any off the top of my head (and I bet it will be obvious once someone mentions an example).

Fedora does package every dependency separately, and for library crates, yes the resulting package is a source-only devel package. So rpmbuild basically just makes sure it can build and run tests before packaging it up. Fedora EPEL packages for RHEL do the same.

RHEL vendors dependencies rather than packaging separately, in large part to avoid that churn, especially with the greater release process involved in RHEL. And yes, that means tests are only executed for the "top" binary crate, but that's the part we most care about in that context anyway.

2 Likes

A problem I have come to realise with the approach that Arch and RHEL takes is that if licenses: what is the SPDX expression for the binary package with vendord dependencies? Arguably it should be some ungodly amalgation of all the dependencies.

Completely unrelated to this security discussion, but someone should probably think of that one too eventually.

2 Likes

Yes, Fedora is using a computed license too, e.g. ripgrep:

https://koji.fedoraproject.org/koji/rpminfo?rpmID=37659707

License     : BSD-3-Clause AND MIT AND Unicode-DFS-2016 AND (Apache-2.0 OR BSL-1.0) AND (MIT OR Apache-2.0) AND (Unlicense OR MIT)
3 Likes

Not yet unfortunately. I plan to add ability to log in and manage things like that, but I don't have that yet. For now the next best thing is to use an RSS reader, and delete the irrelevant notifications there.

I might have a "type". In rsass, the src is 800k and the tests is 13M. This is not due to large blobs, but just a lot of rust files containing text inputs and expected outputs, generated from the sass test suite (the generation is done by a separate crate that is in the git workspace but not published on crates.io). In my case, the tests compresses quite well, so the package is "only" about 500k without excluding anything, but it could be a lot smaller and I have considered excluding the tests. After reading this thread I'm still not sure if I should or not.

1 Like

It's only a bit broken in rare cases, it works for over 90% of crates that we package in Fedora. The cases where it doesn't work are usually easy to deal with.

It's still easier to use canonical sources from crates.io than dealing with arbitrary Git repositories (where it's often not clear which state of the git repo corresponds to which version published on crates.io).

The whole point of using source archives from crates.io is to make the divergence between the crates we use for building packages and the crates that are used for developing projects as small as possible. Using git sources instead would make this worse, not better.

2 Likes