What to include when publishing a crate?

In Cargo.toml files, include and exclude fields allow me to include/exclude files from being published when I run cargo publish.

Now, I've been wondering when I should use these what I should include/exclude when publishing crates? Is there a reason to include anything more than src/**/*.rs?

More specifically, should I exclude the following directories:

  • tests/
  • examples/
  • benches/
  • .github/

I'm leaning towards excluding them — I think I once navigated to a directory under ~/.cargo/registry/src and ran some tests, but I'm not really sure if this is supported?

In Advice for publishing librsvg to crates.io, @bjorn3 and @cuviper mentioned that tests published in crates have a benefit since they're executed when packaging.

Is there more information about this and what is the general consensus here?

Personally, I never use these fields.

1 Like

Yeah, until now, I've also figured it was best to let the crate tarball be a faithful representation of the repository working copy... However, I've recently started to include some demo code in the examples/ folder and for that I would like to add some data files.

If nobody will ever see them deep inside ~/.cargo/registry/src/ and if nobody ever ventures there to type cargo run --examples interactive, then I might as well exclude them :slight_smile:

I think it would be nice to write a paragraph about this in the Cargo documentation. Right now the documentation explains what the fields do — but not why or when one should use them. I'll be happy to submit a patch later if there is consensus on specific advice to add.

1 Like

My take is that published crates are for Cargo and crates.io only, so I exclude everything that they won't use. No tests, no benchmarks, no examples.

crates.io will be storing all this data forever, and every user is going to have to download it all, so I'd rather keep crate files as small as possible.

4 Likes

I guess worth considering is if you want your crate to be useful in crater runs, which build but also run the testsuite https://github.com/rust-lang/crater

7 Likes

Is there a Rust analogue to http://www.cpantesters.org/, which runs the test suites for published Perl modules on a wide variety of systems? If there is, what can it do with crates that don't include tests?

It's been a few years, but I recall checking out the Ruby programming language a while back and being surprised at how poorly the Ruby Gem ecosystem fared in terms of portability compared with CPAN. I attributed this to a Ruby culture of "tests are for the developer", which likely arose from certain Gem infrastructure design decisions:

  1. There is no expectation that the user will run tests when installing a gem: the recommended invocation is just gem install MODULE. Contrast that with Perl, where manual installation traditionally involves a make test step, and cpan install MODULE runs tests and aborts installation if they fail.
  2. Under the Gem spec design, dependencies necessary to run test suites for Gems were lumped in with all "developer dependencies", which could be a large and weighty list. Contrast that with Perl/CPAN, where test suite dependencies are traditionally included in the main depencency list or broken out into a "test requires" list (which is typically small, as it's a subset of all developer dependencies).

The consequences of these Gem ecosystem design decisions were to foster a culture where almost nobody ran tests except maybe the developer — and so if your environment wasn't exactly the same as the developer's, there was a significantly increased chance of module misbehavior. Furthermore, attempts to build the equivalent of CPAN testers were hampered greatly by the difficulty of getting test suites running on anything but the author's system. (Perhaps things are better now; as I said it's been a few years.)

In conclusion, I hope that you and other crate authors bundle tests and expect downstream packagers (and sometimes users) to run them. A culture of actually running test suites makes a big difference in terms of the reliability of the wider ecosystem.

1 Like

Developer dependencies for cargo are only used for tests, benchmarks and examples. In most cases benchmarks and examples barely require any new dependencies. If there is a tool used only during development, you as developer globally install it. Cargo doesn't have any way to depend on an executable. Things like transpiling or generating code are normally done while compiling, rather than while packaging. This means that the necessary dependencies must be included in build-dependencies.

That would likely incentivize developers to make the test suite as small as possible to save the user time, rather than making them as thorough as possible.

It would be nice to have users to be able to run tests of dependencies, but it shouldn't be mandated IMO.

Developers are sensitive to test running times because they run the test suites constantly themselves as part of the edit-compile-test loop! That's at least as important an incentive as saving the user time during the install phase.

Cargo specifically provides a mechanism for suppressing long-running tests by default: #[ignore]. There's no need to amputate tests entirely in order to save running time.

At the least, downstream packagers should be able to run tests. Excluding tests from crates just to save on download bandwidth (as advocated upthread), when they are typically just a few text files occupying a few kilobytes, is penny-wise and pound-foolish.

Furthermore as a user, I would like to know which modules can be depended on reliably because they successfully compile and pass their tests consistently across a wide variety of environments — versus those which either provide no tests or which often fail to pass their test suites. Must we all be cursed to rely solely on individual anecdotal experience to discover which modules are robust and which are flaky?

2 Likes

In that case there is only a single set of tests that are run, rather than a set for each dependency. Also developers don't necessarily run tests themself. Especially for rust-lang/rust, developers often only do a check run and then leave the tests to CI.

I personally only run small sets of the tests while developing cg_clif. The size depends on what kind of change and where in the development of a change. As a first level I have cargo check errors shown by rust-analyzer, the next level is just a small set of tests that takes less than 5 seconds to complete, but quickly finds many problems. When implementing a new feature, I try to add a test for it to this small set. The next level is to compile the standard library and run a few small tests depending on the standard library. This takes about a minute. Then the tests of a few crates run to detect more miscompilation. Finally on CI even more tests run. While developing I often run the tests up to a specific point to see if I am on the right track and then abort the testing after that point. Running tests for dependencies makes it impossible to only run a part of the tests. Even if it would be possible, it is unreasonable to expect from the user who is unfamiliar with the code base to decide which tests are important and which are fine to skip.

In my experience almost all crates either work fine on all platforms, or use platform dependent functions and thus completely fail to compile even before running any tests.

I completely agree. With "mandated" I was referring to mandating that the user must run the tests, rather than mandating that the user can run the tests.

I'm a fan of cargo-diet. It cuts out anything unnecessary and requires all of one command to do it.

With regard to cargo running tests before finalizing a publish, I typically use --no-verify because CI has already validated things.

3 Likes

I see that among the things cargo-diet cuts out are tests, which the associated Lean Crate Initiative asserts "There is no need for" (in the crate). Good grief, people really seem determined to make it hard for downstream consumers to run tests. :frowning:

Has anybody ever run tests on all of crates.io, to see how close this anecdotal experience is to reality?

This experiment from the top of google reported 136596 passing tests and 11596 failing ones.

Please, just clone the repo to run the tests. Nowadays Cargo even includes vcs_info.json that gives you commit hash that should correspond to the published version.

Tests may require fixtures which may be huge data files. For example, I need very large images to test extremes of my image-processing crates. Unicode crates for Rust used to have 20MB-large data files used only during tests.

4 Likes

Why would a downstream user want to run the tests? The only time it should be desired is during development, and for that you already have the repository cloned.

When I ran cargo diet on the time crate, it was a ~40% reduction in size. That's far from trivial.

2 Likes

I think it's perfectly reasonable for a downstream user to run tests of their dependencies. If a library seems to misbehave in your environment, why not just run cargo test -p dep_name to see if its tests catch anything unusual? Cloning the repository and checking out the correct version manually is far less convenient (it'd be nice to have a tool that does that automatically based on vcs_info file). So if the crate's tests don't need large files, I'd prefer if they were included in the published package.

3 Likes

Thank you! If I'm interpreting the full report right, it looks like these are the results:

  • 119569 crates were tested in total.
  • 67834 passed their test suites.
  • 29397 failed to build.
  • 11786 were "broken" and the build could not be initiated. It seems that these are mostly because of either a missing Cargo.toml, a Cargo.toml syntax error, or because an attempt to clone a private github repository failed.
  • 3328 are classified as "error". These also seem to be repository cloning failures; perhaps because the repo was removed (as opposed to made private).
  • 6186 failed their test suites.
  • 494 fell into a variety of minor categories such as skipping tests, etc.

6186 out of 119569 building successfully but then p̶a̶s̶s̶i̶n̶g̶ [edit] failing tests is 5.17%. @bjorn3, I think this illustrates that while many portability issues may be caught at compile time, actually running tests in the destination environment remains of critical importance.

This was on Linux, BTW. I would expect the results to degrade futher on other operating systems.

Because of portability concerns, developer mistakes, spooky random failures, and more. What works on the developer's system may not work on the user's system.

  • different version of Rust
  • different operating system
  • different non-Rust/Cargo dependencies
  • different networking configuration
  • different file system layouts
  • an accidentally committed dependency on some local file which is not bundled in the release
  • glitches when generating the release package
  • a heisenbug which manifests randomly once out of every 256 test runs
  • ...

The current time crate, 0.2.23, is 64.9 kB. When determining whether to trade off software reliability for bandwidth in the wider ecosystem, absolute size matters more than relative size because absolute size is what ultimately determines how many users experience bandwidth-related issues.

As the Crater report illustrates, around 10% of crates specify repository locations that are unreachable at time of testing.

It's not reasonable to expect that authors will host external resources such as repositories in perpetuity. It is reasonable to encourage authors to bundle their test files in their canonical release packages, and then to archive those packages in perpetuity.

3 Likes

Someone would have to manually verify why at least some tests failed. Perhaps, the crate documented to fail on specific platforms/configurations?

Yes, that's also my take away from the discussion so far.

Crates can be cached aggressively, so as long as the size stays below, say, 1 MB, then I think it's fine to bundle a few test files.

I'll exclude the demo programs in the examples/ folder since it seems nobody will ever run those.

I took a quick look at several of the failed tests. Almost all tests looked like they failed because they were broken. There was also a spurious network error. rayon_logs failed because of a read-only file system. Nothing seemed to have been caused by portability issues.

1 Like

Isn't that the issue?

Whether the tests were passing in the developer's environment at the time the crate was published and the breakage occurred because of some delta between that developer environment and the destination environment, or whether the tests were failing in the developer's environment at the time the crate was published, you still needed to run tests in the destination environment to find out that the crate was broken.

In other words, although it may be true for you that "In my experience almost all crates either work fine on all platforms, or use platform dependent functions and thus completely fail to compile even before running any tests", that experience does not generalize to all users. Excepting glitches causing occasional spurious test failures, around 5% of the crates in this Crater run compiled successfully, yet do not "work fine".

(Kudos to the crate authors for writing the tests in the first place so that their brokenness is discoverable!)

It makes sense that many portability issues cause build-time failures for a statically typed language like Rust. Perhaps it would have been helpful if I emphasized the limitations of the typing system upthread instead of portability. You still need tests to catch logical errors.

1 Like