I'm curious about the level of "processing" cargo tree
does in producing its output, specifically when using --format "{l}"
to dump license expressions.
It's no surprise to most that there are challenges in packaging compiled rust binaries for OSS Linux distributions. One of the biggest is working out the combined set of licenses that cover the package, effectively a combination of the licenses for every crate that goes into the final build.
cargo tree --format "{l}"
certainly has the potential to assist in compiling that data, with the correct options. So far, it's been working pretty well to extract a flat list using a command along these lines:
cargo tree --workspace --edges no-build,no-dev,no-proc-macro \
--no-dedupe --target all --prefix none --format "{l}" \
| sed -e "s: / :/:g" -e "s:/: OR :g" | sort -u
(The sed
is because some crates list their license options like "MIT/Apache 2.0
".)
My question is: Do we really need the --no-dedupe
in there? It's there, to the best of my knowledge, on the assumption that the deduplication may lose information â if a certain crate has different features enabled each time it's consumed, that would affect either the license displayed or the list of its own dependencies (and therefore their set of licenses).
But, can that actually happen? Is the {l}
output customized each time it's output?
If it's just whatever is listed after license =
in each crate's Cargo.toml
, then it'll be the same for that crate each time the crate is listed.
And if features can alter the list of dependencies for a crate (I'm not sure whether that actually is possible), then would feature unification also ensure that the dependencies listed are the union of all of the dependencies needed for all features, as well?
Unless there is a possibility for information to be concealed, then it doesn't seem like the --no-dedupe
is really accomplishing anything.