What current policy on crate size and structure? In my understanding thanks to cargo capabilities small fine-grained crates is preferable to monolithic ones. But many crates tend to use more monolithic approach. Is there a reason for it or is it simply no one bother to break them into smaller ones?
What a reasonable limit for breaking crates into separate ones in your opinion?
And what if say monolithic crates decoupled into 50 smaller crates, will be there costs (compile time, runtime, binary size, etc.) to pay for such fine-grain approach?
My rule of thumb is to make anything that could be standalone and reusable be its own crate. Emphasis on reusable. If it's totally specialized to your own needs, don't bother, but generally yes. Break monolithic crates apart. 50 does seem high, though, and I'd be very curious to see a practical example.
50 was not a real number, just an example to illustrate question about possible costs of dividing crate. But I guess something like rust-crypto could be close to this number. (all algorithms + service crates used across them)
Sometimes you just can't split them. I'd love to just have a generic n-dimensional array in one crate, and build the numerical stuff upon it in another crate, and serialization in another. But the implementation of traits (like Add or serialization traits) they need to be in the same crate as the definition of the type itself.
It doesn't really matter how you structure a binary project, do what makes sense for your use case. For libraries it has more to do with what the abstraction the library is implementing requires than any particular preference.
I think abstraction is the key. I've been reaping rewards from making a couple of crates that all support the one use-case orthogonal and Rust makes it easy to work on projects split across many crates.
I think having very fine-grained crates poses significant risks. Abstractions are not free; each additional dependency you take on for your project brings some additional project risk, and is something you have (to some extent) keep track of. I'd rather not depend on a crate for a single function, unless said function has significant complexity that is very isolated from the functionality of my main project.
I try to understand what you are saying, but I don't get it. What risks? Actually, I think, when the crate is very small, the complexity is also small and breaking APIs happen less and if so, they can be handled very easily.
I think I know what he means: statistically the more crates you are using, the higher the cances are that s.th. breaks. And it also involves more authors, that means different code quality, risk of orphaned crates, etc.
But I'm also in favour of several small and simple cates. As you said the complexity is small and often there are multiple crates that do more or less the same thing, so you have an option to switch.
Taking a dependency on another crate, either directly or transitively, should not be taken lightly. Having few dependencies is a feature. Every additional crate adds bloat and maintenance and licensing burden, and the value the crate brings to the table must outweight that burden. If it’s a single function, do you really need an external crate for that?
On the one hand, this means keeping crates small: if all I need is something that url-encodes strings, I don’t want to depend on an entire web framework for that. On the other hand, if I am looking for a simple http server to embed in my application, I would strongly prefer one with only a few dependencies over one that pulls in half crates.io for every miniscule sub-task.
I still think more finer grained dependencies is still usually a better situation than fewer, larger ones. Your dependency footprint is clearer and upstream changes are more fine grained.
I'm pretty sure we're more talking about the case where you have a single 'library' but don't know whether to expose it as a single crate, or a number of them. And I'd say if you've got abstractions with multiple concrete implementations, or orthogonal components that combine in arbitrary ways then splitting each of those logical units into separate dependencies makes sense.
BTW: A few month ago, I proposed the developers of the rand crate to integrate my lib, but they responded, they are going to split their crate anyway and integrate more functionality isn't a good idea.
Maybe this is calls for a new feature: "collection" crates or some such. They contain no actual code, but can depend on other crates. So the OP's question would be answered by "make a bunch of small crates and put them in a single collection crate".
Some kind of feature to make finding/testing dependent crates easier would help with dependency issues. Even if it only dealt with collection crates.
In some way you can do it right now by using pub extern crate <crate_name>; and thus creating something like a meta-crate. In documentation you will get list of reexports which will point on generated documentation for sub-crates. With testing and benchmarking now it a bit harder, but you can pass multiple -p flags, so you can write a small script which will launch tests for all necessary crates without recompilations.
But it would have been nice to have Cargo.toml option, something like subcrates, so by adding dependencies into it you could tell cargo that it should run their tests and benchmarks in addition to that you have in the current crate. (of course it should be done recursively) Same goes for running examples from sub-crates.