Crates.rs — a new, faster crate index website


#43

Yes, I plan to dedupe them like that. Existing crates tend to use a Cargo workspace in a git monorepo, so deduplicating crates based on that should work.

I’m thinking about:

  1. if multiple crates have the same git repo URL,
  2. scan the repository to find the cargo workspace.
    • if it’s not a real workspace, just search for all Cargo.toml files
  3. If there’s a named crate at the root of the workspace/repo, use that as the project name
    • otherwise use github/gitlab/etc. project name
    • otherwise pick shortest crate name in the workspace
  4. Treat the project as a micro category
    • display only one of the crates in normal category listings
    • display child crate names as “Parent Crate Name > Child Crate Name”
    • link to siblings from crates’ pages
    • etc.

#45

Link to next page in category view leads to crates.io (e.g. https://crates.io/categories/no-std?page=6 @ https://crates.rs/no-std)


#46

I know :slight_smile: There’s a bunch of links for unfinished things that suddenly go to crates.io as a fallback. I haven’t implemented paging.

But I’m also wondering if something better can be done than paging. If you’ve scrolled past 50 or 100 crates and still haven’t found what you were looking for, then it might not be the most effective way. Perhaps smaller subcategories are needed? Filtering/faceted search?


#47

Yeah, but also: Sometimes I’m just browsing, especially as a newbie just exploring the ecosystem


#48

This site is super cool. You say you have 329 bugs and improvements – is it an open source site? Can others contribute? I don’t see a “Fork me” or a link.


#49

Short answer: https://gitlab.com/crates.rs

Longer answer:


#50

Ah ok! Thanks for the info.


#51

pretty cool. is there any way the output could be alphabetized or sorted by popularity?


#52

Depends which output?

  • The categories on the homepage are sorted by (number of crates * popularity of their top crates).
  • The category listings are sorted by popularity (but once I make progress on search, I’ll sort it by rank/relevance for that category)
  • Dependencies are sorted by… a bit of a mess. I’m thinking about sorting them by “weight”, grouped by platform.
  • Authors are sorted by owners first, then mix of original order in Cargo.toml and amount of contributed code.

In general I’m trying very hard to never ever sort anything alphabetically anywhere.


#53

:exclamation: :exclamation: :exclamation: :exclamation: :exclamation:

It’s fully open-source now:


#54

I need help figuring out how to present groups/families of crates from a monorepo.

It’s common in Rust to have a single git repo with several crates, e.g.

My initial thought was to use directory hierarchy within to establish a crate hierarchy, and show them as “child crate, belongs to parent crate” on the page, but it’s more common for crates to have siblings rather than parents.

How to find name of the whole group? e.g. given Rusoto’s repo, which crate is the “main” one? (and what algorithm will find it)

How to present on crate page other crates in the same repo? How should it look for repos with 2, 5, 10, 300 crates?


#55

If the crates has the same name as the repo, then it’s the main crate.

Otherwise, the crate with the shortest name is the main crate.


Looking through them, I don’t think there is a ‘main crate’ for things like SGX. It might just be better to have a separate namespace for crates.rs repos, and organize crates underneath it.


#56

Hi,

I’m the author of “imag”.

In the imag project, the main crate is “imag”(, but it does not life
at the root of the repository - if that matters).
The name of the repo is the name of the “root”, so to speak.

What I would think would be best: The name of the repository as the
name of the “group”, if the owner is a “normal” github account - if it
is an organization, the org name could be (but not necessarily is)
the name of the group. Don’t know what would be best then…


#57

I’ve tried deducing crate hierarchy from a) directory layout b) matching against repo name or repo owner, but that still gave minimal coverage. It seems like many repos are just “a bag of crates”.

Also displaying the parent crate as a category doesn’t work — it’s too small and hard to notice. OTOH putting it as a prefix before the <h1> crate name makes it too large. :man_shrugging:


#58

In other news, I’m working on scanning through git history of crates and searching for crates they’ve replaced (removed one, added another). This seems like a solid data for recommendations for like gcc->cc, rustc-serialize->serde.

edit: it’s live! The data in the long tail gets sparse so some suggestions are hilariously bad :slight_smile:


#59

For RustCrypto crates it’s not possible to determine the “main” crate, as repositories contain collection of algorithms. In other words crates in the repositories are “equal” to each other.


#60

Rusoto maintainer here. Rusoto used to be a single crate called rusoto and now there is a main crate of rusoto_core. One idea on how to figure out which crate is the parent crate: use the dependency graph to determine it.

For example, rusoto_core relies on rusoto_credential and rusoto_dynamodb, with the dynamodb cargo feature flag enabled. So one could determine rusoto_core is the main crate and others are dependent/child crates.

Hope that’s a useful idea!