Cargo-crev 0.2; notes from dogfooding; looking for automatic scanning tools and ideas


#1

cargo-crev v0.2 was released, containing many fixes and improvements since 0.1

I’ve started slowly dogfooding and reviewing some dependencies of crev itself. It made me quickly realize, that when you have more than 200 dependencies, it’s very important to know where is the best place to start: which packages are the most suspicious and generally worth checking.

Some effort to help with that was already made. Eg. now cargo crev verify deps will show crates.io download counts to help identify crates which are mainstream and the ones that are unproven, and potentially riskier, along with review counts.

[I] 12-21 22:56 dpc@futex ~/l/crev (master)> cargo build --release; and ./target/release/cargo-crev crev verify deps 
    Finished release [optimized] target(s) in 0.20s
    Updating crates.io index
unknown   0  0  103716  3055881 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro2-0.3.5
unknown   0  0    2179     2455 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/pmac-0.1.0
verified  6  6  452540  8704066 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/log-0.4.6
unknown   0  0    5651    60842 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/derive_builder-0.7.0
verified  3  3  967452  1951348 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/either-1.5.0
unknown   0  0 1335499  1904576 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/fnv-1.0.6
(...)

In the next few releases I’d like to improve on this. And here lies opportunity for ides and feedback. **What good automaticly-collected metrics and signals would suggest that a crate is worth reviewing?"

Here are some that I had so far (signal strength):

  • small crates.io download count (medium)
  • unsafe line count (high)
  • network, filesystem and other potentially destructive system-level operations (low); this one is tricky, because it would have to be done as cross-crate analysis; (medium)
  • lack of tests (low)
  • line count (low)
  • custom build script (low)
  • compilation and clippy warning (low)
  • lack of documentation (low)

What else? :slight_smile:

Now, another question: What existing tools that accomplish the above would you recommend? I would like to avoid having to develop and maintain such tools. Some good to haves / requirements:

  • usable as a library
  • maintained

#2

To me an important signal is who’s got permission to publish the crate. If only people that I trust, then it’s fine.

Ownership info can be read with https://gitlab.com/crates.rs/crates_io_client + group membership from github-rs or https://gitlab.com/crates.rs/github_info (these crates currently aren’t on crates-io, but I can publish them if you’d like to use them)


#3

I think it would be a good idea to print column headers before the output - it’s very hard to tell which column contains what kind of data


#4

We do have an issue to make the output better, and it’s one of the item on the list. Other than that - colors, sorting order, summary, etc.

This one is one I’m uneasy about. crev is essentially a cryptographic WoT. So everything is set-up, so you can trust reviews of other people and verify everything locally. Setting up a whole system of managing trust of ownership is quite complex, and is less of a guarantee than data returned by crates.io and fact of ownership. eg. what if ownership is shared, or someone’s account is compromised, crates.io gets hacked. etc.

Hmm…

Taking data from crates.io is the easy part.

I guess I could add a command or two, to be able to maintain a list of trusted and distrusted crates.io authors, and then mark crates in the summary view (cargo crev verify deps) somehow. Eg. new column “author” with known, unkown, flagged.

Now, should I circulate such information as a signed proof? Hmm… It is somewhat appealing, because the data is already there etc. But on the other hand the potential negative consequence is that it distracts from actually reviewing crates and gives false sense of security. Reputable authors can have their accounts compromised, or go to the dark side too.

Summing up: ideally, I would like burntsushi to sign reviews of his own (and other) crates, and not have people trust everything just because it says that allegedly it was authored by burntsushi. :slight_smile:


#5

I think you should publish them one way or another. I have started using crates_io_api and slapped caching on top of it, but had I knew about your crates, maybe I would went with them. :slight_smile:


#6

I have pushed a version of crev that fetches and displays list of owners. We will have to revamp the look and structure, but it might be usable enough for now. (eg. grep-out trusted owners).


#7

cargo crev edit known will now allow editing a file with list of known crates owners, and it will be nicely displayed in the cargo crev verify deps. Thanks for the suggestion and I hope it will help.

If someone could compose a good list of reputable crates.io users, I would be happy for the default template to contain it. It’s a bit of a sticky thing, because why do some people get to get included there by default, and some not, etc. but I think as a temporary measure it would be helpful, especially now that we’re bootstrapping the whole thing, and once everyone and their grandma are reviewing crates with crev, there will be no need for this feature whatsoever.


#8

I’ve implemented it

The issue for discussing who and why should be on the list: https://github.com/dpc/crev/issues/100

The itself: https://github.com/dpc/crev/blob/master/cargo-crev/src/known_cargo_owners_defaults.txt