Idea for scalable code review/trust system

BTW. I plan to name the project crev (feel free to comment on it) and I already started some hacking/more careful design. I’m aiming at a very minimal PoC.

That’s a very good point, and I agree wholeheartedly.

I plan to address that by “project-wise” trusted signatures. Each project would have a list of ids that they consider trusted. It would mean “root of trust” for this project’s WoT and typically consist of a list of ids of all authors and maintainers, stored along with the source code.

So as a user, you should be able to outsource the “confirm who’s real and who’s a sockpuppet” to maintainers and just be able to use it for judging “code review coverage” on your dependencies etc. this way.

I’d like this to be a configuration parameter for every operation. Ideally, you’d like to have code review coverage without transitivity, but (especially when the whole idea is not popular yet) it might not be practical. You might instead try to go with 1 level of transitivity, or 2, or unlimited transitivity, just to see where does it get you.

1 Like

Hmm, I’m not sure this would help much? If the WoT I use for verification is entirely in the hands of the project, then I don’t really need to check it, I can just trust the maintainers of said project and assume each commit is correct :slight_smile:

I was more thinking about 5 random guys not necessarily affiliated with the project, hence my hope for reliance on the OpenPGP WoT as a means to ensure “this key belongs to someone, and is likely not just a sockpuppet” :slight_smile:

Why would that be the case? Anyone can create ids, sign them with other ids, and so on. Unless you would somehow found a match where you can find a trust link between yourself and one of the reviewers. Which is, in my estimation, very unlikely.

Well, if you’re ready to accept any signature link (not trust link) as indicating that someone is someone, then it can be relatively easy: trust some keys in the Debian keyring, and you’ve got an introduction to the “core WoT” to which most OpenPGP users (try to be or) are attached.

If you want only trust links, there are OpenPGP CAs, like a German authority that signs citizens’ OpenPGP keys, CACert that can sign PGP keys, etc. (not arguing in favor of the CA model here, but it’s a possibility if one wants to trust these)

Then, obviously the longer the path the least confident you will be that these are not actually sockpuppets, but I think blindly trusting to give you interesting reviewers the project whose level of trustworthiness you’re actually trying to assess is not necessarily a better option :slight_smile:

Sounds like a terrible name. It makes me think it is a tool related to code reviews for “C”. I know it is supposed to make me think “Code Review”, but, that is the first thing that pops into my head. How about “trev” or “trustrev”? WoT could be pretty good for finding actual identities. Of course it’s still possible to fake identities there, but the bar is higher.

Thanks for giving a feedback! I’ll consider all opinions, and right now the name is not set in stone or anything. :slight_smile:

You’re welcome. Even better, how about “Trevor”, (T)rust (REV)iew (O)f ®ust Code?

With that, if we could also get a related tool called, “Corey” we’d be golden! :smile:

I like names that give very little matches when putting into google search box.

I’ve been thinking about something similar (though different) for a good long while, which may be of interest here.

A recent issue was a popular npm library being hi-jacked: a hacker got a hold of one of the publisher’s key and push a patch version of the library with spyware inside, intent on gathering other keys it seems.

Another semi-recent issue was the discovery that actix-web was invoking undefined behavior in unsafe code.

Those are seemingly unrelated (one is intentional, the other a bug), however all boil down to the same issue from the consumer point of view: how much can I trust this dependency?

This variety of trust issues has led me to think that verification should be extended beyond the “tests passed” badge from CI, and incorporate a broader set of class of verifications.

My draft was:

  • review packages, not code repositories.
  • store reviews in a decentralized database (git repository?), independent of package manager.
  • sign reviews, to authenticate their authors.
  • define multiple categories of review: safety, correctness, usage, test coverage, …
  • allow custom scoring: assigning custom scores to reviewers (reflecting trust) per category.
  • allow custom thresholds: preventing using a version with score less than X in specific category.
  • gate custom scoring/thresholds on specifics:
    • only if package executes code at installation/compilation time.
    • only if package contains unsafe code.
    • only if package is related to specific domain.

The custom scoring is to prevent “sock puppet” accounts to be created by package authors to mass-review their packages.

There are multiple open questions however:

  • key management: a reviewer may want to rotate their keys, it should not invalidate prior reviews.
  • reviewer management: how to score reviewers? (see again: mass-review)

Some comments

I am a big fan of content-addressability and because of that, I think that reviewing files, and potentially whole directory structure (compacted to a single hash) solves it for both code repo and packages. I see no need for making it one or the other.

I’m afraid that making it too rigorous, granular and bureaucratic makes it less usable. Humans have to do an overall judgment, and that’s all about it. You review their judgment or you trust them or not (I plan to have both trust and distrust incorporated). Anyone is free to trust and participate in many WoT circules of any culture and standards), and eventually shared bigger common groups will emerge out of it out of neccessity.

Generally, I come from a principle that it’s about people, not about scores and checkboxes. The system will work if people’s reputations are on the line, and many judgments are combined to correct single mistakes.

Any customizations and complexity about how to make a final judgment what to trust should happen at the very end, at the very end user. The code reviews should be as compact and simple as possible.

Right now I’m planing to have the following categories:

  • througness (how long I’ve been reviewing)
  • understasnding (how I judge my competence around judging this code)
  • trust / distrust (how much I trust this code)

and I plan to have 3 or 4 “levels” for each score.

The “trust level” will come from the end user policy that might state something like: "I trust only software that I reviewed myself, or has 3 or more reviews reviewes of at least “some” througness, of people that I trust directly, or transitive trust of a level at least “good”, etc.

Such policies can be composed ad-hoc (to identif projects, or even their parts) that have to be reviewed directly urgently. And could also be added in the CI, the automatically enforce trust level at any point. And then some for a release.

I’d like this to be: anywhere, with a strong preference of “along with the code”. In my view: All Trust/Review Proofs will be stored in simple serialized form, and then read all when needed.

I plan to have a date for every review proof, and trust proof, and then clients can have any form of a policy. If someone’s id gets compromised, they will have to create a new one, and make other people to create and include “(dis)trust proofs” in their WoTs.

Similarily to code itself, other’s people ids can be reviewed and signed, with trust/distrust score, date and comment.

1 Like

I have today found really bad issues in a crate I was unfortunate to try: and I have no way to share with the community, that this crate probably shouldn’t be used by anyone. And on the list of dependent crates, there’s even sqlite, which seems rather major.

I think we desperately need a system like the one I described. I’m slowly moving forward because I have very little free time. If someone is interested in helping, please let me know.

I have created github repository: . Not much code is there yet, but it’s slowly moving forward.

I really like this idea, but i think it should be a separate service and not bound to the sources of the code.

Why do you think that anyone would merge a bad review?

1 Like

@dpc Hey, I am willing to help on this.

Good luck bootstrapping that and paying for its maintenance, etc.

I mean, I actually tried, failed and gave up, as it’s not the right approach.

Nice thing about submiting reviews as PR is:

  • it’s a social activity and people will know someone reviewed their code
  • it works as word of mouth marketing

They don’t have to. All reviews can be circulated through any means - after all they are just short text.

Rust community can just create a crev-rust-reviews that would merge any code reviews. crev could include some review sources by default too, especially during bootstrapping process.

Having said all that, the place of code reviewes proof is ultimately right next to source code, and responsible and credible people will include even negative ones. If they don’t it’s a good sign that you should not be using their source code in the first place.

Awesome! Feel free to ping me on gitter and I’m happy to chat.

Update to anyone still here. I made quite a bit of progress. There’s even cargo-crev command that kind of works, and can be used for testing and to give a hand-on feeling of how the things are going to work. See:

I have scaled down substantially, focusing only on reviewing crates via cargo-crev. Per-file code-review, does not seem that useful, especially at the beginning, before the idea gains any traction. And even then, it will probably be useful only for bigger, heavier projects.

In the meantime, there was another reddit thread about all that: and it seems people liked the idea, and there are some volunteers willing to help make this all happen.

1 Like