With the recent discussions about crate security, I thought it would be interesting to build a tool to validate crate code.
Typically when reviewing a crate for functionality and security, I read the code published at Github (or other VCS system). But I believe there is currently no guarantee that the code in the VCS matches the published crate. Malicious code could be injected anywhere (malicious author, hijacked cargo bin, crates.io hack, S3 hack).
My proof of concept is pretty simple, it looks like this
âžś ./run.sh serde = "1.0.130"
Validating serde at version 1.0.130
Downloading https://crates.io/api/v1/crates/serde/1.0.130/download
Unpacking to ./crate
Version 1.0.130 commit hash is 65e1a50749938612cfbdb69b57fc4cf249f87149
Repo URL is "https://github.com/serde-rs/serde"
Github source is git@github.com:serde-rs/serde.git, checking out..
Crate located in subdir ./serde
############################
# VALIDATING REPO CONTENTS #
############################
Ignoring Cargo.toml (TODO)
matching ./serde/LICENSE-APACHE
matching ./serde/build.rs
matching ./serde/README.md
Ignoring .cargo_vcs_info.json
matching ./serde/crates-io.md
matching ./serde/Cargo.toml.orig
matching ./serde/LICENSE-MIT
matching ./serde/src/std_error.rs
matching ./serde/src/lib.rs
matching ./serde/src/private/ser.rs
matching ./serde/src/private/de.rs
matching ./serde/src/private/mod.rs
matching ./serde/src/private/doc.rs
matching ./serde/src/private/size_hint.rs
matching ./serde/src/integer128.rs
matching ./serde/src/de/utf8.rs
matching ./serde/src/de/mod.rs
matching ./serde/src/de/impls.rs
matching ./serde/src/de/value.rs
matching ./serde/src/de/ignored_any.rs
matching ./serde/src/de/seed.rs
matching ./serde/src/ser/fmt.rs
matching ./serde/src/ser/mod.rs
matching ./serde/src/ser/impls.rs
matching ./serde/src/ser/impossible.rs
matching ./serde/src/macros.rs
No code injections found
Before I continue and start running at scale, I would like to know if my assumptions are correct and if this would be useful to anyone? Any suggestions welcome!