I'm finding that code from a crate is executing significantly slower when built using simple-16 = { path = "../../simple-16"} as opposed to simple-16 = "0.1.0" even though the code is exactly the same. I've switched back and forth several times, and the difference is statistically significant. Multiple runs of my benchmark from the same build are within 2µs, but the difference between builds is consistently 37µs (out of a total of 581µs).
$RUSTFLAGS is set to -C target-cpu=native
For the workspace I have set...
The dependencies look like Benchmarks -> Tree-Buf -> Simple16, where within the Cargo.toml for Tree-Buf is where the path is changed between the published crates.io version and the identical local source.
Is it possible that this is just a small change in the exact layout of the generated executable that causes a big change in the runtime because of for example code alignment or code locality?
Could the difference possibly be incremental compilation? It's enabled by default for local and path dependencies, but not crates.io dependencies. It shouldn't have any effect on the output build, but maybe by some bug in rustc (or by some code generation algorithm affecting code locality or alignment, like @bjorn3 suggested), it does?