Announcing cargo-sym


#1

Do you dislike typing nm -s target/debug/<your_crate_name.rlib>?

Do you suffer frequent frowns while typing objdump -T target/debug/<your_dylib.so>?

Do you wake up in the middle of the night suffering from confusing, mangled vision?

Well, then you’re 1 of 3 people that this crate might be for! Now all you have to type is cargo install cargo-sym and then:

  1. cargo sym
  2. cargo sym -e (to print exported symbols if you’re compiling a .so)
  3. cargo sym -d (bonus fun pack, demangles rust symbols)

I’ll be adding new binary backends whenever I add new backends into goblin, which should be this year - so only ELF32/ELF64 is supported right now, sorry! Don’t worry, you’ll be able to be lazy on OSX and Windows somewhat shortly.

Also there will likely be many bugs. Send bug reports and pull requests if you’re bored to cargo-sym.

And always remember:

binary safely

~m4b


#2

Cool! In related news, here’s Bloaty McBloatface, a C++ tool written by a Google hacker that helps you analyse the size of your binaries.


#3

Interesting.

Do you …

Not really, but I find myself typing all these very often:

# size
$ arm-none-eabi-size target/thumbv7em-none-eabihf/release/examples/foo

# disassemble (debug)
$ arm-none-eabi-objdump -Cd target/thumbv7em-none-eabihf/debug/examples/bar | less

# disassemble (release)
$ arm-none-eabi-objdump -Cd target/thumbv7em-none-eabihf/release/examples/baz | less

and these not so often:

# symbols sorted by size
$ arm-none-eabi-nm -C --size-sort target/thumbv7em-none-eaibhf/release/examples/qux | less

# when I'm getting an "undefined reference" linker error and I want to find what
# symbol the compiler is not exposing or optimizing away
$ arm-none-eabi-nm -Cg [-u] target/thumbv7em-none-eabihf/debug/examples/quux | less

Will cargo-sym help me be lazy as well? IOW, is replacing any of these commands
in scope for cargo-sym?

FWIW, I’ve contemplated writing a Cargo subcommand that “shells out” to these
"binutils" before but when I think that I need to support which target, which
profile, which example or the main library (.rlib) or which dependency (.rlib)
somehow through command line arguments, it seems to me that the Cargo subcommand
call will end up being longer than directly calling the binutil. But, the worst
part is that I would still be using the bintutils under the hood and I’d rather
not have to install them in the first place. So I always gave up before
starting.

Can cargo-sym save me from this predicament?


#4

Oh, we’re going to make you so lazy, don’t you worry.

So everything in your list should be doable (and actually very good use cases), minus the disassembly, since rust doesn’t have a native disassembler (yet) (that I know of).

So just to be clear, your binaries are ELF arm binaries, yea? So goblin doesn’t care platform you’re on, it will read the 32-bit big-endian binaries just fine, so the parsing end isn’t a big deal.

Size should be doable, just tedious, and some questions how that will work cross-platform - although if we restrict it to static libs shouldn’t be an issue (you can take a look at the archive parser, it has size data (and if not i’ll add it)).

Disassemble will be hard. Getting the right binary target will be tricky; i want people to be as lazy as possible here; i.e., just typing cargo sym -r (perhaps) shows the user the arm target for release.

Symbols sorted by size is easy, and doable now with some minor changes (only problem is how to easily get that target, your example use case is good one), and last one is extremely easy (actually in the exports flag i purposefully don’t print “imports”, as you’ll see I do !sym.is_import())

tl;dr

yes everything is in scope for cargo-sym modulo disassembly (unless someone writes a disassembly framework or i get bored and link against capstone (but I’d prefer no native system linking, just want it to “work”)), and really great ergonomic examples, thanks!

PS: there is somewhat undocumented cargo sym -f /path/to/target flag you can try out for now, bug reports appreciated!


#5

FYI, it’s not native, but Capstone does have Rust bindings.


#6

@japaric

Merry christmas! I’ve implemented a simple disassembler for cargo sym on master. You’ll need to git clone, haven’t published to crates.io since it’s not really finished. Should be a simple cargo build to get started (you may or may not need capstone installed as a system library, I’ve somewhat tested the capstone-sys bindings and it should compile from source if it’s missing but who knows), and then something like:

target/debug/cargo-sym -d -C -f /path/to/arm/binary

(I changed the command line api to objdump style, sorries!)

This will likely have many bugs. The unusual targets (i.e, different --targets and example binaries) won’t work yet in the laziest possible manner, you have to pass -f still (this is easy, I just wanted to tackle the hard/interesting problem first). I’m thinking the api for this will be:

  • --ex=<binary name> for getting the example binary
  • -t <target triple> (--target) for accessing the target triple (and defaults to first target/debug, then the first triple it finds if nothing passed)
  • -R or -r (--release) for accessing the release version of the binary (I’m thinking to reserve -r for relocations, but maybe who cares?)

It disassembles in the objdump “style”, by sorting the symbols first according to section, and then by address.

Afaik it then disassembles using some heuristics, since some symbols in ELF don’t have sizes. Don’t look at the code, it’s horrible, and I wrote it as fast as possible :wink:

It very likely will explode on arm binaries, and probably most other binaries too, but would be great if you tried it out and sent some bug reports. Also, if you do, please attach the binary, since I’m short on random armv7thumb whatever binaries :stuck_out_tongue:

Also, unfortunately may have to add a --arch flag or something for the kind of disassembling you’re expecting to do. E.g., there might not be enough information in the ELF binary to determine which ARM instruction disassembler backend to use, not sure yet, need to test.

Let me know how it goes!


#7

Nice! I’ll take a look as soon as I can.


#8

Latest version (0.0.4) is out on crates.io, which according to the author’s dubious readme has the following features:

  1. cargo sym will print every debugging symbol it finds in the first valid binary target in target/<target>/debug. This can be, for example:
    a.target/debug (this is used if it doesn’t find a special target, like the following)
    b. target/x86_64-unknown-linux-musl/debug
  2. cargo sym -C will print every debugging symbol demangled
  3. cargo sym -e will print every exported symbol importable by other binaries
  4. cargo sym -Ce will do -C and -e together :]
  5. cargo sym -d will disassemble your binary, objdump style. experimental
  6. cargo sym -d -C /bin/ls will disassemble the binary ls at /bin/ (actually most distros strip /bin/ls so it actually won’t)
  7. cargo sym -Cd --target=aarch64-linux-android will disassemble your crates binary at target/aarch64-linux-android/debug/<crate_name>
  8. cargo sym -C --release -x example will print the symbols from the example binary you compiled in release mode (at target/release/examples/example)
  9. cargo sym -Cd --target=debug -x main will disassemble the example binary main in the regular debug location target/debug/examples/main

Note the arguments have changed to match objdumps. This should work on ARM binaries now, any bug reports are appreciated. There will be many more :stuck_out_tongue:

Most importantly, the target logic has significantly been changed to make it maximally lazy, i.e., it disassembles debug by default, allows disassembling example binaries, and uses the first architecture it finds in the target directory by default, as well as allowing custom targets to be passed. Be lazy, and try it out in a cargo project near you!

If someone wants to fix something easy and fun, the section disassembly logic needs some love; right now it’s a hack, and really we should group the symbols into 5 different target sections, and then walk each of those and disassemble. The five sections are given by this filter:

fn valid_disassembly_target(name: &str) -> bool {
    match name {
        ".init" | ".plt" | ".got" | ".plt.got" | ".text" | ".fini" => true,
        _ => false,
    }
}

this also has the opportunity to provide some semantic information to the disassembler strategy, e.g., for the .got and .plt we can print the first and second got entries in a special manner (since we know we’re disassembling the got) and similarly for the first plt entry (the call to the dynamic linker resolver function)


#9

Looks great. Trying it out now :slight_smile: