Why are debug symbols so huge?

Out of curiosity - why are debug symbols so huge? Quite often people describe how Rust binaries size can be optimized by striping debug symbols, and I've just had to make compilation work on a space constrained system by basically disabling debug symbol generation, and it makes me wonder ... why do debug symbols add hunderds of megabates (gigabtyes?) extra in ./target/ for a non-trivial binary?

1 Like

The main question debug symbols try to answer is "if I'm at instruction X, which line of source code am I on?"

Naively, you could imagine debug symbols looking like this:

struct DebugSymbols {
  instructions: HashMap<InstructionPointer, DebugInfo>,
}

struct DebugInfo {
  file: String,
  line: usize,
  fully_qualified_function_name: String,
}

Obviously this can be optimised quite a lot by using smarter data structures, but you can see why it would be quite reasonable for a single instruction to have debug information that takes up many times more space than the instruction itself (1-15 bytes for x86).

Sometimes you can save space by trading off accuracy and just encode "instructions 0x05001-0x05100 are within function my_crate::path::to::function", which is enough to still make backtraces useful. This is where the various levels in the -C debuginfo flag come from.

14 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.