The binary file compiled by the cargo debug is over 1 GB

We have a project with tens of thousands of lines of code, which mainly depends on tokio, hyper, and rustls. The compiled binary file is only 30 MB in release compilation mode, but is over 1 GB in debug mode.

When I analyzed the size of the binary, I found that the binary was full of debugging information.

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         00000000000002e0  000002e0
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.build-i NOTE             00000000000002fc  000002fc
       0000000000000024  0000000000000000   A       0     0     4
  [ 3] .note.ABI-tag     NOTE             0000000000000320  00000320
       0000000000000020  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000000340  00000340
       0000000000000034  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000000378  00000378
       0000000000001b90  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           0000000000001f08  00001f08
       0000000000000e17  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           0000000000002d20  00002d20
       000000000000024c  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000002f70  00002f70
       0000000000000210  0000000000000000   A       6     7     8
  [ 9] .rela.dyn         RELA             0000000000003180  00003180
       000000000027c468  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             000000000027f5e8  0027f5e8
       00000000000013e0  0000000000000018  AI       5    26     8
  [11] .init             PROGBITS         0000000000281000  00281000
       000000000000001c  0000000000000000  AX       0     0     4
  [12] .plt              PROGBITS         0000000000281020  00281020
       0000000000000d50  0000000000000010  AX       0     0     16
  [13] .text             PROGBITS         0000000000282000  00282000
       00000000035494fc  0000000000000000  AX       0     0     4096
  [14] .fini             PROGBITS         00000000037cb4fc  037cb4fc
       0000000000000009  0000000000000000  AX       0     0     4
  [15] .rodata           PROGBITS         00000000037cc000  037cc000
       0000000000231b94  0000000000000000   A       0     0     4096
  [16] .debug_gdb_script PROGBITS         00000000039fdb94  039fdb94
       0000000000000022  0000000000000001 AMS       0     0     1
  [17] .eh_frame_hdr     PROGBITS         00000000039fdbb8  039fdbb8
       00000000001a793c  0000000000000000   A       0     0     4
  [18] .eh_frame         PROGBITS         0000000003ba54f8  03ba54f8
[19] .gcc_except_table PROGBITS         0000000004197fe8  04197fe8
       000000000020f448  0000000000000000   A       0     0     4
  [20] .tdata            PROGBITS         00000000043a86b0  043a76b0
       0000000000000a70  0000000000000000 WAT       0     0     8
  [21] .tbss             NOBITS           00000000043a9120  043a8120
       0000000000000208  0000000000000000 WAT       0     0     8
  [22] .init_array       INIT_ARRAY       00000000043a9120  043a8120
       0000000000000018  0000000000000008  WA       0     0     8
  [23] .fini_array       FINI_ARRAY       00000000043a9138  043a8138
       0000000000000008  0000000000000008  WA       0     0     8
  [24] .data.rel.ro      PROGBITS         00000000043a9140  043a8140
       0000000000140c40  0000000000000000  WA       0     0     32
  [25] .dynamic          DYNAMIC          00000000044e9d80  044e8d80
       0000000000000250  0000000000000010  WA       6     0     8
  [26] .got              PROGBITS         00000000044e9fd0  044e8fd0
       000000000004a030  0000000000000008  WA       0     0     8
  [27] .data             PROGBITS         0000000004534000  04533000
       000000000000ad00  0000000000000000  WA       0     0     32
  [28] .bss              NOBITS           000000000453ed00  0453dd00
       000000000020f290  0000000000000000  WA       0     0     64
  [29] .comment          PROGBITS         0000000000000000  0453dd00
       0000000000000026  0000000000000001  MS       0     0     1
  [30] .gnu.build.attrib NOTE             000000000474ff90  0453dd28
       0000000000000120  0000000000000000           0     0     4
  [31] .debug_aranges    PROGBITS         0000000000000000  0453de50
       00000000003c6fd0  0000000000000000           0     0     16
  [32] .debug_pubnames   PROGBITS         0000000000000000  04904e20
       000000000d32bd27  0000000000000000           0     0     1
  [33] .debug_info       PROGBITS         0000000000000000  11c30b47
       0000000009534f87  0000000000000000           0     0     1
  [34] .debug_abbrev     PROGBITS         0000000000000000  1b165ace
       0000000000201216  0000000000000000           0     0     1
  [35] .debug_line       PROGBITS         0000000000000000  1b366ce4
       00000000015564ef  0000000000000000           0     0     1
  [36] .debug_frame      PROGBITS         0000000000000000  1c8bd1d8
       0000000000000030  0000000000000000           0     0     8
  [37] .debug_str        PROGBITS         0000000000000000  1c8bd208
       000000001305ad4e  0000000000000001  MS       0     0     1
  [38] .debug_loc        PROGBITS         0000000000000000  2f917f56
  [39] .debug_pubtypes   PROGBITS         0000000000000000  2fb707ff
       000000001c2b5aff  0000000000000000           0     0     1
  [40] .debug_ranges     PROGBITS         0000000000000000  4be262fe
       0000000000aa2e20  0000000000000000           0     0     1
  [41] .debug_macro      PROGBITS         0000000000000000  4c8c911e
       000000000003e001  0000000000000000           0     0     1
  [42] .symtab           SYMTAB           0000000000000000  4c907120
       0000000000667c38  0000000000000018          43   197674     8
  [43] .strtab           STRTAB           0000000000000000  4cf6ed58
       0000000001e1b6be  0000000000000000           0     0     1
  [44] .shstrtab         STRTAB           0000000000000000  4ed8a416
       00000000000001e2  0000000000000000           0     0     1

1 Like

Through cargo bloat, I found a large number of duplicate symbols in the debugging information, both std and other crate.

image

That's “works as expected”, more or less.

What's the question?

4 Likes

Regarding ”duplicate“ symbols, judging by the screenshots, this has to be understood as no reduncancy, but a result of how generics and monomorphization works. The generic function

fn drop<T>(_x: T) {}

is generic, and the compiler does (naively, in debug mode) implement this by generating separate machine code for every type T that this function is ever called with. Similarly for a function like

impl<F, B, E> H2Stream<F, B>
where
    F: Future<Output = Result<Response<B>, E>>,
    B: Body,
    B::Data: 'static,
    B::Error: Into<Box<dyn StdError + Send + Sync>>,
    E: Into<Box<dyn StdError + Send + Sync>>,
{
    fn poll2(self: Pin<&mut Self>, cx: &mut task::Context<'_>) -> Poll<crate::Result<()>> { … }
}

you will get new, separate, machine code for every type of future F that this method is used with.

In optimized builds, e.g. drop will always become redundant because it’s trivially inlined, and for the poll2 example, it’s hard to tell what will happen in optimization, but many of those (and many functions in general) will either have the function itself, or some things they call or some places they get called by get inlined for optimization, so the amount of symbols will certainly drop dramatically.

I’m not sure, maybe I also read about the possibility that there can also be true duplication of machine code if the same instantiation of a generic is needed in multiple compilation units (so another trade-off to increase compilation speed, presumably), but also maybe I’m misremembering.

3 Likes

We are not talking about machine code here, though, but about debug info. ICF is supposed to remove code duplication, but, of course, debug info couldn't be merged because even if you have one, single function produced from 100 different sources debugger still needs to know about all of these.

And I'm not even sure ICF is supposed to be enabled for debug builds at all! It's nice idea to remove code duplication in a release build, but for debug build I'm not even sure it's desirable.

1 Like

Is that causing a problem? Slow to load under a debugger? Setting breakpoints takes more than the blink of an eye?

Mainly because I put binary transfer on the development environment will consume me a lot of time. I vaguely remember a long time ago that the build wasn't this big.

I can configure to remove debugging information in Cargo.toml, but I feel that removing debugging information does not conform to the semantics of the debug tagert.