Question on `librustc_symbol_mangling`

lonelyjoey1905 · May 15, 2020, 2:34am

Hello ,

I've started reading the Rust compiler's source files, trying to better understand the internals of Rust.

I have a quick question regarding
a doc comment block from src/librustc_symbol_mangling/lib.rs .

Here is the doc comment block.

//! The main tool for avoiding naming conflicts is the incorporation of a 64-bit
//! hash value into every exported symbol name. Anything that makes a difference
//! to the symbol being named, but does not show up in the regular path needs to
//! be fed into this hash:
//!
//! - Different monomorphizations of the same item have the same path but differ
//!   in their concrete type parameters, so these parameters are part of the
//!   data being digested for the symbol hash.
//!
//! - Rust allows items to be defined in anonymous scopes, such as in
//!   `fn foo() { { fn bar() {} } { fn bar() {} } }`. Both `bar` functions have
//!   the path `foo::bar`, since the anonymous scopes do not contribute to the
//!   path of an item. The compiler already handles this case via so-called
//!   disambiguating `DefPaths` which use indices to distinguish items with the
//!   same name. The DefPaths of the functions above are thus `foo[0]::bar[0]`
//!   and `foo[0]::bar[1]`. In order to incorporate this disambiguation
//!   information into the symbol name too, these indices are fed into the
//!   symbol hash, so that the above two symbols would end up with different
//!   hash values.

My question :
It is mentioned here that rustc disambiguates DefPaths in anonymous scopes using indices.
Since these indices already disambiguates two fn bar() from each other, is it necessary to also feed them to a symbol hash just for the sake of generating unique names?? Or is there also another benefit that is earned by also feeding the already unique indices to a symbol hash??

Thank you very much for reading!

Cerber-Ursi · May 15, 2020, 3:25am

During the linking, all the names must be unique. Linker doesn't know anything about the scopes, it just gets a flat list of symbols.

lonelyjoey1905 · May 15, 2020, 3:37am

I assume that is why the compiler performs name mangling before linking, right?

Referring to the example explained in the doc comment block,

fn foo() { {fn bar() {} } { fn bar() {} } }

Since the compiler disambiguates DefPaths using indices,
the two fn bar()s are differentiated by the compiler as
foo[0]::bar[0] & foo[0]::bar[1].

My question is that, instead of feeding these indices to the symbol hash function, maybe these representations could be directly used as symbols to be fed to the linker
(without making another call to the hash function) ?

daboross · May 15, 2020, 5:58am

As I understand it, the key reason is that if the indices were included, then you'd have to design your mangling scheme to have some way to specify "index 0", "index 1", etc., and this would have to not conflict with any other symbols. Having a catch-all hash means that the scheme can be that much less complicated, and tooling consuming hashes don't have to handle every edge case.

Note that they mention the indices, but that's not the only thing that ends up in the hash. There are other random bits of information, like the crate version and source, which also get globbed in. I think the indices are more just an example of what kind of thing ends up in the hash.

For background on designing mangling schemes, I recommend reading the current symbol mangling RFC, and its corresponding RFC PR discussion, in which there's a lot of discussion of the pros & cons of different symbol mangling schemes.

lonelyjoey1905 · May 18, 2020, 2:27pm

Thank you for the pointers! I'll take a look at them right away

system · August 16, 2020, 2:27pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Hash part in symbol names help	8	1237	April 28, 2020
Completely unmangled symbol? help	10	3049	January 12, 2023
Suffix of mangled rust name help	6	848	January 12, 2023
Side effect of marking all the methods to #[no_mangle] community	5	803	July 10, 2019
Question about refer to module method help	3	212	July 28, 2023

Question on `librustc_symbol_mangling`

Related Topics