I thought that address of function were guaranteed to be unique, including when dynamically loading a library. I thought that the runtime dynamic linker would relocate all the address of the functions of the dynamically loaded library and all call to those functions.
If linker tricks do not work across dylibs, how function pointer can work if the points to a function that is dynamically loaded from another module?
EDIT: @bjorn3 I’m not sure you saw my question since I’ve changed to the user forum to not pollute the thread on internals
if you identify a function by the (mangled) symbol name, this is only true within the same dylib, since each dylib is individually built, it is not possible to check the uniqueness globally, conflicts can only be detected at load time.
it mainly depends on the loading order (e.g. LD_LIBRARY_PATH, LD_PRELOAD), and many other factors, such as global vs local symbols, weak symbols, versioned symbols, lazy bindings, etc, but generally, when multiple modules have conflicting symbols, the dynamic linker would bind the global symbol table entry to the first resolved one (thus the main binary typically has the highest precedence), at least this is the case for the commonly used dynamic linkers like glibc and musl. you can also read the various flags on the manpage for dlopen(), ld.so, etc.
Oh! So if I have a globally visible function foo() in crate A and another one in crate B with a different implementation, if the have the exact same mangle name (most probably because they are #[repr(C)]), then if they are loaded statically I will get a linker error, while if I load at least one of the two crate dynamically, the dynamic linker will happily link all call to either the one of crate A or B.
Did I get that right?
And if I want a symbol to be guaranteed unique, if this symbol is private and #[repr(Rust)], I assume that its mangled name will be unique, and thus its guaranteed that its address is unique.
There is no such guarantee. We tell LLVM that it is free to duplicate and (if they have the same behavior) merge functions. Generic functions are often instantiated multiple times (either marked as private to the object file or with their name mangled depending on where it was instantiated to prevent symbol conflictw) and identical code folding deduplicates functions.
That's common misconception of people who only know C from the books, but have no idea what actually happens inside of your program.
But if you would read the appropriate article you would know that “normal” way of calling function that belongs to your dynamic library and function, that belongs to some other dynamic library use different code (and thus different addresses) in these cases.
But, of course, C declares that there needs to be only one address for any function – that means that dynamic linker creates yet another code sequence (for functions that have exposed as function ponters in your program) that can be used from any place. And ensures that if you have two identically named functions in different modules only one of them would ever be used.
But if you want to make your program faster and smaller then you may relax the restrictions. And Rust, being a new language, have decided to do that for all functions.
Thanks for the clarification. I’ve indeed completely forgot about the ODR (one definition rule) which by its mere existence imply that a function can be instantiated multiple times, and that it’s fine as long as all instantiations have the same observable behavior.
Which therefore means that we can’t use the address of a function as a deterministic pseudo-random number.
it depends on how you linked your program, but mostly yes, this is the common behavior, at least for the GNU linker and glibc dynamic linker.
in ELF format, the unresolved symbols and dynamic linked libraries are in different tables, typically in sections named .dynsym and .dynamic.
at link time, when a linker find a symbol in a static library, the static library's relative sections are copied into the output file, and the symbol is resolved to the address.
when a symbol is found in a dynamic library, the linker simply put the symbol name into the .dynsym table as an SH_UNDEF entry, and add the shared library into the .dynamic table as an DT_NEEDED entry, and records other metadata required for dynamic linking, such as the GOT, PLT, etc.
there are flags to control the linker behavior, e.g. what to do if an external symbol is not found in all libararies? what to do if multiple libraries contain conflicting symbols? see GNU ld's manual, e.g. on the --unresolved-symbols=method, --allow-shlib-undefined, -Z muldefs, etc.
at load time, the dynamic linker walks the DT_NEEDED entries and recusively resolve the exported symbols and binds the SH_UNDEF entries to the resolved address.
however, unless you use extensions like glibc's symbol versionsing, the ELF format doesn't provide a mechanism to guarantee, say symbol foo must be resolved to libA.so instead of libB.so, so it completely depends on the loading order by the dynamic linker, and that's why conflicting symbols from different libraries can cause very nasty bugs.
also note, your statement "at least one of" is not the accurate way to put it. if one of the library is statically linked, the symbol will be resvoled to the static library at link time, the dynamic linker simply will not bind it at load time, because the symbol is not put into the dynsym table by the linker in the first place.
yes, rust symbols are mangled uniquely;
no, the address is NOT guaranteed to be unique.
you can't assume address as identity in rust, as already mentioned by others.
Use a static item to create unique addresses. That's the primary defining difference from const (aside from how that unique address can be used for interior mutability).