Why do function pointers wind up in `.data` instead of `.rodata`?

I have been doing some quick testing to see how Rust sorts data into D or R sections. A subset of these tests are below:

/// Struct with only primitives
pub struct StBasic { pub a: u32, pub b: u32 }

/// Struct with an optional value
pub struct StOpt { pub a: u32, pub b: Option<u32> }

/// Struct with a function pointer
pub struct StWithOptFn { pub a: u32, pub b: Option<fn() -> u32> }

#[no_mangle]
pub static testsym_stbasic: StBasic = StBasic { a: 1234, b: 12345 };

#[no_mangle]
pub static testsym_stopt: StOpt = StOpt { a: 6323, b: Some(1929) };

#[no_mangle]
pub static testsym_stoptfn: StWithOptFn = StWithOptFn { a: 9123, b: Some(make_const_int) };

#[no_mangle]
#[link_section = ".rodata"]
pub static testsym_stoptfn_force_ro: StWithOptFn = StWithOptFn { a: 9123, b: Some(make_const_int) };

And this is the output from nm:

0000000000000000 R testsym_stbasic
0000000000000000 R testsym_stopt
0000000000000000 D testsym_stoptfn
0000000000000008 D testsym_stoptfn_force_ro

Most is as expected, but why is it that the struct with the function pointer gets placed in .data instead of .rodata? Is it possible to force it to be in .rodata somehow, since it seems like link_section does not work in this case?

Everything else seems to follow the rule of being in .rodata unless there is an UnsafeCell or static mut is used.

Full test
use std::cell::UnsafeCell;

#[no_mangle]
pub static testsym_u32: u32 = 100;

#[no_mangle]
pub static mut testsym_u32_mut: u32 = 500;

#[no_mangle]
#[link_section = ".rodata"]
pub static mut testsym_u32_force_ro: u32 = 1234;

#[no_mangle]
pub static testsym_u32_from_fn: u32 = make_const_int();

#[no_mangle]
pub static testsym_u32_unsafecell: UnsafeWrapper<u32> = UnsafeWrapper(UnsafeCell::new(1000));


#[no_mangle]
pub static testsym_str: &str = "Hello, world!";


#[no_mangle]
pub static testsym_stbasic: StBasic = StBasic { a: 1234, b: 12345 };

#[no_mangle]
pub static testsym_stbasic_from_fn: StBasic = make_stbasic();

#[no_mangle]
pub static testsym_stopt: StOpt = StOpt { a: 6323, b: Some(1929) };

#[no_mangle]
pub static testsym_stopt_from_fn: StOpt = make_stopt();

#[no_mangle]
pub static testsym_stoptfn: StWithOptFn = StWithOptFn { a: 9123, b: Some(make_const_int) };

#[no_mangle]
#[link_section = ".rodata"]
pub static testsym_stoptfn_force_ro: StWithOptFn = StWithOptFn { a: 9123, b: Some(make_const_int) };

#[no_mangle]
pub static testsym_stoptfn_from_fn: StWithOptFn = make_stoptfn();


pub struct StBasic { pub a: u32, pub b: u32 }
pub struct StOpt { pub a: u32, pub b: Option<u32> }
pub struct StWithOptFn { pub a: u32, pub b: Option<fn() -> u32> }
pub struct UnsafeWrapper<T>(UnsafeCell<T>);
unsafe impl<T> Sync for UnsafeWrapper<T> {}


const fn make_const_int() -> u32 {
    20493
}

const fn make_stbasic() -> StBasic {
    StBasic { a: 12342, b: 27565 }
}

const fn make_stopt() -> StOpt {
    StOpt { a: 3334, b: Some(12111) }
}

const fn make_stoptfn() -> StWithOptFn {
    StWithOptFn { a: 9384, b: Some(make_const_int) }
}

Commands

rustc test.rs --crate-type=staticlib -O
nm -gC libtest.a | grep testsym

Output

0000000000000000 R testsym_stbasic
0000000000000000 R testsym_stbasic_from_fn
0000000000000000 R testsym_stopt
0000000000000000 R testsym_stopt_from_fn
0000000000000000 D testsym_stoptfn
0000000000000008 D testsym_stoptfn_force_ro
0000000000000000 D testsym_stoptfn_from_fn
0000000000000000 D testsym_str
0000000000000000 R testsym_u32
0000000000000000 D testsym_u32_force_ro
0000000000000000 R testsym_u32_from_fn
0000000000000000 D testsym_u32_mut
0000000000000000 D testsym_u32_unsafecell
4 Likes

I want to say that, to be an actual function pointer, you need to make them references...

Does that make any difference?

fn is an actual function pointer. Maybe you were thinking of the trait Fn?

3 Likes

I'm not sure, but my guess is that this is related to relocations, specifically position independent code. In code this is normally solved via program counter relative addressing. But for actual function pointers that won't work (at least on x86-64). So the runtime linker needs to be able to rewrite the addresses at load time.

4 Likes

.rodata can contain relocations.

1 Like

An interesting observation - it seems like pointers go in R but references go in D, and I haven't found any way to break this rule

#[no_mangle]
pub static testsym_ptr: MakeSync<*const u32> = MakeSync(std::ptr::null());

#[no_mangle]
pub static testsym_ref: &'static u32 = &testsym_u32;

pub struct MakeSync<T>(T);
unsafe impl<T> Sync for MakeSync<T> {}
0000000000000000 R testsym_ptr
0000000000000000 D testsym_ref

Assuming that fn() pointers are always references, I wonder if there is a way to control that. I would be happy to find a reference explaining this behavior but also haven't come across that.

1 Like

This has a fixed value without any relocations and as such ends up in .rodata. If this were to be a pointer to testsym_u32 just like in the reference case it would also end up in .data.

3 Likes

Thanks for the clarification. Is this documented anywhere, or is there any way to force .rodata?

This comes from a discussion at RFL about wanting to put as many function pointer vtables as possible in .rodata, so the program crashes if malicious software attempts to overwrite them.

Windows allows a running process to alter some page attributes. I suspect Linux has similar. Have you considered placing those vtables in their own section then altering that section to be read-only early in the startup? If it works, that could be applied to other things like configuration data. Obviously there would be a vulnerable window between process start and changing the attributes.

Bear in mind I don't know how difficult that would be with Rust. Or if that's even possible.

What you're looking for is probably #[link_section], which you can apply to statics to specify the section of the object file it's placed in.

It's reasonably well known that #[link_name] is unsafe since it can cause name collisions which the linker isn't guaranteed to resolve safely, but this points out (new information to me[1]) that #[link_section] is similarly unsafe, since it can be used to place an item needing relocation into a section that doesn't do relocation. (And perhaps easier to cause problems with, even, since generally link name collisions cause a linker error rather than runtime UB.)


  1. I'm not familiar with linker directives by any means. It might be more obvious to people more familiar with how linkers work, probably. But despite being relatively familiar with the set of known Rust soundness holes, I've not seen this mentioned before (and I have #[link_name], multiple times). â†Šī¸Ž

Rustc enables RELRO by default on ELF which places data which has relocations but is otherwise read-only in the .data.rel.ro section. This section is initially writable, but made read-only by the dynamic linker after applying relocations. COFF/PE and Mach-O allow relocations on read-only segments already afaik, so they don't need anything like RELRO in ELF to make the data read-only after dynamic linking.

7 Likes

Thanks for the suggestion I know there are runtime tricks and elf editing tricks that could accomplish the same thing. I was just wondering if Rust can force a .rodata specifically, or at least figure out what makes the section determination.

I did try #[link_section = ".rodata"], those are testsym_u32_force_ro and testsym_stoptfn_force_ro in the expanded section at the bottom of my original post. However, it seems to silently fail in this case...? Both these symbols are still in D. Which means part of the UB you described, is probably unlikely, but it's not very nice that it doesn't at least tell you it doesn't work.

Interesting - does this mean that in the final executable, any objects that are not used in any input file will be placed in .rodata if statically linking, or some form of .write-only-once-otherwise-rodata once loaded if dynamically linking?

I guess in that case maybe nm doesn't tell the full story, the output of readelf -aC libtest.a | grep testsym -C10 is:

Relocation section '.rela.data.rel.ro.testsym_ref' at offset 0x330 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000700000001 R_X86_64_64       0000000000000000 testsym_u32 + 0

Relocation section '.rela.rodata' at offset 0x348 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000008  000200000001 R_X86_64_64       0000000000000000 .text._ZN4test14m[...] + 0

Relocation section '.rela.data.rel.ro.testsym_str' at offset 0x360 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000400000001 R_X86_64_64       0000000000000000 .rodata..Lanon.6a[...] + 0

Relocation section '.rela.data.rel.ro.testsym_stoptfn' at offset 0x378 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000200000001 R_X86_64_64       0000000000000000 .text._ZN4test14m[...] + 0

Relocation section '.rela.data.rel.ro.testsym_stoptfn_from_fn' at offset 0x390 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000200000001 R_X86_64_64       0000000000000000 .text._ZN4test14m[...] + 0

Relocation section '.rela.eh_frame' at offset 0x3a8 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text._ZN4test14m[...] + 0
No processor specific unwind information to decode

Symbol table '.symtab' contains 20 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.cb419c8421d[...]
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 .text._ZN4test14[...]
     3: 0000000000000000     6 FUNC    LOCAL  DEFAULT    3 test::make_const[...]
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT   13 .rodata..Lanon.6[...]
     5: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    4 testsym_ptr
     6: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    5 testsym_ref
     7: 0000000000000000     4 OBJECT  GLOBAL DEFAULT    7 testsym_u32
     8: 0000000000000000     4 OBJECT  GLOBAL DEFAULT    8 testsym_u32_mut
     9: 0000000000000000     4 OBJECT  GLOBAL DEFAULT    9 testsym_u32_force_ro
    10: 0000000000000000     4 OBJECT  GLOBAL DEFAULT   11 testsym_u32_from_fn
    11: 0000000000000000     4 OBJECT  GLOBAL DEFAULT   12 testsym_u32_unsa[...]
    12: 0000000000000000    16 OBJECT  GLOBAL DEFAULT   14 testsym_str
    13: 0000000000000000     8 OBJECT  GLOBAL DEFAULT   16 testsym_stbasic
    14: 0000000000000000     8 OBJECT  GLOBAL DEFAULT   17 testsym_stbasic_[...]
    15: 0000000000000000    12 OBJECT  GLOBAL DEFAULT   18 testsym_stopt
    16: 0000000000000000    12 OBJECT  GLOBAL DEFAULT   19 testsym_stopt_from_fn
    17: 0000000000000000    16 OBJECT  GLOBAL DEFAULT   20 testsym_stoptfn
    18: 0000000000000008    16 OBJECT  GLOBAL DEFAULT    9 testsym_stoptfn_[...]
    19: 0000000000000000    16 OBJECT  GLOBAL DEFAULT   22 testsym_stoptfn_[...]

No version information found in this file.
1 Like