Reading structures in memory via pointers

Sorry if I'm asking a question that's already been asked, but searching the web didn't yield anything. :slight_smile:
My question is this: Let's say I have a structure called buffer. It looks like this:

pub struct Buffer {
pub id: u64,
pub data: Vec<u8>,
}

In C, if I defined it as this:

struct Buffer {
unsigned long long id;
short* buffer;
};

I could read it from a memory location if I knew where it was at like so:

Buffer *buf = (Buffer*)0x400000;

I am wondering exactly how I might accomplish something similar in Rust using as little unsafe code as possible (or, if its not possible to avoid unsafe, just how to do it). My goal is to be able to do something like this without having to create tons of pointers and call .read_volatile() on them 20 or more times (depending on the size of my structure).

My C isn't great so I don't know the exact equivalent of what you write. Some unsafe is required, of course, because what you want to do is unsafe. Here's my guess:

fn main() {
    let raw_ptr  =  0x400000 as *mut Buffer;
    let rust_reference: &mut Buffer = unsafe{ raw_ptr.as_mut().unwrap() };
    println!("{:?}", rust_reference)
}

#[derive(Debug)]
pub struct Buffer {
    pub id: u64,
    pub data: Vec<u8>,
}

Don't forget to add #[repr(C)] to the struct! Otherwise Rust may change the order of the fields.

And you can't do it with Vec. It doesn't have a C-compatible representation. short* field has to stay as *mut u16 pointer.

It's possible to make a slice::from_raw_parts to represent C buffers in a more convenient way, but slices are fat pointers, which C doesn't have, so again you can't just cast them.

Basically if you're casting a C struct from memory, you can't use any Rust-specific types, only C-compatible types in the same way C would.

3 Likes

I'm working in OSDev. I'm looking at a working implementation of AHCI and trying to make something like it in my own OS. The example looks like this:

typedef struct tagFIS_REG_H2D
{
	// DWORD 0
	unsigned char  fis_type;	// FIS_TYPE_REG_H2D
 
	unsigned char  pmport:4;	// Port multiplier
	unsigned char  rsv0:3;		// Reserved
	unsigned char  c:1;		// 1: Command, 0: Control
 
	unsigned char  command;	// Command register
	unsigned char  featurel;	// Feature register, 7:0
 
	// DWORD 1
	unsigned char  lba0;		// LBA low register, 7:0
	unsigned char  lba1;		// LBA mid register, 15:8
	unsigned char  lba2;		// LBA high register, 23:16
	unsigned char  device;		// Device register
 
	// DWORD 2
	unsigned char  lba3;		// LBA register, 31:24
	unsigned char  lba4;		// LBA register, 39:32
	unsigned char  lba5;		// LBA register, 47:40
	unsigned char  featureh;	// Feature register, 15:8
 
	// DWORD 3
	unsigned char  countl;		// Count register, 7:0
	unsigned char  counth;		// Count register, 15:8
	unsigned char  icc;		// Isochronous command completion
	unsigned char  control;	// Control register
 
	// DWORD 4
	unsigned char  rsv1[4];	// Reserved
} FIS_REG_H2D;

This, in Rust, if directly translated, would be something like:

#[derive(Debug)]
pub struct FIS_REG_H2D {
// DWORD 0
pub fis_type: u8,
pub pmport: u8,
pub rsv0: u8,
pub c: u8,
pub command: u8,
pub featurel: u8,
// DWORD 1
pub lba0: u8,
pub lba1: u8,
pub lba2: u8,
pub device: u8,
// DWORD 2
pub lba3: u8,
pub lba4: u8,
pub lba5: u8,
pub featureh: u8,
// DWORD 3
pub countl: u8,
pub counth: u8,
pub icc: u8,
pub control: u8,
// DWORD 4
pub rsv1: [u8; 4],
}

The example creates this structue like this:

FIS_REG_H2D *cmdfis = (FIS_REG_H2D*)(&cmdtbl->cfis);

Assume, for this example, that &cmdtbl->cfis is 0x400000. I'm just curious how I'd make accessing the structures fields access the respective memory locations.

That struct uses bitfields (:4 suffix, etc.), so your mapping is incorrect. These are not unsigned char/u8 fields! You'll have to map multiple bitfields into one integer of appropriate size and then use masks to read them.

See if bindgen can map this structure for you.

It is able to generate code, but that code has several issues:

  1. It refers to std types (even though I've explicitly indicated that it should use core types only).
  2. It creates extremely messy code for custom bitfield manipulation.

I would use this code if it would use core types, but it refers to types like c_uchar and such, which do not exist in a no std environment.

  1. Yup, bitfields are a mess.
  2. I think bindgen has an option to change the prefix for the C types, so you could point it to your own module or https://lib.rs/crates/cty. I vaguely recall that std::os::raw is omitted from core on purpose, because sizes of C types are OS-specific.

That solved part of the solution -- using cty -- thanks! :slight_smile: What should I do about reading data into the struct? If I can get that out of the way that'll make my life so much easier.

What do you mean by reading into the struct?

If the data is in memory under a known address, then casting it to the right pointer type is all you need.

And I would do that like this?

let raw_ptr  = bar_of_ahci_device as *mut Buffer;
let hba_port: &mut HBAPort = unsafe{ raw_ptr.as_mut().unwrap() };

If this is the way to do it, would I then be able to update the structure and it would automatically update the value at those memory addresses? Or would I need to convert it back into a pointer and then write it with something like write_volatile?

Yes, this is enough. Modifications via Rust reference write directly to memory at that address, same as in C.

as_mut().unwrap() is a check for NULL. If you know it's never NULL, you can shorten it to:

let hba_port = &mut *(bar_of_ahci_device as *mut HBAPort);
hba_port.some_field = value;

Note that &* and &mut * doesn't actually dereference. C gives the same guarantee. In Rust that's used to change pointer type from an unsafe pointer to a safe reference.

or even:

(*(bar_of_ahci_device as *mut HBAPort)).some_field = value;
2 Likes

Note that &mut *(global_ptr) becomes insta-UB if some other thread or somewhere in this call stack already doing so. If other thread may access it, use locks like Mutex to protect data race. If data race is already concerned but single threaded concurrency is not, assuming the HBAPort is a POD like you posted, just read the whole struct using ptr::read, modify the field, and ptr::write back. Currently there's no builtin way to obtain field ptr from the struct ptr without using temporal reference, but AFAIK there's a RFC to add &raw to the language.

1 Like

Thank you, this is very helpful to know. I'll probably use the .unwrap() call in case where I'm writing is NULl and the computer happens to give me wrong info. :slight_smile: Thanks!

OK, ran into another problem. The example has this code:

	HBA_CMD_HEADER *cmdheader = (HBA_CMD_HEADER*)port->clb;
	cmdheader += slot;

I know that in Rust the first line translates to something like:

let header = unsafe {
let raw_ptr = port.clb as *mut HbaCmdHeader;
raw_ptr.as_mut().unwrap() as &mut HbaCmdHeader
};

But I don't know how I would translate the second line. My idea was to do this:

let header = unsafe {
let raw_ptr = (port.clb + slot) as *mut HbaCmdHeader;
raw_ptr.as_mut().unwrap() as &mut HbaCmdHeader
};

But I don't know if that will work or not. Thoughts?
Edit: one last problem. The example also has this:

	HBA_CMD_HEADER *cmdheader = (HBA_CMD_HEADER*)(port->clb);
	for (int i=0; i<32; i++) {
cmdheader[i].prdtl = 8;
cmdheader[i].ctba = AHCI_BASE + (40<<10) + (portno<<13) + (i<<8);
cmdheader[i].ctbau = 0;
memset((void*)cmdheader[i].ctba, 0, 256);
}

I translated this to:

let header = unsafe {
let raw_ptr = port.clb as *mut HbaCmdHeader;
raw_ptr.as_mut().unwrap() as &mut HbaCmdHeader
};
for i in 0 .. 32 {
header[i].prdtl = 8;
header[i].ctba = AHCI_BASE + (40 << 10) + (portno << 13) + (i << 8);
header[i].ctbau = 0;
unsafe {
write_bytes(header[i].ctba, 0, 256);
}
}

Issue is that this, in rust, is not an array, and Rust doesn't let you read arbitrary memory like an array like C does. So a bit confused on how to solve this problem too.

Would a byte array do what you need?

unsafe {
    let raw_ptr = port.clb as *mut HbaCmdHeader;
    let raw_ptr_bytes = raw_ptr as *mut u8;

    // With the line below, you want the second argument to be the number of bytes the structure 
    // occupies in RAM
    let array_ptr_mut: *mut [u8] = std::ptr::slice_from_raw_parts_mut(raw_ptr_bytes, std::mem::size_of::<HbaCmdHeader>());

    let array_of_bytes: &mut [u8] = &mut *array_ptr_mut;
}

Hi,

well the "array" issue might be addressed by using the offset function on the raw pointer which points in the memory to the location offset*size_of<T> which is the behavior of the indexed access you see in C.
I guess you could try it like so:

unsafe {
    let raw_header = port.clb as *mut HbaCmdHeader;
    for i in 0..32 {
        let header = raw_header.offset(i).as_mut().unwrap() as &mut HbcaCmdHeader;
        header.prdtl = 8;
        .....
        write_bytes(header.ctba, 0, 256);
    }
}

I think that'll work. I'll post back here if it doesn't. Thanks for all the assistance! :slight_smile:

Well, that did work nicely. One last problem though: this line:

let cmdfis = unsafe {
let raw_ptr = cmdtbl.cfis as *mut FisRegH2D;
raw_ptr.as_mut().unwrap() as &mut FisRegH2D
};

Fails with error E0605. cfis is:

pub cfis: [cty::c_uchar; 64usize],

I want it to be converted into this:

#[repr(C)]
#[derive(Debug, Default, Copy, Clone, Hash, PartialOrd, Ord, PartialEq, Eq)]
pub struct FisRegH2D {
    pub fis_type: cty::c_uchar,
    _bitfield_1: internal::bitfield<[u8; 1usize], u8>,
    pub command: cty::c_uchar,
    pub feature_lo: cty::c_uchar,
    pub lba0: cty::c_uchar,
    pub lba1: cty::c_uchar,
    pub lba2: cty::c_uchar,
    pub device: cty::c_uchar,
    pub lba3: cty::c_uchar,
    pub lba4: cty::c_uchar,
    pub lba5: cty::c_uchar,
    pub feature_hi: cty::c_uchar,
    pub count_lo: cty::c_uchar,
    pub count_hi: cty::c_uchar,
    pub icc: cty::c_uchar,
    pub control: cty::c_uchar,
    rsv1: [cty::c_uchar; 4usize],
}

I would use slice_from_raw_parts_mut and/or slice_from_raw_parts, but I don't think that would work (after the pointer retrieval is done, I need to use the returned reference as a struct). Rustc recommends I implement a type conversion, but I have absolutely no idea how I would even do that (i.e. which elements of the array would map to which struct members).