Customizing the `Debug` implementation for certain fields of a struct

A while ago I needed to quickly read some binary data described by a C structure and pretty print it. The struct is rather large, and contains a lot of nested structures. Writing a proper formatter for that in C seemed like a daunting, repetitive, prone to error task. I wanted a quick and dirty way of doing this, in a way that also means that if the structure changes I don't need to change the code.

I ended up with an automated way of doing this using Rust: I use bindgen to generate the Rust definition for the C structures, I instruct bindgen to also derive Debug for these implementations, I read the raw bytes from a file, and I just println! it. Easy enough.

This story is relevant because it illustrates a few goals I have:

  • the process needs to be automated
  • I can't change the C definitions just for the sake of the Rust tool

These goals may mean that I can't actually do what I want.

Some of the inner structs contain char arrays, like char some_stuff[32];. These are guaranteed to be zero terminated ASCII strings. bindgen generates these as some_stuff: [::std::os::raw::c_char; 32usize] inside the Rust structs, which are printed as a series of numbers. I'd like to see these as strings . I can manually do that with a bit of unsafe code, but it's easy enough to forget to write code for one of the fields, or I could forget to do it once a new field gets added, etc.

Searching for solutions to this I stumbled upon Derivative which lets me specify a function for formatting a given field, but the only way I could use it is if I add another step that modifies the bindings generated with bindgen. Is there an easier way?

Unfortunately, I don't think this is possible with bindgen alone; it is simply not flexible enough. So a postprocessing step is likely necessary. What you could try is to create two files for bindings, one with the structs and the other with everything else. (This can be done with the Builder::allowlist_*() and Builder::blocklist_*() methods). From there, you can textually postprocess the former file in at least two ways. One is to use Derivative as you mentioned:

sed -i 's/struct/#[derive(Derivative)]\n#[derivative(Debug)]\nstruct/' type_bindings.rs
sed -i 's/\([A-Za-z0-9_]*: \[::std::os::raw::c_char; [0-9]*usize\]\)/#[derivative(Debug(format_with="my_fmt"))] \1/' type_bindings.rs

Another is to use a repr(transparent) newtype wrapper:

#[repr(transparent)]
struct CStringArray<const N: usize>([c_char; N]);

impl<const N: usize> Debug for CStringArray<N> { ... }
sed -i 's/\[::std::os::raw::c_char; \([0-9]*usize\)\]/path::to::CStringArray<\1>/' test.rs
1 Like

The transparent wrapper might solve my issues.

After posting this I did something similar to your first suggestion: I'm generating the files with bindgen and after that I have a post-processing step that goes through them and adds derivative-specific lines. This isn't a big deal as I already had a post-processing step that generated some Rust code. But it looks cleaner with the transparent type. I'll switch to that. Thanks :slight_smile:

I tried writing a proc macro for this, and while it worked for simpler cases it quickly got out of hand.

To provide a few more details, I can have a struct layout that looks like this:

struct Foo {
    magic_string: [::std::os::raw::c_char; 32usize],
    something: u32,
}

struct Bar {
    something_else: u64,
    magic_string: [::std::os::raw::c_char; 32usize],
}

struct Baz {
    magic_string: [::std::os::raw::c_char; 32usize],
    foo: Foo,
    bar: Bar,
}

And while not impossible to do with a proc macro, it went out of hand quickly.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.