How to expose a static reference and an unmangled static pointer to the same memory location?

I have a struct called Encoding and I want to statically allocate a bunch of instances with different field values such that for a given set of field values there exists only instance with an address that's stable across crate boundaries and the FFI boundary.

Previously, I had in lib.rs stuff like:

/// The Big5 encoding.
pub const BIG5: &'static Encoding = &Encoding {
    name: "Big5",
    variant: VariantEncoding::Big5,
};

/// The EUC-JP encoding.
pub const EUC_JP: &'static Encoding = &Encoding {
    name: "EUC-JP",
    variant: VariantEncoding::EucJp,
};

And in re-exported ffi.rs stuff like:

/// Newtype for `*const Encoding` in order to be able to implement `Sync` for
/// it.
pub struct ConstEncoding(*const Encoding);

/// Required for `static` fields.
unsafe impl Sync for ConstEncoding {}

/// The Big5 encoding.
#[no_mangle]
pub static BIG5_ENCODING: ConstEncoding = ConstEncoding(BIG5);

/// The EUC-JP encoding.
#[no_mangle]
pub static EUC_JP_ENCODING: ConstEncoding = ConstEncoding(EUC_JP);

And in the C header file stuff like:

/// The Big5 encoding.
extern const Encoding* const BIG5_ENCODING;

/// The EUC-JP encoding.
extern const Encoding* const EUC_JP_ENCODING;

I expected

Encoding {
    name: "Big5",
    variant: VariantEncoding::Big5,
}

to be correspond to a unique range of statically allocated memory and BIG5 and BIG5_ENCODING both to refer/point to it.

As long as I had only the declaring Rust crate itself involved, this seemed to work within the crate and across FFI. However, when I started using the crate from another crate, use of BIG5 in the other crate resulted in different instance of

Encoding {
    name: "Big5",
    variant: VariantEncoding::Big5,
}

with references to it pointing to a different address.

On IRC, I was told that this was to be expected and that I should use static if I wanted a stable address across crates.

However, if I do s/const/static/ when declaring BIG5, the compiler no longer allows me to use BIG5 to initialize BIG5_ENCODING for FFI, because statics aren't allowed to refer to other statics.

What's the right way to have a single statically allocated instance of a struct and to export both a reference (for Rust consumers) and a pointer (for C consumers) to it?

Should I have

const BIG5_CONST: &'static Encoding = &Encoding {
    name: "Big5",
    variant: VariantEncoding::Big5,
};

and

/// The Big5 encoding.
pub static BIG5: &'static Encoding = BIG5_CONST;

and

#[no_mangle]
pub static BIG5_ENCODING: ConstEncoding = ConstEncoding(BIG5_CONST);

And rely on the undocumented(?) compiler behavior of these two instantiation of BIG5_CONST getting coalesced into one as long as they are in the same crate?

Well, statics are allowed to to refer to other statics. Statics are not allowed to refer to consts. Sometimes if you add too much &s you may get the error though.

First of all, I would stop declaring both the struct and the reference in one step, like you're doing (I'm not even sure what this syntax is a sugar for):

pub const BIG5: &'static Encoding = &Encoding { ... }; // don't do that

Instead, first declare the struct:

#[no_mangle]
pub static BIG5: Encoding = Encoding { ... };

This way, you're sure that &BIG5 is a stable address – a static reference of type &'static Encoding which can be used directly by Rust consumers. Now you can declare other statics if you want:

#[no_mangle]
pub static BIG5_ENCODING: ConstEncoding = ConstEncoding(&BIG5_CONST);

Now, in C you can either define:

extern Encoding const BIG5;

or

extern Encoding const * const BIG5_ENCODING;

(sorry for writing const on the right, but I find it more consistent)

1 Like

Thank you. Indeed, it seems to be that the key to the solution is not initializing the struct and taking a reference to it in one go.

Curiously, while taking a reference to a static and making another static out of the reference works, constants are prohibited from referring to statics to the point of a constant for the reference to a static struct being prohibited. Since constants seem to be mostly inlined expression I have a hard time seeing why this restriction needs to be in the language.