Modular ABI for Rust

I think a lot of people believe in the benefits and I agree, but I definitely understand the fears that that the compiler/language teams have. Having to standardize on the memory layout of structs and enums and such does really put certain shackles on what they are allowed to do and optimize from that point on.

There are good arguments on both sides, which is why I wish the ABI could be modularized as a separate, swap-able component of the compiler so that we could make our own ABI plugins or something and version them separately.

6 Likes

Novel idea... That could actually be quite interesting.

1 Like

I think a modular ABI would be a great idea.

Does the Rust compiler support compiler extensions? Some compilers support extensions that change the behaviour of the compiler at compile time, for example, here are the extensions that gcc supports. In the case of gcc these extensions are built into the compiler itself; how hard would it be for third-party extensions to be supported for the Rust compiler?

I don't think Rust compiler extension support exists yet, unfortunately. I think it had been tossed around a long time ago, but was dropped because compiler extensions would essentially be permanently unstable due to the unstable internals of the compiler.

That's what I remember anyway from my searches a while ago.

2 Likes

Ah, I see. regardless, a modular ABI is still a good idea. I see two courses of action: a new ABI-compliant compiler backend could be developed (think the wasm backend). Or, the current compiler could be extended to support an ABI feature flag that would allow for toggling ABI compliant builds. for example, one could do something like $ cargo build --release --ABI to build an ABI compliant release binary.

2 Likes

Those are both good ideas, actually.

I think it is a big problem that needs a lot of thought and expertise so I'm not sure how much more value could be brought to the discussion for sure, but we could bring it up on the internals forum again. I don't think there was a topic opened specifically about a modular ABI.

Especially as this is getting a little off topic. :slight_smile:

1 Like

Oops, haha. Python → ABI's, whomst'd've thought.

I totally agree. Something like this is not something you can whip up in an afternoon. However I think the concepts there, and the impact of a stable ABI would be pretty sizable.

I think a post on the internals forum would be a great idea, especially with a focus on modularity of the concept. Before we preemptively write a post, I think we should distill the key pros and cons to present a well thought out argument.

2 Likes

We could start a new topic here to discuss before moving to internals?

1 Like

This is a draft of a post proposing a modular ABI for the Rust runtime. Feedback is appreciated.

Rust is a powerful systems programming with strong memory guarantees. Rust allows for concise expression at a high-level, while still producing fast low-level code. However, Rust does not guarantee the calling conventions and layout of structures in memory, which makes it difficult to write external applications that interface with Rust; Rust lacks a standardized ABI.

There are many benefits an standardized ABI would bring to Rust. First and foremost, it would allow non-Rust crates to be integrated with Rust toolchains. Providing an ABI would also allow outside code to rely on Rust to complete specific tasks that Rust is good at.

However, there are also some tricky things that have to be dealt with. Having to standardize the memory layout of structs and enums and such can limit the kinds of optimizations that Rust can perform. Additionally, standardizing the ABI would take a lot of work. A poorly designed ABI is worse than not having an ABI at all.

While discussing the matter, a point was brought up that the ABI could be modularized. A modularized ABI would be optional while compiling. A new ABI-compliant compiler backend could be developed, or the current compiler backend could be extended to support an ABI feature flag that would toggle ABI compliant builds. Something like $ cargo build --release --ABI.

The end goal of a standardized ABI would expand the number of applications that Rust could be used for. When writing close to the metal, a stable ABI cements a language in the kernelspace - if you need an example, look no further than C. Would it be possible to standardize Rust's ABI? What would the ABI look like, and how feasible would it be to implement? What are your thoughts?

1 Like

Sure. What do you think of the draft I wrote?

See also the abi_stable crate, which attempts to address (a part of) the stable API question from a library level.

3 Likes

Is this the most compelling reason? I could be but I suspect that it largely depends on who you ask. In conversations that I've been in ( one of which at least was started by me, though, so not a good indicator ) it has actually been Rust<->Rust interaction that was the motivation behind the stable ABI.

Right now there isn't any way to build Rust crates that can dynamically link to other Rust crates, a practice that is extremely common with C and C++, and very useful for plugins.

I guess it doesn't really matter which is the most compelling reason I guess, but we should try to come up with a good list of all of the benefits and issues. Starting on the Pros:

Pros

  • Allows dynamically linking Rust crates to other Rust crates:
    • This allows for dynamically loaded plugins to Rust programs
    • This allows you to save compile-time/disk space for projects, for example, that have multiple CLI's that all link to the same core library crate
    • Note: This use-case potentially rather well covered by abi-stable-crates

  • Allows the potential for making libraries loadable by other languages such as swift:
    • Quote: Imho one of the biggest mistakes C++ ever made was not stabilizing its abi; swift just stabilized theirs and is already reaping the benefits, swift system libraries, the swift runtime, swift UI libraries, all dynamically linked and backwards abi compatible.

    • Quote: I’d love to use Swift ABI from Rust . extern "C" as the lowest common denominator is too low for Rust, but maybe there’s a useful subset of Rust that can be mapped to Swift’s ABI to provide useful interfaces for Rust and Swift?

  • Insert that note @isaac made earlier about OS writing by someone

Cons

  • Limits optimizations the compiler can perform
    • This would hopefully be helped if we could create a modularizable ABI. We would hope that you could like publish an ABI implementation as a crate ( I mean we're dreaming already, right, :wink: ) and then version the crate. If you ever need to break ABI compatibility for some optimization, you could do it within Semver.
  • Depending on the implementation, if we want to make ABI plugins, to avoid stabilizing the compiler's built-in ABI, we might run into another problem because we have to stabilize the plugin interface, which could be another can of worms.
  • A bunch of people could write ABI crates and it would make it really not standardized because everyone uses different ones?
  • Who knows how hard this could be...

BTW some potentially good idea for implementation in this comment.


Yep, that's the closest thing I've seen to solving this problem yet. Very cool, but requires creating interface crates to act as the middleman.

1 Like

Technically you can just fine, it's just immediately UB if you don't guarantee that both sides use the exact same compiler version and flags.

Technically, so do C and C++: the middleman is just always required, and called headers. A nice small initial step would be to get the compiler to allow skipping an explicit interface crate via saying "hey this dependency, don't statically link it". Maybe even just spec it so that the compiler spits out the dylib for the dynamic dependency alongside the binary.

Then the easiest case -- I'm writing a tool family of many binaries and want to share the core as a dylib -- can be accomplished by compiling all of the tools, verifying that the core dylib is indeed bit-for-bit identical, and then having them all use the same dylib.

(Personally, I think the "plugin" stable ABI case is best served by Wasm interface types, and the "system" stable ABI case would be best served by the Swift ABI or a similar setup.)

3 Likes

So I've found the original source, but it actually doesn't mention Rust's lack of an ABI explicitly. Rather, it cites that C's stable ABI is one of the major reasons for choosing C.

The part about Rust not being suitable because of its lack of a stable ABI arose from the discussion of the document on hackernews. It's relatively aside, so I have no idea why I remembered that single comment so explicitly :sweat_smile:. I think the argument I presented still stands though.

The idea behind a stable ABI would be that for two different binaries to interact in a defined manner, they wouldn't necessarily have to use the same compiler and flags. A stable ABI would allow for previously compiled binaries to interact with new ones (within reason).

In the original context of this discussion, a stable ABI was being discussed as a way to allow for a Rust-derived scripting language to interact with precompiled Rust crates. (See these two posts.)

I agree that dynamically loading dependencies would be a good first step. How many man-hours do you think implementing dynamic loading would take, and would an RFC be required?


@zicklag btw, thanks for that awesome list of pros and cons. I'll post a revision of the draft soon (or maybe tomorrow).

1 Like

I started out writing a post about how this is probably impossible, but I think it might be. Here's a proposal for an interface.

Today, we have the abi_stable crate. It communicate over a #[repr(C)] API, and I think the main disadvantage of that is that every single type you pass through has to go through their mechanisms. You can't directly send a Result, for instance, you have to use their declared type instead.

And that has to repeat for every type. Even for custom user types, you have to use their annotations to make it FFI safe, and have those contain their repr(C) types, rather than Rust ones.

I think with a real modular ABI proposal though, we could solve that. What if an "ABI" crate was a proc-macro-like crate with one function for each rust structure - struct, enum, union, tuple? Each of these would take in a description of the type, and give back a description of exactly where each byte of information ends up. So the compiler could know exactly where the enum discriminant is stored, and use that to interact with the type rather than its native definitions.

Then, when wanting to use the ABI, you'd have some functionality to tell the compiler to "turn all of this data into this ABI". At compile time, the compiler would call into the ABI crate and generate code to move each byte of data from the native Rust ABI into the defined ABI. The operation would return an opaque bunch of bytes which could be sent over a FFI function, and put back into the translation function, but nothing else.

It would mean we'd have to copy all the data we want to move from native rust to an ABI representation, but if we want the ABI to be separate from Rust's native representation, I think that would be required anyways.

I don't think this would really be easy, but I'm starting to see how it might be possible. We'd still need a solution for dealing with arbitrary pointers / allocated memory, and an interface which actually encapsulates the intricacies of layout. But both of those could be possible! Dealing with allocation is probably very hard, but I thought doing this at all would be impossible, and there might be a reasonable way.

Dealing with intricacies of layout might just mean finding a minimal API which is reasonable to commit to. But it wouldn't be easy either; how would it deal with things like niche optimizations? Or the ABI defining a way to reorder fields based on size? (would it be easier to define two functions per type of data, one which constructs it given the individual parts, and one which destructs it into those parts?)

Anyways, I think this might actually be possible. How similar/different is this from your idea for a modular ABI? I don't think I've read a technical description of modular ABIs beyond the idea that crates would define them - so this is mostly me running with that idea. Am I thinking of something entirely different? I'm excited to see where you go with this, regardless.

Anyone else have thoughts on the feasibility of a modular ABI (regardless of whether we want to or not)? I've kind of convinced myself that it's feasible, I'm interested if you think it is too.

3 Likes

If I understand correctly,

     +-------------------------------------------------------------------+
     | In the rustc compiler pipeline                                    |
     | +--------------+    +---------------+    +----------------------+ |
     | | Rust structs |    | Crate w/ ABI  |    | ABI compliant        | |
... -> | enums, etc.  | -> | Translation   | -> | structs, enums, etc. | -> ...
     | | bit layout   |    | Macros        |    | bit layout           | |
     | +--------------+    +---------------+    +----------------------+ |
     +-------------------------------------------------------------------+

You mention moving data from a native Rust representation to an ABI representation. Do you mean moving as in transmuting bits during runtime, or moving everything to a stable universal ABI representation during compile time? I'm assuming the latter as that's more the function of an ABI.

Another tricky thing to deal with is unsafe code. All unsafe code that mangles bits in a way compliant with the current native representation but not compliant with the stable ABI would have to be rewritten.

As for arbitrary pointers / allocated memory, once that memory has been passed out of Rust, the receiving program can do whatever it wants with it. I suggest that transferring data across ABI boundaries should act as a full move, as in, that program is now completely in charge of the data. To get pointers / memory back, they need to be passed back through the ABI. That's one potential solution, but I bet there's another one.

There has been a lot of work on optimizing laying out fields in structs in reliable ways. I think one of the points of a modular ABI was that it would be modular, meaning that binaries could choose to be compiled in an ABI compliant way. Plus, there are a large class of optimizations that can be done in compliance with an ABI. As a matter of fact, since an ABI solidifies the layout of data, more reliable and defined bit-twiddling and the like can occur.

I can't speak for @zicklag, but this seems pretty similar in semantics to what I had in mind. I originally thought that the Crate with ABI Translation Macros would be embedded directly in the compiler itself, but making a crate gives it versioning and other nice features. The only issue I can see is integrating the crate into the compiler pipeline, but I guess I kinda grasp the concept of how this could work.

Given similar languages like C and Swift have a stable ABI, I see no reason other than potential downsides for a stable ABI not to be implemented. One thing which hasn't really been discussed is the calling convention that the ABI would use. Does Rust already have a 'stable' calling convention?

Another thing: would there be only One Stable ABI Crate, or would the compiler provide a general interface allowing data layout to be defined (with macros or otherwise)?

Yet another thing: should the ABI be opt-in on a struct-by-struct basis, (i.e. #[repr(ABI)]) or should it directly apply to the whole crate when used? I'm thinking that the latter would be better, but #[repr(...)] has its benefits if user-defined ABIs are allowed (think repr(SwiftABI), repr(MyCustomABILayout)), etc.

I've actually already done this though only within the same compiler:

https://users.rust-lang.org/t/creating-rust-apps-with-dynamically-loaded-rust-plugins/

Actually that is right along the lines of what I was thinking! That sounds like it's going in the right direction.

1 Like

I think you should be able to define multiple. That way people can support other languages such as Swift.

I'm thinking both. I think you should be able to apply the ABI to the whole crate and therefore not have to any code to make two crates be able to dynamically link to each-other, but the per-struct method has its own advantages in different use-cases.

2 Likes

Here's the draft revision.

Proposing a stable modularizable ABI interface for Rust

Based on the points from the discussion here.

Introduction

Rust is a powerful systems programming with strong memory guarantees. Rust allows for concise expression at a high-level, while still producing fast low-level code. However, Rust does not guarantee the calling conventions and layout of structures in memory, which makes it difficult to write external applications that interface with Rust; Rust lacks a standardized ABI.

Benefits

There are many benefits an standardized ABI would bring to Rust. A stable ABI would allow for dynamic linking between Rust crates, which would allow for Rust programs to support dynamically loaded plugins (a feature common in C/C++). Dynamic linking would result in shorter compile-times and lower disk-space use for projects, as multiple projects could link to the same dylib. For example, imagine having multiple CLIs all link to the same core library crate.

Although this use case is already rather well covered by abi-stable-crates, there are still many more benefits beyond linking crates dynamically. A stable ABI would allow Rust libraries to be loaded by other languages (such as Swift), and would allow Rust to interop with libraries defined in other programming languages. Non-Rust crates could be integrated with Rust toolchains; providing an ABI would also allow outside code to rely on Rust for performance-intensive tasks. Cross-language compatibility would increase the diversity of Rust's package ecosystem.

Quote: Imho one of the biggest mistakes C++ ever made was not stabilizing its abi; swift just stabilized theirs and is already reaping the benefits, swift system libraries, the swift runtime, swift UI libraries, all dynamically linked and backwards abi compatible.

Stabilizing the Rust's ABI would allow for cross language interop and dynamic linking. "extern "C" as the lowest common denominator is too low for Rust" (Quote).

Recently, the Fuschia OS Team at Google decided to ban Rust's for use in Fuschia microkernel, citing C as an alternative because of its stable ABI. Not providing a stable ABI ultimately hurts Rust when getting down to the bare metal. Given similar languages like C and Swift have a stable ABI, I see no reason why a stable ABI would not to be implementable. As discussed here, some ABIs/FFIs have already been written using proc macro and the like.

Potential Issues

However, a stable ABI is not all peaches and roses. Having to standardize the memory layout of data can limit the number of optimizations the compiler can perform.There has been a lot of work on optimizing laying out fields in structs in reliable and ABI-compliant ways. There are a large class of optimizations that can be done in compliance with an ABI; since an ABI solidifies the layout of data, more reliable bit-twiddling and the like can occur.

While discussing the matter, a point was brought up that the ABI could be modularized. A modularized ABI would be optional while compiling. This modular ABI could be published as a versioned crate. If the ABI ever needs a backward-compatibility breaking change, the change could be made within Semver. Alternatively, a new ABI-compliant compiler backend could be developed, or the current compiler backend could be extended to support an ABI feature flag that would toggle ABI compliant builds.

However:

Standardizing the ABI would take a lot of work. A poorly designed ABI is worse than not having an ABI at all. And as we all know, the right solution is often the hardest one.

Another downside is that allowing ABI crates would not stabilize Rust's ABI, there'd just be ABI fragmentation. Although this is a genuine concern, a 'master' ABI crate with Rust's 'official' ABI could be developed. This would standardize Rust's ABI, while still allowing other crates with other ABI's to be written for interop with other ABIs, like Swift's.

Summary

The end goal of a standardized modularized ABI would expand the number of applications that Rust could be used for. A stable ABI would standardize for dynamic linking between Rust crates, minimize the amount of computer space-time used for compilation, allow for cross-compatibility between Rust and other programming languages, and increase the plausibility of Rust as a kernel-level language.

Implementation Proposal

So, what might this modularized ABI look like? Roughly speaking, an ABI would be defined by a series of macros in a crate which specify the layout of structs, enums, tuples, etc. according to that ABI. During compilation, while determining the layout of the data, the layout information provided by the ABI macros would be used. The end-goal would be for something like #[repr(RustABI)] or $ cargo build --release --abi rust-abi to be plausible.


What are your thoughts on this draft? Anything else to add / take away? Should we go into how the modularized ABI might work in more detail?

1 Like

I like everything you have there. I think it would be good to go into the ideas for how it would work that @daboross proposed.

If we bring this up in internals, we are going to want to have as much evidence/ideas for a way forward and the potential problems that we can.

Maybe just add a horizontal rule or a heading before going into the implementation idea.

1 Like