Better Inline ASM

I would love to see someone take inline asm over the line and into stabilization.

I have only used it a little bit, so I don't have a ton of specific recomendations, but there are a number of issues filed: Issues · rust-lang/rust · GitHub

Well clang doesn't really check the string, it's just that in the process of modifying it pseudo-checks it. I'm not picky about using $ vs. % for parameters, though I'd prefer to stick with $ simply because that's what we already have. That doesn't prevent us from checking the string, we can still make sure all the parameters used in the template actually exist.

Also, most LLVM asserts have nothing to do with the template string itself and instead are because of mismatches between constraints and what we're passing to LLVM.

I wholeheartedly support this effort. Here are some initial thoughts.

Number one priority fix: don't silently drop invalid inline asm.

We should think about what kind of types we allow for register constraints. I think it makes sense to only allow integers, floats, char, and raw pointers, as opposed to the situation today where we allow anything that's the right size.

Can we do a clobber detection lint?

Deployment: will the asm! macro be able to always tell the difference between the old and new dialects? Should we start moving the current asm! macro to something else now so that asm! is free for use when we're done down the road?

Please, no. I heavily use register-specific constraints in my inline assembly:

  1. x86 has many instructions that only take arguments/put outputs in specific registers and these are exactly the instructions you might want to use with inline assembly since they are not available through the normal language (here's but a few: DAA, AAA, XSAVE, XGETBV, CPUID, ENCLU, ENCLS, GETSEC).
  2. Calling conventions require placement of things in specific registers.

Yes, I know you can use general constraints and move things around as necessary, but this results in a performance overhead as well as extra assembly to parse through for humans when reading inline assembly.

But I think register-specific, any register, and any memory should be sufficient for most purposes. Regarding constraint language, don't use whatever GCC is doing with their one-letter names that are mostly non-obvious. The LLVM constraint language seems fine to me so far.

Yes please! Did you know that you're not allowed to modify an input register if it's not also an output register? Even if you specify it as a clobber? (Warning: Do not ...) This leads to stupid stuff like unused variables and having to load registers manually. I feel like all this should be explicit. Here are the cases we should at least support:

  • read-only input (backed by immutable variable)
  • read-write input (backed by immutable variable)
  • read-write input+output (backed by mutable variable) This option should not require specifying two separate but linked constraints.
  • write-only output (backed by mutable variable)

If we're going to define our own template language, let's make sure that there are no conflicts with the underlying assembly language (example of a conflict: having to write all explicit registers with %% in GCC).

AT&T x86 assembly syntax also uses $, for immediates. One obvious alternative could be to mimic println! syntax and use {}, but ARM assembly uses those for lists of registers (as well as # for immediates).

Perhaps Rust should use %0/%[foo] like GCC, but pass through % when followed by a letter, which at least solves the problem for x86.

I think inline assembly is generally considered more esoteric and dangerous than it deserves to be, partly due to GCC's syntax overcomplicating things. I'd like to see a more radical change to a more regular syntax - get rid of the :, "r"(expr), and perhaps the quoting of assembly strings, and switch to something that looks more like the rest of the language. Not sure exactly what that would look like, though.

But what if you use inline assembly to do something really fast without checking a lot on a regular struct?
You would need a lot of mem::transmute to do that.

Regarding the rest I am pretty much in favour of what @jethrogb said.

I personally think we should go with {} as it's already the template syntax used elsewhere in the language. It readily gives us familiar syntax for positional and named parameters, and is also able to support LLVM's Asm template argument modifiers should we decide to support them.
In my experience {a, reg, list} is far less common in ARM assembly than %reg and $imm are in AT&T x86, and I expect it to be even rarer in inline assembly.

I tend to agree. Specifying output/input/clobber is something I have regularly seen trip up even people that are reasonably familiar with inline assembly. Remembering the order they have to go in is one annoyance, the sigils (=, &, +, %) as constrained modifier are another big one.
It might be worth reevaluating the syntax proposed in RFC PR #129.
An idea I have come up with is something along the lines of:

CLOB = "clobber(" CLOBBERSPEC ")"
DIRSPEC = "in" / "out" / "inout"
PARAM = CLOB / DIRSPEC "(" CONSTRAINEDSPEC ")" [ IDENT "=" ] EXPR
ASM = "asm!(" TEMPLATE *("," PARAM) ")"

Which ends up looking like:

asm!("
    movq {}, %rcx
    1:
    movb -1({lhs}, %rcx, 1), %al
    xorb -1({rhs}, %rcx, 1), %al
    orb %al, {res}
    decq %rcx
    jnz 1b
    ",
    in(r) count,
    in(r) lhs = &left,
    in(r) rhs = &right,
    inout(r) res = result,
    clobber(al, rcx, cc));

I'm not sure this is expressive enough in general though.

Are there any symbols that no assembly language uses? Here are all printable ASCII characters and what I know they're used for (based on GAS) (incomplete list). Some characters might appear multiple times.

& | ~ ^    bitwise operators?
* + - /    arithmetic operators?
< > =      comparison operators?
" '        strings
\          string escaping
# ; /      comments
$ % * ,    operands
@          ARM comment, MSP430 operands, IA-64 relocations
( ) ,      x86 SIB addressing (AT&T)
[ ] * +    x86 SIB addressing (Intel)
{ }        ARM register lists
AZaz09_    label names
.          special directives
:          labels
?          TIC54X local labels https://sourceware.org/binutils/docs-2.26/as/TIC54X_002dLocals.html
!          ARM pre-indexing, Alpha relocations, PowerPC/Solaris comment
`          Epiphany statement separator, VAX displacement sizing

It seems like ` and ? might be good candidates.

1 Like

It's worth noting that it may well be possible to steal symbols only used by AT&T xor Intel syntax, if the Rust asm syntax converges on the other one. As nothing else seems to use ( ) or [ ] (respectively), that may be a compelling option.

Of course, the "third option" is to define a syntax more akin to Rust itself, perhaps taking advantage of the acceptance of "naked functions":

asm_fn!{
    foo(x: Reg<usize>, y: Mem<usize> ) {
        regs.rax = x ^ *y
    }
}

Perhaps, but I should note that if I found a usage in ARM/x86 I didn't look further for usage in other machines.

ARM uses , PPC uses (). There are probably others.

I think inline asm should be kept as an opaque string that is passed directly to the assembler. In some of my inline asm I use assembler directives to perform operations like alignment, writing to other sections, etc. However I would be ok with changing the register substitution format from $0 to {0}, which matches the behavior of format!, including the ability to have named parameters.

And even though ARM does use {}, those case just be escaped using {{ and }} like in format!.

1 Like

We could do the opposite, where {{ and }} indicate some form of templating. I guess I didn't look into it, but I expect double brackets to not be used in ASM languages.

Actually, the rust lexer is UTF-8 aware. ASM languages are generally written in 7-bit ASCII. This means we could use characters such as « and » for templating. The drawback is that some keyboard layouts make it hard to enter those.

Link drop: Stabilization path for asm!()? - #9 by bascule - language design - Rust Internals

I like the idea of using {{ and }} as it looks like escaping to a different meta-level to me (some kind of nesting). Other combinations of <[{()}]> might work as well.

I disagree with « and » being a good idea. I think there's no non-US-ASCII-character around which is present on all common keyboard layouts and which doesn't look out of place (all non-ASCII, non-alphanumeric characters on a German layout: §, °, €, ´).

Maybe the (already obsolete) shell syntax of `` could be used. I don't remember that being used in assembly although I'm only used to minor x86 assembly.

1 Like

I watched @Florob's talk last night and I've come to the conclusion that I like the DSL style best, because it's so easy to work with. However, as both D and MSVC show, it's a huge maintenance burden. Microsoft, one of the biggest companies in the world, couldn't even be bothered to include 64-bit support once they added that to their compiler. Therefore it seems unreasonable to think the Rust community would be able to get there (until such a time, of course, when all system assemblers are written in Rust).

What the talk didn't really touch on too much is intrinsics. Most people associate intrinsics with SIMD, but that's too limiting. MSVC supports a large set of x86 intrinsics that have nothing to with SIMD, such as RDFSBASE, RDTSC, etc. https://msdn.microsoft.com/en-us/library/azcs88h2(v=vs.80). GCC also has intrinsics, although they are called built-in functions. https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/ia32intrin.h.

Maybe we can get the best of these three worlds. I'm thinking of some strongly-typed intrinsics or domain-specific language, where missing instructions could be implemented using template-style inline assembly. This would limit the scope of review for the times where you do need to use templates, and is flexibly and usable enough in other cases.

To stick with the cpuid example:

// presumably this one would come pre-defined in some x86 module
unsafe fn cpuid(mut eax: u32, mut ecx: u32) -> (u32, u32, u32, u32) {
    let ebx: u32;
    let edx: u32;
    asm!("cpuid", inout "eax" eax, out "ebx" ebx, inout "ecx" ecx, out "edx" edx);
    (eax, ebx, ecx, edx)
}

let (_, ebx, ecx, _) = cpuid(4, 0);
println!("L1 Cache: {}", ((ebx >> 22) + 1) * (((ebx >> 12) & 0x3ff) + 1) * ((ebx & 0xfff) + 1) * (ecx + 1));

Edit: this example doesn't really show completely what I had in mind, which would include traits for register classes, etc.

1 Like

I would agree that creating Rust-specific asm DSL is way too labor intensive. Given that both GCC and LLVM had settled on mostly the same asm dialect, I think it makes a lot of practical sense to piggy-back on that.
Every C++ compiler out there includes some implementation-defined extensions. IMO, we should accept that there is going to be some of that in Rust too.

I am not so sure.

I think that the best source for ideas here is ironically the Lua community. LuaJIT is a blisteringly fast JIT compiler for Lua. It has been used for things like packet processing systems where one would not normally expect a dynamic language to be. It also comes with its own assembler DSL (DynASM), which has inspired a Rust version. There is also a WIP patch that provides LuaJIT with user-defineable intrinsics, which could provide a source of ideas for what user-defineable intrinsics could look like at an extreme bare-bones level. Rust being AOT-compiled rather than JIT-compiled, there are a few more issues (mostly related to cross-compilation), but I think this could be a useful source of ideas.

1 Like