I am trying to insert compile a program for aarch64. As you may know, newer ARMv8 versions also include some optional hardware features, such as MTE (Extensão de marcação de memória do braço | Android Open Source Project), which tags pointers and memory to increase memory safety for more traditionally unsafe languages, if implemented in compilers. I am working on extending one such a compiler, and would like to add the following code:
asm!("st2g {tag}, [{addr}]", tag = in(reg) tagged_ptr, addr = in(reg) addr);
You don't need to know what st2g
does, just that it is an MTE-specific instruction unique to newer ARM hardware, but it should be replaced with/executed as nops if not available. However, now I try to compile the project for the aarch64 linux target with:
cargo build --release --target aarch64-unknown-linux-gnu
and this gives me the error:
error: instruction requires: mte
--> crates/runtime/src/mmap.rs:630:19
|
630 | asm!("st2g {tag}, [{addr}]", tag = in(reg) tagged_ptr, addr = in(reg) addr);
| ^
|
note: instantiated into assembly here
--> <inline asm>:1:2
|
1 | st2g x10, [x11]
| ^
The build fails with the error message instruction requires: mte
. I find this quite confusing. I am compiling from an x86 linux machine, so I obviously don't have mte
on my system. But I shouldn't have to to be able to cross compile for aarch64, right? Furthermore, how does the rust compiler even know of mte
. To me, this sounds like I have to enable some cargo flag --use-mte
, is this true?
Any help would be appreciated. Thanks!
My experience here is limited but I believe you will need to use target features. (Or perhaps a more specific target CPU.)
2 Likes
The documentation for std::arch
has some additional details on how you can dynamically detect features, which it sounds like you're trying to do
4 Likes
Enabling target features sound like the right way to go, I am just unsure how I would enable +mte
in a cargo build
command. cargo build --release --target aarch64-unknown-linux-gnu -C target-feature=+mte
doesn't work.
Thanks a lot for the help! Worked like a charm, the following compiles now with just
cargo build --release --target aarch64-unknown-linux-gnu
after adding
#[cfg(all(target_arch = "aarch64", target_os = "linux"))]
#[target_feature(enable = "mte")]
Maybe this is a bit of a dumb question, but is there a difference between:
#[cfg(all(target_arch = "aarch64", target_os = "linux", target_feature = "mte"))]
and
#[cfg(all(target_arch = "aarch64", target_os = "linux"))]
#[target_feature(enable = "mte")]
?
Intuitively, I would think that it works like this: The first version conditionally compiles the arm code block only if all three (arch, os, and feature) are supported/present. The second version only compiles the arm code block/function if the arch and os are present/supported, and additionally enables the mte feature?
I am mainly asking this question for the following reason: I want to be able to cross compile from a different architecture (x86 linux) to aarch64 linux, and that works with cargo build --release --target aarch64-unknown-linux-gnu
, but I'm not sure how I would also compile for that target feature "mte". Or does the target feature mean: If the machine you are compiling this on has mte enabled, we choose this block? That would never be the case for me on x86, so I'm looking for some flag in cargo to do this. Or do I just have to use the second version, because that seemed to have worked for me.
The first version means that the item it's applied to will only be compiled when the target feature has been enabled globally.
The second version forcibly enables the target feature for only the function it's applied to. You are responsible for only calling the function when the specified feature(s) are enabled on the current hardware.
If you wanted to use the first version, you would need to pass -C target-feature=+mte
to rustc when building your program to enable the feature for the whole program. The documentation on that flag is here
https://doc.rust-lang.org/rustc/codegen-options/index.html#target-feature
Since it sounds like you are trying to dynamically choose between the mte version and a fallback implementation at runtime, I think you want the second version. Forcibly enabling the feature for the duration of a function allows you to emit the instructions you need for the optimized version without requiring every machine that runs the program to have that feature enabled. Then you just need to do a runtime check for the feature to choose between those implementations, as described in the std::arch
docs I linked above
1 Like
Is it possible to pass -C target-feature=+mte
to cargo as well?
Ok, thanks a lot for the help!
One more question:
If we write a function ourselves, that has two variants: one mte variant and for everything else, but we are only defining one as mte because we know it contains mte-specific code, not because it otherwise doesn't compile (e.g. it doesn't contain st2g
asm! calls, which I cannot compile without specifying mte), then I would argue that
#[cfg(all(target_arch = "aarch64", target_os = "linux", target-feature="mte"))]
fn some_function(&mut self, start: usize, len: usize) -> Result<bool> {
/// We don't support MTE on non arm64 linux
#[cfg(not(all(target_arch = "aarch64", target_os = "linux", target-feature="mte")))]
fn some_function(&mut self, start: usize, len: usize) -> Result<bool> {
makes more sense than
#[cfg(all(target_arch = "aarch64", target_os = "linux"))]
fn some_function(&mut self, start: usize, len: usize) -> Result<bool> {
if !std::arch::is_aarch64_feature_detected!("mte") {
return Ok(false);
}
/// We don't support MTE on non arm64 linux
#[cfg(not(all(target_arch = "aarch64", target_os = "linux")))]
fn some_function(&mut self, _start: usize, _len: usize) -> Result<bool> {
Ok(false)
}
because all the seperation logic is done at compile time, given that we are compiling with
[target.aarch64-unknown-linux-gnu]
rustflags = ["-C", "target-feature=+mte"]
Do you agree?
If you compile with the target feature enabled via rustflags it is unsafe to run the built program on hardware without that feature (at least in theory, some features may be specifically designed to be backwards compatible in some way).
Unless you are
- Targeting hardware that always has the feature
- You are distributing one binary with the feature and one without it and requiring the user to run the correct one
You almost certainly don't want to enable to feature globally like that