Intel CPU bug : what about Rust?


#1

Hello everyone, first post on this forum !

Just a background to understand where my question comes from : I’m mostly a front-end developer, do some backend but with “high-level” languages, mostly Javascript and Clojure. Knows very little about low-level stuff, memory management, how to write an OS and this kind of things. Also, I’ve never written a single line of Rust (yet).

My question now.

I’m wondering, since Rust seems to promise safe memory usage patterns, through its concept of memory owners (don’t remember the official name), and since the Intel bug seems to come from shared memory if I understand somehow, would the bug have happened in Linux if Linux was entirely written in Rust (and no usage of unsafe Rust stuff, at least in the parts relevant to this bug) ?

For those not in the knows, here’s the link to the original bug report/disclosure https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/

Maybe a naive question, but I want to understand Rust more. I think this is a good real world case.

PS: Please, remember I’m no expert in low-level things, so if you need to go very technical, think about me and provide some explanations and/or links.

Thanks a lot in advance for your answers !

PS2: As I said, I know nothing about low-level stuff, but the language AND the community looks very nice from the outside. Keep up the good work !


#2

The new attacks are side-channel timing attacks that target the speculative-execution components of modern CPUs. Because they are side-channel attacks (i.e., from outside the program flow), they are independent of the language in which the host operating system is written. One would have to control the attacker’s coding tools (which is always impossible) to prevent these specific attacks against existing hardware flaws.

Depending on the processor family, there may be software and/or firmware mitigations that the operating system can employ. For example, ARM has developed such mitigations ([https://developer.arm.com/support/security-update]) for those of its processors that it considers at risk.


#3

Thanks for the answer ! I found this Wikipedia page on timing attacks https://en.wikipedia.org/wiki/Timing_attack, in case someone is interested in the principle.


#4

llvm has patches to limit Spectre so I assume Rust will get that for free once we upgrade to the relevant llvm release.

I’m curious though how much Rust will be impacted by the performance of those patches

When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.

However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
strongly recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.


#5

I’m thinking this would be relevant to redox-os. I’m no expert in kernel and low-level code, but I’m positive that an operating system written in rust could have a positive outcome to this situation.


#6

I’m a big fan of both Rust and low-level code, but trust me, the mess is really all on the hardware side for this one… :confused:


#7

This page has more complete information on the bug, including papers on the exploits (dubbed “Meltdown” and “Spectre”):

https://meltdownattack.com/

Edit: (ah, I figured Discourse would stop me if it was a duplicate link, but it looks like they bought multiple domains, and @epage already linked one of the others :stuck_out_tongue: )

I’m thinking this would be relevant to redox-os. I’m no expert in kernel and low-level code, but I’m positive that an operating system written in rust could have a positive outcome to this situation.

I’m no expert either, so this is the blind arguing with the blind—but I think you may be underestimating just how crucial of a role the CPU plays in regards to kernel security.


#8

Let’s be careful with spreading “Rust could have prevented this” myths for every major vulnerability. It doesn’t improve our credibility, and it’s more than pointed enough to do it when it obviously applies.

In this case, Redox is just as vulnerable if it uses the intended (and up to now preferable) way to protect kernel memory from unprivileged processes, and it would need to implement just the same mitigation.


#9

An issue has already been opened for the redox kernel.


#10

Sorry if my question induced this, it was really not the point. My conclusion is that the culprit is, like for all software, complexity. Intel CPUs became so complex that the complexity is beating them. As the title of this paper says, Hardware is the new software.

Thanks again for all your answers !


#11

That’s doesn’t account for the other CPU vendors with similar vulnerabilities. The underlying cause here is that none of these vendors employed design reviewers whose work history was one of developing ways to bypass security measures, such as select former employees of NSA’s TAO group. There are many ways to use information side-channels to attack hardware.


#12

Sorry, your sentence is a bit long, I have troubles understanding it (not a native English speaker). I understand that design reviewers's work is to bypass security measures, which doesn’t make sense. Could you rephrase your sentence please ?


#13

When reviewing a proposed design, it’s helpful to have people with experience in breaking security protections look over the design to see if there are ways to exploit it. People focussed on speed or features are often not so great at noticing security weaknesses.

@TomP is suggesting that CPU manufacturers have not brought security researchers into their design processes effectively.


#14

(My personal reaction to hearing about timing attacks is to put my fingers in my ears and sing as loudly as I can, so I can hardly criticize…)


#15

Thanks a lot @jimb, now it makes total sense. Thanks too @TomP, sorry for not saying thanks before. I’m learning a lot here.


#16

Well stated. Sorry that my earlier reply was so difficult to parse.


#17

As stated, the problem is lower than the language in the HW. So using Rust wouldn’t have helped.

But speaking with a friend (he’s a kernel developer), there were some wild ideas (mostly beer-style talks, not real formed ideas) that it might be possible to flag certain areas of code as „OK to speculate“ and „not OK to speculate“ by special compiler intristics (which would control both compiler and CPU optimisations, just like there are special atomic types that disable some optimisations with memory synchronisation, reordering, etc). Mostly because there’s not much the CPU manufacturers can do about it (short disabling the optimisations, which would make it slow), so they’re likely going to document it as a feature and say it’s the compiler’s/programmer’s job to do it correctly.

My point there was that Rust’s type system might provide much more information to the compiler than C, so maybe they wouldn’t have to be applied manually. Or maybe not in Rust as it is today, but the general ideas might be extended to something similar to how thread safety guarantees work.

Sure, nothing concrete, but I believe if such needs arise, it could be solvable.


#18

The link in the original post in this thread now leads to a readable summarization of the Spectre and Meltdown hardware bugs and which devices are impacted. These bugs can be exploited by rogue code, which would surely mark itself “okay to speculate”. Thus such ISA extensions would not provide more protection from rogue code.

It is possible to design hardware for existing ISAs that supports more thread parallelism, thereby reducing or eliminating the need for speculative execution and deep cache hierarchies. I gave one example of such an approach in a related thread in the internals forum. Rust’s ease of creating and managing lightweight threads makes it an appropriate language for such architectures.

I queried a colleague about my above-cited post. He has decades of experience in devising side-channel attacks as well as in designing systems that are more difficult to attack via side-channels. Among other things, he’s designed a number of FAA-certified flight-critical subsystems that are found on Boeing and Airbus commercial aircraft.

He’s the one who raised the suggestion of barrel processors. His reply, which he permitted me to post, was

Edit: Added some explanatory links.


#19

These bugs can be exploited by rogue code, which would surely mark itself “okay to speculate”.

This is what I argued too. But supposedly, the situation is like this:

  • Meltdown is fixable.
  • For spectre, you need to run code in the same context as the memory you want to steal. Which either means you need to persuade the program to run your code as part of itself (BPF, javascript in browser), or you need to find a code that suits your needs that is already there and just make the program call it. The first thing can be validated (eg. the browser sees the javascript and compiles it). But you need to ensure that your own code can not be misused. And that’s the one you’d annotate as „OK to speculate“ or „Don’t speculate“.

But let’s see what the CPU manufacturers and kernel developers come up with first, designing languages to specially cope with this is a bit premature :-).


#20

I personally do not have faith that every developer of a library routine that I might use will voluntarily give up performance in favor of security. I imagine that most developers would mark everything “OK to speculate” during development, and many would not revisit every point in their code where a security analysis was needed before they release the software. At least in my experience, schedule and performance trump safety and security in most non-life-critical code.

Re barrel processors, in case it’s not obvious, a typical hyperthreaded CPU is a 2-slot barrel processor.