Because distributed programming is hard, I am trying to figure out the max performance we can get out of a single machine.
Of all the architectures that Rust targets (and also runs Linux), what is the most powerful machine we can get? Is it:
(1) dual core AMD Epyc
(2) ampere altra
(3) something else ... ?
For simplicity, let's rule out infiniband + rdma. I want something where we can easily use AtomicU64 and have it visible to all nodes/threads of the single machine.
Also, let us rule out GPUs. We're not doing floating point crunching. We do not have that level of a data parallelism.
Primarily, I am interested in the # of cores we can have, subject to the constraint that all cores can see all memory and access shared AtomicU64's.
The System/390, Power, and RISC-V architectures are all on the Tier-2 target list, so you should be able to find a mainframe-class computer that can run both rustc and programs implemented in Rust (though there may be some quirks/bugs).
This support is for Linux running in a VM (what they used to call an LPAR, I think) on the mainframe, so it will look a lot like any other 64-bit Linux server. Rust targeting z/OS would be quite different.
What is a "single machine"? The most performant computer in the world at the moment seems to be Supercomputer Fugaku - Supercomputer Fugaku, A64FX 48C 2.2GHz, Tofu interconnect D | TOP500 at 16 PFlop/s . You would probably describe it as a rather big number of smaller computers, but it counts as one machine. And it is running Red Hat Enterprise Linux as operating system, so it's probably running rust. At least some rg somewhere in some administrative script ...