I found the following image and is it real? If yes, then how is it calculated?
I don't know specifically how it was calculated, but this data is from this paper https://doi.org/10.1145/3136014.3136031 so it should hopefully explain that.
Ditto for Pascal vs C.
I refrained from it, but I initially wanted to make a tongue-in-cheek comment along the lines of "everything you read on the internet is true". This sort of blanket comparison of "language performance" is usually not helpful because it carries so little context it quickly becomes meaningless.
The interesting part is basically everything else. The language design, the tooling, the culture, which all affect performance characteristics of course, but they affect so much more. My experience is that great language design and great culture (ie., the general desire of the community to write high-quality code) is much more essential to getting good performance on average in real-world tasks than any particular optimizer excelling at a specific microbenchmark.
In my field, bioinformatics, a lot of shitty C and C++ code is being written. People who write bioinformatics tools are usually biologists who happen to know some programming, and not programmers who happen to know some biology. But woe betide you if you try to write C or C++ without actually knowing the language! The result will inevitably be slow, incorrect, and generally subpar code.
I have once discovered a bug (still not fixed for several years) in a widely used phylogenetic profiling tool. It segfaults if you give it a large-ish (but entirely realistically-sized) input, because it tries to allocate the whole thing on the stack, using a feature that isn't even officially in the language (VLA in C++). Naturally, it does a lot of unnecessary allocations and copying too, and who knows what else. Instead of being blazing fast and using the resources of the computer efficiently, it exhausts its stack and dumps core. It might have been written in a "slower" language like Go or Java by a more competent programmer, and it could still have been more efficient and correct.
Of course, it may be easier to write fast code in C or Rust than achieving the same performance in Java or Go when you need to fight the GC. But that doesn't mean C and Rust "are faster" than Java or Go. Programming languages are tools. Some languages give better tools for achieving high performance. Some languages sport better features for catching more bugs. Some languages have better tooling to facilitate code reuse, metaprogramming and profiling. And I do believe that Rust excels in all these fields and is a great choice for many kinds of projects. But it's by no means an automatic guarantee that anyone coding in Rust will do better than other programmers using other languages.
Supposedly they used Intel's RAPL interface to get energy readings from the CPU.
Well what do you know, the energy consumed by a program is proportional to the amount of time it runs, who would of guessed?
With any luck the United Nations IPCC will make using Python illegal in their quest for net-zero
Not sure if all CPU instructions have a constant ratio of runtime and energy consumption.
I'm pretty sure they don't. It depends how many transistors have to light up to get the work done, how many bit flips happen. But I think it's a good ball park assumption.
That is an interesting question though. Given some operation what is the most energy efficient way to get it done? That may not be the same as the fastest way to get it done. It may not be the same as the smallest memory use way to get it done. Typically compilers are optimising for the fastest execution not the minimal energy consumption.
For example if one wants to clear a register in x86 one could do the obvious thing and move an immediate zero into it:
MOV AX, 0
or one could do the less obvious and exclusive OR the register with itself:
XOR AX, AX
Which I believe is what compilers often do because it saves on instruction bytes or is faster. But what if the the MOV version used less energy?
What if using regular, scalar, CPU arithmetic operations was more energy efficient than using vector operations?
Clearly a "green", socially responsible, net-zero targeting compiler should optimise for energy consumption not performance.
There is a whole new avenue of research for compiler optimiser authors here!
The data in the article is from the Computer Languages Benchmarks Game, which is notoriously unreliable.
If you really care about performance and energy usage for computationally intensive tasks, why aren't you using an accelerator? GPUs have much better performance per watt than CPUs. This is like arguing about the efficiency of an SUV vs a minivan when you could use a truck.
I guess because a lot of computationally intensive tasks are not helped by an accelerator like we find in GPUs.
If I could get my server farm to consume 500 megawatts instead of a giga watt in return for average response times increasing by 20% that might be a very economically attractive proposition. And quite the greens down a bit as well.
The benchmarks game is targeting stuff that a GPU or FPGA would crush (e.g. making a picture of the Mandelbrot set, computing digits of pi). It's not a good proxy for teasing out the comparative advantage of general-purpose CPUs.
Oddly enough computing the digits of Pi seems to not be helped by GPUs or even multiple compute nodes. See description of Googles 100 trillion digits of Pi hardware: https://cloud.google.com/blog/products/compute/calculating-100-trillion-digits-of-pi-on-google-cloud
These are the problems in the test - since most seem to read the comments but not the paper
The benchmark isn't to compute a world record. It is for a much smaller size.
I would add that benchmarking energy consumption (and more generally resources, carbon footprint, etc.) is useful if you are able to compare different solutions or implementations at the appropriate scale. For example, total energy consumption of a startup with two different setups that provide exactly the same service.
Yes, yes, computing Pi to a lesser number of digits can be as simple as 22/7.
What I was getting at is that some problems are not helped by enormous parallelism or GPUs and such. They are essentially serial.
It can also be the case that sending a problem to the GPU would cost more time and energy in setup and communication (all of the work done in the GPU driver code) than computing the answer on the CPU would — and it's usually higher latency. Energy-efficient programming can often be more about “handle this event/request quickly and go back to sleep” than “maximum throughput per watt”.
That's certainly the case in many domains, even in some HPC applications, but there is a reason Frontier (and most of the TOP500) is a GPU machine.
The point I was trying to make was that the Benchmarks Game is not a great way to measure energy efficiency. The most competitive solutions to some of its problems can be solved with fewer watts on different hardware.
If you want to know how to save power when you're serving web pages, unless you are running the web page for FRACTINT, is the speed at which a program in your favorite language can draw a fractal a reasonable measure?