The point is more that benchmarks are only good at measuring what they measure, tautological but often forgotten, and that they normally get very different results between each other despite claiming to measure the same thing.
Here, sorting by 99% (64 byte) latency, for example, 10 of the top 11 frameworks are rust, and the top is an order of magnitude faster. But does that matter at all when we're talking about whether it's 1ms or 0.1ms? Who cares when the ping is 100ms? Throughput is also not hugely relevant for most services, since it's application code that's going to dominate. Sure, you can write Java code that has incredible throughput, but only by writing it like you're writing C, and the slightest mistake will silently drop you back to multiple second GC pauses every Nth request, and of course any public packages out there are completely out of the question.
Full, even if small, applications; written in idiomatic, maintainable style, are far more informative of what the "performance of the language" is, but both this is a lot of work, and now you're buying all the trouble of "no true scottsman"-ing the specific implementations: "well of course it's slow if you do that, you would do this if you really cared about performance", all the way up to JNI-ing to hard-crafted assembly.
Carefully crafted code used with advanced JITs is often competitive with native code. There's a few reasons for this, but essentially there's a pretty hard floor for how good machine code you can generate, and if you avoid triggering the GC or fall off the JIT fast paths, you're pretty much guaranteed to be running 100% JITed machine code. There's some edge cases where JIT can even beat native compilers due to having more information (vague example: this array is mostly cats, not dogs, so optimize for that).