Is this a fundamental limitation of llvm, or of rustc, or just that its not considered?
It would be a handy feature, IMO. Interesting that I have not seen discussion of this relating to Boost context/coroutine/fibre since I would think it might apply anywhere that cannot or does not treat TLS base addresses as volatile.
I don't know the situation on Windows, but on GNU, it is not supported to resume a coroutine with a native stack on a different thread from that on which it was created/suspend. It is more or less encoded in the ABI that the address of TLS variables does not change during the execution of a function.
A design which does not use native stacks for coroutines would not necessarily suffer from this; it could simply avoid caching TLS addresses. Due to its nature, it will also interoperate with the rest of the program even if that caches TLS addresses.
On Windows it is explicit in the Fibre subsystem - but my question is
really about the compiler.
When you say 'A design' what are you referring to? It is
straightforward (if needing care) to avoid in user-level code - but the
issue as far as I can tell is the compiler retaining addresses that it
uses in its own runtime and caching in the stack frame and/or registers.
I could not find any reference to disabling such optimizations for
Clang/LLVM or GCC, and yet there is quite a long history of GCC being
used with stackful coroutine libraries, including Boost.Context. Some
of these systems do allow for context/fibre migration between OS
threads. In C code that explicitly uses OS TLS access, perhaps it is
OK. In a language that manages exceptions, perhaps not.
So my concern is really whether this issue is also infecting C++ code
(and might infect native Kotlin, Scala etc if they mature).
It would be 'very handy' if rust had an optimization switch that does
effectively what the 'fibre safe optimizations' switch in MSVC does - if
indeed it is possible with LLVM.
It also seems unfortunate that the authors of the original link
suggested that it is a dead-end for coroutine support on rust, since it
seems to me that the limitation is that coroutines must be scoped to OS
native threads and cannot switch between them. Which does not completely
destroy their utility, even if it does limit their usefullness with
work-stealing thread pools.
Stackless coroutines which the compiler recognizes due to a syntactic peculiarity (such as Python's yield keyword), and then transforms in a special way.
There is no way to disable this caching in GCC (or Clang, I presume). Libraries which pretend that cross-thread resumption is working are wrong and apparently have not received sufficient testing. Some special cases work, and it may be possible to stay within the supported area if you control the contents of call stacks at all points, but the GNU toolchain is simply not aware of TLS switching, and may introduce optimizations that are incompatible with this in ways that are not apparent from the application source code.