[FFI, Internals] "Upgrading" a C thread to a Rust thread


#1

I’m experimenting with writing a game engine which uses both Elixir (for long-term, asynchronous logic) and Rust (for per-tick, performance-sensitive things like rendering, input and physics). I want the fastest interface possible between the two, so ideally I want the two running in the same process, with direct access to one another.

Fortunately, Elixir (and the BEAM) has the concept of NIFs, which are essentially arbitrary linked native code exposed as normal functions to Elixir code. The excellent Rustler library makes writing these very easy. I simply have a NIF call which spawns a Rust thread which is the main “Rust” side of the application.

The problem I have run into is that on macOS, where I’m working, all GUI and input work must be done on the main thread, that is the exact thread on which the program entry point was initially called (this is in contrast to other systems, where usually the GUI/event loop is single-threaded, but can occur on any designated thread). This is an issue, as the main thread is owned by the BEAM (Erlang / Elixir virtual machine).

Now, Erlang includes built-in bindings for wxWidgets and they have run into the exact same problem. Since the BEAM does pretty much all of its work outside the main thread, spawning scheduler threads to actually execute code, they have implemented a way for linked code on macOS to “steal” the main thread, emulating a thread spawn but actually executing the code on the main thread. On macOS, the main thread, once setup is complete, sits around and waits to read a function pointer and an argument off a pipe which a NIF could write to, and will execute that function pointer once it receives this. Once that function returns, it will place the result into another pipe, allowing another thread to “join” that main thread again.

I haven’t written the code to actually implement this yet, but I see a large problem to overcome with this approach: Rust libstd code is never run on that thread to set up things like the stack guard and thread info, and I can’t do that manually since the relevant code is private, and even copying it won’t work since the thread info thread-local struct is defined in a private module. I’ve read that running Rust libstd code on a non-Rust thread is UB, and it seems to me that this is precisely why and there shouldn’t be other reasons for this.

I’d love to get some opinions on this situations. I suppose my questions are:

  1. Is there any way to get around this and do all the necessary bookkeeping on a thread not started by Rust for it to function as a normal Rust thread?
  2. If not, is there any good reason for the Rust standard library to not expose some set of unsafe functions to manually upgrade a thread to a Rust thread?
  3. Am I being totally crazy for wanting to do this? Is there something I’m overlooking which means I definitely shouldn’t try to turn a thread into a normal Rust thread after creation?
  4. If this just isn’t going to happen, does anyone know of perhaps a sane way to load another binary (say the BEAM) into your address space and run it on a separate thread from a Rust process?

If I can’t get this to work, I’ll have to rearchitect my code and use some sort of IPC between the BEAM and the main Rust binary, which I am loathe to do as it requires me to serialise everything and increases the latency between the two systems.

Any help or thoughts would be appreciated.


#2

Where have you read this? This works fine. For example, you can compile Rust to a dynamic library and link to it dynamically from any language. In such a situation, there is 0 control over the threads Rust code runs on. Rust doesn’t have a runtime and there is nothing that needs to be setup when a thread spawns. Indeed, you won’t be able to name arbitrary threads, but that’s about it. There is nothing special you have to do to run Rust code on any thread. If the system Elixir/wxWidgets uses works for C, it will work for Rust.


#3

Looking back through my history, I think this is where I got that idea, but now looking at it this was written in 2014. As I understand it threads work differently now.

So looking through the Rust libstd code, it also seems that if I don’t use the Rust methods to create a thread, the stack guard and stack overflow handler won’t get set on that thread. Is this a big deal, or is there a way to cause those to be set? Is the only downside of this that if I do get a stack overflow I’ll get a segfault instead of a nice error message?


#4

Yes, unfortunately, it’s best to disregard any online information about Rust from before 2015.

I don’t think there’s a way to set that manually. The threading implementation is responsible for setting up the stack guard. For example, when using pthreads, you can use pthread_attr_setguardsize. Since Rust is not handling the threads, it shouldn’t be touching the stack.

Pretty much.


#5

The handler is a nice-to-have, but the guard page itself is kind of important. With a guard page, you’re guaranteed to segfault if you run out of stack. Without a guard page, you might corrupt the heap.


#6

Talking about GUI I am confused, are you implementing a game engine running as back-end; or software as front-end?

As for NIF (if I remember correctly), these are blocking routines and should not be used for long lasting operations; here the erlang-BEAM engine will not be able to count the operations and scheduler will fail to delegate CPU-time to the next micro-task in time.

As for linking logic to BEAM, one should use Port-Drivers, http://erlang.org/doc/reference_manual/ports.html, interfacing with the event-system of BEAM. Here you would create a Rust-Background thread processing the state.

As for serialization, the message tuples between erlang-micro-task are always serialized, so each micro-task can perform GC independently. All, but binary-data! Binary data is ref-counted and shared between erlang-micro-task. If you want performance/high-bandwidth between micro-task, exchange binary-data (raw-array) between micro-tasks, for example data being forwarded from one network-socket to another one.

Talking about large data items being exchanged with Rust-Code via Ports, use binary data.


#7

Ah, I didn’t actually realise this. I thought that this was something special that Rust did, not necessarily something which any threading implementation does. In that case, I would assume that the BEAM would either have already done this, or has some C function I can call to do so myself. It seems that when creating new threads from Rust, rust directly uses pthread functions to obtain the correct parameters to set this up for new threads, so the fact that the thread it’s running off is not a “Rust” thread should not prevent it from being able to correctly create new Rust threads.

I’m writing a local game engine, not something networked. Rust handles performance-sensitive tasks like graphics, physics, input, etc, and I plan to use Elixir to write logic like AI, NPC routines, weather, general things which have more to do with simulating the world than getting it onto a screen 60 times a second.

There’s actually been less and less difference between port drivers and NIFs as time goes on. You’re right in that Port Drivers have some extra conveniences around long-running logic, however this mostly applies to their original application, which is handling I/O. To clarify, what’s happening is that I’m executing a NIF from Elixir, all this does is start up the main Rust game system on another thread (possibly saving off some handle or channel for later communication) and then returns. The long-running Rust logic is on a different thread. Then, when Elixir wants to communicate something to the Rust engine, it can call another NIF, while Rust can use a NIF API function to asynchronously send messages to Elixir processes at will (even from other threads which are not performing a NIF call).

You’re right about binary data. In this case, I suspect I will not be sharing a large volume of data between the two systems—most heavyweight stuff like graphics, audio etc will stay on Rust’s side. What I do expect is a lot of calls though, so latency is something I’m more worried about than e.g. memory footprint. Messages between Erlang processes are copied, that is true, though when a NIF is called I don’t think its arguments are copied (and they would be otherwise if the call was happening as a remote message of some kind). What I meant by my concerns with using IPC between the two systems is serialisation meant two-fold:

  1. The messages would be serialised into a wire format like ETF to actually be sent back and forth. Currently, Elixir has no serialisation overhead, and Rust can performantly convert the BEAM types into something Rust can use.
  2. Using a single socket or pipe would mean that all communication is serialised on both ends—only one process in Elixir would be able to write and read this, and so all communication would have to go through that process, and likewise I’d have a dedicated thread on the Rust side through which all communication would be serialised. If the two systems are in the same address space, I can use a NIF call in any Elixir process to communicate with the correct Rust thread directly (e.g. placing some event on a queue somewhere) and likewise, any Rust thread can place a message in an Elixir process’ mailbox without having to go through some central thread.