What's the problem of using different CRTs on Windows over FFI?

yutannihilation · January 23, 2023, 2:48pm

Hi, sorry for asking a bit too vague question. I wrote several R packages using Rust via FFI.

Recently, I received some concern about using different CRTs between R and Rust on Windows; R (as of version 4.2) uses URCT while, if I understand correctly, the Rust toolchain uses MSVCRT. It seems their concern comes from their experience with the problems of statically-built C/C++ libraries when R switched from MSVCRT to UCRT.

https://blog.r-project.org/2022/11/07/issues-while-switching-r-to-utf-8-and-ucrt-on-windows/

In my understanding, memory allocation / deallocation can be a problem on Rust's case as well (the Rust libraries should not free the memory allocated by R, and vice versa. I think this is a basic rule of FFI anyway). However, except for this, I have no idea what problems there can be. On C/C++'s case, the encoding and locale support of CRT matters a lot, but I don't think Rust's string system depends on it.

Rust is not C / C++. I guess Rust depends much less on the CRT than C/C++, but at the same time, I know it does depend at least to some extent. What are the examples of possible problems caused by the difference of the CRTs?

Also, I'd like to know the current status of UCRT support on Windows. I found this comment, but couldn't find there's any progress on windows-gnu's support for UCRT.

github.com/rust-lang/rust

Comment by mati865 to Add MVP LLVM based mingw-w64 targets

rust-lang:master ← mati865:mingw-llvm-target

Thank you for the explanation, updated the description. > I have one more que…stion about ucrt. This target uses it as a C library, but it's not visible from the target's name. > Any plans to introduce windows-gnullvm + msvcrt and/or windows-gnu + ucrt targets later? What naming will they use if added? [dot-asm](https://github.com/dot-asm) created proof of concept repo showing how to not depend on any *CRT, maybe that could be used to make stdlib safe with any *CRT. As for mingw-w64 `libmsvcrt.a` is just a copy of CRT that was configured as the default, so it's either copy of `libmsvcrt-os.a` or `libucrt.a`. > I see the `mingw64` (classic), `ucrt64`, and `clang64` "distributions" in MSYS2. > If `ucrt64` is `windows-gnu + ucrt`, and `clang64` is `windows-gnullvm + ucrt`, then I guess that nobody cares about `windows-gnullvm + msvcrt`? I don't think I've had any requests for it. > And that we may add a `windows-gnu + ucrt` target in the future? Possibly but ideally `windows-gnu` would support either CRT.

Coding-Badly · January 23, 2023, 3:30pm

In general, that's true for any pair of binaries.

It is.

Correct. Rust strings are absolutely not anything Microsoft strings.

Yeah. From your confusion, I'm confused about such things as well. Why do you believe your R package built with Rust is statically linking to any C runtime? While it has a been a few years, the last time I dealt with this issue it was clear that our Rust application used whatever CRTL was available on the target machine (which can make the situation much worse than using a statically linked CRTL).

yutannihilation · January 23, 2023, 4:39pm

Thanks.

Sorry, I didn't describe the details around this. No, I don't use crt-static on Rust, and R is not statically linked to any C runtime. So, the concern is when a staticlib-type Rust library, which was compiled for MSVCRT is called from an R session, which should use UCRT.

(Honestly, I don't understand these things well. Sorry if my explanation doesn't make sense...)

Coding-Badly · January 24, 2023, 1:50am

That is certainly understandable given this...

I'm struggling to understand how an external library is statically linked to the dynamic library of R. I guess they're trying to say they use import libraries.

That question can be boiled down to the interface between the Rust library and R. If the folks who built R have done a good job then responsibilities are clearly defined and memory ownership is always retained by the thing that allocated that memory.

In other words, you nailed it in your original post...

yutannihilation · January 24, 2023, 2:59pm

I think I don't understand this part, but it's that the Rust crate is compiled to the static library, and then the static library gets liked on compiling some helper C code to make it possible to call the Rust functions from R's side. As R uses MSVCRT, when the session loads the result DLL, the DLL should use MSVCRT accordingly.

So, are you saying the only possible problem is about memory allocation? Doesn't Rust rely on the C runtime for other things than memory allocation...?

Coding-Badly · January 24, 2023, 5:39pm

Memory management is not the only potential problem. Some of the C library is stateful. On rare occasions I've seen programs built incorrectly call the "wrong" function resulting in a crash. But...

As far as I know, Rust, on Windows, only uses the C library for heap management. On the several occasions that I've dug into the details of various bits of the Rust standard library the end result has always been a call directly to the operating system.

When I'm concerned about such things a MAP file is an invaluable tool. Armed with a MAP file it's possible to definitively answer both of those questions.

carey · January 25, 2023, 12:21am

Rust heap management on Windows does not use the C library. As you can see at rust/alloc.rs at c8e6a9e8b6251bbc8276cb78cabe1998deecbed7 · rust-lang/rust · GitHub, it uses HeapAlloc directly, which is part of kernel32.dll.

Looking at a program of mine using DUMPBIN, the C library is used for the executable entry point before main(), maths functions, memcpy() etc., and stack unwinding for panics.

yutannihilation · January 26, 2023, 1:25pm

Thanks. I didn't come up with this. So, it might be that there are no such things like "the problem of using different CRTs," but there actually are differences, and they might matter, right?

blonk · January 26, 2023, 1:45pm

Indeed. Many years ago I worked in a project that pulled in three different CRT's for one process. We discovered that this was a problem because the CRT's at startup would pull in the environment variables from the system and store them in their own CRT-specific buffer. After the initialization the setenv(), unsetenv() and getenv() calls would operate on this CRT-specific buffer. We first noticed this when one DLL would set an environment variable, but it would not become available to another DLL.

carey · January 26, 2023, 6:25pm

Yes. The most obvious one out of what I noted is that I wouldn’t expect stack unwinding on panic to work across code compiled for different CRT’s. And as blonk noted, even if Rust calls HeapAlloc, GetEnvironmentVariable and so on, the different CRT’s might not from C and C++ code.

chrisd · January 26, 2023, 11:20pm

Note that the article is talking about MINGW which (in Rust) uses the very very ancient msvcrt.dll distributed with the OS (which was intended as a private dll and should not have been used by third parties but its kept around because people did).

The msvc and the (tier 3) llvm-gnu toolchains use the UCRT.

yutannihilation · January 27, 2023, 3:33am

Thank you all for a lot of useful information!

This is exactly the kind of problems I'd like to know. Interesting.

Yes, unwinding is one of the biggest problem related to FFI. A good news is "C-unwind" ABI is getting stabilized! (but I honestly don't understand to what extent it will solve the problem)

github.com/rust-lang/rust

Tracking Issue for "C-unwind ABI", RFC 2945

opened 09:11PM - 31 Jul 20 UTC

nikomatsakis

A-ffi B-RFC-approved T-lang relnotes C-tracking-issue disposition-merge finished-final-comment-period F-c_unwind S-tracking-impl-incomplete A-abi

This is a tracking issue for the RFC "C-unwind ABI" (rust-lang/rfcs#2945). The feature gate for the issue is `#![feature(c_unwind)]`. This RFC was created as part of the ffi-unwind project group tracked at https://github.com/rust-lang/lang-team/issues/19. ### About tracking issues Tracking issues are used to record the overall progress of implementation. They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however *not* meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label. ### Steps  - [ ] Implement the RFC - [ ] Adjust documentation ([see instructions on rustc-dev-guide][doc-guide]) - [ ] Stabilization PR ([see instructions on rustc-dev-guide][stabilization-guide]) [stabilization-guide]: https://rustc-dev-guide.rust-lang.org/stabilization_guide.html#stabilization-pr [doc-guide]: https://rustc-dev-guide.rust-lang.org/stabilization_guide.html#documentation-prs ### Implementation notes Major provisions in the RFC: * [x] Add a `C-unwind` ABI and `system-unwind` (I think), we may need more variants. * [ ] For external functions with the C ABI, we already (I believe) add the "nounwind" LLVM attributes when we build them. We want to continue doing this. * [ ] For external functions with the C-unwind ABI, we do *not* want any such attributes. * [ ] For Rust functions defined with the C ABI, e.g., `extern "C" fn foo() { ... }`, we want to force them to abort if there is a panic. The MIR helper function [`should_abort_on_panic`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/build/fn.should_abort_on_panic.html) already exists and I think will modify the generated code to insert the required shims in such a case. They should also be marked as "nounwind" in LLVM, if they're not already. * [ ] Rust functions with the "C-unwind" ABI should **not** abort on panic. * [ ] Use ABI to guide the "nounwind" attribute on callsites as well. * [ ] Write suitable codegen tests to check generated LLVM IR. ### Unresolved Questions  None. The unresolved questions in the RFC were meant to be solved by future RFCs building on this one. ### Implementation history * Initial implementation: https://github.com/rust-lang/rust/pull/76570 * Additional work (removing `#[unwind]`, adjusting panic handling): https://github.com/rust-lang/rust/pull/86155

Oh, I didn't know the MSVC toolchain also uses UCRT. Unfortunately, if I remember correctly, R requires the library to be built by the GNU toolchain, so it's probably not the case. Good to know anyway.

simonbuchan · January 27, 2023, 7:57am

The general rule of thumb here is "it's the interface's problem" - they have to define not just what functions exist, but also everything legal to do with them. If they, for example, allocate and return a string you have to CRT free(), they have to define exactly which CRT it is, which can get very complex very quickly.

The two main approaches for the interface to avoid having to deal with this mess are to demand you build both sides of the interface, then it's your problem to make sure the compiler and library settings and versions match up; or to provide a completely hermetic interface, where no assumption about the runtime is made: mostly this is things like having a free_foo() for every new_foo(), doing their own last_error() instead of using errno, and so on.

So uh, tldr read the R library docs, I guess!

system · April 27, 2023, 7:58am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is it possible to change the CRT Rust links to on Windows MSVC?	2	615	January 12, 2023
Rust target for Windows Universal help	14	6255	January 12, 2023
How to use openssl as a crate in rust on windows	6	637	July 18, 2021
vsprint_s / msvcrt.dll on windows xp sp3 help	4	1623	January 12, 2023
Any compiled applications from Rust does not work on Windows 7/10 on VM help	17	2927	January 28, 2021

What's the problem of using different CRTs on Windows over FFI?

Related Topics