Preventing rust from calling setenv/getenv, from a library crate

Soni · October 9, 2022, 11:22pm

We maintain hexchat-unsafe-plugin. We'd like libstd to bind to our own environ instead of using libc's. This is for soundness purposes. That's because global state is inherently unsound, so much so that there's work towards deprecating static mut, and yet libstd doesn't seem to care for that when it comes to environ.

How do we do this?

alice · October 31, 2022, 9:26pm

I don't believe that there's any way to do so.

quinedot · October 31, 2022, 10:19pm

The lib team is aware of the set_env soundness issue and actively figuring out how to deprecate it, FYI.

Soni · October 31, 2022, 10:32pm

var is also unsound. we just want to isolate the libstd concept of "vars" from the libc concept of "environ", so they don't interfere with eachother. (tho ideally we'd still like to preload it from libc's environ, and Command should follow the rust/libstd env.)

alice · October 31, 2022, 10:39pm

Ultimately, the unsoundness is the fault of the unsafe code that calls into the libc code. It is usually not possible to prevent incorrect unsafe code from doing bad things.

jhpratt · November 1, 2022, 2:00am

var is not unsound. The underlying libc call is explicitly thread-safe.

Soni · November 1, 2022, 2:09am

environ is not. how can getenv be thread-safe if environ is not? (also doesn't setenv invalidate getenv? how are we defining "thread-safe" here?)

quinedot · November 1, 2022, 3:08am

Reading operations are considered MT-safe but writing operations are not (by POSIX), in brief.

There's a lot more conversation in #27970 and this IRLO thread.

LegionMammal978 · November 1, 2022, 5:47am

This is not exactly accurate; POSIX explicitly allows getenv() not to be thread-safe:

The getenv() function need not be thread-safe.

The only guarantees we get are from glibc and other implementations on our supported targets.

VorfeedCanal · November 1, 2022, 1:03pm

POSIX gives you direct access to environ. How can anything be thread-safe in such configuration?

Soni · November 1, 2022, 1:07pm

obviously none of this would be a problem if setenv leaked the environ, but sadly that's not what existing implementations do.

(we guess we should just carry on with making hexchat-plugin wasm-based, built upon hexchat-unsafe-plugin...)

LegionMammal978 · November 1, 2022, 1:07pm

That is answered on the same page:

Conforming multi-threaded applications shall not use the environ variable to access or modify any environment variable while any other thread is concurrently modifying any environment variable. A call to any function dependent on any environment variable shall be considered a use of the environ variable to access that environment variable.

In other words, an implementation is required to support multiple threads concurrently reading from environ, but it is not required to support multiple threads concurrently calling getenv(), nor reading from environ or calling getenv() while environ is being written to.

VorfeedCanal · November 1, 2022, 1:16pm

The problem is that when we have gotten ahold of environ we have entered the twisted maze of defined and undefined behaviours. Some things are permitted, some things are not permitted but there are absolutely no way to make the whole thing sound.

The nature of simple pointer variable which is very much part of the API makes it impossible.

Soni · November 1, 2022, 4:37pm

(honestly tho so uh why can't we just do linker overrides on it? .-.)

Michael-F-Bryan · November 1, 2022, 7:59pm

If you mean we should use linker magic to replace the environ static variable with our own, then we still haven't solved the problem that there's an unsynchronized global variable anyone can use.

If you overrode setenv() and getenv() to use a function that accesses environ via a lock, then what's stopping code that's already been compiled from using environdirectly?

If your overridden setenv() and getenv() used a synchronised copy of the original environ variable, you now have a situation where one piece of code could be writing to the synchronised copy and a different piece of code (maybe an installed C library) could be reading from the original environ variable. If your program relies on these two seeing a consistent set of environment variables, you're going to run into weird bugs.

Either way, there's no simple trick we can use to fix things - otherwise it would have been fixed years ago.

I don't see there being much point in worrying, though. The lack of thread safety as guaranteed by POSIX seems to be inconsequential in practice, and mostly a concern for people working in adversarial environments or academics.

Soni · November 1, 2022, 8:04pm

what prevents a crate like hexchat-unsafe-plugin from preventing a crate like std from linking to the libc environ/setenv/etc and instead providing our own, with rust-level linker magic? (link_name attribute maybe?)

Michael-F-Bryan · November 1, 2022, 8:07pm

There is nothing preventing you from doing it.

My comment was pointing out that providing your own versions wouldn't solve your underlying problem. environ is still going to be an unsynchronized global variable that can be mutated directly by anyone wanting to do so.

Soni · November 1, 2022, 8:09pm

at least it wouldn't interact with the hexchat environ, or that of the other plugins.

MoAlyousef · November 2, 2022, 10:40am

I think C stdlib symbols are mostly weak, so your program’s getenv calls glibc getenv on gnu systems if it didn’t find a redefined symbol. You could define your own getenv and it will override the libc symbol (similar to how you can redefine malloc):

use std::os::raw::c_char;

#[no_mangle]
extern "C" fn getenv(_: *const c_char) -> *mut c_char {
    // or override it however you wish
    std::ptr::null_mut()
}

fn main() {
    // this will fail even though we didn't call getenv directly
    println!("{}" ,std::env::var("HOME").unwrap()); 
}

Gives:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NotPresent', main.rs:9:42
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Even though HOME is set on my system.

system · January 31, 2023, 10:41am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Blocking certain calls help	5	356	July 17, 2021
Global state and env::set_current_dir	9	1594	January 12, 2023
You should stop telling people that safe rust is always safe help	10	2569	January 12, 2023
[Kernel] `static` written once and read without `unsafe` help	10	1250	June 26, 2021
Unsafe and memory alloc help	7	545	March 19, 2020

Preventing rust from calling setenv/getenv, from a library crate

Related Topics