Subprocess and dynamic library linking problem/interaction


#1

Hi,

I having a strange library linking problem. TLDR: Linking my rust program with the GSL library causes gnuplot subprocess to load wrong libraries. Any hints on what is happening are greatly appreciated!

I’m running gnuplot as a subprocess (what follows is a minimal example):

src/main.rs

use std::process::Command;

fn main() {
    let output = Command::new("gnuplot")
        .output().unwrap();

    println!("status: {}", output.status);
    println!("stdout: {}", String::from_utf8_lossy(&output.stdout));
    println!("stderr: {}", String::from_utf8_lossy(&output.stderr));
}

cargo run prints:

status: exit code: 0
stdout:
stderr:

as expected. So far so good.

Now I want to use the GSL library from my rust code (ultimately through the GSL crate). To link it to my program, I simply use a build script with pkg-config:

Cargo.toml

[package]
name = "gnuplot-test"
version = "0.1.0"

build = "build.rs"

[build-dependencies]
pkg-config="0.3"

build.rs

extern crate pkg_config;

fn main() {
    pkg_config::probe_library("gsl").unwrap();
}

But now gnuplot fails because it’s loading wrong libraries! Output of cargo run:

status: signal: 6
stdout:
stderr: dyld: Symbol not found: __cg_jpeg_resync_to_restart
  Referenced from: /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
  Expected in: /opt/local/lib/libjpeg.9.dylib
 in /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO 

I must be missing something important, but I don’t understand how what my program is linked against influences the search paths for the dynamic libraries of any other programs that I might call. For instance, ImageIO above depends on a completely different libJPEG.dylib:

$ otool -L /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
...
	/System/Library/Frameworks/ImageIO.framework/Versions/A/Resources/libJPEG.dylib (compatibility version 1.0.0, current version 1.0.0)
	...

But for some reason it picks the wrong one from /opt/local/lib where libgsl.dylib is located.


I am on macOS sierra 10.12.1, having installed GSL (v2.1) and gnuplot (v. 5.0.5) both via MacPorts (so they live in /opt/local).

$ rustc --version                                                                                                                                                           master *
rustc 1.12.1 (d4f39402a 2016-10-19)

I haven’t set any DYLD_* env variables.


#2

I haven’t set any DYLD_* env variables.

Maybe something else has done that? Can you add

    for (key, value) in ::std::env::vars() {
        println!("{}: {}", key, value);
    }

and check if there perhaps is any difference?


#3

Thank you so much for your suggestion! Mystery solved, Cargo sets the DYLD_LIBRARY_PATH, among others!

DYLD_LIBRARY_PATH="/opt/local/lib:${CRATE}/target/debug:${CRATE}/target/debug/deps:~/.multirust/toolchains/stable-x86_64-apple-darwin/lib"

This messes up the dylib loading when running the binary through cargo run.
Running `target/debug/gnuplot-test" directly works fine.

But this now begs a question: Is this the expected behavior? It is quite surprising to me. I thought these env variables are needed only during the build. Maybe cargo should reset the variables before it runs the binary? Especially since this means that cargo run and target/debug/... run the binary in different environments…


#4

Cargo issue: https://github.com/rust-lang/cargo/issues/2888


#5

Oh, good find, thank you for your help.

I’ve been looking through cargo code and it seems that cargo actually specifically sets all these variables when running the binary here: the call to target_process creates a pristine process struct and then fills all the env variables by calling fill_env. Dylib path is set here. So it’s not that cargo needs to clean these variables up, it just shouldn’t set them.

But at least I know how to go around this issue: just run the binary directly :slight_smile:


#6

The changes Cargo makes to the DYLD_LIBRARY_PATH are required for it to operate correctly in many circumstances (e.g. a dylib C library was compiled).

I unfortunately don’t really understand what the problem is at hand here, could you explain in more detail about why the path set by Cargo is causing problems?


#7

In my particular case, the problem is the extra dylib search path that Cargo adds to the env when running the binary of my program in cargo run. macOS doesn’t have a system package manager, so I use MacPorts, which stores all of its installed artifacts to /opt/local/, in my case /opt/local/bin/gnuplot and /opt/local/lib/libgsl.dylib.

Now for my binary, I want to link against GSL to use some diff. equation solvers, and then run gnuplot as a subprocess to pipe plotting commands to it.

  • Building: Linking GSL is relatively straightforward, I just use build.rs build script to emit the libgsl path (via the pkg-config crate). Indeed, running otool shows the correct dependency in the produced binary:
$ otool -L target/debug/gnuplot-test                                           
target/debug/gnuplot-test:
	/opt/local/lib/libgsl.19.dylib (compatibility version 20.0.0, current version 20.0.0)
	/opt/local/lib/libgslcblas.0.dylib (compatibility version 1.0.0, current version 1.0.0)
...

In particular, and if I understand dylib loading correctly, since the correct location is in the binary, it is unnecessary to set up the DYLD_LIBRARY_PATH to find the correct libgsl when running the binary.

  • Running. cargo run executes the binary, which then runs gnuplot as a subprocess via std::process::Command. gnuplot is a pretty big command having many dependencies since it can plot to many terminals. One of them is Aquaterm, which seems to depend on Apple’s ImageIO framework, which depends on libJPEG.dylib. As I mention above:[quote=“rekka, post:1, topic:7873”]
    $ otool -L /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO

    /System/Library/Frameworks/ImageIO.framework/Versions/A/Resources/libJPEG.dylib (compatibility version 1.0.0, current version 1.0.0)

    [/quote]
    However, if DYLD_LIBRARY_PATH=/opt/local/lib is set (as it is by cargo run), the dylib loader seems to search for libJPEG.dylib in that directory first. It finds libjpeg.dylib (macOS is case insensitive), which is a completely different library, and gnuplot fails to load.

I guess I am just surprised by Cargo’s behavior. I have always thought that cargo run is just a convenience shorthand for cargo build && target/debug/${BINARY_NAME}. However, since Cargo goes out of its way to set many env variables when running the binary, it is not exactly the case. I understand that the env vars must be set for the build stage, but I don’t think they should be set when the binary is run. After all, we want to run the binary eventually without cargo, not only through cargo run.

I have a very rudimentary knowledge of dylib loading. I haven’t been able to find how DYLD_LIBRARY_PATH actually works on mac since most google results are just forum post where people are asking for help with linking problems :confused: It seems to be very fragile, since just to include one library from a specific location, that location must be added to the dylib search path and then all the other libraries there might cause conflicts. I have to say that I love Cargo’s dependency resolution and that everything, including C libraries are usually linked statically.


#8

Hm ok, yeah this sounds pretty unfortunate. This is somewhat of a footgun with the dylib loading system with DYLD_LIBRARY_PATH in that having two libs of the same name ends up in trouble at some point.

Cargo needs to add these paths in cargo run because any paths in the build directory aren’t part of DYLD_LIBRARY_PATH (most likely) for dylibs that were built as part of the build process. Perhaps a solution here is to not modify this path except for those paths that are in the build directory? Here /opt/local/lib is clearly owned by the system and not Cargo, so presumably it should rely on you to get that right instead of trying anything itself.


#9

On macOS, DYLD_LIBRARY_PATH is essentially a hack: libraries are normally located by absolute path, but DYLD_LIBRARY_PATH asks the dynamic linker to ignore that path and search the directory for libraries matching only the basename. It really should not be used; better for Rust dylibs to just use the right path. See also, see also.

But there may be a similar issue on Linux.

By the way, the behavior of dyld environment variables is documented in the man page: https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/dyld.1.html


#10

I was going to say I think this is issue https://github.com/rust-lang/rust/issues/28640 biting you but looks like you already found that link :stuck_out_tongue: