Strange crash in std::process::Command on Windows 7


#1

Problem Description

I have encountered a strange crash/behaviour in std::process::Command in my program, which is basically an HTTP server (Rocket) that gets triggered on /execute endpoint to run cmd/sh in futures::Future with a possible timeout. I would like to ask for help in clarifying as to why my program has cmd launches that crash on Windows 7 only.

On Ubuntu 16.04 (x64) or Windows 10 (x64), the cmd/sh triggered by /execute works normally and does not crash at all. However on Windows 7 SP1 (x64), the program starts out normally for the first consecutive 36 /execute triggers, but keeps crashing from the 37th trigger and onwards. And it seems that the behaviour happens consistently.

For example, given the HTTP client repeatedly tells the program (server) to run echo haha, the client will log the JSON output for the first 36 times normally:

{"server":{"exitCode":0,"stdout":"haha\r\n","stderr":"","hostname":"0.0.0.0","port":17385}}

And then after that (with the consistently the same exitCode value even after restarting the program):

{"server":{"exitCode":-1073741502,"stdout":"","stderr":"","hostname":"0.0.0.0","port":17385}}

along side with the cmd crashing with the message: The application was unable to start correctly (0xc0000142). Click OK to close the application.

For compilation, I am using x86_64-pc-windows-msvc on nightly (since rocket does require nightly) with Build Tools VS2015 (x86_64-unknown-linux-gnu for Ubuntu). The program code does not contain any unsafe blocks.

As mentioned, the program works fine on Windows 10 (and Ubuntu), but fails on Windows 7. Tried on various machines and VMs with Windows 7, and also tried static and dynamic linking to the runtime libraries, but the crash still occurs. I am puzzled as to why Windows 7 has a different/crashing behaviour especially when it runs perfectly fine on Windows 10.

In comparison, I have also tried a simpler program that just performs cmd/sh via std::process::Command repeatedly without other larger dependencies such as rocket and futures, and that actually works without any crash on Windows 7.

While the program code does not have unsafe blocks (and also generally no .unwrap() / .expect()), I am not completely sure if my code usage is correct, which could have been the cause of the crash on Windows 7, so please guide me along if I have made any mistakes.

Below contains the extract of my program code in src/main.rs (comm_service) that highlights the important parts (sorry for the length). I have also listed the links to my program source code repositories. Thanks for any help in advance! :grin:

Code Extract

macro_rules! create_fut {
    ($pool:expr, $timeout:expr, $action:expr) => {{
        let timer = Timer::default();
        let action_fut = $pool.spawn_fn($action);
         
        let bail_fn: Box<FnBox(_) -> _ + Send> = Box::new(|_| bail!(ErrorKind::Timeout));

        let timeout_fut = timer.sleep($timeout)
            .then(bail_fn);

        let win_fn: Box<FnBox(_) -> _ + Send> = Box::new(|(win, _)| win);

        timeout_fut
            .select(action_fut)
            .map(win_fn)
    }};
}

fn execute_impl(is_blocking: bool, config: State<MainConfig>, client_map: State<ClientMap>, exec_req: JSON<ExecReq>) -> Result<Option<CommOverallStatus>> {
    let timeout = Duration::from_millis(config.timeout as u64);
    let pool = CpuPool::new(config.thread_count as usize);

    if let Ok(client_map) = client_map.read() {
        // broadcast and execute for each client
        info!("Broadcasting commands to each client...");

        let client_comm_overall_status_futs: Vec<_> = client_map.iter()
            .map(|(client_name, client_info)| {
                let client_execute_url = {
                    let mut url = client_info.url.clone();
                    let path = if is_blocking { "execute" } else { "executenb" };
                    url.set_path(path);
                    url
                };

                let exec_req = exec_req.clone();
                let client_name = client_name.to_owned();

                info!("Broadcasting '{:?}' for client '{}' to '{}'...", exec_req, client_name, client_execute_url);

                create_fut!(pool, timeout, move || {
                    let client = Client::new();

                    let mut res = client.post(client_execute_url)
                        .body(&serde_json::to_string(&exec_req).chain_err(|| "Unable to convert execution request JSON into string")?)
                        .header(ContentType::json())
                        .send()
                        .chain_err(|| "Unable to perform client post")?;

                    let mut rsp_body = String::new();
                    let _ = res.read_to_string(&mut rsp_body);

                    info!("Client '{}' response body: {}", client_name, rsp_body);

                    let partial_comm_overall_status: CommOverallStatus = serde_json::from_str(&rsp_body)
                        .chain_err(|| "Unable to parse client response body into comm overall status")?;

                    Ok(Some(partial_comm_overall_status))
                })
            })
            .collect();
        
        // execute for self if matching
        let client_key = Regex::new(&exec_req.cmd_id_re)
            .chain_err(|| format!("Unable to parse '{}' as regex", exec_req.cmd_id_re))?;

        info!("Checking for self name '{}' match against regex '{}'...", config.name, client_key);

        let self_comm_overall_status_fut = if client_key.is_match(&config.name) {
            let cmd = exec_req.cmd.to_owned();
            let address = config.address.to_owned();
            let port = config.port;
            let name = config.name.to_owned();

            info!("Name '{}' matches regex, executing '{}'...", name, cmd);

            create_fut!(pool, timeout, move || -> Result<Option<CommOverallStatus>> {
                let child = if cfg!(target_os = "windows") {
                    Command::new("cmd")
                       .args(&["/C", &cmd])
                       .output()
                } else {
                    Command::new("sh")
                        .args(&["-c", &cmd])
                        .output()
                };
                    
                const ERROR_EXIT_CODE: i32 = 127;

                let self_comm_status = match child {
                    Ok(child) => {
                        let exit_code = match child.status.code() {
                            Some(code) => code,
                            None => ERROR_EXIT_CODE,
                        };

                        CommStatus {
                            exit_code: exit_code,
                            stdout: Some(String::from_utf8_lossy(&child.stdout).to_string()),
                            stderr: Some(String::from_utf8_lossy(&child.stderr).to_string()),
                            hostname: address,
                            port: port,
                        }
                    },

                    Err(e) => {
                        CommStatus {
                            exit_code: ERROR_EXIT_CODE,
                            stdout: None,
                            stderr: Some(e.description().to_owned()),
                            hostname: address,
                            port: port,
                        }
                    }
                };

                let mut partial_comm_overall_status = CommOverallStatus::new();
                partial_comm_overall_status.insert(name, self_comm_status);
                Ok(Some(partial_comm_overall_status))
            })
        } else {
            info!("Name '{}' does not match regex, returning None...", config.name);

            create_fut!(pool, timeout, || -> Result<Option<CommOverallStatus>> {
                Ok(None)
            })
        };

        // merge all the results
        info!("Merging results from broadcast and self-execution...");

        let comm_overall_status_futs = client_comm_overall_status_futs.into_iter()
            .chain(iter::once(self_comm_overall_status_fut));
        
        // handle future results based on the type of execution
        if is_blocking {
            info!("Blocking mode, waiting for all created futures with possible timeout...");

            // converts from vec of status into hash map
            let comm_overall_status: CommOverallStatus = comm_overall_status_futs
                .map(|comm_overall_status_fut| comm_overall_status_fut.wait())
                .filter_map(|partial_comm_overall_status| {
                    match partial_comm_overall_status {
                        // expected case for valid use
                        Ok(Some(partial_comm_overall_status)) => Some(partial_comm_overall_status),

                        // invalid command given or e
                        Err((e, _)) => {
                            error!("Comm execution error: {}", e);
                            
                            const OTHER_ERROR_EXIT_CODE: i32 = 126;
                            let mut partial_comm_overall_status = CommOverallStatus::new();

                            partial_comm_overall_status.insert(
                                config.name.to_owned(),
                                CommStatus {
                                    exit_code: OTHER_ERROR_EXIT_CODE,
                                    stdout: None,
                                    stderr: Some(format!("{}", e)),
                                    hostname: config.address.to_owned(),
                                    port: config.port,
                                });
                            
                            Some(partial_comm_overall_status)
                        },

                        // empty comm overall status result
                        _ => None,
                    }
                })
                .flat_map(|partial_comm_overall_status| partial_comm_overall_status.into_iter())
                .collect();

            info!("Done waiting, overall status: {:?}", comm_overall_status);

            Ok(Some(comm_overall_status))
        } else {
            info!("Non-blocking mode, forgeting all created futures...");

            for comm_overall_status_fut in comm_overall_status_futs {
                pool.spawn(comm_overall_status_fut).forget();
            }

            Ok(None)
        }
    } else {
        error!("Unable to read from client map!");
        bail!(ErrorKind::ClientMapRead);
    }
}

#[post("/execute", data = "<exec_req>")]
fn execute(config: State<MainConfig>, client_map: State<ClientMap>, exec_req: JSON<ExecReq>) -> Result<Option<JSON<CommOverallStatus>>> {
    info!("Received /execute: {:?}", exec_req);
    execute_impl(true, config, client_map, exec_req)
        .map(|comm_overall_status| {
            match comm_overall_status {
                Some(comm_overall_status) => Some(JSON(comm_overall_status)),
                None => None,
            }
        })
}

#GitHub Repositories
Server (the above): https://github.com/guangie88/comm_service
Client (for repeat triggering): https://github.com/guangie88/comm_service_rep

#Commands used

  • Server (comm_service), runs on port 17385 with the name ‘server’:

target\release\comm_service.exe -l config\comm_service_log.yml -n server -p 17385

  • Client (comm_service_rep), repeatedly triggers the server to run the echo command with 250 ms interval:

target\release\comm_service_rep.exe -c "echo haha" -d http://localhost:17385/execute -i 250


#2

Are you running under PowerShell? There is a bug related to starting many console programs on Windows 7, that has a hotfix:

https://support.microsoft.com/en-us/help/2701373


#3

Thanks for your response! :grin:

I have tried running after installing the hotfix but unfortunately the error continues to occur. The program runs in cmd.exe but I have also tried in Powershell and it has generally the same issue. The message “The application was unable to start correctly (0xc0000142)” does not appear (or crash silently), but still returns -1073741502 error code).

However I feel that the link has provided a good hint towards the cause of error. Previously I did not realise that the error code actually represents STATUS_DLL_INIT_FAILED. But now that I know of it, searching for that shows that there is a possibility of desktop heap allocation failure after some fixed number of cmd.exe launches.

Which leads to me to ponder if there are any differences between the way that Windows 7 and Windows 10 (or even Windows 8.1) handle the desktop heap allocation, or DLL loading for cmd.exe, especially for spawning cmd.exe for multiple times in a program, since running the same thing on Windows 10 had no issue. As a reference, I am spawning multiple cmd.exe using std::process::Command.

Does anyone have any idea? Thanks again in advance for any advice! :slight_smile: