Do I really need a global variable here?

I'm writing an init system in Rust, and I have logical entities known as 'units' which basically describe commands to start. Currently, I've defined it like this:

use std::cell::RefCell;
use std::process::{Child, Command};

#[derive(Debug, PartialEq)]
pub enum UnitKind {
    Task,
    Daemon,
}

#[derive(Debug)]
pub struct Unit<'a> {
    pub name: &'a str,
    pub cmds: &'a [&'a [&'a str]],
    pub kind: UnitKind,
    pub active: RefCell<bool>,
}

impl<'a> Unit<'a> {
    pub fn load(&self) {
        let mut running: Vec<Child> = Vec::new();
        for x in self.cmds.iter() {
            match Command::new(x[0]).args(&x[1..]).spawn() {
                Ok(x) => running.push(x),
                Err(e) => eprintln!(
                    "error starting some command in unit {}, error is:\n{}",
                    self.name, e
                ),
            }
        }
        self.active.replace(true);
        if self.kind == UnitKind::Task {
            // we want to block in this case
            for x in &mut running {
                x.wait().unwrap();
            }
            self.active.replace(false);
        }
    }
}

This works fine, but I'd like to implement an unload method, which will attempt to gracefully kill every process in the unit. To do that, I need to move running into the struct, and then I can implement it. However, if I do that, I'm responsible for cleaning it up in the case I use UnitKind::Task when it exits. On Unix I can tell when a process exits by handling SIGCHLD, so I'm doing that with the nix crate:

    unsafe {
        signal::sigaction(
            signal::SIGCHLD,
            &signal::SigAction::new(
                signal::SigHandler::Handler(sigchld),
                signal::SaFlags::empty(),
                signal::SigSet::empty(),
            ),
        )
        .unwrap();
    }

where sigchld is defined as:

extern "C" fn sigchld(_: c_int) {
    thread::spawn(|| {
        // we don't want to waste time in the main thread reaping, instead let's just ask the process scheduler to do this for us via thread::spawn
        println!("sigchld recieved");
        loop {
            match wait::waitpid(
                nix::unistd::Pid::from_raw(-1),
                Some(wait::WaitPidFlag::WNOHANG),
            ) {
                Ok(_) => {}
                Err(e) => match e.as_errno() {
                    Some(nix::errno::Errno::ECHILD) => break, // nothing more to reap
                    None => panic!("todo figure out why as_errno returns none here"),
                    _ => eprintln!("waitpid: unknown error {}", e), // some other error, report a bug, you can get the error code with `errno -l` 
                },
            }
        }
    });
}

Now here, I want to remove the Child from whatever Vec it's in in whatever Unit it's in. My first thought is to use global variables but that requires more unsafe, is there an alternative?

Maybe something like a channel suits your use case better? If I understand you correctly you effectively have several threads sending data to a single receiver.

My first thought is to use global variables but that requires more unsafe, is there an alternative?

If you're thinking of a static mut here, that is probably a bad idea. But there are 100% safe alternatives if you do want a global mutable variable. You're probably looking at a Mutex inside a lazy_static! or OnceCell, if you want to access it from multiple threads.

Thanks for the reply!

I don't think channels are what I want. I don't want to send data, but rather mutate existing data inside a struct instance. sigchld spawns a thread only so that it doesn't take too much time which disrupts the event loop (listening for unix socket connections). An ideal Rust-y way would be to pass state into the function, i.e. able to write this but I can't:

        signal::sigaction(
            signal::SIGCHLD,
            &signal::SigAction::new(
                signal::SigHandler::Handler(|_: c_int| { sigchld(&local_data) }),
                signal::SaFlags::empty(),
                signal::SigSet::empty(),
            ),
        )
        .unwrap();

What the solution with static mut I was thinking of might actually be nicely implemented with lazy_static. Is this an ideal way? Still need to test it out...

use lazy_static::lazy_static; 
use std::sync::Mutex;
pub mod unit;

// global
lazy_static! {
    static ref arr: Mutex<Vec<Unit>> = Mutex::new(Vec::new());
}

// inside sigchld handler check ` arr.lock()? `

I doubt the thread::spawn() is a signal safe function since its implementation is non-trivial. Why don't you use safe wrapper like signal-hook?

This is pretty awesome, I'll migrate to it if I can figure it out.

fn main() -> Result<(), Error> {
    let term = Arc::new(AtomicBool::new(false));
    signal_hook::flag::register(signal_hook::consts::SIGTERM, Arc::clone(&term))?;
    while !term.load(Ordering::Relaxed) {
        // Do some time-limited stuff here
        // (if this could block forever, then there's no guarantee the signal will have any
        // effect).
    }
    Ok(())
}

I don't get this example, so register sets term to true when it's recieved, but the thing in the while loop will certainly block forever for me, it's listening for new socket connections and handling them.

If you do this you'll run into lifetime issues with your Unit struct because it refers to other references that have a limited lifetime, so you can't store it in a global like that. You'll want an Unit struct that owns all its members.

An ideal Rust-y way would be to pass state into the function, i.e. able to write this but I can't:

You can also pass around something like Arc<Mutex<MyStruct>>, which anyone who has one can safely carry around and mutate (and is threadsafe).

For ergonomics it's usually nicer if you make a struct to encapsulate that, e.g.

struct MyState{
    inner: Arc<Mutex<...>>
}

Busy-loop with atomic flags are hard to use correctly and efficiently. Try their other examples instead.

So something like this to avoid referencing, correct?

pub struct Unit {
   pub name: String,
   pub cmds: Vec<Vec<String>>,
   pub kind: UnitKind,
   pub active: RefCell<bool>
}
1 Like

That example looks too big and unwieldy for me to understand :sweat_smile:

I think I can use std::thread::spawn in a signal handler safely if I delegate it to another function, making the kernel think the call finished.

extern "C" fn sigchld(_: c_int) {
   do_other_thing();
}

then I can put the call in do_other_thing

It doesn't change anything. Any functions which calls some other not signal safe functions are also not signal safe. The most common deal-breaker is the heap allocations, which surely is happened within the thread::spawn().

2 Likes

Heap allocations aren't allowed in a signal handler??

It's not allowed. It's pretty restrictive environment.

Is there not any way to have the handler call out to something that's more free? Self-pipe trick maybe?

See this man page for details on what you're allowed to do in a signal handler:

https://man7.org/linux/man-pages/man7/signal-safety.7.html

That explanation is C-oriented, but applies to Rust in many respects too. For example, Rust by default uses the system allocator provided by libc malloc (on Unix systems), which is not required to be async-signal-safe, so you can't do any heap allocations or deallocations in a signal handler.

1 Like

The fork and exec- family of functions are listed as safe so I could spawn another process to reap chlidren but that sounds terribly overcomplicated... I'll try looking at that signal_hook library again if it mitigates this problem (I am curious how it would do that).

Edit: I managed to get something running with signal_hook. I'm not sure if this does let me get rid of the async-safe-only calls though.

use signal_hook::{consts::SIGCHLD, iterator::Signals};
use std::{error::Error, thread, time::Duration};

fn main() -> Result<(), Box<dyn Error>> {
    let mut signals = Signals::new(&[SIGCHLD])?;

    thread::spawn(move || {
        for sig in signals.forever() {
            std::process::exit(1);
        }
    });
    std::process::Command::new("ls").spawn();
    loop {} // doesn't loop forever because the sigchld breaks

    Ok(())
}

Check out the documentation for signal_hook::low_level::pipe. Apparently the signal_hook::iterator::Signals interface uses this (the same self-pipe trick you mentioned) under the hood, so you're free to do heap allocation or whatever in the for loop. (Which is already apparent from the fact that Signals doesn't require unsafe to use.)

I got this large error:

error[E0277]: `RefCell<bool>` cannot be shared between threads safely
   --> src/main.rs:40:5
    |
40  |     thread::spawn(|| {
    |     ^^^^^^^^^^^^^ `RefCell<bool>` cannot be shared between threads safely
    | 
   ::: /home/user/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:617:8
    |
617 |     F: Send + 'static,
    |        ---- required by this bound in `spawn`
    |
    = help: within `Unit<'_>`, the trait `Sync` is not implemented for `RefCell<bool>`
    = note: required because it appears within the type `Unit<'_>`
    = note: required because of the requirements on the impl of `Sync` for `Unique<Unit<'_>>`
    = note: required because it appears within the type `alloc::raw_vec::RawVec<Unit<'_>>`
    = note: required because it appears within the type `Vec<Unit<'_>>`
    = note: required because of the requirements on the impl of `Send` for `&Vec<Unit<'_>>`
    = note: required because it appears within the type `[closure@src/main.rs:40:19: 62:6]`

error[E0277]: `RefCell<Vec<std::process::Child>>` cannot be shared between threads safely
   --> src/main.rs:40:5
    |
40  |     thread::spawn(|| {
    |     ^^^^^^^^^^^^^ `RefCell<Vec<std::process::Child>>` cannot be shared between threads safely
    | 
   ::: /home/user/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:617:8
    |
617 |     F: Send + 'static,
    |        ---- required by this bound in `spawn`
    |
    = help: within `Unit<'_>`, the trait `Sync` is not implemented for `RefCell<Vec<std::process::Child>>`
    = note: required because it appears within the type `Unit<'_>`
    = note: required because of the requirements on the impl of `Sync` for `Unique<Unit<'_>>`
    = note: required because it appears within the type `alloc::raw_vec::RawVec<Unit<'_>>`
    = note: required because it appears within the type `Vec<Unit<'_>>`
    = note: required because of the requirements on the impl of `Send` for `&Vec<Unit<'_>>`
    = note: required because it appears within the type `[closure@src/main.rs:40:19: 62:6]`

So I came to a solution that almost compiles, which is to protect the Vec with a Mutex. Here's how it looks:

let mut sigset = Signals::new(&[SIGCHLD]).unwrap();
    let mut units: Mutex<Vec<unit::Unit>> = Mutex::new(Vec::new());
    thread::spawn(|| {
        for _sig in sigset.forever() {
            println!("sigchld");
            loop {
                match wait::waitpid(nix::unistd::Pid::from_raw(-1), Some(wait::WaitPidFlag::WNOHANG)) {
                    Ok(_) => {}
                    Err(e) => match e.as_errno() {
                        Some(nix::errno::Errno::ECHILD) => break, // nothing more to reap
                        None => panic!("error reported but #as_errno() returns None, probably a bug in the `nix` crate"), // we should never hit this, ever
                        _ => eprintln!("waitpid: unknown error {}", e), // some other error, report a bug, you can get the error code with `errno -l` 
                    },
                }
                let acq = units.lock();
                for x in acq.iter() {
                    for y in x.running.borrow_mut().iter() {
                        if y.id() == 1 {
                            println!("debugging, change 1 to the real pid");
                        }
                    }
                }
            }
        }
    });

    { // scope so lock freed when doen
        let u = unit::Unit {
            name: "tree",
            cmds: &[&["tree", "/"]],
            kind: unit::UnitKind::Daemon,
            active: std::cell::RefCell::new(false),
            running: std::cell::RefCell::new(Vec::new()),
        };
        let acq = units.lock().unwrap();
        acq.push(u);
        acq.last().unwrap().load()
    }

The trouble is the x.running.borrow_mut() fails to compile. If I remove that block of code, I get errors that it wants to move in units and sigset, but of course I use units later. Rust lifetime errors are quite nice, but quite painful

error[E0373]: closure may outlive the current function, but it borrows `units`, which is owned by the current function
  --> src/main.rs:42:19
   |
42 |     thread::spawn(|| {
   |                   ^^ may outlive borrowed value `units`
...
54 |                 let acq = units.lock();
   |                           ----- `units` is borrowed here
   |
note: function requires argument type to outlive `'static`
  --> src/main.rs:42:5
   |
42 | /     thread::spawn(|| {
43 | |         for _sig in sigset.forever() {
44 | |             println!("sigchld");
45 | |             loop {
...  |
56 | |         }
57 | |     });
   | |______^
help: to force the closure to take ownership of `units` (and any other referenced variables), use the `move` keyword
   |
42 |     thread::spawn(move || {
   |                   ^^^^^^^

error[E0373]: closure may outlive the current function, but it borrows `sigset`, which is owned by the current function
  --> src/main.rs:42:19
   |
42 |     thread::spawn(|| {
   |                   ^^ may outlive borrowed value `sigset`
43 |         for _sig in sigset.forever() {
   |                     ------ `sigset` is borrowed here
   |
note: function requires argument type to outlive `'static`
  --> src/main.rs:42:5
   |
42 | /     thread::spawn(|| {
43 | |         for _sig in sigset.forever() {
44 | |             println!("sigchld");
45 | |             loop {
...  |
56 | |         }
57 | |     });
   | |______^
help: to force the closure to take ownership of `sigset` (and any other referenced variables), use the `move` keyword
   |
42 |     thread::spawn(move || {
   |                   ^^^^^^^

Am I just overcomplicating everything?

Alternatives I'm thinking of: set an atomic variable and poll that (very cpu heavy but should be 100% safe), use kqueue on BSD and signalfd on Linux

The approach that looks to be very commonly used is set a sig_atomic_t, and then poll it, so I think I'm missing something. Isn't this very slow?

I don't need a global variable!

I managed to write a perfecly nice, safe, state encapsulating handler with the self-pipe trick :+1: