Using Linux seccomp sandboxing properly?

I have to handle some user supplied data, and am planning on putting the processing into a separate thread which is sandboxed using seccomp using something similar to the below. Maybe others have some better battle-tested approaches they can share?

extern crate prctl;

use std::thread;

fn sandboxed_stuff() {

    println!("Inside sandboxed function");

    match prctl::set_seccomp_strict() {
        Ok(()) => println!("seccomp activated"),
        Err(ret) => println!("prctl to seccomp failed with error {}", ret)
    }

    println!("Sandboxed and reporting!");

    // Do more stuff here.

}


fn main() {

    println!("Starting...");

    let handle = thread::spawn(|| sandboxed_stuff() );

    handle.join().unwrap();

    println!("Back in main and shutting down.");

}

The seccomp-bpf filters look powerful, but I have some pre-3.5 kernel Linux servers which don't support it (and aren't likely to be upgraded anytime soon). Other Rust sandboxing options I had looked at were: rusty-sandbox (no seccomp support at the moment), servo/gaol, and insanitybit/sandbox.

Many thanks.

2 Likes

I wouldn't use my (insanitybit) sandbox. It's going to change radically, soon, and I've done very little to ensure safety. It's a proof of concept. I can't speak to the other two libraries.

@staticassert thank you for the update - I'm looking forward to seeing how it develops further.

Yesterday I did some more testing with the prctl crate approach above and it works well. I placed a File::open before the prctl::set_seccomp_strict(), then did a read_to_string, and this worked correctly as the read operation is permitted in the sandbox against a pre-existing file descriptor established before entering the sandbox. When I placed the File::open after the the call to enter seccomp, the thread died as expected as the open call is blocked by seccomp, which terminates with SIGKILL.

I've had to create some boilerplate checks to ensure that seccomp support exists on the platform I'm running the code on, which I achieved with:

  1. Checking the running Linux kernel version is >= 2.6.12 (first kernel including seccomp). This was done using the nix crate to get a uname result and then using the semver crate to Version::parse the result and do a GTEQ comparison against a static version string.
  2. Then a call to prctl::get_seccomp then confirms whether or not seccomp support is compiled into the kernel.

Seccomp is definitely the choice I'd use, however there are some considerations to using the v1 version. Seccompv1 only allows the 4 system calls mentioned in the article, which means you can't map or unmap memory, in particular. Rust may actually make that easy to deal with, since it's generally simple to allocate ahead of time and then just use references on the stack.

If you do find the need for more system calls you can use seccomp v2. v2 also allows you to validate that the parameters into the system calls matches a specific pattern.

You can find a crate for this here:
https://crates.io/crates/seccomp

I've used it to do some basic proof of concept work here:

Note that you should not do thread level seccomp the way I've done it though - it is only effective at a process level, or through trickery that does not exist in rust yet.

I took at quick look at your seccomp_rust repo and noted the warning regarding the isolation of trusted vs untrusted threads. Do you have any pointers to material that would help me better understand this issue in the context of Rust? I thought that seccomp applied to each thread individually, and so could properly isolate the untrusted element.

I also read Servo's design document and it talked about having separate trusted/untrusted processes communicating via IPC, but this is new ground for me. A set of untrusted workers spawned from a trusted daemon, each isolated with sandboxes and communicating with something like nanomsg looks very interesting though!

Yeah so the problem is that while seccomp prevents the thread from executing system calls, threads all live in the same address space within a single process.

So if you get arbitrary read/ write in the sandboxed thread you can simply manipulate a trusted thread's stack and have it execute what you need.

I don't know the details of how this can be prevented other than that it's a pain and they did it initially in Chromium.

In terms of process workers, this is a way better model if it works for you - at least it's better in terms of security, given the current tools we have.

This is how the sandbox repo (the first one, not the seccomp one) I have works, it runs code in separate processes. I'm currently rewriting that code to have an easier, less error-prone API as well as seccomp support, but it likely won't be done until next week/ next month.

Dropbox implements a strong seccomp filter by having a child process listen on stdin and communicate via stdout. They do this in brotli.

1 Like

I recommend either using gaol (full disclosure: I maintain gaol) or using minijail (never used it, but heard good things).

1 Like

@staticassert Thank you for the info and the link to the DropBox article. I will set a watch on your repo so I can check out when the rewrite is done.

I think Gaol is the right option most likely as well. I'm still messing around with designs for my library but there's a good chance it'll just be an interface for Gaol.

@pcwalton @staticassert OK, thanks. I will have a closer look at Gaol this weekend.