Aren't there any efforts to bring Rust's dependency number down?

I agree. That's where the second part of one of the OP comments comes handy -- being able to split std into a couple of crates and white list them separately. If I look at my deps, most of them don't access OS or FS, they use collections, mutexes, etc. Very few use FS or call the OS (except heap allocation for which I can provide my own global allocator an audit at runtime.)

If I have to manually audit only half of the deps, it's still a win for me.

I think there's a lot of misunderstanding happening in this chat. :slight_smile: I've been mainly commenting whether unsafe can be used to increase security, while others have been almost exclusively commenting on whether it alone should be used to increase security.

I agree that unsafe alone can't guarantee all safety and other mechanisms would need to be in place.

This is a good point, I don't think it's been mentioned already. I think macro-generated code would have to go through the same security check as a 3rd party library.

The way macros are written has always been beyond me, so I could be completely wrong here, but shouldn't it be possible to do static analysis on a macro to determine whether it can make calls to IO functions? I agree with what you said about unsafe not being the sole source of malware, it's just one of the things on the list, and an easy example to point to. Can you think of any way to write malware that doesn't use unsafe, and also doesn't use any form of IO, as provided by the standard library?

Proc macros are compiled and ran, basically compiler plugins, so I'm sure analysis of those is (mathematically) impossible in the general case. "Macros by example", I'm a little less certain, but wouldn't be surprised at all if they are also Turing complete. You could, of course, take a best-effort type approach.

Well, how are you defining malware, or in your previous post malicious code? How about DOS-ing date-based fork bomb or OOM-inducer? Or as others have mentioned, something like a voting algorithm package that delivers unfair results. Or a crypto library that's deliberately weakened. A password generator with output the implementers can predict. There's a lot of possibilities.

3 Likes

Probably. Even the type system is turing complete: Rust's Type System is Turing-Complete – Recursive Descent into Madness – A countable set of sanities and insanities, by Shea Leffler.

Malware is just ordinary code that does things the users don't want. If you can write ordinary code that is useful, then you can write malware. Preventing malware by a language mechanism means preventing ordinary code from being written. (Which won't prevent anything, because people will just always opt-in to whatever mechanism they need to write ordinary code, which includes malware.)

2 Likes

They are.

2 Likes

Anyone who thinks that crates implementing "pure" algorithms can't be used for malware, please don't ever work in crypto.

2 Likes

I agree but...

I can't help thinking there are a number of different issues being discussed here as if they were all one.

Let's accept that "malware" can be introduced into any program or function and it's very hard/impossible to detect. It's futile. Even human reviewers cannot detect malware. Might as well give up.

On the other hand this thread starts out with a question about the number of dependencies required in creating almost any Rust program and how we can reduce that or be more confident they have not been back-doored. If I understand correctly.

So let me propose a simple example....

I create a pseudo random number generator crate. It's the most best statistically random PRNG ever invented. Cryptographically secure. It boils down to a couple of functions, one to seed a PRNG and one to pull out the random values.

Now if you are using my PRNG crate you might like to know if a new version ever starts to use "unsafe" where it never did before. That might indicate that somebody had been tampering with it down the supply chain. Clearly a tool could alert you to that eventuality and you could investigate further.

Or perhaps a new version of my PRNG suddenly sprouts a dependency on something that does file system access or networking. Where it never did before and clearly never needed to perform its function. Again a tool could alert you to that new I/O dependency and you could investigate further.

So I suspect that such an "unsafe" and I/O dependency tracking tool could be useful in raising alarms for shenanigans going on in the crate supply chain.

Now, of course, some cunning hacker could tweak my PRNG so that it is no longer cryptographically secure and hence get easier access any system you defend with it. Well, yep. But that does not negate the value of the checks I describe above.

Surely something is better than nothing?

2 Likes

Just pin the version.

There are two main principles for developing reliable systems:

  1. Redundancy. If your system fails, you should have an online backup which can take over.

  2. Recovery. If your system fails, you should have an offline backup of a good build, so you can restore to a known working state easily and quickly.

You cannot expect to solve malware problems by asking malware writers to declare they code to be malware, because they will not. You also cannot ask everyone else to declare their code not malware, because everyone will. unsafe does not mean 'malware here', and it is not a suitable proxy for this kind of decision.

I think tools to automate review or the crate supply chain like that are possible and of value.

Reproducible & check-summed builds are valuable. The ability to pin versions is valuable. Anything that supports the two points I outlined above is valuable. Trying to use unsafe as a declaration of 'bad code' is not.

Yes indeed. That would do it. If you know you have a "good" version why move on.

However I feel I did not make argument clearly.

I'm not talking about "unsafe" as being a declaration of "bad code". Far from it. Nor am I talking about I/O dependencies being such.

What I am suggesting is that in the normal way of things people often want to adopt a new version of some dependency. For performance reasons, or bug fixes, or new features, whatever.

I assume they have reviewed or some how have confidence that the version the are using is malware free. Perhaps they just trust me.

I suggest then, that if some new version suddenly introduces "unsafe" when it never had any before and no particular to have it. Then that is a red flag that something has been tampered with in the supply chain. Or I have turned rogue.

Similarly for new versions that suddenly sprout file system, network or other dependencies that they never had before and no reason to need.

These things are a red flag that indicates some further investigation is in order.

Certainly I don't suggest this is a full on analysis that is sure to find malware. I do suggest these are relatively simple checks that could be put in place with tools that have value in detecting bad behaviour.

2 Likes

Or you're just optimizing it? If someone trusted you before, why wouldn't they trust you now? More importantly, you don't need unsafe to do this: maybe you already introduced unsoundness in your crate before people relied on it. Now you can do malicious things later using only safe code.

The fact is, if you are a malicious actor and people depend on your code, you have them in a tough spot. Unsafe is not the red flag here, your willingness to change your code without suitable explanation is the red flag. (And if the code you download isn't the code in the repository, that's a problem regardless of whether there's any unsafe in it.)

1 Like

Sure. That is OK. The tool raises a red flag "warning 'unsafe' introduced". A concerned user checks it out. Perhaps discussing with the author.

It's not all about trusting me. It's about trusting the supply chain. No to say that I had not gone rogue of course.

I think we are finally talking the same language.

If your supply chain is compromised, are you going to sit around ignoring it because no unsafe was introduced? If your highly audited crypto code suddenly changes, are you going to overlook it because no unsafe was introduced?

If so, your process is primed to fail. Flagging unsafe is not helpful to solving any of these problems. You can write malware in completely safe code. Understand that, and this conversation might actually move forward in some way.

1 Like

I'd like to see some proposals/ideas/approaches that can be used, tiny wins that can be implemented that solve part of the problem but so far comments tend to be about how avoiding unsafe doesn't solve anything. I think everyone understands that that is not a complete solution but the question remains -- what can be a (partial) solution? What would help?

I don't think that the goal really was to avoid malware but to reduce the manual effort of checking/auditing dependencies and scanning for exploits. If 70% of exploits happen because of OOB memory access, reducing the use of unsafe would reduce this vector.

Anecdotal evidence warning: At work we use Rust for one project and so far all 3 RUSTSEC advisories in the last year were OOB in unsafe code. It was never a direct dependency but a dependency of a well-known/widely-used dependency (like rand) and there was nothing much we could do about it except hope that a large project like warp gets updated soon which it never does because of the release schedule.

I'd prefer to use a fractionally slower/less performant code which panics rather than a clever/optimised unsafe code. I'd definitely benefit from not depending on unsafe code.

A feature flag where I can opt-in to optimised or safe code would help in my case. Maybe not doable in general as it'd have to be viral and descend to deps of deps wherever possible.

So there are two different problems being discussed here.

On the one hand, ensuring the code you rely on is sound is definitely served by reducing the amount of unsafe in your dependencies. It's not clear to me what more can be done, because proving soundness is extremely difficult, but as I mentioned above, getting the build process to be stable and reproducible, as well as perhaps integrating with tools like miri by default, could make it much easier to study code and collaborate on finding unsoundness, as well as the ongoing efforts to define what assumptions unsafe code is allowed to make.

Anecdotal evidence warning: At work we use Rust for one project and so far all 3 RUSTSEC advisories in the last year were OOB in unsafe code.

Do you have any evidence that this unsafe could could have been replaced with safe alternatives? If not, then what would you have expected the developers to change? If so, then why did they make the change and how did it get into your product?

As for the other problem, protecting yourself from mistakes in well-intentioned code is very different from protecting yourself from intentionally malicious agents -- you can't do that by attempting to target their code rather than by ensuring soundness in your own. If your code is unsound, a malicious agent doesn't need your permission to do anything. If you're pulling in dependencies written by malicious agents, you've effectively already given them your permission to do what they want.

I think this is the core of the issue. Much of what's going on here seems to me to just be people saying "we could make some cases better" and others saying "we can't fix all the cases by that", and they're both right.

I feel like there's a major 80/20 opportunity here. Probably 80% of the code I bring in will be from 20% of the dependencies, and those dependencies will be complicated & important ones, and they'll probably need unsafe code. That's fine. I'm not going to say that regex shouldn't use unsafe anywhere -- of course it (or a dependency) will somewhere.

But I also think there's 20% of the code coming from 80% of the dependencies that just do simple little convenience things. Do I want leftpad using unsafe code? No, I'd rather just write it myself than audit it for something simple like that. But I'd love it if there was a way for, say, rev_slice — Rust library // Lib.rs to be written without unsafe in such a way that it's obvious to static analysis that it's not doing anything unsound. Is that a critical library to anyone? No, and it never will be. But it sure would be nice to be able to use it in normal code with less concern because rust can tell it's not sketchy.

(And yes, if you're writing crypto then you still couldn't trust it. But the number of people actually writing industrial-strength crypto is tiny, and they're probably not using little convenience API crates like that anyway, so I think that's just a distraction.)

So maybe it'd help in these comments if people were to put at the top of the posts what their expectation is from whatever's being discussed. Or maybe just to move all the conversation to threads with actual specific proposals as context, to reduce crosstalk.

1 Like

I, at least, have not made such a claim here. As far as I can tell everyone here understands the truth of what you say.

I'm not sure how to express what I/we are suggesting here any more clearly. It's not about totally assuring freedom from malware by looking for "unsafe". It's about raising red flags when "unsafe" or I/O is introduced where it is not expected.

It's about the fact that most code is never audited by anyone. That we can assume supply chains are compromised. So, if I happen to have had time to look it over at some point it might be useful to be alerted when something suspicious happens.

No perfection. But something is better than nothing.

3 Likes

The JavaScript community has taken a different approach with Secure EcmaScript (SES). SES is an object capability language, which is just a fancy way of saying that a method you call only gets access to the things you explicitly pass. There are no ambient authorities. For example, a module that needs access to a file does not have permission to open it; the open file handle (or whatever JavaScript calls it) must be passed in. SES also takes care of other JavaScript specific issues, such as making the global state immutable.

JavaScript is a garbage collected language, which makes implementing SES easier. However, it is possible to audit Rust for many of the same potential vulnerabilities. SES won't do anything to assure you that your crypto library correctly implements a good algorithm. However, it can prevent a module that gets your key from leaking it; SES even closes some side channels.

I took brief look at what it would take to create a Secure Rust, and decided it was too hard. However, experience with a similar idea for Java showed that an IDE can report violations of the relatively simple rules behind SES.

1 Like

Never updating your dependencies is "something."

There is also the possibility that introducing "security" auditing using unsafe as a crutch is going to lead people to believe in the fallacy that you need unsafe code to create malware and actually lead to less effective auditing. Apparently some developers already believe this to some degree.

You can have very big complex dependencies without unsafe. There will be more as frameworks mature and increasingly complex middleware develops. Example: How do you know that large web framework hasn't had a little route bypass backdoor installed? No unsafe or heavy crypto implemented here, so no need to audit, right?

Sometimes "Something is better than nothing." is false.