Why is real network IO in unit tests bad?

I've always taken the stance that this was bad, at that unit tests should be scoped to only functions which do no IO, or maybe only some filesystem IO depending on the test.

However, thinking more about the cargo testing ecosystem, cargo test doesn't differentiate between running unit tests and integ tests - it runs all tests at once. So if it doesn't care, then why should I?

I'd think whether I care if a test (unit or otherwise) does real network IO only depends on the environment of the test (Does it have a network connection / creds? Could it mess up my laptop's filesystem?). I.e. if I'm fine with an integ test making network calls, why shouldn't I be fine with a unit test making network calls in the same environment?

I get mocking is useful for testing control flow, like what happens if a network request fails, so I still see the use case for that.

Is the general aversion to network IO in unit tests only because it prevents testing all results of the IO call (e.g. network errors) / conditional logic?

In my last job we did tons of network IO in unit tests, since we were testing a replicated system. The only two things I remember being important were:

  • Always use localhost of course.
  • Set the timeouts to much smaller values than those used in production, so that test failures causing an unresponsive node will shut down more quickly.
2 Likes

It's not that network IO is bad, it's that anything that needs to go off-box is bad because doing that in CI and such is bad.

If you want in a test to start a server on a random port and send requests to that port, then shut it down again, sure, whatever, you do you.

2 Likes

It's convenient to be able to test parts of the program without having to setup dependencies.

If bringing up the network component is trivial or can even be done by some factory method in the testsuite itself it can be ok.
If it takes some scripting, installing packages or acquiring credentials to external infrastructure it makes life more difficult. Perhaps someone getting onboarded to the project or running it in CI or whatever won't work. Or if the APIs take a long time it'd also make the testsuite slow.

where possible self-contained tests are preferable imo. But if I had to pick between no test at all and involving some networking then the latter is fine.

1 Like

Cargo’s “integration tests” are indeed a misnomer — they are just a different way of compiling tests (fully separate crate & target vs. cfg(test)ed non-test crate or doctests) and you should use it or not as suits the tests’ technical needs, not the role they play.

However, you should think about categorizing your tests in terms of useful subsets. Sometimes it is very useful to be able to select “only the tests that compile and run very very fast” when working on a problem requiring lots of iteration. A test that uses network system calls is not necessarily slower than a test that doesn’t, but it might be; in which case, you might find value in putting those tests in a separate target even if there is no technical necessity.

4 Likes

Why is doing that in CI bad? If it were, wouldn't that mean you also shouldn't integ / E2E test in CI?

Because unit tests should run in the build server where the build isn't allowed to go off-box at all.

Then once you have a build, you deploy the build to various environments for non-unit testing.

1 Like

Network IO done wrong can cause all sorts of issues in tests:

  • Hard coding listening ports: now your tests cannot be ran in parallel any more as there will be port conflicts.
  • Timing issues: sometimes tests will mysteriously time out. For any reasonable value you set. Yes, a CI host will be randomly overloaded leading the flaky tests that you can never reproduce. This reduces overall confidence in tests, making people say "oh, I guess it is just being flaky again" which will hide real sporadic issues.
  • Obviously don't rely on any services outside the test itself. If you need to, start a dummy server and dummy client as part of the test. Don't rely on connecting to a server off localhost.

Can you do it right? Yes, but I have seen it done wrong more often than right.

1 Like

But why?

Because running tests on a box that is not isolated:

  • Can be a security issue, if there is a bootstrap mechanism in the code under test that downloads a real malicious payload. Isolating the box as a whole in terms of I/O prevents the bootstrap mechanism from downloading that payload
  • Introduces dependencies on 3rd party hosts, making the tests more fragile: if that 3rd party changes any relevant functionality (which you may not even be notified of) the test might break, sometimes silently
  • if those tests are unit tests, having external dependencies is undesirable: you're only testing the unit, not the world at large. It's desirable to be able to test the unit from anywhere, on any supported hardware and OS, which the requires as few test-time dependencies as possible. This is a large part of the reason that mocking exists.
1 Like

Using network IO in tests is fine. Tokio's test suite does it all over the place. Attempting to mock out a TcpStream when you could use a real one makes no sense in my book.

2 Likes

I agree that it makes no sense to mock a TcpStream when both of its endpoints are under control of (i.e. set up as well as torn down by) the test.
If that is not true, however, then what I said in my previous post holds, and depending on a 3rd party service/API in a test is just asking for trouble down the road if you ask me.

1 Like

Congratulations, you've now stumbled into the job description of a typical build- or devops engineer. :slight_smile:

It's "bad" because if you have a set of developers working on new feature development, you're probably going to have multiple commits, several open PRs, and many code reviews related to this one feature in flight at once. If each of these needs to wait for external I/O resources, it can take a significant amount of time. (When I worked for Electriq Power, back in ye olden days, integration tests against live hardware could take up to 48 hours to complete.)

This is why most people provide levels of tests. Unit tests are, by their very name, defined to test individual units of code. That's it. Integration tests is where you start coupling things together; here's where configuration management starts to get tested. Systems testing is where you then deploy to a staging environment, perhaps with customer oversight, so that an entire system can be tested as a whole. Here's where things like user interface interaction is considered, etc. All this stuff takes varying levels of time. (This also explains why there are several hundred or more patches per alpha release, several tens of alpha releases per beta release, maybe a handful beta releases per "release candidate", etc. Each aggregation of functionality includes a lot of hidden development most people don't get to see.)

If you can integration test in milliseconds to seconds, there's no point in isolating these levels of tests. But when seconds turns into minutes, and you've developed yourself into a corner because you don't have any structured testing in place, then your development velocity is going to be very severely and negatively impacted. AND, worse, it's a boiling frog situation; you won't notice it at first because, well, "it was fast at first." It won't be until many patches later, after you've probably hired a few employees, after you've acquired more than a few customers, after you've made serious investment in your existing testing infrastructure, that you're going to find your velocity sluggish and heavy-weight. Then, your devs will probably start blaming agile development practices for being over-rated and detrimental (yes, I've seen this happen for exactly these reasons).

And, let's not forget that CI/CD infrastructure costs money, especially if it's cloud-hosted. If you abuse GitHub Actions, for example, that can either throttle your development velocity due to API limits, or it can end up costing you money. From GitHub's docs:

For private repositories, each GitHub account receives a quota of free minutes and storage for use with GitHub-hosted runners, depending on the account's plan. Any usage beyond the included amounts is billed to your account.

Keep your unit tests fast. Keep your integration tests separate. Keep your systems tests separate from those. Etc. It's more work now, but it'll make development much smoother when your project grows in both man-power and resources later.

2 Likes