Why can I call .iter() on an immutable empty array?

As I'm sure is the case with most new Rustaceans, I've developed a love/hat relationship with the compiler. For the first time though, I wrote some code that compiled that ended up causing a problem: it calls an API, iterates over the response object and does some stuff. However, sometimes that response object is empty. I felt a bit let down by my Rustacean sensei, the compiler - why does it allow this?

fn main() {
    let v = [1,2,3];
    
    let u: [u32; 0] = [];
    
    for n in v.iter() {
        println!("{}", n)
    }
    
    for _ in u.iter() {
        println!("checking")
    }
}

Standard Output
1
2
3

It prints 1,2,3 as expected but it doesn't print "checking" because there's nothing to iterate over in u. Why can I iterate over an empty array? Are there situations in which this behavior is valid/useful?

Why wouldn't it? An empty collection is still a collection. It would be really inconsistent if it was not allowed.

14 Likes

Indeed it is but is it possible to do something with nothing? (I have a feeling this could get super philosophical :smile:)

In my real-world case, it was obviously my fault for not handling the case in which the API returns an empty object, but I couldn't think of a situation in which I would want to do this - it seems to me like 9 times out of 10, if someone is trying to iterate over an empty collection and do something, it's going to result in an error.

fn print_all(strings: &[&str]) {
    for s in strings {
        prntln!("{}", s);
    }
}

Would you expect this to fail to compile unless strings is statically known to be non-empty? That would be really strange. And it would also complicate the language and the compiler immensely.

When you specifically need non-empty arrays, you can express that constraint using a newtype easily, but otherwise it would be an artifical restriction and an unnecessary special-case to handle every single time you want to do something with an array or slice.

6 Likes

It seems to me that doing anything else than looping zero times given an empty array, vector or other collection would be obviously wrong.

There are lots of situations where a collection may or may not be empty, and if it is empty, you almost always want it to just iterate zero times.

8 Likes

Very good points, thank you. I guess my real question then is, why would I ever want to define an immutable, empty array? In my code above, could u ever be useful?

It might happen in the case where I am working on the code, and am going to modify the array later.

Another case where it could be useful would be e.g. where I want to build a json object using serde, but I know that in this case the list is always empty. This could be because I am calling an API and the field is required, but I never use it.

#[derive(Serialize)]
struct MyStruct {
    always_empty: [SomeType; 0],
    other_field_i_actually_use: String,
}

Or it could be useful with a generic function that takes something which can be turned into an iterator, and in one of the calls, I want to provide it an empty iterator.

Or it could be useful in code generation in the output of e.g. a macro.

4 Likes

It's not like you purposefully write empty literal arrays directly in code. What if you are parsing JSON and it contains an empty array? What if you are interfacing with a database and you get back an empty result set? Dynamically allowing an empty array is useful for these kinds of reasons, not because you necessarily want to write for item in &[].

4 Likes

I mean, there are other cases that are genuinely never useful such as:

1 + 1;

The above computes 2, then throws it away. But it is allowed because language-wise, it is the same as

a_method_that_first_prints_and_then_returns_an_int();

and the above is obviously sometimes useful. Disallowing 1 + 1; would be a weird special case, although I could imagine that you might add a warning for it.

6 Likes

Another thing to compare to is that I sometimes write this:

if false {
    some_code_i_want_to_disable_for_now();
}

The equivalent of the above for a loop is:

for x in [] {
    some_code_i_want_to_disable_for_now();
}
3 Likes

Thanks! This is really just a testament to how awesome the compiler is - in my short time programming in Rust I've come to depend on the compiler to not let me make mistakes by not handling some eventuality in my code so I was surprised when it allowed me to iterate over a collection that could be empty, and do stuff with the non-existent items in the collection, without warning me that it could panic.

It cannot panic. Iterating over an array with 0 items performs the loop 0 times.

Empty sets are normal and common in everyday life. "You have no money in your account." "There are zero cars in this parking garage." "Yes, we have no bananas today." It's not a program error for a sequence to be empty; it would be far more weird if it was.

There's even a standard iterator whose job is to always be empty. This comes in handy, for example, in generic code where an iterator is required but the caller doesn't actually have any items to provide.

5 Likes

But it cannot panic. By the very definition of a for loop, if the iterated-over collection is empty, the loop body simply never executes. Why would it panic?

In my case the error was:

thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0',...

The code looked something like this:

#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Doc {
    pub dh_doc_id: String,
    pub doc_name: String
}

#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "camelCase")]
pub struct Res {
    pub docs: Vec<Doc>,
}

let res: Vec<Res> = // call API, deserialize JSON to struct

for doc in res[0].docs.iter() {
// do some stuff
}

The way that I "fixed" it was to simply wrap the for loop in an if statement that checks the length of res.

It's res[0] that causes that panic, not for doc in ... If you rewrote it like this

let r = &res[0];
for doc in r.docs.iter() {

the panic would happen on the first line because res is empty. .iter() never gets called in this code at all.

6 Likes

In the olden days of Fortran 66, the body of a loop was always executed.

      let j = 0
      do 10 i = 1, 0
10    j = j + 1

The variable j would end up with the value 1. The result was some hard to find bugs. The problem was fixed with the introduction of Fortran 77. Unfortunately, that change broke all of my programs. (It didn't break any of my wife's because she was a professional programmer.)

5 Likes

For what it's worth, the indexing error you ran into is something I intellectually think shouldn't be allowed - i.e. indexing is an operation that can panic - intellectually I think indexing should return an Option unless statically provable (e.g. correct bounds checking, something akin to what TypeScript can do with type guards). The slice .get method always gives an Option so there's that.

Practically however, I understand that many many many people disagree with that, because array indexing is fundamental to (maybe) all imperative programming languages, and if you don't check your bounds you are a "bad person".

Anyway, there is a lint that can warn you in some cases: ALL the Clippy Lints it's not enabled by default so you have to add it.

1 Like

Good idea :grinning_face_with_smiling_eyes:

warning: unused arithmetic operation that must be used
 --> src/main.rs:2:5
  |
2 |     1 + 1;
  |     ^^^^^
  |
  = note: `#[warn(unused_must_use)]` on by default

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9fa2d8226c624665d3f6e0db3ddd83e5

"Bad person" is definitely too strong. Of late I've been talking about places with algorithmic invariants as being where dogmatically returning Option/Result can hurt more than it helps. That's not unlike what you said about "statically provable", but acknowledging that actually putting that proof machinery in the language may be impractical.

Related previous conversations:

Or, more generally, To panic or to Result - #4 by scottmcm

2 Likes

I have places where I use .get and others indexing depending on the context. I don't like to have too many unwrap()s because I feel I must include a comment of why it's safe.

let foo = vec![1, 2, 3];
let second = foo.get(1).unwrap() // Unwrap safe because of previous line

let bar = [1; 3];
let n = 1;
let my_index = MyIndex::new(n, bar.len())?;  // Error if n > bar.len()
let second = bar[my_index.u()]; // my_index.u() returns a usize

Of course the first example is pretty silly, and the second example looks awkward. The issue in the first example is when foo is mutable and modified far from the get(). For the second example, in my real code I create an index once and can pass it around all over the place with the assurance that indexing will never panic.