When the return keyword is needed

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if b.is_empty() {
        return a;
    }
    for elem in b {
        a.retain(|x| *x != elem)
    }
    a
}

Hi ! Sorry if I am digging up a corpse with a reply to a post so old, I was solving this problem and in the if block, as you can see, there is return a;. I really love the way I can omit the return keyword in Rust and just write the variable name as you can see in the last line of this function. Can someone explain why the return keyword is needed inside an if block ?

A block expression like { a; b; c }, when it reaches its last expression c, evaluates to the result of that expression.

This is true not just for function bodies, but for any block. For example:

fn main() {
    let x = {
        println!("hello");
        1
    }; // This block evaluates to `1`.
    
    println!("x is {}"); // prints "x is 1".
}        

Note that the value of the inner block does not get returned from the main function; it is just an expression like any other. Here we assign it to a variable, and the function continues running.

However, the block will not reach its last expression if you break out of it early with a keyword like break or continue or return.

return foo in a function or closure causes the function/closure to return early with the value foo.

break foo in a loop expression causes the loop to stop running and evaluate to foo.

You always need the return keyword (or the ? operator, which is shorthand for returning an error) when you want a function to return early (i.e., before reaching the end of the body).

3 Likes

Think of the function as

fn array_diff<GENERICS>(ARGS) -> RETURN_TYPE
IMPLEMENTATION

where the IMPLEMENTATION is a block of type RETURN_TYPE. So IMPLEMENTATION is

{ STATEMENT1; STATEMENT2; EXPRESSION }

i.e.

  STATEMENT1      STATEMENT2     EXPRESSION
  vvvvvvvvvv   vvvvvvvvvvvvvvvv  vvvvvvvvvv
{ if … { … } ; for … in … { … } ;     a     }

(Your actual source code doesn’t contain the semicolons because they are optional after if / for / while / match expressions (and a few more) under certain circumstances)

In line with what @mbrubeck mentioned above, such a block is executed by evaluating all the statements (ignoring their results), then evaluating the final expression, which then becomes the return value of the block.

The return value of the function body (which is a block) is the return value of the function; the return keyword is a different way to return from a function. It operates as an early return. Evaluating a return a expression short-circuits right out of the surrounding function with the effect of skipping evaluation of subsequent statements in a block, or also subsequent subexpressions in other situations.

The evaluation of

{ STATEMENT1; STATEMENT2; EXPRESSION }

first evaluates the first statement, then the second statement, then the final expression, then returns the result of evaluating the final expression, unless during the evaluation of any of these statements/expressions, e.g. a return statement was evaluated. The first statement, if b.is_empty() { return a; }, can hit a return statement. If the condition is true and return a; is evaluated, STATEMENT2 and the return EXPRESSION of the function are skipped.


If feel like exploring how exactly return expressions work in Rust in terms of syntax and type system, take a look at this recent topic (that got perhaps a bit too technical)


Edit: Another thing: if you want to avoid using return, you can rewrite your code example using an else block:

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if b.is_empty() {
        return a;
    } else {
        for elem in b {
            a.retain(|x| *x != elem)
        }
        a
    }
}

then the return a; actually is at the end of the function – no more need for an early return – so it can be rewritten as

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if b.is_empty() {
        a
    } else {
        for elem in b {
            a.retain(|x| *x != elem)
        }
        a
    }
}

Fínally, if you notice that both cases of the if/else end with the same final expression, you can refactor the code as follows, moving the return expression behind the if/else statement (and removing the code duplication of the “a”)

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if b.is_empty() {
    } else {
        for elem in b {
            a.retain(|x| *x != elem)
        }
    }
    a
}

Finally, the empty block after if can be avoided by inverting the condition

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if !b.is_empty() {
        for elem in b {
            a.retain(|x| *x != elem)
        }
    } else {
    }
    a
}

and then removing the redundant else {}

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if !b.is_empty() {
        for elem in b {
            a.retain(|x| *x != elem)
        }
    }
    a
}

Finally, if you realize that a for loop on an empty array doesn’t really do anything, you can avoid the b.is_empty() check alltogether and just use

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    for elem in b {
        a.retain(|x| *x != elem)
    }
    a
}

Going further, your function doesn’t really need an owned copy of b, because it really only needs to read the values in b

use std::cmp::*;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: &Vec<T>) -> Vec<T> {
    for elem in b { // here, “elem” is now “&T”, because
                    // “b” is “&Vec<T>” and no longer “Vec<T>”
        a.retain(|x| *x != *elem)
    }
    a
}

and in a further step, it’s common practice to replace &Vec<T> arguments with &[T] in Rust, as well as to avoid use ………::*; imports. Also references implement PartialEq, so *x != *elem can become x != elem:

use std::cmp::PartialEq;

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: &[T]) -> Vec<T> {
    for elem in b {
        a.retain(|x| x != elem)
    }
    a
}

Edit2: Ah… @tczajka wrote some similar suggestions while I was editing my comment xD

2 Likes

You can refactor your code to get rid of the return keyword.

Step 1. Don't return early, use else instead:

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if b.is_empty() {
        a
    } else {
        for elem in b {
            a.retain(|x| *x != elem)
        }
        a
    }
}

Step 2. Notice the the same value is returned in both cases:

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    if !b.is_empty() {
        for elem in b {
            a.retain(|x| *x != elem)
        }
    }
    a
}

Step 3. The if is superfluous since if b is empty the loop does nothing anyway:

fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
    for elem in b {
        a.retain(|x| *x != elem)
    }
    a
}
2 Likes

It has nothing to do whether it is inside an if expression; it has to do with whether it is an early return. Early returns require an explicit return keyword because just putting the value there would mean that the if expression itself evaluates to that value. That doesn't make the function return, though — it's just a simple expression in the middle of a function.

The last expression in a function is special, though: its value becomes the return value of the function. Conceptually, it's not like it causes an explicit return, either; it's merely by the fact that it is the last expression that this makes sense (since the function necessarily returns after its last expression).

2 Likes

Ohkay ! I forgot that we could use those non-semicolon value based statements inside an if block or match block as well ! Thank you !

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.