Function input validation

Hi, I'm in the process of learning Rust, so please be patient. :slight_smile:

In a function, I pass a vector of vectors.

For the vectors to be a valid input, they must be:

  1. at least 2;
  2. not empty;
  3. same length;

Of course, I can check this from the body of the function, managing the Errors, etc.

But, is there any other way to do this in Rust?

Any help is appreciated

This is tricky because a Vec<Vec<T>> handles a lot of this stuff at runtime.

If it's a blatant programming error (e.g. you document that one of your function's invariants is that it gets an even number of vectors) then it's perfectly fine to do an assertion and blow up loudly so the caller knows they need to fix their code.

Otherwise if these sorts of errors are expected (e.g. you're handling user input) then the typical solution is to return a Result<T, Error> and handle the error like you would any other.

Some other thoughts...

Instead of using a jagged array, can you store everything in one long Vec<T> and use index math to switch between rows? That way you'd be able to guarantee each row has the same length.

Depending on how much code uses this vector of vectors, you may also want to pull it out into its own type. That way the inputs can be checked once during construction and everything can assume it's always valid. That's typically how you'd implement something like a ChessBoard.

2 Likes

You can create an intermediate type which represents the unvalidated form of the type you want to ultimately construct. Never construct the type directly, but rather instead via the validations.

edit: see subsequent post for example code

2 Likes

First of all thanks for the kind reply.

Of course, I can store anything in a unique array, but this would be very efficient for me and the function itself :slight_smile: but it would feel strange or hard to understand for the final user of the function (the developer who uses the function).

Your final thought is maybe what I was looking for, but I don't know how to do it: that is, build a new type extending the outer vector, so that its inner vectors are i32 vectors, at least 2, same size, etc. At that point, I simply pass one argument of this function, with this custom type. But eventually at the moment I don't know where to start :slight_smile:

Here's an example you might use when creating a 2D matrix type. It's probably not exactly what you need for your function, but it may give you an idea as to how it can be implemented.

(playground)

Full example source code
use std::{
    error::Error,
    fmt::{self, Display, Formatter},
};

/// A 2D matrix type.
#[derive(Debug, Clone, PartialEq)]
pub struct Matrix {
    cells: Vec<f32>,
    width: usize,
    height: usize,
}

impl Matrix {
    /// Create a `Matrix` full of zeroes and with fixed dimensions.
    pub fn empty(width: usize, height: usize) -> Self {
        Matrix {
            cells: vec![0.0; width * height],
            width,
            height,
        }
    }

    /// Create a `Matrix` populated with some existing values.
    pub fn with_values(
        width: usize,
        height: usize,
        values: Vec<f32>,
    ) -> Result<Self, BadDimensions> {
        if values.len() == width * height {
            Ok(Matrix {
                cells: values,
                width,
                height,
            })
        } else {
            Err(BadDimensions {
                width,
                height,
                values_provided: values.len(),
            })
        }
    }

    pub fn get(&self, row: usize, column: usize) -> f32 {
        let index = self.index_for(row, column);
        self.cells[index]
    }

    fn index_for(&self, row: usize, column: usize) -> usize {
        // TODO: double-check this math is correct for your use case. I may have written row-major instead of column-major, or vice-versa.
        row * self.height + column
    }
}

/// The error type returned when the number of values and the dimensions provided
/// to [`Matrix::with_values()`] don't match.
#[derive(Debug)]
pub struct BadDimensions {
    width: usize,
    height: usize,
    values_provided: usize,
}

impl Display for BadDimensions {
    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
        write!(
            f,
            "Cannot create a {}x{} matrix from {} values",
            self.width, self.height, self.values_provided
        )
    }
}

impl Error for BadDimensions {}
1 Like

Just a note: for Vec< _ > I get an error (not allowed in type signatures)

You can use it inside functions:

// OK!
let a: Vec<_> = iter.collect();

You can't use it on function or global types:

// Not OK!
fn foo_bar() -> Vec<_> {
    ...
}

The _ in a type name asks the compiler to fill in the correct type using type inference. Function signatures deliberately don't participate in type inference for many good reasons (it means you can no longer just look at a function's signature to know what it does, promotes non-local reasoning, etc.).

It's not bad, it's unreadable.

Ah, yes, I should have called that out. I was providing a sketch where one could fill in any type.

This is probably a better example overall of what I was talking about: Rust Playground

It's essentially the same idea as @Michael-F-Bryan posted. One difference is that I'm specifically emphasizing the use of a wrapper type ValidatedVec, which exists only as a way of documenting and proving that a vector is validated when you use it.

That means any time you have a function which accepts a ValidatedVec as an argument, you know that it has the properties you defined without having to perform assertions again. If you have one function at the top of a call tree which validates the vector once, the functions down the tree don't have to check for themselves.

The type system here isn't guaranteeing that you can only construct a ValidatedVec using this function I wrote out, so it's not actually guaranteed to be correct as I've written it. But it does lessen the error surface a bit. It can increase your confidence in the code enough to not validate the input vector in every single function which receives it.

Use cases may vary, but I like this pattern a lot.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.