Const Function that returns array size at compile time

All,

I need to create a function that takes the length of an array (usize) and if it is not evenly divisible by 16, round it up to the next multiple of 16. This requires being able to evaluate the size of an array at compile time and more importantly being able to use the output of this function (usize) as the size of a new array. All of the information in theory is available at compile time so I would imagine there is a way to do this. Here is a basic example with rust nightly 1.47

#![feature(const_generics, const_fn)]

const fn return_size(input: usize) -> usize {
    let mut new_size: usize = 0;
    if (input % 16 != 0) {
        new_size = input +  (16 - input % 16);
        new_size
    }
    else {
        input
    }
}

fn main() {
    let this_array: [f32; 2] = [0.0, 0.0];
    //let resized_array: [f32; return_size(this_array.len())] = [fill with zereos];
}

I have not looked into this too much, but I believe the answer lies somewhere in the typenum crate, based on what I've seen various n-dimensional array type crates do.

It basically makes common integers into compile-time known objects that can be used and reasoned about accordingly.

As in nalgebra, making a 2x3 matrix with:
type Matrix2x3f = Matrix<f32, U2, U3, MatrixArray<f32, U2, U3>>;
Where U2 and U3 tell it the size (as unsigned ints), but in type form so it can know at compile time.

You can do this ( in nightly or beta ):

const fn return_size(input: usize) -> usize {
    input +  (16 - input % 16)
}

fn main() {
    let _this_array: [f32; 2] = [0.0, 0.0];
    const N : usize = return_size( 8 );
    let _resized_array: [f32; N ] = [0.0; N];
}

But I don't think you can use len(). Incidentally, your function adds 16 if the number is divisible by 16, contrary to your description.

1 Like

Well, you can do this as well:

const fn return_size(input: usize) -> usize {
    input +  (16 - input % 16)
}

fn main() {
    const A: [f32; 2] = [0.0, 0.0];
    const N : usize = return_size( A.len() );
    let _resized_array: [f32; N ] = [0.0; N];
}
2 Likes

I'm currently looking into generic_array crate. I would prefer to solve this within std Rust. But I am not hard set against using a crate. Thanks for pointing me that way. But how could I do this using std Rust?

geebee22, thanks for replying so quickly! I had seen some solutions involving making the "input" array a const. But I'm not sure that will work for what I need. I need to be able to pass in any array into this function, and create a new array that is a multiple of 16 with all the values from the old, padded with zeroes if necessary.
I'm new to rust, but these arrays will be constructed from input that is read from a file, so I'm not sure the "input" array to the return_size function can be const if it is to solve my issue. Is there a way in nightly to pass the length of any array to return_size() and use the output of that function in the length field of another array?

It sounds to me like you want a Vec not an array.

2 Likes

Sorry, you are correct. When I was typing up the example vscode told me this would be a compile error with having if statements in a const fn, so I simplified it as that would be a separate problem. However, I have corrected it and it still builds and produces correct output.

I am trying to refactor a C++ library I wrote and I have this ability in C++ with constexpr and templates. I was hoping to duplicate that functionality in Rust. I would prefer to keep the ability to use arrays, I use an array with the maximum size of data I support. If there is more data I inform the user that the file cannot be processed. That array with data is then passed to this return_size function and a new array is created that is a multiple of 16 with all the data from the original array. I could use a vec, but it would be nice not to "lose" anything going from C++ to Rust.

Ok, this may not be quite right, but when you have some byte data to process, the function would typically take a parameter input: &[u8].

You can pass either an array or a Vec to a function declared like this. Still, I am doubtful that in a case like this you would want an array at all, better to stick to Vec mostly.

[ I am very much a Rust learner, so don't take what I am saying as gospel ]

Put me very much in the still learning Rust category, so take anything I say with a grain of salt...

But, one of my favorite things I've found while trying to better understand Rust, is that you can go into the source for any crate available in Rust. Even better, that means you could directly grab any functional bits of code you want to borrow, if you want to keep your crate standard/minimal.

As for the how... It uses hard-coded named traits and implements some methods for replicating basic math with them. I'm not sure if there's a smaller/simpler version that limits it to sizes/operations you'd care about. But it appears the underlying principle is applying traits of terminating chains of bits to represent each int... Nifty! And it looks like they used a script to generate it. typenum const.rs source

Rust doesn't have stable const generics yet, and in the first version of const generics all arithmetic on array sizes will be forbidden, so you can't do this in Rust yet.

Rust macros are also somewhat similar to C++ templates, and you can generate some array-manipulating functions this way.

But in general arrays are rarely used in Rust, and have poor language support. Prefer Vec in almost all cases.

+1 vote for recommending Vec from me. I recall being worried about stack vs. heap issues on some nested Vec iterations at one point. So I tested out several instances of trying to slow down Rust's basic Vec iterator trying to force it to access heap references and compared that with trying the same things on an explicitly stack-allocated, small vector (using arrayvec), and couldn't find any difference in performance whatsoever. EDIT: And now I'm thinking of new ways to mess with it, so I probably can't guarantee it's perfect. But it is GOOD!

I think I've heard that's part of why Rust doesn't worry about having tail call optimization, because it can usually iterate as fast as it or anything else can recurse... (If anyone knows more or otherwise, please let me know!)

Thank you all for replying to my question. For now I suppose I can use a vector for this, I will compare to my c++ implementation using arrays and see if I can find any performance diff between the two. If I can get it up fast enough, I'll post performance benchmarks. Thanks again for the help.

Look forward to seeing the results of that.

So far all the C/C++ codes I have reimplemented in Rust have resulted in performance that matches the original. Although it may take some rearranging to do things "the Rust way".

You mention heap allocated arrays vs arrays on the stack. Which reminds me I have yet to reimplement one of my C++ exercises into Rust. It does big integer multiplication, millions of digits, using the Karatsuba algorithm. It is optimized by keeping small arrays of digits on the stack rather that big vectors on the heap. Karasuba is a divide an conquer algorithm so you end up dealing with a lot of small arrays at the bottom of the recursion.

It's kind of a useless thing to do as we have big number libraries to do all that, but it was part of a coding challenge where everybody was using different languages and use of non-standard libraries was not allowed.

I was wondering if I would need that small vector optimization if I did this in Rust. And if so how on Earth would I do it?

I need a Rust solution as an entry to that challenge!

1 Like

The small vector optimization is implemented by the smallvec crate.

Definitely have to be careful when testing. If you're lucky, multiple vectors might land near each other in RAM and the cache plays a very important role, too.

The main problem with vectors is, that they're bad if you have to create and destroy them in a loop. If you can clear and reuse the same vector, instead, the OS will be happy, because there's a minimum amount of allocation and deallocation happening and the CPU will be happy, because it always accesses the same memory region, i.e. the vector is likely still in L1 or L2 cache, at which point there is no difference between operating on an array or a vector.