String conversion [i32; 200]

I need to massively convert fixed format strings to [i32; 200] and this is my current method:

// st = "12|3232|323...(*200)" 
let mut v = : [i32; 200] = [0; 200];
let mut i = 0;

for n in st.split('|')
{
	v[i] = n.parse().unwrap();
	i += 1;
}

Is there a conversion method with better performance? Thanks.

1 Like

You might want to look into GitHub - Alexhuszagh/rust-lexical: Fast numeric to- and from-string conversion routines.

Also, it's probably better to use iterator methods:

let v: Result<Vec<i32>, _ > = st.split('|').map(|n| n.parse()).collect()
1 Like

That would allocate a new vector on each iteration. If the number of integers on each line is fixed (or at least bounded) and the number of lines is large, that would be a significant performance issue.

I suggest using the atoi crate for integer parsing, if you're not using the more complex formatting features, such as digit separators, and you're sure that the file adheres to the specified format. In my local benchmarks it seems to be significantly faster than lexical, although the latter benchmarks are written in a way which make them somewhat harder to compare.

How so? It would create a vector for each st given, but this is just what the example did, too, with an array instead of a vector (I figured that's by design, so I kept that).

1 Like

No, the FromIterator impl for Result<C: FromIterator> creates a single collection and adds all Ok values to it.

2 Likes

Arrays are cheap. That's just 800 bytes on the stack, it requires basically no extra work to deal with. There could be a memcopy after the array is parsed, but that depends on the surrounding code, and can be avoided in many cases. Even if it happens, it's still much faster than memory allocation and deallocation. Copying 800 bytes will take on the order of 100ns (probably less), while calling an allocator generally takes 1-10mcs (with potentially unbounded call time, which can happen in case of excessive memory fragmentation).

If moving [i32; 200] is an issue, one can reuse the same vector between the iterations, clearing the contents before parsing each line.

On the other hand, the compiler often optimizes iterator code better than manual iteration. I'd benchmark, if performance is of a concern. Maybe using take(200) will get the code to pre-allocate properly, and then the allocation might be amortized. Pretty much all of this depends on the surrounding usecase, of couse :slight_smile:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.