It does seem like a reasonable RFE for ndarray to provide this -- it could easily preserve the data in case of a shape error.
OMG, I forgot "..." and only wanted to point OP that it's perhaps better to use Array::uninitialized(shape)
over mem::uninitialized
nothing more nothing less, and then continue with swap like done in the question ...
I precisely meant
fn inplace_reshape(array: &mut Array<f32, IxDyn>, shape: Vec<usize>) {
let mut ret = unsafe { Array::uninitialized(shape) };
mem::swap(&mut ret, array);
...
}
@cuviper complete solution is the way to go and @Yandros points are valid.
Also more about Array::uninitialized
:
it was defined like
pub struct OwnedRepr<A>(Vec<A>);
type Array<A, D> = ArrayBase<OwnedRepr<A>, D>;
impl<S, A, D> ArrayBase<S, D>
where S: DataOwned<Elem=A>,
D: Dimension,
{
pub unsafe fn uninitialized<Sh>(shape: Sh) -> Self
where A: Copy, // <--- important guarantee
Sh: ShapeBuilder<Dim=D>,
{ ...}
}
Ok I tried to make a benchmark to test the different ideas:
use ndarray::{Array, IxDyn};
fn reshape_inplace<T>(array: &mut Array<T, IxDyn>, shape: Vec<usize>) {
use std::mem;
unsafe {
let mut tmp = mem::uninitialized();
mem::swap(array, &mut tmp);
tmp = match tmp.into_shape(shape) {
Ok(tmp) => tmp,
Err(e) => {
mem::forget(mem::replace(array, Array::from_vec(vec![]).into_dyn()));
Err(e).unwrap()
}
};
mem::swap(array, &mut tmp);
mem::forget(tmp);
}
}
fn reshape_inplace2<T>(array: &mut Array<T, IxDyn>, shape: Vec<usize>)
where T: Copy {
use std::mem;
unsafe {
let mut tmp = Array::uninitialized(IxDyn(&[]));
mem::swap(array, &mut tmp);
tmp = match tmp.into_shape(shape) {
Ok(tmp) => tmp,
Err(e) => {
Err(e).unwrap()
}
};
mem::swap(array, &mut tmp);
}
}
fn reshape_inplace3<T>(array: &mut Array<T, IxDyn>, shape: Vec<usize>) {
use std::mem;
let mut tmp = Array::from_vec(vec![]).into_dyn();
mem::swap(array, &mut tmp);
tmp = match tmp.into_shape(shape.clone()) {
Ok(tmp) => tmp,
Err(e) => {
Err(e).unwrap()
}
};
mem::swap(array, &mut tmp);
}
#[cfg(test)]
mod bench {
extern crate test;
use test::Bencher;
use super::*;
fn array() -> Array<f32, IxDyn> {
Array::from_shape_vec(IxDyn(&[16, 16]), vec![2.5; 16 * 16]).unwrap()
}
#[bench]
fn bench_reshape_inplace(b: &mut Bencher) {
let mut x = array();
let mut y = false;
b.iter(|| {
let shape = if y { vec![16, 16] } else { vec![32, 8] };
y = !y;
reshape_inplace(&mut x, shape);
});
}
#[bench]
fn bench_reshape_inplace2(b: &mut Bencher) {
let mut x = array();
let mut y = false;
b.iter(|| {
let shape = if y { vec![16, 16] } else { vec![32, 8] };
y = !y;
reshape_inplace2(&mut x, shape);
});
}
#[bench]
fn bench_reshape_inplace3(b: &mut Bencher) {
let mut x = array();
let mut y = false;
b.iter(|| {
let shape = if y { vec![16, 16] } else { vec![32, 8] };
y = !y;
reshape_inplace3(&mut x, shape);
});
}
}
So it does not create new arrays every iteration just reshapes one.
Result:
Finished release [optimized] target(s) in 0.14s
Running target\release\deps\tensor-142d1c5c8a6e0b80.exe
running 3 tests
test backend::native::bench::bench_reshape_inplace ... bench: 149 ns/iter (+/- 28)
test backend::native::bench::bench_reshape_inplace2 ... bench: 246 ns/iter (+/- 107)
test backend::native::bench::bench_reshape_inplace3 ... bench: 277 ns/iter (+/- 106)
test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured; 0 filtered out
Process finished with exit code 0
The first one seems to be always the fastest. Am I missing something?
(And how do I style code as rust code by the way?)
It doesn't necessarily mean the first one is the fastest. All three 95% confidence intervals have overlaps. Criterion gives you p-values and statistical significance, which are better measures though not perfect!
Also 300 ns is very fast, I don't think it would make a noticeable difference to have one over the other unless reshaping is a large part of what you app is doing.
This forum defaults to rust for ```
blocks, but I think indented blocks are treated as plain text. You can also specify the syntax style manually, like this:
```rust
// code
```
I took a shot at rewriting your benchmark with criterion, adding my own tweaks. I changed the mem::uninitialized()
version to use ptr::read
and write
instead, because there's not much difference in safety with that. The other two I just slimmed down to hopefully reduce copies.
use criterion::{Bencher, Criterion};
use ndarray::{Array, ArrayD, Ix, IxDyn, ShapeError};
use std::{mem, ptr};
type ShapeResult = Result<(), ShapeError>;
/// On error, `array` is empty.
/// On panic, accessing `array` is UB!
fn reshape_inplace<T>(array: &mut ArrayD<T>, shape: &[Ix]) -> ShapeResult {
unsafe {
match ptr::read(array).into_shape(shape) {
Ok(tmp) => {
ptr::write(array, tmp);
Ok(())
}
Err(e) => {
ptr::write(array, Array::from_vec(vec![]).into_dyn());
Err(e)
}
}
}
}
/// On error or panic, `array` is `Array::uninitialized` -- UB if accessed.
fn reshape_inplace2<T: Copy>(array: &mut ArrayD<T>, shape: &[Ix]) -> ShapeResult {
let tmp = unsafe { Array::uninitialized(IxDyn(&[])) };
*array = mem::replace(array, tmp).into_shape(shape)?;
Ok(())
}
/// On error or panic, `array` is empty.
fn reshape_inplace3<T>(array: &mut ArrayD<T>, shape: &[Ix]) -> ShapeResult {
let tmp = Array::from_vec(vec![]).into_dyn();
*array = mem::replace(array, tmp).into_shape(shape)?;
Ok(())
}
fn bench(b: &mut Bencher, f: fn(&mut ArrayD<f32>, &[Ix]) -> ShapeResult) {
let mut shape: &[Ix] = &[16, 16];
let mut shape2: &[Ix] = &[32, 8];
let mut x = Array::from_shape_vec(shape, vec![2.5; 16 * 16]).unwrap();
b.iter(|| {
mem::swap(&mut shape, &mut shape2);
f(&mut x, shape).unwrap();
});
}
fn main() {
let mut c = Criterion::default().configure_from_args();
c.bench_function("reshape_inplace", |b| bench(b, reshape_inplace));
c.bench_function("reshape_inplace2", |b| bench(b, reshape_inplace2));
c.bench_function("reshape_inplace3", |b| bench(b, reshape_inplace3));
c.final_summary();
}
On my i7-3770K on Fedora 29, rustc 1.34.0-nightly (7e001e5c6 2019-02-27)
, I get:
reshape_inplace time: [81.904 ns 82.311 ns 82.718 ns]
Found 16 outliers among 100 measurements (16.00%)
11 (11.00%) high mild
5 (5.00%) high severe
reshape_inplace2 time: [135.06 ns 135.56 ns 136.22 ns]
Found 13 outliers among 100 measurements (13.00%)
10 (10.00%) high mild
3 (3.00%) high severe
reshape_inplace3 time: [111.10 ns 111.49 ns 112.00 ns]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
Awesome! could you show the p-values as well? also curious to know about Array::uninitialized(shape)
.
@cuviper could the exception (un)safety of the first function be fixed with a scope guard on panic doing what the err body does? If that is the case, could you benchmark the code with the added scope guard?
AIUI p-values only show up in comparison -- are you asking compared to @jan17's code?
Slows that one down about 14%. This is curious though, because I was actually creating an empty Array::uninitialized
, so there's no invalid data you could access for UB. I wonder why it's slower than the Array::from_vec(vec![])
?
Ok, I rewrote it like so:
/// On error or panic, `array` is empty.
fn reshape_inplace<T>(array: &mut ArrayD<T>, shape: &[Ix]) -> ShapeResult {
unsafe {
let array = array as *mut _;
let guard = scopeguard::guard(array, |array| {
ptr::write(array, Array::from_vec(vec![]).into_dyn());
});
ptr::write(array, ptr::read(array).into_shape(shape)?);
mem::forget(guard); // don't clobber on success
Ok(())
}
}
No significant change to its performance either, so that's nice.
I'm setting this aside now, so I'll invite others to experiment further.
Well, I think criterion has changed it. It used to specify them explicitly. I'll check to see what kind of statistics we can extract.
Only if run it more that once will show the p-values wrt individual previous results.