Hello,

I am trying to find a way to get all 4 f64s from one __m256d

How can I do this? Even better is there a way to accumulate all 4 values into 1 f64?

I am trying to get the sum of products from 2 vectors ie:

```
for i in 0 .. 1000 {
result += vec1[i] * vec2[i];
}
```

I want to speed this up using something like this:

```
let a = _mm256_set_pd(1.0, 2.0, 3.0, 4.0);
let b = _mm256_set_pd(1.0, 2.0, 3.0, 4.0);
let mut acc = _mm256_set_pd(1.0, 2.0, 3.0, 4.0);
acc = _mm256_fmadd_pd(a, b, acc);
```

I need the values in acc eventually, how do I do this?

Also, since I can't yet get those values, I can't tell if acc is having it's previous values overridden, or is it doing the result += vec1[i] * vec2[i] part that I'm assuming it's doing.

I will likely have more questions about how to quickly dump f64s from a Vec into these without slowing it down too much.

Thanks!