p1: *mut u8; p2: *mut u8;
p1, p2 are both aligned at multiples of 4096 byte boundaries.
Now, we want to copy 4kb from p2[0 .. 4096] to p1[0 .. 4096].
What is the absolute fastest way to do this ?
We can assume this is x86_64 linux.
EDIT: As it turns out, the 'real' problem is that we have:
p1_0, p1_1, p1_2, ... p1_n p2_0, p2_1, p2_2, ..., p2_n
and we want to copy
p1_i <- 4kb from p2_i for all
0 <= i < n
Basically, I want the fastest way to copy a bunch of (not necessarily continuous) 4kb pages.