Linux x86_64: absolute fastest way to copy 4kb of data

We have:

p1: *mut u8;
p2: *mut u8;

p1, p2 are both aligned at multiples of 4096 byte boundaries.

Now, we want to copy 4kb from p2[0 .. 4096] to p1[0 .. 4096].

What is the absolute fastest way to do this ?

We can assume this is x86_64 linux.

EDIT: As it turns out, the 'real' problem is that we have:

p1_0, p1_1, p1_2, ... p1_n
p2_0, p2_1, p2_2, ..., p2_n

and we want to copy p1_i <- 4kb from p2_i for all 0 <= i < n

Basically, I want the fastest way to copy a bunch of (not necessarily continuous) 4kb pages.

Probably just memcpy (ptr.copy_bytes) plus some rayon would do the trick.

As has already been mentioned, memcpy() is normally the fastest method.

If you have control over one of the pointers you could always play tricks with virtual memory and your address space to map p1 and p2 to the same physical memory. That way whenever p1 is first initialized your other buffer will see the changes instantly.

2 Likes

@Michael-F-Bryan : Yeah, I was hoping for non-memcpy tricks. However, it's probably not worth it to do a mmap call for just one pair of 4kb pages right? This seems only applicable in cases when we have huge continuous blocks.

1 Like

Yeah, tricks like mmap() and DMA might work to improve the transfer of large chunks of data but for only a page or two it's probably faster to do the memcpy() instead of switching into kernel space.

1 Like