pub struct A {
len: usize,
data: [u8; 1300]
}
pub struct B {
len: usize,
data: Box<[u8; 1300]>
}
When dealing with lots of packets, is either preferred? A appears to have the flaw of "probably lots of moving / copying around"; B appears to have the flaw of "lots of small Box's fragmenting the heap; cost of allocating / dropping them"
Moving around isn't all that bad, and you can often avoid copying. Can you give some more context as to where you're using it and what sort of operations would you be doing on it?
Well, I don't know what most do. But let's look at it this way:
A UDP server receives packets (which I assume have a in-wire representation of either A or B).
You probably want to parse the packet - this doesn't require you to copy anything (pnet handles it quite well over any slice). So A will give better cache performance.
You may want to store the packet. B is better for that since you are effectively storing the data out-of-line, making iteration over the entire storage more cache-friendly
For replying, again, A is better since you won't need copying, and can create a packet on the stack, populate it properly (using pnet perhaps) and then send it off
So my advice would be use A for manipulation, parsing and responding while B is better for storage/caching.
I do not know your experience with high performance UDP servers in Rust.
Mine is 0. This is my first serious attempt.
I suspect there is a solution out there that combines the strengths of both approaches. In a very hand wavy manner, it would do something like this:
use some type of slab allocator to provide a buffer of packets [should be as cheap as a malloc + zero]
provides Boxish, which is like a Box, except, when we allocate a Boxish<Packet>, it pops one from the pool, and when we drop it, it pushes it back to the pool
this has the advantage that instead of moving/copying 1300 bytes around, we just move/copy a Boxish ptr, which is 8-16 bytes; the cost of alloc/drop is also just a vec pop/push
If anyone has built something like this, I would love to hear about the battle scars.
One issue is that Rust isn't great at avoiding copies when moving values. For instance, consider this function, which passes A to a function pointer by value:
You can easily solve this by pooling your boxes. It's probably even a good thing to limit the number of in-flight packets you're processing at once anyway.
The savings from not copying around 1308-byte objects is probably well worth it.
Here's an outline for how you can pool your boxes:
Create an ArrayQueue for storing boxes in the pool with a size equal to the bound on the number of packets.
Also create a Semaphore to wait for boxes to be returned when the bound is reached.
To acquire a box from the pool, you first acquire a permit from the semaphore and then immediately forget the permit. You then pop an item from the ArrayQueue, allocating a new box if it's empty.
The boxes you return should be wrapped in a custom struct with a destructor that returns the box when its no longer needed. To do this, first push it to the ArrayQueue, then add a permit to the semaphore.
Since neither ArrayQueue or Semaphore will allocate memory after they are created, this lets you reuse boxes without touching the allocator. The creation of a new box in step 3 happens only when you need more boxes than you have ever needed before, and you could also write it to allocate all of them up front when constructing the pool.
Are the contents of the Box contiguous in memory? I.e. suppose we have 1,000,000 Box<[u32; 1024]>, does this create 1,000,000 small objects on the heap or does it allocate 1GB of memory and chop it up into 1_000_000 parts ?
This might sound entitled: Is there not an existing crate for this? This seems like the type of primitive that many high networking systems would depend on, and thus it seems like there should be a well established / debugged solution.
Could the pass example be fixed by making callees responsible for the memcpy, at least with Rust ABI? Then the generated code could use the value in place and only really move it if necessary.
The only real way to do that is to pass a Box or &mut reference. Due to the differences in copy/move semantics, Rust doesn't really have an equivalent to C++'s && rvalue references.