I'm writing a program which dumps UDP packages from a 10GbE network card. I want to reach a speed of about 1GBytes/s. The size of the UDP package is around 8kBytes. The data dumped will be written to a RAM disk.
So my question is : Is std::net:udp suitable for this purpose? or is there any better solution?
It generally takes some work to get to those speeds. I'm not very up-to-date with the libraries in this space, but one example of one is http://udt.sourceforge.net/. A few years ago I wrote rust binding for UDT: https://github.com/eminence/udt-rs. If you're interested, we can work to get them updated and tested with the latest version of rust.
I'm not aware of any pure-rust high-speed UDP libraries, but if someone wanted to work on this, it would be a great addition to the ecosystem
I have roughly studied your udt lib, but not sure whether my condition is suitable to this lib.
I'm receiving the data from data acquisition device, which I cannot modify its inner program (actuall its is based on FPGA).
I find that UDT is a connection oriented lib, but my DAQ device does not perform any connection, it simply send udp package to some destination address in some certain port.
I have tried to use std::net::UdpSocket to receive the data but about 10% of the package will be lost.
Can you show us your code which uses std::ned::UdpSocket? I suspect that issue is that UdpSocket does one syscall for each packet, which is less efficient than whatever libpcap does under the hood.
Also try to test mio, being low-level wrapper around epoll (if you are on Linux) it will process several packets per syscall.
let socket = UdpSocket::bind("0.0.0.0:60000").unwrap();
socket.set_nonblocking(false).unwrap();
let niter=4096;
let mut buf = vec![0_u8;16384*niter];
let mut shift=0_usize;
for _i in 0..niter{
let (num_bytes, _src_addr) = socket.recv_from(&mut buf[shift..]).unwrap();
shift+=num_bytes;
// no other code here
}
///then check the packet received to find out how many are lost here
as a comparison, following is my pcap code:
let dev = pcap::Device {
name: dev_name.to_string(),
desc: None,
};
let mut cap = Capture::from_device(dev)
.unwrap()
.timeout(1000000000)
.buffer_size(512 * 1024 * 1024)
.open()
.unwrap();
cap.filter(&format!("dst port {}", port)).unwrap();
while let Ok(packet) = cap.next() {
let data: &[u8] = &packet.data[42..];//skip header
// many other operations including copy to a temp buffer and use a
// crossbeam-channel to send the buffer to another thread etc.
// I skip it here.
}
ps, I find that pnet is also worse than pcap. If I set a rather big buffer (4GBytes) for pnet, it will lose 1% of the packets, while pcap lose if any <0.001% packets.
Outside of the Linux specific recvmmsg and sendmmsg it is not possible to receive or transmit multiple UDP datagrams with one syscall. If you are writing portable code you will always get only one datagram per recv call.
What is the size of the UDP datagrams being sent? Is the link a standard 1500 MTU ethernet link or are Jumbo frames being used?
It is possible that some of the loss is due to overheads in the IP/UDP stack in the kernel and limitations of the UDP socket syscall interface, also your UDP socket will have the default buffer size while you have increased the pcap buffer size.
It’s likely the “stock” kernel IP/UDP stack cannot keep up with the packet ingress rate if you’re pulling a packet at a time (ethtool should have stats on the types of drops, including ones on the NIC); the way UdpSocket is set up is it’ll be copying your jumbo frames from kernel buffers to your buffer. Instead, you’d probably want to use AF_PACKET and work with the raw packets coming off the interface. You can then create a setup whereby the kernel and your program share a buffer, and the kernel will tell you when it has filled up the buffer with packets; when that happens, you’ll want to copy them out to a background thread that does file I/O (possibly using AIO to speed up that path).
But this is a custom setup that is beyond what UdpSocket provides.