I have tried several tutorials, books, and other resources about networking, but none of them really satisfied me. I do not just want to understand how something works; I want to understand what problem it solves in the first place. I prefer learning from first principles.
This question is not specifically about Rust, but I previously had a good experience asking about how to implement a VM in Rust. People were very helpful, so I decided to ask here again.
For learning and as a hobby project, I want to rebuild the entire TCP/IP stack from scratch in Rust.
My plan is to start from the lowest level by simulating a physical layer that is unreliable. I have already written a small library that simulates unreliable communication, but I would like something that feels closer to real hardware. My idea is to simulate a very basic hardware interface in code and then build the higher layers on top of it.
However, I am not sure what the best approach is. The physical layer feels especially important since everything else is built on top of it.
Does anyone have recommendations for how to simulate a physical layer in a meaningful way for learning purposes? For example, what kind of abstractions or constraints should I model to make the simulation realistic enough to support building higher network layers?
Any advice or references would be greatly appreciated. Thank you.
smoltcp::phy might be of interest? I don't know anything about smoltcp really, but to me it seems they rely on fuzzing quite a bit for testing. I don't know if fuzzing alone qualifies as simulating an unreliable network device or if a more specialized simulation algorithm is needed.
From what I understand, smoltcp uses fuzzing to test that its protocol logic doesn't panic on malformed input. But what I'm trying to do is different, I want to simulate an unreliable channel at the physical layer, meaning I need to deliberately drop packets or corrupt them, and then verify that the layers above (like TCP) recover correctly.
It sounds like what you need is some kind of state machine that has unpredictable aspects. I have no experience in this but unless it becomes large-scale, it may be feasible to implement this directly as a state machine to which you add a random input to create this randomness?
The main questions would be: How large would this have to be (in terms of behaviour coverage or API coverage) for it to be sufficiently useful in order to implement communication protocols on top of it? How to be sure that the artificially introduced "random points" are sufficiently representative?
As part of your studying project, maybe studying hardware itself would be a way to know more about these topics?
I suggest picking one then doing a little research before you embark on a simulator. For example, CAN bus is fairly simple as far as physical layers go. You will have to decide how to simulate differential signalling, bit stuffing, understand which CRC is used, simulate out-of-tune oscillators, etc.
The details are incredible. For example, as a fun added bonus, CAN bus can work without a common ground and, in some cases, with just a single communications wire. Will you be simulating that?
Or, are you thinking "at the adapter level"? That would be much easier.
I would recommend you start wiring up an adapter for Linux XDP queues in your network stack and do your simulation there. You can get a lot of simulation capabilities 'for free' as well as customization with eBPF, tracing tools via tcpdump and netfilter, and a minimal jump into real deployment. That's what I should have done in my own TCP/IP-from-scratch projectÂą and it will helpfully validate your entire abstractions as well. XDP is also reasonably close to what drivers would provide for real hardware (or indeed the MMIO interaction with hardware itself) especially given eBPF hardware offloading being discussed as the next state-of-the-art for fast and efficient general purpose networking.
Creating interfaces can be done with macvlan or veth pairs.Those will also let you run your own stack on the same machine in parallel to the other networking so you can test against real devices once your stack is there.
ÂąI'm excusing myself by saying that we did not have good enough Rust tooling for eBPF back then but nowadays aya is stable and provides a lot of documentation.
Smoltcp has utils to spuriously drop frames too. And like, anyway, what is the problem? Buffer packets a bit, have a random value decide about a) drop or not b) reorder or not c) corrupt or not d) delay or not e) combination of the above. Seems like an easy piece to write
if i understand correctly what you want is to be able to simulate the actual bytes transmitted on the wire by something like IEE802.3 and then be able to write the protol itself and then ip tcp etc?
in that case the solution is easy just use anything that can send a bytestream between processes and place a basic middleman that can add delay and occasionally flip a bit no need for any meaningful library to achieve that, once you have that simulated wire you can try any L2 protocols on it easily and once you have a working L2 you can use it to make L3 work on it and so on.
consider experimenting with multiple processes competing over the wire, it will be chaotic
I discussed the idea with some LLMs earlier and they suggested going down the IEEE 802.3 route or simulating something closer to a real network controller. I'm comfortable enough with the hardware side that I could simulate it, but I was worried it might consume too much time that could be better invested in the upper layers, which are the main goal of the project.
Your suggestion of treating a bytestream between processes as the “wire” is a nice abstraction. A simple middle layer that can add delay, drop data, or flip bits already seems enough to create the kind of unreliable environment needed to exercise the protocols.
The multiprocess aspect is particularly interesting. Having several processes sharing the same simulated wire and competing over it sounds like a good way to introduce real contention and unpredictable behavior.
Just to clarify, are you suggesting that I first build a simple simulator for the unreliable physical layer, then use smoltcp to test it? And if smoltcp works correctly over this simulator, I can be confident that the physical layer simulation is solid enough for building the higher layers on top?"
What I’m aiming for is to build a simulator that will expose me to all the obstacles engineers faced when designing the TCP/IP stack, so I understand why certain solutions exist. From my perspective, the higher layers look like a dark forest — I don’t yet know what challenges will appear there. That’s why I focused on this specific layer. Some replies suggest it’s straightforward, but to me it feels like the most important part of the project.
I’m not planning to simulate the full hardware details like differential signaling or oscillators. I’m looking more at the “adapter level,” where the simulator behaves like a wire that can drop, delay, reorder, or corrupt packets. That abstraction is enough for me to explore how the higher layers, like TCP/IP, handle unreliable conditions, without getting lost in the full physical layer complexity.
and yet that level of depth is worth exploring at other time.
Thanks, that’s a really valuable answer. I’m curious, how much time did it take you to build your stack from scratch? Did you keep any notes, documentation, or a public repository that I could look at? Also, are there any lessons or gotchas you learned along the way that you wish you knew before starting?
I would not bother simulating the hardware. I would stop at the packet level and focus on sending and receiving packets.
All hardware related issues, such as missing packets, delayed packets, truncated packets, and packets with bit errors, can be simulated easily in the packet distribution layer.
If you work only with packets, it should be easy to bring in real network elements through the operating system on Linux.
It should also be easy to export to and import from Wireshark at the packet level. That means the problem of observing and analyzing packets is already solved with minimal effort.
That way, you can concentrate your resources on your actual goal, rebuilding a TCP/IP stack.
I got to a working iperf3 over the course of roughly a trimester, spending on average maybe ~6 hours a week? I suppose you could go faster if you don't try wrestling with the type system as much as I did. Expect 'research' level code as in there are two approaches to XDP and neither is the final one (that's in a private repository I'm not ready to share, sorry) and it taught me what the approach can't do. E.g. efficient multi-buffer packets conflict with having a method that gives you the whole packet as a single byte slice; and I couldn't figure out a good way for hardware offloaded checksums to really go in there. Routing packets directly between queues without copying out of the buffer into another did not go beyond a draft, if you want that sort of efficiency you must plan for the operation beforehand.
You'll also see that I have a sketch of a virtual phy but testing on real devices (even if software devices like macvlan) got me much further. The ability to put any software on the other end and do inspections with wireshark proved to work nicely for debugging. (If you do have tracing in your stack, consider tagging your own logs with TCP seqs. I had no experience with log at the time and some packet traces got really long and annoying to search by hand).