Background: going through quite a fascinating book on x86-64 Assembly at the moment. A lot of things are finally coming together in my mind, yet a select few still remain somewhat mysterious.
So far, I've gotten to understand that:
At the end of the day, we're just moving 1's and 0's around - with different levels of abstraction over the hardware, as appropriate for the issue at hand
C/C++ is, essentially, a handy wrapper for ASM
ASM itself is just a sequence of purely imperative instructions to the processor that swaps one set of 0's and 1's for another, as viable for the calculation at hand - as is all the machine code that it maps to
What I still don't understand that well:
How is the processor able to interact with the underlying hardware and the OS - if all that it can do on the most basic level is swap bits in its registers and query the underlying bus for reads and writes?
What's the deal with syscalls? One post defines them as "glorified jumps" - which does help to clarify the overall context on a higher level - but what do they do behind the scenes, and how would you go about writing them yourself?
How can a machine run without any OS, if nothing useful can be done without syscall-ing it in the first place? Or am I missing something obvious here?
Rust feels (perhaps, inappropriately - you tell me) as just the tool for the job. Yet where do you even start?
Any pointers in the right direction would be greatly appreciated - and pardon my glaring ignorance, when it comes to all matters low-level. JS/TypeScript land is where I come from.
For 1) imagine that some of those memory locations your machine code is swapping bits with is actually connected to external peripheral hardware. Then, as a simple example, writing to some memory address could cause the hardware connected to that address to print a character to a terminal.
For 2) imagine that an operating itself has to run without an operating system. If you see what I mean. Application code makes system calls into the operating system code, via *glorified jump". In turn the operating system is swapping bits. with those hardware connected registers in memory described in 1).
The processor has some instructions (like configuring the MMU for controlling what memory a process can access) which can only be used when running in a privileged context like an OS kernel. In addition interaction with many external devices like those connected through PCIe happens using memory mapped io (MMIO) which basically means that the device will intercept writes to certain parts of the memory and interpret them as instruction. The kernel generally configures the MMU such that userspace can't write to such MMIO regions but needs to go through syscalls to tell the kernel itself to use MMIO for it. The kernel can directly write to the MMIO regions.
A syscall is basically a regular call to a library function except that the target is in the kernel and is performed using a different instruction than regular calls. The instruction that performs the syscall changes the privilege level stored in the CPU state and then jumps to a fixed address configured by the kernel (rather than an arbitrary address specified for regular calls). This fixed address is the start of a so called syscall handler. This syscall handler is responsible for performing the requested action using system privileges and at the same time to verify at every step that the process is actually allowed to have access. At the end a syscall return instruction is executed by the syscall handler which causes the privilege level to drop back to unprivileged and execution to resume right after the syscall instruction.
So that's what's the difference between user and kernel space access is all about. I get it now. Given the kind of security holes that any standard user behavior naturally creates, it would be a disaster if any program could do whatever it liked with the underlying hardware - thus, the separation and "administrator privileged required". Makes sense.
And the "drivers" for an OS of a device then basically instruct it on the type of this device (input/output/both), and what kind of MMIO it's been configured to use. Things finally "clicked".
Well, the answer is either "that's exactly how" or "it's not the only thing it can do".
There is no fundamental reason why a processor couldn't physically do anything besides accessing memory. In fact, processors can do a great many other things – that's what I/O is. And even when a processor is in fact limited to accessing "memory", then memory-mapped IO can kick in, and some address ranges won't map to any physical RAM; instead, they will map to other communication channels towards other kinds of hardware.
It is not the case that "nothing useful can be done without syscalls". Syscalls are an abstraction, but the OS is just software, after all, it's not magic. Syscalls are just functions, basically. Modern OSes and CPUs offer a range of layers of abstraction, and security considerations complicate the picture (cf. "protected mode", "ring0", etc.), but at the end of the day, a syscall will eventually compile down to the exact same kind of assembly that a regular, user-written function would.
If you want to understand how a computer without an OS works, read up on microcontrollers or MCUs. These are tiny CPUs that have a set of direct I/O ports (called GPIO for general-purpose I/O), and are usually programmed either directly in assembly or in C (and these days, Rust too). They have no OS; you program them by writing bit patterns into registers, and these registers instruct various pieces of hardware (timers, GPIO ports, USART communication) to do their own hardware-y things. No OS needed whatsoever.
Seconded. One of the most "bang for the buck" things I've done, with regard to how much I learned vs how much time I spent on it, was implementing software timers and a taskswitcher on top of hardware timers on a PIC microcontroller.
Easy to do, but provided a lot of valuable insights.