Yes! I meant precisely this.
That is read bytes aligned as maximum alignment and interpret it as some field. Reading the field does not demand two or more reads with such alignment.
OS doesn't implement virtual memory. VM is a hardware feature. What OS does with them is to manage buffers and mappings to implement process isolation. When you load from a memory address which is already allocated and valid to access, it doesn't invoke any OS code - it would be unacceptably slow to invoke OS code whenever memory is accessed.
I encounter "alignment" in Rust's documentations quite often. This is why this questions arises. I'm not pursuing the performance. But this is also a matter of curiosity, most likely useless.
Oh, quite the opposite is true, it’s all about the hardware, specifically the circuits.
The short version: real computer hardware requires aligned memory access for technical reasons. As a result, programming languages, compilers, and operating systems all have to accommodate this reality. Real, usable programs need to be written in real, usable programming languages and run on real hardware with its inherent technical constraints.
If you want to understand this rather than just accept it, studying computer architecture is a great starting point. A good place to begin is Wikipedia - Computer Architecture, though unfortunately, you seem to dismiss this valuable resource.
Yes. I understand. Within this illusion there is no notion of things less than addressable chunk. Chunk is the atom for the purpose of such instruction, so last two zeros simply can't mean something observable, so they does not exist.
What about 288 connectors of DDR5 DIMM? Why is it important?
Thank you for correction. What I wrote is based on 5 seconds of reading the Wikipedia article on virtual memory. That is, I have no idea about it and just made an assumption. Thank you for correcting me. I didn't mean to sound rude.
Yes, I really understand that it was silly to assume that access the memory causes the system call. Does it called "system call"?
Yes @khimru clarify that opposite is true.
@trentj has provided the valuable everyman-understandable example and this enough for me at this point.
I appreciate Wikipedia, but for me it is more like a reference book overloaded with cross links, which are very difficult to read without preliminary knowledge, without losing motivation or becoming gloomy realizing how scarce my knowledge is. The style of schoolbooks is much better for me, since I do not have any education in the field of computer sciences or mathematics and I can not get it.
I have begin reading of "Structured Computer Organization" by Andrew S. Tanenbaum, but switched to understanding abstract algorithms first. I'm going to return to it as soon as I finish with "Introduction to Algorithms" by Cormen, Leiserson, et al.
Generally, Tanenbaum is very valuable source.
In my experience, it is much simpler to go from the concrete to the abstract.
It's not “about such instruction”. it's simple: when CPU requests something from memory – it gets back 16 bytes.
It doesn't even matter what CPU does inside with it's instructions, that's how memory works.
The precise number is not important. The important thing is this simple fact: you can't request and get back a byte from memory. Only a cache line. 16 bytes. One would need 16*8 wires for that… but in reality there are 288, more than enough for 16 bytes.
And all the dance that happens is around that: how do we organize our data structures to not force CPU to “stitch together” small piece of data from two different 4/8/16/32-byte chunks.
We know that these chunks are powers of two but couldn't, realistically, proclaim that it's 16 bytes, precisely: yesterday it's 8 bytes, tomorrow it can become 32 bytes.
In reality designers of 80386 (also known as x86, these days) C API made such a mistake: since 80386 accesses memory in chunks of 4 bytes… they declared that things should only be aligned at 4 bytes… and even today, when CPU uses 16 bytes chunks… people are accessing memory in the inefficient fashion (evidently one cannot be a proper god of war using just ordinary locks, so the game does a lot of split locking is the easy to remember key to the whole misery). That's why Rust doesn't cap alignment at 16 bytes (but practically it only effects very exotic data types, because Rust doesn't try to make access to non-primitive types aligned).
In his book there is a rise from the physical level to the level of application programming. In preamble, author offers for to skip parts relating to the level of digital logic for speed, which I did. I went through the level of machine instructions to the level of the operating system, but I was disturbed by a weak understanding of how the stack based languages functioning and how primitive data-structures baking the memory model works. I did not know at all what "stack" and a pointer are. Many sources imply elementary knowledge, but when they are not there, you have to switch. It may seem to you that you studied from the bottom up, but it seems that it cannot be always correct.
Yeah, starting from absolute zero in any new field is never easy, but over time, the pieces start to connect.
When I started with computers decades ago, I was still a child. The challenge today is that computer science has evolved so much, making it harder to break in.
But in general, from what I’ve observed about how people learn, regardless of the field or their age, they begin with simple, concrete concepts. Maybe my phrase "concrete to abstract" isn't precise English, but what I mean is this:
Take math as an example. You first learn to count, then to add numbers, then multiplication. Only later do you realize that addition and multiplication aren’t just for real numbers, they apply to anything that follows certain abstract rules.
Re-using an old post about why chips care about alignment:
If you're interested in the hardware-level reasons for constraints like this, I would strongly encourage you to play through Turing Complete. The pacing's a little uneven, but the game has you work your way up from single NAND gates all the way up to a working computer, and while it makes some simplifying assumptions along the way, they mostly serve to hide details like clocks and specific voltage levels.
Building a memory array from parts will teach you very effectively why word-level addressing works the way that it does. Doing it in a video game is significantly cheaper, faster, and easier than doing it with transistors (let alone integrated circuits).
Oh. You mention several things at the same time, which are incomprehensible to me.
This discussion becomes outside the topic. Probably, I would first of all want you to confirm or refute my understanding of the foregoing.
-
The C API for x86 instruction set still declare that things should be aligned at 4 bytes today? Is it that what you meant behind "accessing in inefficient fashion"?
-
Rust does not declare such constraints, only >=1 and 2^n. Right? In this way if in the future we will have CPU operating on 32 or more bit chunks of memory, all Rust's types will keep working, but compiler developers can implement more efficient operations for 32-bit aligned types make use of new instructions on new platforms. Yep?
-
I read the post you mention. "split-lock" is described as "architecturally legal operation", that is "split-lock" is CPU instruction. Kernel designers provide access to this instruction through the system function, accessible via C or other languages APIs (is calling this API called "system call"?). Then programmers leverages this API making now discouraged operation "split-lock" in its language of choice. Do I understand it right? The question is about this line:
In practice, that means locking the bus for the duration of the operation, which can stall every other processor in the system.
Why doing "split-lock" requires special measures?
Yes, now I better understand what you mean. The book about algorithms is pretty "concrete" about entities in this way of thinking, as discrete math is concrete opposed to calculus. Anyway it feels like it fills the gap what prevented moving forward.
It looks very promising. I did not use Steam. Can I somehow find out what time takes to complete the full game among users who have passed the game completely?
Judging by my own Steam profile, it took me about a week to complete the game (playing a couple of hours at a time), and then about a month to get bored of it afterwards. On the other hand, I have some prior background in computer engineering, so this may not be representative; a few comments in the reviews suggest that some people took much longer to complete it, and some completed it faster.
Looks like it usually takes 10-17 hours for completing the main part of the game.
I have a questions about cited reply.
- Is this MIR? Where can I read about its syntax?
- Do I understand the following correctly?
- Line 12 calls read_via_copy moving
_2
into it, which checks if type of the pointee isCopy
and then goes tobb1
, unwind otherwise; - Patched version in lines 13-14 bypasses all checks and simply copy value behind
_2
to_0
and goes tobb1
.
- Line 12 calls read_via_copy moving
- How input value from
_1
turns out to be in_2
? _0
is always for returned value?- The topic author claims that
read
callsdrop
(on value readed). Is they wrong? Or they mean that afterlet x = p.read();
x
is dropped when goes out of scope? And this is not the case now sincex
is somewhat likeManuallyDrop
?