Rust doesn't have a runtime

My intension is not to be nitpicky. My intension is the opposite, to point out that the term "run-rime" does indeed mean different things in different contexts.

And failed miserably.

I know of these things, I use Espruino, it's wonderful. They certainly do need a "run time". As did BASIC in the days when any computer people could buy only had kilo bytes of memory. They are interesting outliers.

That's precisely the point. I don't think there are languages (except for assembler) which don't have any runtime at all.

But most languages with runtime (Java, C#, Go, etc) impose runtime on you whether you use it or not. That's why “slimmed down version of the language” is needed: the only way to reduce runtime is to create a special version of the language. Otherwise even if you run “Hello, world” you get a lot of code.

Languages like Ada, C, C++, Pascal, Rust… they are different: if you don't use certain facility — it wouldn't be added to binary. Even if you use “full” version of the language.

Simplified runtimes exist (like Musl instead of GLibc) but they are not mandatory.

The C language does not have a run-time. You will not find any mention of such a thing in the C standard: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf.

There have been and are many such languages, PL/M, Coral, Pascal, etc, etc.

You mean: all these functions which is does mention (like memcpy or malloc/free) come from thin air?

At least Pascal includes functions read and write which are not functions at all, because they don't behave like regular functions. Not sure what PL/M or Coral say (do they even have a formal spec?) but I suspect that, like Rust and Pascal, they also have some things which are not possible to implement directly in these languages.

I don't think there are any formal definition which may give us clear answer about whether some languages does or doesn't have a runtime.

It looks similar to CPU bitness: there are 8bit CPUs, 16bit CPUs, 32bit CPUs, 64bit CPUs… but how do you determine which is which?

8080 and z80 have 16bit instructions (like 8086), but they are 8bit because of what… because their ALU is only 8bit and 16bit instructions need to pass it twice?

Okay, but why then Prescott with 32bit registers and 32bit ALU is 64bit? Heck, Intel calls what Prescott got IA-32e mode, clearly because it though that it's 32bit, still, isn't it?

Oh, it stall calls it that! But architecture is no longer IA-32e, but Intel 64 thus we have this in the documentation: Intel 64 architecture increases the linear address space for software to 64 bits and supports physical address space up to 52 bits. The technology also introduces a new operating mode referred to as IA-32e mode.

Everything is clear as mud… and issue of runtime is similar: answer depends, to a large degree, on who are you asking and what marketing guys thought about all that when issue was discussed.

Well, actually, yes, kind of! These operations are part of the Abstract Machine directly. Other functions defined by the standard are also technically baked in Abstract Machine operations, but are typically provided as C code implementations of the prescribed interface.

But the C standard doesn't actually have any concept of "compile time" either. It's merely a specification for what the result of interpreting some C code is; how the code actually gets executed is entirely up to the implementation.

Yes, "does language X have a runtime" is a terribly underspecified question until you define what a runtime is.

Fundamentally, if you're on a multitasking system, there's a "runtime" involved in executing any executable, even if it's written in pure machine code, which most people would consider as not having a runtime. There's still structure involved: there's some entry point defined as part of the executable format, perhaps even multiple, and perhaps even called in sequence, e.g. initialization routines called before jumping to main.

Do you consider the standard library "a" runtime? It's certainly support code which is provided with and versioned with the language. But it's also mostly not part of the language, but instead implemented in the language.

Because you can use Rust as #![no_core], it's one of the languages with the smallest language-level baked in "runtime," since "everything" beyond "execution starts here" (defined by the executable format) is defined in Rust code rather than as part of the compilation process. (Unwinding is sort of half-and-half.)

Even if not. Even the very first 8086 CPU had a tiny program in it. Only 512 21bit words long, but hey, it's still “a runtime”! Fully decoded and thoroughly explored, but… is it runtime already or not?

Ooh. That adds even more mud. You may say “if something is not part of the language but instead is implemented in the language” then it's not a runtime but mere library.

Okay. Well… what about Forth? Typically it's runtime is implemented in the language yet… not implemented in the language.

Because if you do that you would, of course, need full language to compile said runtime and then you would have a bootstrapping problem. Forth sidesteps that: since Forth compiler is so primitive the initial core is usually hand-compiled by humans!

Is it runtime because it was not, actually, produced from Forth sources or is it mere library since it was, actually, hand-compiled by human and thus undistinguishable from library code?

No. I mean the C standard specification makes no mention if a "run time", as I said. As for memcpy, malloc/free and friends, none of those are part of the actual language specification (as opposed to standard library specification). It is perfectly possible to run C code that does not use any of those. It is possible to run C code on systems that don't provide those. Basically I count them as just functions from a library rather than a run rime.

You might be right about Pascal's read/write. I have used Pascal on systems that did not provide read/write but I guess it is defined in whatever specification Pascal has.

Coral certainly has a specification. It was a British MoD standard before Ada came along. PL/M...hmmm... was described in a book and a manual by Intel, about as formal a spec. as Rust's I think.

Yes, exactly. As I noted here: Rust doesn't have a runtime - #21 by ZiCog

They are most definitely part of the standard language specification. Simply because they can not be written in [standard] C.

Even better example would, probably, be type-generic math: functions which are defined as type-generic macros in a language which have no facilities to create type-generic macros.

This would have been true if you had an ability to implement them in [standard] C. But that's impossible.

This makes things a bit similar to how other “runtime-based languages” work: you may have small runtime (without malloc/free) or you may have large runtime (with malloc/free) but these are different (but related) languages.

Because you couldn't just add the remaining part with code in [standard] C.

Not only they are defined in the specification, but they use special syntax which can not be used with user-provided functions.

This makes them procedures-in-name only: they look like procedures, but compiler treats them specially and you can't create anything like that in library.

Certainly memcpy, malloc and free can be written in C without any library support. If I recall correctly "The C programming language" by Brian Kernighan and Dennis Ritchie includes examples of doing so.

Not sure I'm convinced by the type-generic math stuff. Surely that just generates code to call different versions of a function depending on type, at compile time, no run-time required.

What is it from the C standard library that I cannot implement in C?

I don't follow. There is only one C language standard hence only one C language.

The troublesome one is free. It can not be written in C. Because there are only two ways to end object lifetime in C:

  1. Exit function and then lifetime of all automatic variables ends.
  2. Call either free or realloc and then lifetime of heap-allocated object ends.

If you don't have free or realloc then it's not clear how to write one with just #1. You can, probably, can do that with threads… but then, threads also need some runtime which can not be written in [standard] C!

Sure, but that wasn't what we call C today. That was K&R C, entirely different language without specification. And without ability to create one because if you would try… you would face the same dilemma: how to determine when can you use certain pointer and what you can not use it?

Yes, but it's the same story as with Pascal: if you don't have a runtime which provides these functions and don't have magical header then you can not add that facility to the language.

You can create functions but not magical macro. In fact that's how Turbo Pascal did it. Borland Pascal comes with full source for it's runtime, but it doesn't include read or write. Instead it includes system.pas which implements functions ReadInt, WriteInt and so on and magical binary system.tps file which is used to teach the compiler how to transform read and write into calls to these function.

Without that magical file you can not create working Pascal Runtime.

C language standard says you have to have malloc, free and tgmath.h. Otherwise what you have is not [standard] C.

You say that we may throw that away and still have C. No, you can't — because these facilities, if not provided, can not implemented in [standard] C.

This is, obviously, offtopic, but it's good to understand what, precisely, makes free and realloc special.

Consider the following infamous example:

int main() {
    int *p = (int*)malloc(sizeof(int));
    int *q = (int*)realloc(p, sizeof(int));
    if (p == q) {
        *p = 1;
        *q = 2;
        printf("%d %d\n", *p, *q);
    }
}

Three most popular compilers think this program contains UB and thus output 1 2 is valid (as would any other output). That's because call to realloc made pointer p invalid.

But of course you can just replace realloc here with identity function and then all three would agree there are no UB anymore and result would be 2 2.

You can not replicate that behavior in [standard] C. You can create memory allocator but there are no way to convince compiler that what you have is, actually, a memory allocator.

That's why Linux Kernel, e.g., is not written in [standard] C. It uses an extended version which does make it possible to declare such function.

You can, then, say that standard C does have a runtime, but GNU C doesn't, and that would be closer to the truth, but still doesn't make situation clear: does it mean that GNU C (where you can declare kmalloc/kfree and vmalloc/vfree like kernel does) is language without runtime but MSVC C (where it's not possible, compiler just knows that functions called free and realloc are “magical”) is language with runtime? But then how was Microsoft able to write OS with the use of MSVC?

The more you dig the weirder the whole thing becomes. This rabbit hole is plenty deep.

At risk of staying off topic, I'm curious and perhaps others are, I don't understand what you are getting at here.

I see nothing in the language of the C language standard that prevents me from writing my own realloc in C. Neither do I see anything particularly odd about your example.

My understanding is that realloc will allocate some new memory, copy stuff from the old memory space to the new then deallocate the old memory space. That is what the standard says.

I see no reason why that cannot be written in C.

In your example the memory at p is deallocated during the realloc,as per the standard, so I conclude dereferencing p with *p is undefined behaviour, that memory is gone.

See From section 6.2.4 of the C standard:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

Clearly deallocating an object ends its lifetime and the example is UB.

But how do you write the function which deallocates the memory (therefore complying with standard), without leaving the C land? That's the question, I guess.

My malloc and free can use bytes from a big global byte array, defined in C. All I'm doing is handing out pointers to different positions in that array, keeping track of which bytes from that array I have given to to the caller and when I get them back again.

As above but with no byte array defined in C. All I need is a pointer containing the address of memory in my system that I know I can use.

You can not. Or, rather, you can but would have no way to tell C compiler than the memory which you have freed is no longer valid to use.

Precisely. Object which was created with call to malloc no longer exist after call to free, but that's precisely because 7.22.3 Memory management functions part of the standard explicitly says: The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.

Standard goes further and add that it's undefined behavior if the value of a pointer that refers to space deallocated by a call to the free or realloc function is used.

That's why that program have UB (note how I checked that p and q are equal before attempting to use them).

But for any other function which doesn't deal to pointers passed to free or realloc this comparison would be valid!

Call to free or realloc not just does something to object which is referred by p, it also makes p invalid!

That is something you can not replicate in your library in [standard] C.

Tell me how that may happen if malloc/free and friends, none of those are part of the actual language specification (as opposed to standard library specification).

Silly positive example which compiler can do:

int foo() {
    int *p = (int*)malloc(2*sizeof(int));
    p[0] = 40;
    p[1] = 2;
    int q = p[0] + p[1];
    free(p);
    return q;
}

Both clang and gcc can eliminate calls to malloc and free because they know what malloc and free do.

Compiler knows that object referred by p is only used in that function and would disappear after call to free. Which means that this object is ephemeral, can be elided and then calls to malloc or free can be removed. That's not possible with manually-created malloc and free because memory that they give is accessible after call to fake-free.

Compiler must know that, too.

Okay.

You never “get them back”. Pointers to live objects can not just “go bad” in C. They are still valid till you pass them to free or realloc.

Quite so. I still don't see how making my own homemade malloc and free is a problem. As a programmer using it I have two choices:

  1. Assume the original pointer p is still valid after calling free. As if free were any normal function that did not clobber p.

  2. Assume that the original pointer p is invalid after calling free. As if 'freewere some wild function that could clobberp`. Any use of it would be UB.

Now, the language of the C standard specifically tells me I cannot make the first assumption. So, it does not matter if my home made free does something weird with p or not, I'm told I should assume it does. And write code that uses it accordingly.

I agree, the compiler, the optimiser, can make use of the fact that p is always assumed to be clobbered to perform whatever optimisations it likes. It knows I'm not going to use that p again. That is great.

That does not make my home made free any less valid or violate any language of the standard. It's just that the optimiser may treat it as any other function and not optimise around my p. OK, I perhaps loose some performance by using my home made malloc and free.

It makes it different from what standard provides. Which raises the question: does the original language which includes functions which can not be implemented in C without runtime, that languages… does it include runtime or not?

I don't believe it does make my home made free different. My home made free can do exactly what the language of the C standard says it should. As a programmer I have used it exactly in accordance with the standard regarding the lifetime of that original p.

As you note the only difference is whether a compiler implementation can use the information in the standard, about the lifetime of p, to optimise code around my free. Whether is does or does not, as a programmer using that homemade free I have to abide by what the standard says and it does not matter if optimisation happens or not.

I don't understand the question. What original language?

The one which permitted to produce “nonsese” value which is both 1 and 2 simultaneously or elide malloc/free from foo.