What is the biggest difference between Garbage Collection and Ownership?

That's the key fallacy and main reason I consider tracing GC a plague.

Programmer is part of the memory management system in most today's languages, GC or no GC.

We have only started dabbling with some languages which have some rudimentary ideas about how one can free programmer from this burden (Haskell or Prolog are examples of such languages), and, well… they don't even work.

I mean, yes, they kinda work as a programming languages and you can write working programs in these, but even in languages like Haskell or Prolog programmer is very much a part of the memory management system.

Only it no longer manages memory directly but manages subsystem which was supposed to free him (or her) from the need to be a member of said team. In practice this usually works worse than manual management.

That is what people are missing when they say that:

The main overhead of GC comes not from GC itself but from that desire of trying to drop responsibilities of the memory management system on someone or something. When people are using GC-based languages they often feel that GC frees them from the need to think about the memory management. But nothing could be further from the truth.

I remember a story when some team was struggling with the speed of a certain component and I proposed to improve performance by rewriting it with careful attention to details or how data is structured in memory. And then they asked me if I used perf, flamegraph, etc — because they have used these and achieved pretty good progress, around 10-15% speedup each quarter for 4 years…

My answer was “I have achieved 10x speedup with this rewrite which is equivalent for all 4 years of your work… if I could have 4 more years to learn to use these perfectly I may squeeze few more percents from it”.

Somehow they were not amused and I haven't got the permission to spend 4 more years trying to achieve additional speedup, but you get the idea: you can not retrofit performance and reduce memory usage well if you haven't consider yourself a part of the memory management system from the very beginning.

And if you do that then you don't really need a GC. Especially tracing GC. In fact if you do that you would actively prefer Rust to languages like Go or Java. Because Rust doesn't hide things from you (except for the cases where it usually does better than human, like with layout of structs).

But yes, sometimes languages with GC are adequate. When you, somehow, know in advance that you would never need to try to fight excessive memory consumption or excessive slowness.

And my experience shows that people overestimate capabilities of modern computers. A lot. Basically: 9 times out of 10 when someone was saying “let's write that part in Python, it wouldn't ever be a bottleneck” year or two years down the road we needed to do something about resources consumption.

I guess if you work in a “modern” fashion and switch teams before consequence of your choices may hurt you (and they hurt someone else instead) GC may be a good thing.

Otherwise… no.

7 Likes

Hmm.... Thing is, as far as I can tell, this is not true. Even if ones programs are written in Python we have the facts that:

  1. Most Python code that is of any use is dependant on libraries, numeric, AI, or whatever, that are actually implemented in C or C++. So of course it is fast enough, only a few percent of Python is actually taking the time.

  2. Of course Python run times are written in C or C++.

So we can see, Python is slow and sucks memory. Thus consuming energy and contributing to global warming. Is that a problem? Perhaps not if it is restricted to its role of managing other code written in other languages that do most of the actual work.

By the way, your quote about theory and practice is much older the Einstein: In Theory There Is No Difference Between Theory and Practice, While In Practice There Is – Quote Investigator

4 Likes

This is so very true.
To me that means that Rust's way of managing memory also does a better job of managing programmers' expectations than runtime GC as a whole has since its emergence in real world systems.

I also just thought of something else. Runtime GC generally only manages memory, which makes it a half-measure and leads to language features such as with-style blocks in Lisps, which e.g. Python and Java have taken a page from.
And while that mostly works, I find Rust's philosophy on that both to be more elegant (1 system that manages all the resources e.g. handles to open files, not just memory) and more flexible, since in Rust I can open a file, bind that to a local, and then move the local e.g. by returning it.
When using with-style resource management, the resource binding is tied to the lexical scope introduced by the with-statement.

8 Likes

Yeah, the combination of RAII + affine types means stuff like closing files or freeing memory is handled really nicely in Rust.

I dislike how languages like C# or Python require you to use some sort of is_closed or is_disposed flag internally because you are manually disposing of resources and it's possible to call self.close() multiple times. C# even has System.ObjectDisposedException and the convoluted dispose pattern just for this.

C++ gets most of this right with RAII and destructors, but the presence of move constructors or close() methods mean you can move something and still accidentally use the old (now logically disposed of) variable and run into issues. This doesn't really happen in practice, but that's more because of smart humans/best practices than the language stopping you from shooting yourself in the foot.

8 Likes

Bear in mind, of course, that Rust still has to have such flags for cases where its hands are tied by the OS APIs.

fclose is a good example:

  • fclose can return failure, so it's necessary to expose an API for people who want to handle that failure somehow.
  • The fclose documentation says "In either case, any further access (including another call to fclose()) to the stream results in undefined behavior."
1 Like

I don't think so.

For a real world counter-example, the std::fs::File type manages the state of a file completely using the type system with no such flags.

On cfg(unix) targets,

  • std::fs::File is a wrapper around std::sys::unix::fs::File
  • std::sys::unix::fs::File is a newtype around FileDesc
  • std::sys::unix::fd::FileDesc is a newtype around OwnedFd
  • std::os::fd::owned::OwnedFd is a wrapper around RawFd, and
  • std::os::fd::raw::RawFd is just a type alias for std::os::raw::c_int

By using type states to represent whether a file is open or not, the standard library can avoid all the is_closed flag shenanigans (i.e. it's not possible to have an unopened std::fs::File and the only way for fclose() to be called is by dropping the File).

It's also impossible to use fclose() to break std::fs::File's invariants because as_raw_fd() just gives you a borrowed file descriptor (which means you can't satisfy fclose()'s safety requirement that it won't be touched again). Using into_raw_fd() to get the underlying file descriptor means the programmer takes it upon themselves to uphold the invariants (which is totally legitimate and consumes the std::fs::File object in the process).

If we wanted to add an explicit fn close(self) -> Result<(), std::io::Error> method to File then it's pretty trivial to use into_raw_fd() to consume the File without calling its destructor then call fclose() ourselves. Note that this method takes self by value instead of by reference (which would require us to track the open/closed state at runtime).

6 Likes

Fair point. I'm waiting for some melatonin to kick in after waking up to remedy "too hot to stay asleep" mixed with "jet-lagged", so I didn't have the presence of mind to consider the typestate option.

4 Likes

You (and others in the discussion) might be interested in Racket's wills, which allow arbitrary finalizer procedures (and the accompanying executors, which allow some control over when the wills are executed, as long as it is after the associated value is unreachable according to the GC), or in custodians, which can manage or forcibly close other kinds of resources. Neither is limited to lexical scope, since the former associates a runtime will with a runtime value in a runtime executor, and the latter is controlled by the dynamically-scoped current-custodian.

Of course, between these and/or dynamic-wind, you can build with- style lexical resource-management forms (e.g., with-input-from-file).

Of particular interest is that all of these exist in a garbage-collected language. I think wills actually allow you to tie hook arbitrary resources into the GC (in the sense that those wills are ready to execute when the GC says so), and you have explicit control over when you choose to execute the wills (different from a traditional GC pause, over which you still have no control).

1 Like

Those are nice options to have available, if I were ever inclined to work with a member of the Lisp family again.

That said, the concept of finalizers has left a vile taste in my mouth from my Java days. The main problem there is that there was no absolute guarantee that the finalize would run, thereby making the entire thing pointless since you may or may not leak resources depending on the "mood" of the JVM at that moment.

I don't know if the same is true for wills et al; they'd only be useful insofar the finalizers/wills have a hard guarantee to be run.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.