I do wish we would not refer to Rc and such as "garbage collection".
If some allocated data is freed at the moment such a reference count hits zero then there is no garbage anywhere to be collected. No more so than when I free() memory in C. Nobody calls that "garbage collection".
My understanding is that "garbage" is memory that is hanging around after the program has ceased using it and waiting to be returned to free memory for reuse.
According to the definition used in computer science, reference counting is a form of garbage collection. Tracing GC is just one kind of GC.
There are many other forms as well: epoch based reclamation, hazard pointers, etc are also forms of GC, and are used in for example the Linux kernel and other high performance environments.
Unless you suggest reserving the term GC for only tracing GC, you are going to have very blurry and arbitrary lines.
Here are some relevant links that may change your mind:
In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource , such as an object, a block of memory, disk space, and others. In garbage collection algorithms, reference counts may be used to deallocate objects that are no longer needed.
But clearly reference counting does not mean garbage collection. Garbage collectors may use reference counting, not everything that reference counts is a garbage collector.
In the case of Rust we have reference counting in Rc and whatever. We never have any garbage though and there is no garbage collector.
Uh, no. Rc and Arc are examples of garbage collection. You collect the garbage immediately after the last user stops using it. When reference counting is used to manage memory to prevent leaks it is a forms of GC.
Reference counting garbage collection is where each object has a count of the number of references to it. Garbage is identified by having a reference count of zero. An object's reference count is incremented when a reference to it is created and decremented when a reference is destroyed. When the count reaches zero, the object's memory is reclaimed.
This is exactly how Rc and Arc works. And this is the definition used in compsci literature.
As to using reference counting for other uses than GC. That depend on how widely you define GC. What about open file descriptors? Or inodes on disk? All of those is about making sure some resource(s) are only cleaned up once the last user stops using it, and that it is cleaned up immediately after that. That could be said to be a form of GC.
There is precedence for using the term GC about things that are not RAM: Nix has a GC for unreferenced installed packages for example. (I do believe that implementstion is a tracing one, but it is clearly not for main memory.)
Reference counting is a concrete mechanism, which in and of itself is not GC, but when used to manage main memory (and possibly other resources) in order to prevent resource leaks it is GC.
All the examples on Reference counting - Wikipedia (with the possible exception of the file system use case, depending on how you define it) are for garbage collection.
You could even say that explicitly calling free is GC. Therefore all programs use GC except those that leak everything they allocate (or never allocate). This makes the term fairly meaningless.
It's an unfortunate terminology issue, since informally many people say GC when they mean tracing GC. The only solution I know of is to always say tracing GC when that's what you mean.
But I also think it is Ok for tracing GC to be implied by the context. If we're comparing programming languages, it does seem implied to me.
Not for any definition of GC that I'm aware of - GC automatically frees up objects some time after they stop being usable, without an explicit "finished" call such as free or drop.
Reference counting is one way to GC things, because it says that when the last reference to an object goes away (hence the object is not usable), it's immediately freed. Tracing GC is another way to GC things, because it says that when it cannot trace a path from the current stack to the object, it will be freed.
Arguably, the OS’ process cleanup could be considered a form of GC: It’s effectively a form of arena allocator which reclaims whatever resources you’ve chosen to “leak” once your program has completed its work and therefore don’t need them anymore.
For some reason, though, it’s often considered bad practice to rely on this particular OS service.
I was able to shave about 250 ms runtime by targeted use of std::mem::forget, not bad for a program with a total runtime of just under 4 seconds.
If you have a short running command line program that builds up expensive-to-drop collections (anything with many small allocations, any collection of a type with non-trivial drop and many elements) this seems like a clear win to me.
In my case I had a large HashMap of an enum type where some rare large cases required heap allocation (SmallVec in my case, but boxing large and rare variants would also do it). That means that the generated drop glue needs to check every element of the ~90k for if it needs to free that variant. Even though only like 5 of them had associated heap allocations.
I also had some large BTreeMap (needed the ordered property, so couldn't use HashMap). Those are just expensive to free in general.
Another trick I have used to remove freeing from the critical path is punt it to a background thread (in cases where letting the OS clean it up wasn't suitable).
I was once compiling a C++ program into asm.js to run in the browser. The nature of the project was that this program could be run many times interactively. I found that every time it was run a huge lot of memory was not freed when it exited and the asm.js runtime did not clean it up, eventually resulting in a dead web page. Luckily the programs creator fixed his program for me so that it destructed everything before it exited.
That's quite myopic. Rust programs can have plenty of garbage lying around. Reference cycles are one simple example, but one can get plenty of more complex ones with specialized data structures, or even just memory which isn't freed promptly enough. Anything which can be classified as a memory leak is an example of garbage.
Reference counting is an example of garbage collection. How do you even know what is garbage? There are two principled ways to do it: either you trace known live objects or known dead ones. In the former case you start from a set of known live roots, and consider everything unreachable as garbage. In the latter case you start from a known dead object, and consider as garbage anything which is reachable from a known dead object and is definitely not reachable from live ones.
These processes are entirely dual, and a real-world GC will use a mix of both strategies. If you think about the implementation of dead object tracing, you'll end up with a more or less standard reference counting. Naive reference counting can't deal with reference cycles, necessitating a tracing GC. While freeing refcounted data could in principle be deterministic, you don't actually want to do it in a general-purpose system, since deep nesting can cause arbitrarily long pauses. For this reason tracing of dead objects typically isn't deterministic either: the full tracing happens an indeterminate time after the roots are marked as dead, and the collection process runs concurrently with program's execution.
I don't think anyone would object to calling Python a garbage-collected language. But CPython primarily relies on reference counting for all its memory management, and the tracing GC is only occasionally run to collect cycles. In fact, for almost ten years after the first release Python didn't even have a tracing GC, relying purely on reference-counting until Python 2.0.
The core difference of manual vs automatic memory management isn't in determinism, or pauses. Either of those can vary significantly. The core difference is whether the application programmer needs to think about memory management at all, or whether memory management is entirely abstracted away, and memory can be treated as effectively infinite, with safe memory reclamation happening in some invisible way.
Well there we have it. In Rust we have to think about memory management, that what "fighting the borrow checker" is all about. I can't just create an object in a function and return a reference to it, for example. If Rust had a garbage collector then I could do that like I can in JS or whatever. Rust has no garbage collector.
Personally, I feel most people currently think of "GC" as memory reclamation that needs something we would classify as a runtime component; RC causing the compiler to insert a drop call that recurses generally isn't considered enough, but the inserted thread that runs over all the inserted metadata for your types of a tracing GC is well over the line. Certainly, local allocations getting "collected" by the stack pointer getting unwound is not!
If the other types of reclamation like hazard pointers were more commonly known we might see a more defined consensus.
I feel like I've seen ancient C documents referring to free as garbage collection, but a quick Google didn't find anything.
From the perspective of a language user, I have to agree with @afetisov:
Side note: AFAIK this is still the case for Perl.
So a key distinction between these reference-counting scripting languages and Rust is that the scripting language does it all for you invisibly, whereas with Rust you have to opt in with explicit types, clones, etc. It's still in your face. (Also you get a choice in Rust, and it's rare to use it for everything.)
Do you really never have to think about it when it's invisible? No! Sometimes the lack of cycle detection is a problem, and sometimes world pauses is a problem (albeit with a different probability of biting). Garbage collection changes the default of caring about memory management,[1] but can't remove it entirely.
If you wrote code for EPOC16, this was a recommended strategy for short-lived processes; you got signals for the OS running out of memory, and you were advised to delay freeing memory until you got a signal, because if you didn't get a signal, you could save runtime (and hence battery) by asking the OS to free the entire process as a unit.
Don't mix up a language based on a garbage collector, and a language which merely has one available. Rc/Arc are an example of garbage collector, although a rather degenerate one. Would you say that Rust still has no garbage collector even if I literally pull in the gc crate? After all, it isn't built into the language, and you still need to carefully decide which of your objects go into the GC or which are managed in the usual way. And you still can get some nasty borrow checker errors when using GC! Even though the reasons would be different from normal Rust usage.
On the other hand, even C famously has Boehm GC, which uses the usual malloc/free as its interface, but does the actual memory management in the background. That doesn't make C a garbage-collected language: the semantics of the language are entirely independent from an existence of GC, you can avoid using one, and there are actually some nasty interactions between the Boehm GC and the default language behaviour.