Part 3: The alloc module API

This seems to be too subtle for you so let me explain it so you understand :rofl:
When I looked at the API of alloc and saw a return type, yes the concrete very specific return type of u8 it seemed bad/wrong to me. It seemed that way because alloc should allocate for any type not just u8. @ZiCog Drew my attention to the fact that it doesn't mean to be allocating for that type specifically but it allocates bytes only and not for any specific type. And that made sense to me. And that's why I said that it is most sensible answer. Change of perspective, from which that API isn't that bad after all.

Because it would mean that Rust is a language with a garbage collector. Which it is not true. It manages memory without the help of garbage collector.

Some folks get the impression that Rust comes with "manual memory management", but in my view, nothing could be further from the truth. My position is that Rust comes with "declarative memory management" - and I'll show you why.

Terms are hard. The way I use the terms, Rust doesn't have a mandatory, pervasive dynamic runtime garbage collector, but it does have static/compiletime garbage management, and std does offer dynamic reference counting.

Would you describe Swift as a garbage collected language? If so, note that its garbage collection is not cycle collecting nor is it deferred; it's just simple reference counting equivalent to Rust's Arc, but implicit, pervasive, and integrated into the language (which allows them to do cool things like +0, +1, and "+½" passing).

The magic of Rust isn't that it "doesn't have garbage collection." The magic of Rust is allowing you to only use dynamic garbage management techniques where they're beneficial, but use static resource management throughout most code. It's in allowing mixing levels; of seamlessly moving between super high-level orchestration code and low-level data structures in the same language.

So yes, both "Rust doesn't have a garbage collector" and "Rust has automatic garbage collection" are true at the same time. I'd perhaps even go as far as to say Rust has multiple garbage collectors: the static one in the borrow checker, and the dynamic one in reference counting.

5 Likes

You misinterpret/misunderstand the term memory management and garbage collection.

I'm not sure what you mean by the +1/2, but in comparison, Rust can also do +0 (and actually defaults to it) by moving the (A)rc (and +1 by cloning it). In this sense, the fact that Swift's ref-counting is language-integrated doesn't come with any real advantages; after all, Rust's ref-counting library types also use the language-supplied Drop and destructive move mechanism.

1 Like

Let me rephrase it: everyone told you that this API was not meant to alloc for specific types, either u8 or other, that you would have to think of it as a low-level API to allocate a block of memory for anything, even some types that does not have a size that is known at compile time.

But for some reason, this is now an explanation that you accept.

Maybe the truth is that you don't consider that some things can be opinionated and expressed in term that are defined differently than in your mind. It does not mean that anyone is wrong, but for discussing this we have to be aligned in the definition. The example regarding garbage collection is very representative of your attitude. Garbage collection is not specifically tracing garbage collection, reference counting is a type of garbage collection. Without any more details, you cannot assume that Herb Sutter told that shared_ptr is some kind of a tracing garbage collector, he never said that. Actually he was specifically speaking about reference counting. But you still consider this statement to make himself a complete clown.
So, it seems that even if we take time to precise the definition of term, you utterly ignore it and consider others as complete fools.

For the discussion about alloc, you just only consider now that the alloc API is somewhat opinionated and arbitrary choices have been made with a different objective than what you would have. @ZiCog found the words to make you accept that but explaining the same thing. In other words, he gave you an answer with a different form but the same meaning.

Too bad the first topic is gone now, but you should reread the discussion with your new perspective. Maybe, you wouldn't find your attitude so gentle regarding people that were trying to help you.

6 Likes

You should avoid such accusations, lest you make a clown of yourself. In all academic literature on the subject, reference counting is a kind of garbage collection, because it allows you to build arbitrary object graphs and avoids any issues with ownership or use-after-free. It's not a particularly efficient garbage collection, because it can't deal with reference cycles or fragmented memory, but those are optimization concerns and not a fundamental property of memory management strategies.

Entire languages, like Python or Swift, are built almost purely on reference-counting. Nowadays Python has an extra GC which can collect unreachable reference cycles for you, but it was added long after the initial Python's release, somewhere in Python 2 history.

The point of garbage collection is to provide an abstraction of infinite memory which can be never freed, thus avoiding any memory management issues. In that regard memory leaks with Rc is a non-issue, since memory is infinite anyway. Actual real-world garbage collectors typically balance between deterministic collection with reference-counting and nondeterministic but powerful reference tracing. Both can take an unbounded amount of time, depending on your object graph, both have undesirable failure modes.

Here's an article for you if you want to study the issue. That link's paywalled, but it shouldn't be hard to find it in other places.

Rust is not C++ and doesn't have typed memory and TBAA. Making an allocator return typed memory could be confusing, implying that the memory could be used only for a single type, which is not the case.

alloc is a very low-level interface, whose only purpose is to define what "a global allocator" means. The only reason you should be interacting with it is if you're writing a global allocator, or if you're reimplementing the alloc crate itself. For all other purposes use higher-level primitives, including Box<T>, which is the primitive way to manage heap-allocated memory in Rust and has some features which are impossible to get otherwise. In my 4 years of writing Rust code, including low-level unsafe code and containers, I have never needed to interact with alloc::alloc directly.

That's like saying that for loop is an automatic iteration, because you didn't have to write while let Some(item) = it.next(), or that ? is automated error propagation, because you didn't write the handling match expression.

The Rust compiler deterministically inserts drop glue in certain code points (at the end of the scope, unless the value was moved). You can't manually call the drop glue for safety reasons, but you can call mem::drop() which is effectively the same. The compiler automates boilerplate for you, but the memory management is still manual: you need to decide when to allocate memory, how to pass around ownership, and where the memory should be deallocated.

Compare it with a garbage collector, where you never think about ownership, memory can be silently and entirely automatically allocated and deallocated at any time, and the value is collected "sometime after it becomes unused".

7 Likes

It seems that whether the compiler performs monomorphization is mainly an implementation detail. In particular, the compiler could decide to never monomorphize.

Never monomorphizing would probably be very bad for performance, as it would mean that for generic functions taking Sized arguments, or returning a Sized value, it would have to pass arguments via main memory, or add some other mechanism to the ABI for a function to dynamically determine the size of its arguments on the stack. This would likely be bad .

However generic functions, whose arguments and return values are all sized and have the same size regardless of generic instantiation, are somewhat of a special case. The compiler can decide to monomorphize, or not, as it sees fit, e.g. for these sorts of signature

fn alloc<T>() -> *mut T;
fn alloc_array<T>(usize count) -> *mut T;
fn alloc_aligned_array<T>(usize count, usize align) -> *mut T;

To me it seems that the problem is not monomorphization, but whether there even exist any user-defined types which benefit substantially from a 'special' layout, which would be the sole reason to even want to create special monomorphizations of such functions.

That's not really true. This would only be possible if the behavior of such a generic function didn't depend on T. An allocator could, for example, decide to dispatch to specialized allocation algorithms based on the type (e.g., it could try to pack small allocations within a page but not care about that for allocations bigger than a page, or allocate some built-in types such as thread locals from special segments).

In what sense are you using "layout"? The layout of a type is not dependent on the monomorphization of any particular generic function, it's decided by the definition of the type.

Yes, it seems to me that this is exactly what is being claimed - by having an allocator API which knows the static type for which you're allocating, you can make these sorts of specializations; and in particular, that such specializations might not depend only on the size and alignment, but also on the exact identity of the static type being allocated. I'm not making this claim, that's just my understanding of what the OP is claiming in this topic.

By "The compiler can decide to monomorphize, or not, as it sees fit" I only mean that in the case of a generic function whose argument and return value sizes don't depend on the generic instantiation, the compiler could monomorphize, or not, without changing the ABI or creating a fancy ABI for passing dynamically sized values; so that for such functions, monomorphization is not a serious performance concern (again, not I claim I made, this is the claim made in the OP, and it seems to be a 2nd hand claim anyways, I don't see the original author of that claim referenced anywhere here).

Yes, I would agree with this. Once again I am responding to the OP, this is just a response to the claim being made here in the OP of this topic, I'm not saying I agree or disagree with that claim (I'm not sure I fully understand it).

All I'm trying to express here is that I don't believe there would be any substantial performance penalty from monomorphizing a function like alloc (as claimed by the OP), and that I would like to see examples of specific types which benefit from the monomorphizing of alloc in the way claimed by the OP (again, not my claim).

I'm very well aware of that. My reason for calling Herb Sutter clown is the following:
He in one of his speaches/talks compared and enforced that notion on an person from the audience which was unfair and it is not right. It is not right because Herb Sutter presented shared_ptr to be garbage collector of the same level/magnitude if you like to more advanced garbage collection mechanisms. It is like putting you in the corner and enforcing on you during discussion about transport and how environmentally unfriendly it is (your opinion and you are in favour of reducing amount of transport etc) and some clown (like Herb Sutter) pops up with smart ass statement: Hey, but don't forget, your own legs are also form of a transport so if you wanna defend your position you should stop using them either.
And because obviously he has more clout and he is the person that everyone knows and nobody knows you, and most likely you don't want nor feel like getting into argument in front of entire audience you just sit quietly at your place and that's it. Is such argumentation/behavior during serious (or something that's supposed to be serious) discussion OK with you?

As for rest of your post... I really am not sure what your point was. You actually seem to agree with me and my notion of garbage collector, but I'm sure you wouldn't:

FYI, because you seem to have it wrong. It is actually I, who said that the info about type's layout can be used to the advantage during monomorpisation, not @H2CO3 who's original claim was that the monomorphisation would be an actual disadvantage in this scenario.

Could you provide a link to this video and a time where he presents this? I probably saw the talk you are referencing (at CppCon 2016) but I didn't hear anything like this so I am perplexed.

I was looking for it for some time now, can't find it. It was so many years ago.

It's probably too late to matter, but here's one useful bit of terminology that might have helped avoid some of the confusion above:

https://en.wikipedia.org/wiki/Tracing_garbage_collection

Tracing garbage collection is the most common type of garbage collection – so much so that "garbage collection" often refers to tracing garbage collection, rather than other methods such as reference counting

As for "manual" vs "automatic" memory management, those terms definitely get used many different ways which makes them hard to ever use constructively (as this thread demonstrated), so I tend to avoid using them at all. I'm not personally aware of helpfully more specific terms for overall memory management strategies, only terms for specific mechanisms like GC and RAII which are useful but whose meanings subtly vary by language.

6 Likes

I wonder why people are trying to find justification for something that doesn't, really, need a justification.

If you want to look on how flexible, generic allocator API can be done — look on C++.

And if you will do that you'll find out that you can not do that without GATs. C++ used template struct rebind in C++98 version (and it was quite a pain point for it because it took years for the compilers to properly implement GATs), C++11 switched to std::allocator_traits (also GAT-based).

And, well, GATs were experimental till Rust 1.65 (released only couple of months ago).

I think this closes the question of “why Rust allocators are not typed” in a very straigthforward fashion: backward compatibility is bitch — GATs weren't available when allocators were designed and without them you can not have typed allocators and yet hope to have somewhat ergonomic interface for HashMap or BTreeMap.

And the nail in the coffin is the need not just for GATs but for specialization: in C++ with it's duck-typing metaprogramming approach it's plenty easy to create allocator which would deal with certain types in a special fashion while delegating the work for most other types to “standard allocator” (e.g. you can create pair of types: vector which doesn't remove memory in destructor and special-purprose allocator for that vector which makes it available for it's constructor, then you may hope to save some CPU cycles if you create and delete such vectors often).

There typed allocators are actually useful.

But in Rust, where your function can not be specialized and especially in Rust before Rust 1.65 where one allocator can not easily create another allocator when HashSet for BTreeSet needs it… this would be both almost impossible to do and also pretty pointless.

Maybe in some imaginary Rust 2.0 in the bright future, where GATs are widely used and specialization is stable may provide a different allocator interface… but today… I think what we have is adequate.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.