Inversion-of-control resource management


#36

… more splitting? We’re talking about a function that takes a slice, and getting two slices back. You can then split those slices in half, too, until you’re just splitting a slice with one or zero items. :slight_smile: The Rust standard library has a pure safe implementation a binary search done like this.

Python does bounds checking, too. I’m going to take a wild guess that the Java services you’ve had the misfortune of working with were enterprise software, and were slow despite running on HotSpot, not because of it.

Pointer chasing that comes with using way too many objects will kill you, as will thrashing the GC by spawning way too many short-lived objects.


#37

Okay I didn’t understand that the implementation is employing a raw pointer which turns off all borrow checking. So that is why thought it was employing a borrow checked pair, but now I understand it instead turns off all safety in user land code.

If the implementation allows overlapping ranges then it is no longer safe. Perhaps some algorithm may require overlapping ranges.

The borrow checking is doing nothing to help insure safety with multiple mutable iterators. We are relying on the semantics of the implementation which the borrowck can’t check.

— no reply required to the following, just expressing my philosophy as it currently stands –

What I am trying to explain is that philosophically I support a PL design principle that type systems are valuable when they aid documentation, clarify, simplify, and promote consistency— not when they must be routinely avoided and casted away. Type systems can’t and shouldn’t have the goal of catching all bugs. Of course I want a compiler to check for errors that don’t require a globally enforced contract which is routinely casted away (especially silently).

Afaics, global lifetime and borrow checking is fundamentally fighting against the fact that the universe is composed of partial orders, not a total order. And for what gain? To guarantee non-aliasing contracts where it hasn’t been SILENTLY turned off with raw pointers. Where is the consistency, clarity, etc? Afaik, the significant gains from tracking mutability come from partitioning to enable parallelism or immutability which enables concurrent scaling, but these are typically high-level, macro level design decisions. A low-level check that attempts to enforce a global, total order seems to me to be incongruent with the physics of our existence.

I do roughly envision a possibly very significant advantage of compile-time checking the borrowing of memory lifetimes local to the function body, if that can help offload from GC the destruction of temporary objects and if it won’t require lifetime parameters. And that is because thrashing the GC with temporary objects appears to be one of the most significant weaknesses of GC. That is only enforcing a partial order, and not attempting to compile-time check a global consistency which can’t exist. In short, I don’t want a type system that silently lies to me very often. As another poignant example, subclassing is an anti-pattern because it routinely lies about the implicit invariants.

I like typeclasses without subtyping, because afaik it doesn’t ever silently lie, unless you cast it away somehow. The way that you cast away traits is by destructuring the trait object with RTTI employing some cast to data type such as a match-case or instanceOf. But at least it is explicit and not silent in every instance. This becomes an anti-pattern in the sense that it disallows extension in one dimension of the Expression Problem. That is why I devised that complete solution to both dimensions of the Expression Problem.

P.S. I was drawn to Rust by the excellent typeclasses and Crate system. It is unfortunate that I can’t adopt Rust for my current project because:

  1. The emphasis on these low-level global borrow checking for resources lifetimes and mutability.
  2. Lack of GC semantics.
  3. Lack of runtime for asynchronous programming.
  4. Lack of Javascript compile output target (which would provide the JIT compilation portability I need).

I do realize the above are due to Rust’s different focus and use case target.


#38

split_at_mut cannot be used to alias mutable pointers, and is thus a safe API. It turns off safety within itself, as part of the implementation of slices, but whatever algorithm you implement on top of it is safe.

split_at_mut cannot be used to alias mutable pointers, and is thus a safe API. The type system is not powerful enough to prove that the function itself is correct, but borrowck is verifying that the function that uses it is correct (because the returned two slices have the same lifetime as the one input lifetime, the two returned slices cannot be dangling).

The implementation of spit_at_mut is not being checked, but borrowck is still checking that whatever user code calls split_at_mut is correct.

Am I misunderstanding you, or are you claiming that Rust’s ownership model is a list rather than a tree? Because I’ve always understood that an owner can own multiple things, and I’ve actually worked on large Rust projects. Rooted trees are partial orders, not total orders.


#39

Actually, that’s fairly similar to how Servo manages its layout tree. The big persistent data structure is a root BlockFlow with some number of child Arc<Flow>s. It uses Arc instead of a true GC, but that’s a fairly minor difference.


#40

Since borrow checking is turned off by the raw pointers employed by split_at_mut, what can prevents invoking it more than once on the same slice? I presume the mutable reference to the slice is not borrowed by the split_at_mut since the raw pointers employed by split_at_mut are said to turn off borrowck.

How can borrowck know that split_at_mut mutably references the slice if it doesn’t check what a raw pointer references? Are you saying that split_at_mut inputs a move of a reference from the slice and indicates to the borrowck that it moves that reference to the output which contains two raw pointers?

Also I noticed the code sample you provided seemed to show the front and back raw pointers being assigned to two different instance variables. Since you say these can be split further, how can the borrowck track that these are only split once each, given your example split the mutable reference into two instances? It sounds to me like you are claiming that borrowck allows a tree of mutable reference duplicates for as long as the split_at_mut claims that these are disjoint?

If the last sentence of the prior paragraph is the way split_at_mut works, then I still ask how to implement an algorithm that requires overlapping mutable ranges? For example, if we invoke two separate algorithms (or same algorithm twice with different input criteria) in which each return a list of mutable references to the same collection, and we keep both results around, then how can we be sure that the two lists are disjoint?

The point is that afaics there is no way to express general abstraction by tracking the borrowing of mutability in the way that Rust wants to.

I don’t understand how it can do this, per my statements/questions/bewilderment above. Especially it seems to be impossible in the case of general abstraction.

I appear to be claiming the opposite, that Rust checks borrowing in a tree of disjointness, but that is a total ordering of disjointness whereas in real life we need multiple partial orders (e.g. a list or lists per my example above) on disjointness per the example I provided above.


#41

Afaics there is no programming language available that has all the following features:

  1. Typeclasses (Rust’s traits) instead of that anti-pattern subclassing. Including trait bounds on polymorphism and trait objects.
  2. Polymorphism of types and functions.
  3. GC every where except ability to use borrow checking to track lifetimes of temporary objects to free them without burdening the GC in many cases. And copying some of v8’s techniques for improving the robustness of handling temporary objects for those cases that fall through to the GC.
  4. Asynchronous programming, e.g. yield.
  5. Thread pools and work stealing.
  6. Compilation to a portable JIT target, such as Javascript, Java, or I believe now LLVM (but LLVM’s coverage is I think not yet stable on all architectures?). Also afaik LLVM doesn’t support yield.
  7. Preferably compilation to Javascript so it can also run in the browser for HTML5 games, etc…
  8. A module and packages system like Crates. Node.js and NPM don’t count for me, because there is no strong typing in Javascript.
  9. Preferrably the ability to native (without FFI) code in low-level similar to C. For example, this could be compiled to Javascript employing Emscripten and ASM.js. So perhaps I am just talking about unifying some steps that would be required to do that orthogonally with the tools and methods we have now. I haven’t actually dug into what the design would look like, but I think it is ridiculous that in Java and Javascript I can specify unsigned machine level data types such as u8 and u32. And in Javascript I can’t layout a struct.
  10. Not a gobbledygook clone of Haskell’s syntax and philosophy. Otherwise PureScript would be reasonably close to what I want, except lacking #4, #5, #9, and #10.

Those are the features I feel I need to see in order to be confident I am coding to the next mainstream programming language which will eat most of the other languages out there.

Go, Swift, and Rust each have parts of the above. On further reflection, Scala appears to be closest but with many of Java’s plus Scala’s additional warts (<— worth listening to 5 - 10 minutes).

  • PureScript is lacking #4, #5, #9, and #10. Is it also lacking #8? It obtains something like #4 with functional programming coroutine, but IMO that is more gobbledygook.
  • Go is lacking #1, #2, and #7. Is it also lacking #6 and #8?
  • Swift is lacking #3, #7, and #9. Is it also lacking #4, #5, and #8? And is LLVM not a robust #6?
  • Rust is lacking #3, #4, and #7. Is it also lacking #9? And is LLVM not a robust #6? And it adds total ordering on borrows which in my current understanding (open to change if I’m incorrect on some technical points) I don’t agree with.
  • Scala is lacking #9. Appears #4 and #5 are only attained via a library such as Akka, so it may add another layer of obtuseness. Afaics Scala sort of has modules but doesn’t have all of the features of Crates. Scala’s variant of typeclasses are achieved with implicits and objects.

I’m presuming the mainstream wants GC instead of Rc / Arc, because:

  • the latter burdens the programmer with reasoning about circular references
  • there is some reason to doubt whether the latter is more performant than the former (either throughput or memory consumption) in the single-threaded Rc case and certain the latter is less throughput in the multi-threaded Arc case. I am presuming GC optimizations cited in this thread, and also that refcounting will not free circular references so it can end up consuming memory that GC would deallocate. I mean to imply they are probably (guesstimated) roughly equivalent performant in the single-threaded case on average for all programs.
  • the latter can’t become more performant (at least not without changing all the code that is employing it), i.e. there is little improvement we can make. GC can undergo much optimization, research, and tuning between the language’s semantics and in many cases without needing to change the code that is employing it.
  • the latter can have huge latency pauses as well. GC can be designed to tradeoff throughput to make guarantees about maximum latency pauses.

I’d love to know what other people think. I need to find some forum where I can get feedback on programming languages. I lost access to LtU and I don’t think their focus is from the perspective of the actual programmers, i.e. they will give a theoretical bias and not a “what we need for our daily work” bias (and a poignant example). If anyone can suggest an appropriate forum for me, that would be much appreciated.

Edit: someone told me Haskell has everything I listed. I know it doesn’t have #10 and I just don’t think most programmers will shift to Haskell’s obtuseness, and also afaik the lazy evaluation and immutability every where fights against the way most programmers currently reason about programming. Also the compilation to Javascript probably looks unintelligible. Also I recently read that ghc can completely change the algorithm to try to optimize away the cost of lazy evaluation thunks. Also coding imperatively in Haskell with the monadic pattern and $ looks incredibly noisy to me.

We know Haskell won’t be preferred by many programmers, because it has been around a long time and is still stuck at 0.5% market share.

Edit#2: metrics of language popularity:



http://www.tiobe.com/tiobe_index
http://redmonk.com/jgovernor/2015/07/31/programming-language-rankings-summer-2015/

http://trendyskills.com/

I find it noteworthy to compare “IEEE” and “Trending” on only language for mobile, and to see that programmers are forced to trend to C/C++ to get around Java warts, and the only other trend is Swift/Objective C. This indicates to me that there is a hole that remains unfilled in this design space. Programmers aren’t satiated with one language on mobile.

Edit#3: I just realized Paul Graham writes some of the same points that have been in my mind:

…designing programming languages is like designing chairs: it’s all about dealing with human weaknesses.

Most of us hate to acknowledge this. Designing systems of great mathematical elegance sounds a lot more appealing to most of us than pandering to human weaknesses. And there is a role for mathematical elegance: some kinds of elegance make programs easier to understand. But elegance is not an end in itself.

And when I say languages have to be designed to suit human weaknesses, I don’t mean that languages have to be designed for bad programmers. In fact I think you ought to design for the best programmers, but even the best programmers have limitations.

[…]

…a lot of the best ones were languages designed for their own authors to use, and a lot of the worst ones were designed for other people to use.

The argument for designing languages for bad programmers is that there are more bad programmers than good programmers. That may be so. But those few good programmers write a disproportionately large percentage of the software.

[…]

Give the Programmer as Much Control as Possible.

Many languages (especially the ones designed for other people) have the attitude of a governess: they try to prevent you from doing things that they think aren’t good for you. I like the opposite approach: give the programmer as much control as you can.

[…]

Aim for Brevity.

Brevity is underestimated and even scorned. But if you look into the hearts of hackers, you’ll see that they really love it. How many times have you heard hackers speak fondly of how in, say, APL, they could do amazing things with just a couple lines of code? I think anything that really smart people really love is worth paying attention to.

And it’s not only programs that should be short. The manual should be thin as well. A good part of manuals is taken up with clarifications and reservations and warnings and special cases. If you force yourself to shorten the manual, in the best case you do it by fixing the things in the language that required so much explanation.

[…]

Speed Comes from Profilers.

Language designers, or at least language implementors, like to write compilers that generate fast code. But I don’t think this is what makes languages fast for users. Knuth pointed out long ago that speed only matters in a few critical bottlenecks. And anyone who’s tried it knows that you can’t guess where these bottlenecks are. Profilers are the answer.

Language designers are solving the wrong problem. Users don’t need benchmarks to run fast. What they need is a language that can show them what parts of their own programs need to be rewritten. That’s where speed comes from in practice. So maybe it would be a net win if language implementors took half the time they would have spent doing compiler optimizations and spent it writing a good profiler instead.

Edit#4: I’d like to correct the tradeoffs of programming language paradigms. I think typeclasses (TC) subsume OOP+AOP providing the benefits of both without their drawbacks, especially when my proposal for a complete solution to the Expression Problem is included:

Pure FP…Imperative…OOP…AOP…TC
…✓…✗…✓…✓…★…Composability/Encapsulation
…✓…✓…✓…✓…★…Extensibility
…✗…★…✓…✓…✓…Flexibility
…★…✗…ⁿ/ₐ…ⁿ/ₐ…ⁿ/ₐ…Scalability (e.g. parallelism, concurrency)
…★…✗…✗…✗…✓…Verifiability (algorithmic reasoning)

So this is why I prefer a design philosophy of encapsulation of interfaces with FP+TC where possible, yet retaining imperative where necessary. I am currently thinking best to throw OOP+AOP and even subtyping into the trash can.


Blog Post: Lifetime Parameters in Rust
#42

Raw pointers don’t “turn off” the borrow checker. Raw pointers are never borrow-checked, but all references are always borrow-checked. split_at_mut takes a reference and returns two references, and the borrow checker enforces the lifetimes of these references.

The type of split_at_mut is for<'a, T> fn(&'a mut [T], mid: usize) -> (&' mut [T], &' mut [T]), which means the mutable reference passed in must have the same lifetime as the mutable references that it returns. And therefore no other mutable borrow can be made of the same data during that lifetime. This code:

// DOES NOT COMPILE
fn f<T>(v: &mut [T]) {
    let len = v.len();
    let (a, b) = v.split_at_mut(len/2);
    let (c, d) = v.split_at_mut(len/3);
}

produces this error message at compile time:

error: cannot borrow `*v` as mutable more than once at a time [--explain E0499]
 --> main.rs:8:18
7 |>     let (a, b) = v.split_at_mut(len/2);
  |>                  - first mutable borrow occurs here
8 |>     let (c, d) = v.split_at_mut(len/3);
  |>                  ^ second mutable borrow occurs here
9 |> }
  |> - first borrow ends here

It sounds to me like you are claiming that borrowck allows a tree of mutable reference duplicates for as long as the split_at_mut claims that these are disjoint?

Right, split_at_mut is safe because it can only return disjoint slices. It is correct that safe Rust does not allow overlapping mutable slices to exist at the same time.


#43

Okay so then I can’t do some algorithms per the example I had provided.


#44

I have a question about this, how do you join the returned iterator to the collection?


#45

Have you read this about PureScript


#46

Just read it for the first time, but I had already shared my solution to accomplish the same thing in Javascript:

Refer to my JavaScript code examples near the end of the linked post.


#48

Borrow checking treats function signatures as opaque black boxes, so the fact that split_at_mut uses unsafe code internally isn’t really relevant. It’s still enforcing type safety at its boundaries because it says that it’s taking one reference and giving two references with the same lifetime (the backing array remains borrowed until the lifetime of the reference it returned ends, not just until the reference proper is unused):

fn split_at_mut<'a>(&'a self) -> (&'a Self, &'a Self)

And since that is, indeed, what it does, it’s not lying, any more than Haskell’s claim that + takes two operands of type Int and returns a third Int is lying. Sure, it’s implemented in assembler and thus not subject to Haskell’s type checking, but that’s not really relevant.


#49

I think you’ve been around here long enough to understand that not only is Rust not intended as the One True Programming Language, most of us don’t even think such a language can exist.

For example, you’re not going to get a no-overhead C FFI if you have a precise GC or stackless coroutines, because integrating C with a precise GC involves wrappers that look like JNI, and integrating C with stackless coroutines requires that you have two separate call stacks.


#50

The distinction of course is that the compiler checks what it is telling the programmer it checks, which is that both operands to + are of compatible types where a + operation is defined. The compiler is making no other contract. Whereas, the borrow checking is a contract about the inability to alias, yet the borrowck is not enforcing that when the programmer escapes to using raw pointers. The programmer of the raw pointer implementation is assured the borrowck won’t allow his interface to be aliased, but the user of that checked interface has to presume that the programmer of the implementation is compliant behind that interface which employs the raw pointer. But that isn’t even my main point, which is moreover that we are forcing the borrowck everywhere in a total ordering tsuris for the user, but then needing to escape out of it, and not only encapsulated in a tree of checked disjointness but also in cases where the borrowck will somehow have to be entirely subverted I guess by putting raw pointers in userland code (see my quote of myself below).

And when Circle / Square inherits from Ellipse / Rectangle in an OOP language, the subclassing paradigm is not lying about the fact that the invariants of Circle / Square are violated by the invariants of Ellipse / Rectangle in the mutable case (the error doesn’t exist in immutable case).

I don’t think there is one language that can satisfy everyone’s preferences. For example, programmers will have different preferences on the mix of FP, imperative, immutability, low-level control, and high-level reasoning. I did mention a set of features that I think would be popular in a language, that currently don’t exist as a complete set in any one language (although Haskelll and Scala get close but with idiosyncrasies and issues that cause me to think they are not going to be popular).

My apprehension regarding Rust was mostly merely arguing that I posit that aiming for total orders is an anti-pattern adding a lot of tsuris for very little gain{1}, and that it creates a frequent case where it is saying it has checked the invariants but has in fact delegated the checking to the programmer via the escape hatch of raw pointers, and in other cases we can’t even do certain algorithms except by putting raw pointers in userland code. I may be wrong about the relative gain and the relative tsuris. I will merely quote again my most relevant statements and leave it as an open question that has not been thoroughly quantified. The market will quantify it over time. I don’t think we are omniscient enough to know with certainty, but again I offer the following quotes as my insight or intuition on the matter and they may well be wrong or too one-sided or … please remember I am all thumbs up and ecstatic about Rust’s typeclasses and Crates.

{1} To reiterate or re-summarize again, appears to me that mutability control is a high-level concern, and most performance comes from high-level reasoning (profiling the code). Low-level performance is rarely the 80/20. Again I may be wrong about those priorities and it may vary by programmer and use case. YMMV.

Edit: we have all these wonderful PL advances that reduce tsuris and increase high-level expression such as first-class functions, lambdas (closures), type classes, and stackless coroutines or generators. I would like to see all those in a popular language asap. I am tired of waiting. Jettison all that OOP noise that is an anti-pattern. I realized that wasn’t the reason Graydon Hoare created Rust. I’ve been reading his blog. He created Rust because he believes in low-level safety and thought that could help to build a better browser. I criticized Typestate back in 2011 when I first read Rust was considering it because type systems should never be separated (two different type systems in the same language). Now I am dubious (almost criticizing but aware of my lack of omniscience) about total ordering, low-level safety checks, which are only open under a tree of disjointness thus incompatible with open expression of partial orders (that is a generative essence attribute that I posit will prove to be an anti-pattern, but I’m not 100% certain). It was also a small world when I read today that he had been working on Stellar’s consensus algorithm after he left Rust which I also analyzed a few months ago and had criticized as it would end up centralized (not realizing he was part of its implementation). And I see Graydon and I have entirely dipolar politics (I eat a lot of meat, love sports, and I’m not apologizing for being a man). So I don’t expect a very significant chance we will be working together in the future although he seems to be very knowledgeable. Enjoyed reading his blog today. He is extremely shy (he admitted in his blog he feels awkward in social gatherings), I can only find one photo of the man on the entire internet and he is looking at the ground. As you can probably tell, I am an extreme extrovert. I don’t get offended by Paul Phillips’ (Scala) brash style, at least he wasn’t as arrogant to me as others in the FP realm (Graydon also lamented the FP community tendency to arrogance or too serious in his blog). Rust community has treated me very respectfully and you also have my respect. I will exit with admiration of your tolerance.


#51

Remember upthread I mentioned the idea of possibly employing borrow checking in a very limited way (that would be silent no annotations in most cases) which didn’t extend its tentacles into a total ordering, yet afaics might enable to declare some references to be unique (non-aliased) mutable, so that these could be compile-time deallocated (not given to the GC) and afaics these guarantees would also eliminate the need for the immutable restriction on adding elements to collections under my proposed complete solution to the expression problem.

Thus I think the GC would be entirely out of the picture when interface to FFI in that case, but I need to dig into the details to be sure.

I don’t understand well the issue of integrating C with stackless corountines, but I am thinking about generators instead which means we only need to rewrite a function body as a state machine and return to the caller. But I need to study more about this and determine if full coroutines are also necessary and if so how this would impact calling to something like C code.

So I can’t precisely rebuke you at this time, but I also think it may be possible to do great and innovative things.


#52

Simple function rewriting can’t do this (using gen as the generator version of fn, and an imaginary yield statement):

gen my_generator(param: &T) -> U {
    param.higher_order_function(|x| yield x from my_generator)
}

Because that requires busting out of higher_order_function, which is not part of the state machine. You need spaghetti stacks to do that.


#53

Yeah I know that, but what I don’t yet have a good conceptualization of which use cases require jumping higher in the stack or to other stacks.

It seems the stack unwinding has more regular semantics. Throw generally leaves the program in an indeterminate state. I need to learn more about coroutines.


#54

I am not sure I understand the problem here. If the closure is implemented like a C++ function object, IE it captures all the variables it needs to operate on, then we can model a coroutine in simple closures. Consider the following JavaScipt:

function f() {
    var state = 0
    step() {
        state = state + 1
        return state;
    }
    return step;
}

we return a closure that captures the state and every time you call it, it increments. Compare with:

var state = 0;
function step() {
    while true {
        state = state + 1;
        yield state;
    }
}

They are actually more or less the same thing, if you define a lazy list as an ADT, then getting the head of the list can call the next ‘step’. So we can transform the latter which returns a lazy list type, to the former as a code-to-code transform in the compiler.


#55

So I presume the programmer would would still write yield, then the compiler would translate to a state machine of switch-case within the step function body.

So yes this is just like my prior comment about generators except you’ve enabled capturing the “stack” (i.e. state of) the closure function, and then any function calls within the closure function use the stack of the caller of the closure function. Now I remember that when I studied this a few weeks ago when I first learned Rust didn’t have stackless coroutines, this was precisely the way I also envisioned implementing it in Javascript.

You mean the latter code example to the former code example by creating the state machine within the function body?


#56

Afaik, the code-to-code transformation you described in your example is incomplete. The caller must be able to inject a return value for the yield, e.g. let x = yield state where the caller gets the value of state (and some indication that state is a yield and not a final result of the function) and when the caller calls the closure generator function again, the function input should be assigned to x.

Also I thought coroutines meant that a pair of functions can yield to each other, not one calling the other. Afaics some control function sets up the pairing and serves as the glue calling the two paired closure generator functions (or any pattern of pairing multiple closure generator functions). Thus I guess your formulation is the fundamental building block needed, but with my clarification in the prior paragraph. But a direct pairing at the language level would probably be more efficient.

Again I am not very knowledgeable about the use cases for these coroutines.