Code generation for iterator chains

Perhaps these are related:

For me the "chaining" syntax (using D universal function call syntax, or using Rust iterator chains, or using the F# operators like |>, and so on) is quite simpler to read. It avoids nesting and makes the flow of data very easy to follow.

1 Like

If you can guess the invisible intermediate types passed between functions.

It's generally kind of impossible to guess those types, they sometimes grow to become very large and very complex. I don't need to know them.

You do if you are trying to read, understand, and debug some code. I don't want write-only code :slight_smile:

I can't even tell if there's a definition of the algorithm in there?
Edit: looks like the API definition, I'll see if I can find the source.

To read and understand that kind of code you need to know what algorithms you are applying to the code, in what order, where the data comes from, but most times you don't need to know the types. It's usually readable enough code. To debug it, you sometimes use some "peeking" functionality, printing intermediate stages. But it's usually not very bug-prone code, so there's less need to debug. Most bugs are elsewhere (the same happens in Rust iterator chains).

Yes, sorry, that's the documentation, it shows the function signatures, gives some information on the computational complexity, input and outputs, and so on. Those docs don't show the algorithm. The source code of Phobos is on GitHub, but reading library-level code is not so easy, unless you know D well enough.

bool isPartitioned(alias pred, Range)(Range r)
    if (isForwardRange!(Range))
{
    for (; !r.empty; r.popFront())
    {
        if (unaryFun!(pred)(r.front)) continue;
        for (r.popFront(); !r.empty; r.popFront())
        {
            if (unaryFun!(pred)(r.front)) return false;
        }
        break;
    }
    return true;
}

Looks like they were unable to re-use the find_if, find_if_not definitions, but otherwise it looks similar and readable because its using loops :slight_smile: Note the runtime check for isForwardRange, instead of using the generic trait overloading.

See if you can work out what this code (in JavaScript) does?

function mystery(x) {
    return x.reduce(function(a0, v0) {
        return a0.reduce(function(a1, v1) {
            return a1.concat(v0.map(function(v2) {
                return v1.concat([v2]);
            }));
        }, []);
    }, [[]]);
}

Yes, this kind of Phobos code often uses loops.

You're mistaken. That's a D template constraint (look better, there's no { before that if), it's a purely compile-time feature, and it performs the specialization of templates. It's the way D implements half-static type system at compile-time.

In D (and probably Rust too) you almost never write code like that (I think this is a nice example of: Straw man - Wikipedia ).

Does unaryFun! check the predicate is a unary function too? It's not a bad syntax, but my first guess was it was a runtime check. That means that D-ranges are not runtime bounds checked then?

I take that as a no, you cannot :slight_smile: I did actually write that code, but then thought better of it, so I at least have tried to write things like that.

Here's the code written using explicit looping:

function mystery(x) {
    var a = [[]];
    for (var i = 0; i < x.length; ++i) {
        var b = [];
        for (var j = 0; j < a.length; ++j) {
            for (var k = 0; k < x[i].length; ++k) {
                b.push(a[j].concat([x[i][k]]));
            }
        }
        a = b;
    }
    return a;
}

Is that any easier? So far nobody has said the functional code was easier to understand, when I have asked this of other people, even when given both at the same time.

Edit: Perhaps we could agree that it is bad style to make your chains too long, or to try and be too clever with them? Best to stick to simple non-nested maps and reductions.

D templates allow you to do many different kind of things. unaryFun!() is not a check, it creates (if necessary) a function from a string, after some compile-time processing:

It was useful in past, before D gained lambda functions. Today it's not much useful anymore.

I don't fully understand this question. D ranges enforce and test compile-time requirements at compile-time, a bit like like C++ concepts. But sometimes some bounds aren't known at compile-time, so you have to test at run-time (just like in Ada and C++ and in future Rust HKTs).

All language features could be misused, you need to keep your brain switched on and apply common sense and good judgement :slight_smile:

I have realised this makes no sense, the 'r.empty()' is the runtime test, although it assumes that the range is a subrange of the total container. Can you still break this if you create your own range with a range constructor and give it two iterators from different collections, or out of range iterators? I guess a range is no safer than a pair of iterators on its own, it all depends on how you control the construction of ranges.

I think I like the idea of making a range a pair of iterators, as a single object and then defining the algorithms that accept a range as methods on the range.

It's a low-efficiency implementation of product(), in Rust:

fn product(pools: &[Vec<u32>]) -> Vec<Vec<u32>> {
    let mut result = vec![vec![]];
    for ref pool in pools.iter() {
        let mut temp = vec![];
        for ref x in result.iter() {
            for &y in pool.iter() {
                let mut xc = x.to_vec();
                xc.push(y);
                temp.push(xc);
            }
        }
        result = temp;
    }
    result
}

In Python it's more readable because it has list comps (I hope Rust Prelude will gain a refined array comp macro soon):

def product(sequences):
    result = [[]]
    for pool in sequences:
        result = [x + [y] for x in result for y in pool]
    return result

You can write it with iterator chains, but it's much less readable than the Python version because Rust lacks the list comps and the Rust vectors don't have a concatenation operator yet (yes, there's a .concat() function, but it usually leads to ugly code. In this case in D you write "x ~ y", in Python you write "x + [y]" and in Rust you need to write "[x.to_vec(), vec![y]].concat()" or to split that code into three lines and use a push()).

In general you use interator chains where they make your code more readable and nicer. If you see that the resulting code is more messy than using for loops, then use the loops :slight_smile:

Yes, actually comprehensions are a very useful tool for making this kind of code easier to read. Did you google the LtU discussion, because your Python code looks very similar to the best solution we came up with there?

I am not sure why you say it is low-efficiency?

Please not a macro :slight_smile: let's have a real comprehension syntax with an implementation trait. (well maybe a macro if its done well).

Nope, it comes from the Python docs:
https://docs.python.org/2/library/itertools.html#itertools.product

(I helped the design of D range-chained functions for Phobos. I have worked on this stuff for years).

It's not lazy and it copies too much data and it allocates too much.

I like Python/Haskell list comps, but I think a macro today allows you to write "good enough" list comps in Rust. So having a specific syntax may seem too much. This is a judgement value, I can't decide now... but I think having a well designed macro (two macros, one for eager and one for lazy list comps, well designed and flexible) in the Rust Prelude is nice.

Have you read Elements of Programming? I really think it is essential to read and understand this book before writing generic libraries.

Lazy would be less efficient, as we want all permutations. I haven't really though about writing a Rust implementation, but how would you implement a more efficient version that has to accept any number of input sequences each of any length?