Two problems of tuple indexing syntax


#1

It’s probably a bit late to change this, but I’d like to ask the rationale of this situation (if there is one), that I see as one of the very few design mistakes of Rust, regarding the tuple indexing syntax, that has two problems shown below.

This is D code. D has compile-time tuples that can contain values or types, and it doesn’t have built-in run-time tuples as Rust. But in the standard library there is an implementation of tuples (in std.typecons):

import std.stdio, std.typecons;

int foo(size_t idx)(Tuple!(int, int) t2) {
    return t2[idx];
}

void main() {
    const t1 = tuple(10, tuple(20, 30));
    writeln(t1[1][0]); // prints: 20
    assert(foo!0(t1[1]) == 20);
    assert(foo!1(t1[1]) == 30);
}

In the main() the t1 tuple contains an int and a tuple of two ints. You access tuple items with [], the same syntax you use for arrays. This avoids one pitfall when you access inner tuples in Rust:

fn main() {
    let t1 = (10, (20, 30));
    // println!("{}", t1.1.0); // error
    println!("{}", (t1.1).0); // OK
}

The second problem of Rust tuple indexing syntax is shown in the function foo() of the D code. If Rust gains integers as generic arguments (like idx in the D code), I think you can’t use them to index tuples, unlike the D syntax that works still for this rather common situation. Unless Rust will allow you to use the syntax “t2.idx” in foo(), or you use some kind of macro (but using “t2.idx” in Rust probably leads to troubles. So I hope it will be possible to use a macro, like field!(t2, idx) to both read or write tuple fields).


#2
  1. IMO, tuples are more like structs (heterogenous) than they are like arrays, which is why you use the dot notation. The lexer thing is fun though (I think using a space to break the token would be a little bit clearer). Everything you can do with a tuple, you can do with a user-defined struct.
  2. What does D do if you try to use a variable index for a mixed tuple, e.g. Tuple!(int, long) ? Does D implement dependent types?
  3. Present-day Rust cannot support variable indexing of tuples because it lacks dependent types: the type would have to depend on the index. It is not sufficient to restrict to compile-time constants, because the value of a compile-time constant might not be known until after typeck. Generic integer arguments (when implemented) are actually a good example of this, because generic functions are typecked once, before instantiation, and the generic integer constants are at that point skolem variables, not actual values.
  4. If all of your values are the same type, then you could index them with a value unknown during typeck, and Rust already supports this using fixed array types. [i32; 3] has the same representation as (i32, i32, i32) but supports variable indexing.
  5. If you wanted to access a true tuple using a computed index, that index would need to be known before typeck. Macros seem like a natural fit here, although they cannot currently do arithmetic.

#3

Tuples are more like structs (and indeed, D tuples are implemented with structs) but that doesn’t require you to use the dot notation, and indeed in my post above I’ve shown that an array indexing syntax avoids two problems.

Do you mean like this?

println!("{}", t1 .1 .0);

It indeed compiles, I didn’t know that :slight_smile: But for me it’s less clear…

D doesn’t have dependent types. I am not sure what you are exactly asking, this D program shows an example of what you do, but this is getting away from the points of my post above, so this is not necessary in this discussion:

import std.stdio, std.typecons;

auto foo(size_t idx)(Tuple!(int, double) t2) {
    return t2[idx];
}

void main() {
    const t = tuple(10, 1.5);
    foo!0(t).writeln;
    foo!1(t).writeln;
}

In this program the templated function foo() is specialized two times, the first with idx=0 as “(i32,f64) -> i32” and the second with idx=1 as “(i31,f64) -> f64”.

In D the tuple index must be a compile-time constant (but in D you can do a lot of stuff at compile-time).

Right, but sometimes it’s less handy to use all the same types for the fields (in Rust you can do it using an enum).

Thank you for your answers :slight_smile:


#4

Thinking about this more, what you want are type-level integers (integers that are real during typeck). Those have been discussed before; I think there’s consensus we’ll have them eventually, but they can also be faked in userspace with the trait mechanism. I’ve found examples of doing arithmetic on them and tuples indexed using fake type integers, though these examples use incompatible definitions… Presumably once we have type-level integers in core you’ll be able to use them with core tuples with the simple ergonomic syntax.


#5

They are what I was referring with “If Rust gains integers as generic arguments” in my original post.

What simple ergonomic syntax?


#6

As an aside, the dot syntax is also a real pain in macros. In order for dot syntax to work, the tuple index must be an integer literal. This means that not only can you not use an expression, you cannot use regular macro expansion in “tuple index” position: you have to use raw tts and macro callbacks (say goodbye to your recursion limit!). This is why, for example, @with_bindings in scan-rules is limited to 32 bindings in a pattern: if it weren’t for the requirement for an actual literal, I could just use $last + 1 and go all day.

Well, I say “all day”, but even if I fixed that, the recursion limit would still be there to make my life hell… *sigh*


#7

Since this apparently hasn’t been suggested before:

Why not just modify . to accept one of: an integer literal, an identifier, a parenthesised expression which evaluates to a usize. So t1[1][0] in D would be t1.(1).(0) in Rust.

It’s marginally longer, but would maintain the distinction between dynamic and static indexing, whilst also giving us a way to expand tuple indexing to expressions and generic value parameters in the future.

Heck, if you allow the expression to resolve to a &str, and then you could do name lookups on structures in the future.


#8

I was about to suggest that as well. It’s not very pretty, but a nice generalization and shouldn’t be needed that often.


#9

I have written a Rust solution of the “Nuts and bolts” problem:

The problem asks to sort two different permutations of the same sequence of integers, but you can only compare one value with the values of the other sequence (so I think in Rust you can’t define Ord on the values).

In the end I’ve not used a tuple, but a more “readable” struct with two named fields, that contain a pair of the first and second sequence. At the end of the program the two fields must contain the same values.

While this code seems to work (I have used design by contracts, unittesting and fuzzying) it has lot of code duplication. In D language I can write code similar to this (representing Pair as a tuple, and indexing its fields with compile-time integers) using just one swap() and one three_way_partition() function. Can I reduce the code duplication in this Rust program? (I am feeling a lack of abstraction power in Rust).

use std::cmp::Ordering;

type Size = u32;

#[derive(Debug, Clone, Copy)] struct Bolt { size: Size }
#[derive(Debug, Clone, Copy)] struct Nut { size: Size }
#[derive(Debug)] struct Pair { n: Nut, b: Bolt }

impl Bolt {
    fn compare(&self, other: &Nut) -> Ordering {
        self.size.cmp(&other.size)
    }
}

impl Nut {
    fn compare(&self, other: &Bolt) -> Ordering {
        self.size.cmp(&other.size)
    }
}

fn swap_nuts(arr: &mut [Pair], i: usize, j: usize) {
    // Can't use arr.swap() here.
    assert!(i < arr.len() && j < arr.len());
    let aux = arr[i].n;
    arr[i].n = arr[j].n;
    arr[j].n = aux;
}

fn swap_bolts(arr: &mut [Pair], i: usize, j: usize) {
    assert!(i < arr.len() && j < arr.len());
    let aux = arr[i].b;
    arr[i].b = arr[j].b;
    arr[j].b = aux;
}

fn three_way_partition_nuts(arr: &mut [Pair], lo: usize, hi: usize, pivot: Bolt)
-> ((usize, usize), (usize, usize), (usize, usize)) {
    assert!(lo < arr.len() && hi < arr.len());
    if lo >= hi {
        return ((0, 0), (0, 0), (0, 0));
    }

    let mut lt = lo;
    let mut gt = hi;
    let mut i = lo + 1;

    while i <= gt {
        let t = arr[i].n;
        if t.compare(&pivot) == Ordering::Less {
            swap_nuts(arr, lt, i);
            lt += 1;
            i += 1;
        } else if t.compare(&pivot) == Ordering::Greater {
            swap_nuts(arr, i, gt);
            gt -= 1;
        } else {
            i += 1;
        }
    }

    ((lo, lt), (lt, gt + 1), (gt + 1, hi))
}

fn three_way_partition_bolts(arr: &mut [Pair], lo: usize, hi: usize, pivot: Nut)
-> ((usize, usize), (usize, usize), (usize, usize)) {
    assert!(lo < arr.len() && hi < arr.len());
    if lo >= hi {
        return ((0, 0), (0, 0), (0, 0));
    }

    let mut lt = lo;
    let mut gt = hi;
    let mut i = lo + 1;

    while i <= gt {
        let t = arr[i].b;
        if t.compare(&pivot) == Ordering::Less {
            swap_bolts(arr, lt, i);
            lt += 1;
            i += 1;
        } else if t.compare(&pivot) == Ordering::Greater {
            swap_bolts(arr, i, gt);
            gt -= 1;
        } else {
            i += 1;
        }
    }

    ((lo, lt), (lt, gt + 1), (gt + 1, hi))
}

fn inner(arr: &mut [Pair], lo: usize, hi: usize) {
    assert!(!arr.is_empty() && hi < arr.len());
    if lo >= hi {
        return;
    }

    let pivot1 = arr[lo].b;
    for p in lo .. hi + 1 {
        if arr[p].n.compare(&pivot1) == Ordering::Equal {
            swap_nuts(arr, lo, p);
            break;
        }
    }

    let (_, eq0, _) = three_way_partition_nuts(arr, lo, hi, pivot1);

    let pivot2 = arr[eq0.0].n;
    for p in lo .. hi + 1 {
        if arr[p].b.compare(&pivot2) == Ordering::Equal {
            swap_bolts(arr, lo, p);
            break;
        }
    }

    let (inf1, _, sup1) = three_way_partition_bolts(arr, lo, hi, pivot2);

    inner(arr, inf1.0, inf1.1);
    inner(arr, sup1.0, sup1.1);
}

fn solve(arr: &mut [Pair]) {
    if !arr.is_empty() {
        let hi = arr.len() - 1;
        inner(arr, 0, hi);
    }
}

fn main() {
    let nut_sizes = [2, 3, 5, 6, 2, 2, 1, 1, 3, 3];
    let bolt_sizes = [1, 3, 2, 2, 3, 3, 6, 5, 2, 1];
    let mut pairs = nut_sizes
                    .iter()
                    .zip(&bolt_sizes)
                    .map(|(&s1, &s2)| Pair{ n: Nut{ size: s1 }, b: Bolt{ size: s2 } })
                    .collect::<Vec<Pair>>();

    println!("{:?}\n", pairs);
    solve(&mut pairs);
    println!("{:?}\n", pairs);
}

This code is mutation-heavy (later I’ll write a blog post about this, with more functional alternative versions of the code that mutate less), and there are some invariants to keep in the three-way cross-partition algorithm. So in such code I’d like to use pre-conditions, post-conditions, loop invariants, and loop variants. So for such code I’d like Design by Contract built in Rust (in another version of this program I have used asserts to implement post-conditions, but if your function has more than one exit point, you have to duplicate the asserts or move them in an inner function).