Some notes part 2

A smaller batch of questions/ideas.


  1. This is an academic (useless) question because Rust is past V.1.0.

You can use a generics syntax like:

fn foo<T: Clone, K: Clone + Debug>(x: T, y: K) {...}

You can also use an alternative syntax like this, to declutter the signature when your types have complex trait constraints:

fn bar<T, K>(x: T, y: K)
  where
  T: Clone,
  K: Clone + Debug {...}

But isn't "better" to always write the signatures like this, with the types always after the run-time arguments, and avoid the need of the "where" keyword?

fn foo(x: T, y: K)
   <T: Clone,
    K: Clone + Debug> {...}

  1. This code shows the data sizes of some types:

    fn main() {
    use std::mem::size_of;
    println!("{}", size_of::< Option >()); // 8
    println!("{}", size_of::< Option<Option> >()); // 12
    println!("{}", size_of::< Box >()); // 4
    println!("{}", size_of::< Option<Box> >()); // 4
    println!("{}", size_of::< Option<Option<Box>> >()); // 8
    println!("{}", size_of::< Option >()); // 2
    println!("{}", size_of::< Option<Option> >()); // 3
    }

I don't know how much common such kinds of types are in Rust code, but I think the compiler can compress some representations. Some examples:

Option<Option<i32>> to just 8 bytes;
Option<Option<u16>> to just 4 bytes;
Option<Option<u8>>  to just 2 bytes.

Another example:

enum Foo<T> { A(T), B(T), C(T) }
enum Bar<T> { D(T), E(T) }
enum Spam { F, G, H }
fn main() {
    use std::mem::size_of;
    type T1 = Foo<Bar<i32>>;
    println!("{}", size_of::< T1 >()); // 12
    type T2 = Foo<Bar<Spam>>;
    println!("{}", size_of::< T2 >()); // 3
}

Perhaps the compiler can encode T1 in 8 bytes and T2 in 1 byte. This should reduce the memory used, but I don't know if this is going to speed up normal Rust programs.


  1. In the D standard library there's a second kind of Nullable:

It's similar to the Rust Option, but you can instantiate it specifying a value that represents the None state (the first Nullable version is more normal, it adds a second field, like Rust normally does). An usage example (this example usage assumes the array haystack to never have a length size_t.max):

Nullable!(size_t, size_t.max) indexOf(string[] haystack, string needle) {...}

This has the advantage that:

Nullable!(size_t, size_t.max).sizeof == size_t.sizeof

Is something like this in the Rust std lib? I think you can implement a similar Option2 in Rust too (giving the None state as template argument is not strictly necessary, you can just use a different name for the structure).

(Perhaps Rust should offer safe and clean ways to optionally specify the bit-level representation of enum fields).


  1. This little D program generates a lazy sequence (a Range), named "seq", and then prints it all:

    void main() {
    import std.stdio, std.range, std.algorithm;
    auto seq = 100
    .iota
    .map!(x => x ^^ 2)
    .filter!(x => x > 20)
    .take(10);
    seq.writeln;
    }

    Output:
    [25, 36, 49, 64, 81, 100, 121, 144, 169, 196]

I think the Rust println should handle and print lazy iterables too, this similar code doesn't compile:

fn main() {
    let seq = (0 .. 100)
              .map(|x| x * x)
              .filter(|&x| x > 20i32)
              .take(10);
    println!("{:?}", seq);
}

If you want to tell apart the lazy sequences from the arrays in the println!("{:?}") output you can even use semicolons instead of commas:

[25; 36; 49; 64; 81; 100; 121; 144; 169; 196]

  1. Do you know if there's an (updated) Rust cheatsheet? This page seems dead:

http://static.rust-lang.org/doc/master/complement-cheatsheet.html


Thank you for comments and answers.

Eh, I like having generic parameters introduced before they're used. Saves having to "reparse" a declaration.

Theoretically, yes. This is one reason why Rust's ABI is unstable. Rust already does this for types wrapped in NonZero and borrowed pointers; it's why Option<Box<T>> is the size of one pointer.

I suppose it's also kind of like that second Nullable in your next question. There's no general version.

Eeeeeh, I'm bullish on this one. I mean, iterables in Rust are, somewhat frequently, not copyable or cloneable for various reasons. That would make printing them like that destructive, which would probably be somewhat surprising since println! generally doesn't consume its input.

As an aside, I'm not sure how that'd even be done. I mean, you'd have to implement Debug manually for every iterator type. Once we have specialisation, you could probably do a blanket implementation, but I believe that'd be the only blanket Debug impl you could ever have.

Doesn't seem worth it. You should be able to fairly easily implement an iterator adapter that converts an arbitrary iterator into a Debug implementing value. You'd need an extra .debug() or .display(), but oh well. That happens a lot in Rust. :stuck_out_tongue:

There was a cheat sheet?

1 Like
  1. There was an RFC where part of the proposal was to include an abstract alias keyword. It's possible something like the following may exist some time in the future:
abstract type Clonable: Clone + Debug;
fn foo(x: Clonable, y: Clonable) { ... }

Something like that would go a long long way to decluttering generics.

The big advantage of where clauses is that you can have things other than a single generic parameter on the left, e.g. the values of associated types, without getting them mixed up with bindings of new generic parameter names (which can only appear within angle brackets). Most of the time this could theoretically be distinguished by whether the left side is a single identifier or not, but not always - today you can write:

fn foo<T>() where SomeExistingType: Add<T> {}

Personally I'm not a fan of using where clauses when the shorthand would work, but I guess some people think it looks better.

2 Likes

You should be able to fairly easily implement an iterator adapter that converts an arbitrary iterator into a Debug implementing value. You'd need an extra .debug() or .display()

Those debug()/display() adaptors add some noise to the code, but I think they are acceptable, once they are in the standard library:

fn main() {
    let seq = (0 .. 100)
              .map(|x| x * x)
              .filter(|&x| x > 20i32)
              .take(10);
    println!("{}", seq.display());
    println!("{}", seq.debug()); // Compile-time error?
    println!("{:?}", seq.display()); // Compile-time error?
    println!("{:?}", seq.debug());
}

Looks nice.

This is somewhat possible, using a macro, but you can't really get all the way there. One limitation in regular macros is that they don't play well with lifetime parameters, so storing references seems to be out of the question at the moment. Here is something I threw together for the fun of it: Rust Playground

1 Like
  1. There is the More Exotic Enum Layout Optimizations RFC that covers some of these points.

I think your nested options are covered by the RFC and people are interested in experimenting in that direction. Your Foo<Bar<Spam>> example is harder because you have to be able to extract &Bar<Spam> and &Spam values, so including the tag for an outer enum in the same bytes that encode the value becomes problematic. The nested Option problem gets around this because you only have to be able to extract the contained value if all the options are Some, so you have freedom about how to use the bits in the None cases.

1 Like

I see. Something like an #[compact_enum] (that statically disallows to extract inner parts through pointers) could be used where memory compactness is more important (like when you're storing many of them in an array, etc).

Yeah, I think that could work. It seems kind of analogous to packed structs, which I believe are also supposed to have restrictions on references or something. (I haven't been following that discussion and am not sure what the current state of that proposal is.)