White space dependent syntax

Presently, the syntax of Rust is filled with redundancy. The code is indented, curly braced, comma separated, bracketed, and parenthesized as if uncertain about the meaning of an affirmative. Space and Newline are punctuations relied upon to reveal structure but ignored for providing structure. It all seems rather silly considering that one of the reasons C itself is not white space dependent was apparently a need to be unreadably compact for more parsing efficiency back in the day of limited space and throughput.

In particular, the need to wrap arguments in parenthesis eliminates the syntactic fluidity of currying and a meta language of transformation common in functional languages:

x        // x is some concept: value, function, operator, type, generic type, etc.
f        // f is some suitable transformer: function, operator, generic type, etc.
f x      // application produces a new concept
f x y    // equivalent to (f x) y
f (m, n) // where (m,n) is a tuple of the two concepts m and n.

White space is always implicitly used in Rust to reveal the structure of the code. The need to have a second structuring syntax of semicolons, braces, etc. on top of that is redundant and makes the code less self documenting; consequently, the visual structure is not the real structure but merely secondary documentation.

Explicit type annotations are also redundant. Redundancy makes code more robust to refactoring mishaps.

All too often, I cut and paste a line into a python file in CLion and break it because the autoformatter decided I wanted a dedent somehere. Or when moving lines around in atom it always assumes I want the indent of the previous line and then I end up with something accidentally nested in an if block.

15 Likes

Yeah, assistive editors can mangle your code with over enthusiastic formatting. At one point there were editors for semicolon languages that would add semicolons and commas for you as well. Very annoying having to delete those too. Similarly, auto correcting spellchecks can also be annoying.

Explicit types annotations is not really redundant documentation since the compiler validates them they effectively function as revelations about the state of the code. Visual formatting does not suffer the rigor of compiler verification and can readily be inconsistent with the curly brace and other structuring elements that do define the code. In effect, with curly brace languages we end up reading code in one syntax, indents and new lines, but writing the code in another, curly braces and commas, for no other reason than convention while eliminating the possibility the linguistic brevity and consistency and flexibility white space significance affords.

Finally, reading and editing white space dependent code is so much easier. What you see is what you get (provided you are using mono-spaced fonts and consistent white character -- which can be compiler verified). It becomes almost pseudo code.

So what's your point? Rust has stability guarantees, so isn't going to change.

11 Likes

White space dependent syntax (WSDS) could and should coexists with the white space independent syntax (WSIS) verison. It needn't be either/or. For instance, WSDS could be implemented with an out of band pluggable transpiler perhaps trigger by a source code level pragma or different source file extension, as mentioned elsewhere.

It is indicative to note that F# and Haskell both have a WSIS form; however, they are not popular because like all such syntax they become very cryptic to read and edits become a matter of hunting commas, adding semicolons, and counting parenthesis. Missing a comma or semicolon can transform the meaning dramatically and is harder to see than spaces delimiting words or line breaks. We are not computers.

For example, isn't the following syntax much easier to read and edit as it follows our intuition like most good language...

let aby = [
  "a"
  "b"
  f c
  ]

...than this with its dangling commas, semicolon, and superfluous parenthesis (given that space is already semantically relevant as a word separator and is available)...

let aby = [
   "a",
   "b",
   f(c)
   ];

Of course, if you want WSDS with everything on one line it can become...

let aby = [ "a", "b", f c]

No. It is not. And that should be the end of it.

But, what you are suggesting

is that this should have one meaning

but 

this 

should 
have a totally
different 

meaning.

or this
    should have
    some totally other
        different
meaning

The road to chaos and insanity!

9 Likes

Whitespace dependent semantics lead to subtle easy to miss bugs, which is all the more dangerous in Rust due to unsafe code. The harder it is to audit unsage code, the harder it is to adopt Rust.

To prevent thses sorts of bugs, Haskell and other langauges like it adopt a very different style of programming, and that would be hard to retrofit onto Rust.

5 Likes

This can be solved in a self-contained way by writing a tool that translates reversibly to/from your preferred syntax. Probably the source of rustfmt would be a good place to start. So you'll use Rust's standard syntax for storage and compiling, and your preferred syntax for editing. So I think if you really care about this you can make it happen.

5 Likes

I would feel very uncomfortable if this

let list = [
    f
    g
]

was different from this

let list = [
    f g
]

There's a reason that even Haskell has commas in their lists.

7 Likes

The only syntax change that I would like to see is to use commas in multiline expressions the way Elm does at the beginning of the newline as opposed to the end of the previous line.

module Shapes exposing
    ( Point(..)
    , Shape(..)
    , surface
    , nudge
    , baseCircle
    , baseRect
    )

That's only because it makes the alignment easier to follow. In fact, I think this is more accurately a formatting preference than a syntax preference. I don't find your example

at all preferable.

2 Likes

To add to what has already been said: I've started working with a Python codebase at my job, and I don't find it at all intuitive. I am used to semicolons and such, because I learnt Delphi, C++ and Rust previously. For someone used to that style, your "improvements" are very detrimental to readability.

Also, I tend to forget the colons after function headers, for loops, if/else statements, etc. in Python, and since it is a dynamic, interpreted language, the painful reality is that you have to execute the whole program as many times as you have errors and wait all the time for execution to get to the point of the error before you see it. Aaargh! Rust, being a compiled language, can alleviate that somewhat, but still, don't just create a weird syntax just because you can.

Just my 2c.

5 Likes

Let's entertain the thought:
If I follow, my following code:

let size: usize = full_radius as _ * 2;
if size == 0 {
    panic!("Creating empty mesh!");
}
let mut value_map = HashMap::<(isize, isize), usize>::new();
let mut points = Vec::<[f32; 4]>::new();
let (sin, cos) = (PI / 4.0).sin_cos();
for i_idx in 0..size {
    let i = i_idx as f32;
    for j_idx in 0..size {
        let j = j_idx as f32;
        let point = [2.0 * (i * cos - j * sin) - full_radius, (i * sin + j * cos) - full_radius];
        let dist = (point[0] * point[0] + point[1] * point[1]).sqrt();
        if dist <= full_radius {
            value_map.insert((i_idx as isize, j_idx as isize), points.len());
            points.push([point[0], 0.0, point[1], 0.0]);
        }
    }
}

Would become the following:

let size = 2 * full_radius
if size == 0 
    panic "Creating empty mesh!"

let mut value_map = <
    (
         isize
         isize
    )
    usize
> HashMap new
let mut points = [f32 4] Vec new
let (sin cos) = sin_cos (PI / 4.0)
for i_idx in 0..size 
    let i = f32 i_idx
    for j_idx in 0..size 
        let j = f32 j_idx
        let point = [
            2.0 * (i * cos - j * sin) - full_radius
            (i * sin + j * cos) - full_radius
        ]
        let dist = (point[0] * point[0] + point[1] * point[1]) sqrt
        if dist <= full_radius 
            value_map insert (
                (
                    isize i_idx
                    isize j_idx
                )
                points len
            )

I'm going to be frank and say that that is simply unappealing to the eye, there seems to be a lack of a strong, typed (literally, as in character-wise) structure, and is using rather implicit rules. Writing this felt like dancing, something that feels elegant and cool to do but not so useful to get around with.

If I must be honest this reminds me of a push-pop/stack calculator.

5 Likes

If you want information on whitespace-based languages, including the reason why Python has those colons (it's based on a study run on students), you should probably read the Python-history blog. Similarly, if you want to argue in favor of more syntactic redundancy, I suggest some of Larry Wall's posts.

But even if it was scientifically proven that whitespace-based syntax was better, it is unlikely that it would be better enough for the Rust project to accept the cost of maintaining two parsers, the learning curve and fragmentation of having two dialects, and the even greater fragmentation that would result if fully custom syntaxes were supported and encouraged. Certainly, Rust is going to be subject to some dialectization, anyway, but there are advantages to delaying it.

Though I guess if I'm going to cite Perl as something to look at, arguing that "dialects are bad" isn't coming across as very convincing, since Wall makes a point of how pluralism and dialects are a good thing. But, then again, he also makes a point of how originality is overrated, and the Rust team are certainly willing to agree on how Rust liberally borrows from other languages (I'm pointing at Cyclone, here) and is intentionally designed to be easy to do FFI with. For someone who self-identifies as a postmodern linguist, it shouldn't be a surprise that the PM post can be read both ways.

Anyway, twelve posts that all say "I like this and not that" aren't really much fun to read. nor are they very convincing. I, at least, prefer my flame wars to provide deep rabbit-holes of links to get myself lost in.

Also, what you're asking for already exists. Because of course it does. Have fun. :smile:

11 Likes

That colon in Python sticks out like a giant wart on your friends face.

In a language where white space is used for delimiting it serves no useful purpose and is very annoying.

For example in an "if" construct there should never be anything after the conditional expression it is terminated by a line end. The work of the if block is then an indented list of statements underneath. Even if there is only one.

There would never be any confusion or need for a colon if the job was done properly. Python failed.

I have always disagreed with the Larry Wall idea of offering many ways to say the same thing. It creates a runaway explosion of chaos. For example if there are three ways to write each of three statements in three lines of code then we have nine ways to say the same thing. That is nine times more confusing than if there were only one!

1 Like

Let me put it this way:

The colon is not there for your benefit. It's there for CS students with no prior programming experience. The fact that it's not technically necessary (Guido says himself that it isn't) doesn't matter. It's used in English to denote block headers, so it's also present in Python; using enlarged text would've been better, but you can't do that in ASCII.

2 Likes

Edit:

My post below was flagged by "the community", who ever they are, "because the community feels it is offensive, abusive, or a violation of our community guidelines."

I hope we can agree there is nothing offensive or abusive in the text below. That would certainly be my last intention here.

My apologies if I have inadvertently transgressed some paragraph in said guidelines. I will read them again and try to find out what that transgression may be.

In the mean time, I'm at a loss to know what the issue is.

Clearly not. It has no benefit :slight_smile:

The fact that programming is a technical discipline, no matter who you are teaching it to, rather in implies that it does matter.

So I have heard. This argument holds no water, for the following reasons:

  1. By that logic Python should also require periods at the end of each statement. It does not.

  2. Students of the arts use all kind of weird and wonderful notations in their fields. Think musicians, linguists, choreographers, etc. They handle it. I see no reason why programming requires special treatment for them.

  3. There was no such terminator on BASIC. A language designed exactly for similar purposes which famously did a very good job.

Clearly the colon in Python is just muddled thinking.

3 Likes
...but isn't things delimited by space 
semantically different than 
things delimited 
by 
lines
.

    Aren't things
   indented 
        different than 
   things that
        are not

Isn't it enough to note that we expect formatted code. Why not make visual structures we expect semantic instead of mere informal documentation easily in conflict with intent...a mere facade, if you will, however revealing, to the underlying meaning.

No, they are not semantically different, and it would be strange if they were different

1 Like

I have a strange example. Consider the sentence:

This line is longer
than this one.

True or false?

1 Like

...actually, more like from this...

let size: usize = full_radius as _ * 2;
if size == 0 {
    panic!("Creating empty mesh!");
}
let mut value_map = HashMap::<(isize, isize), usize>::new();
let mut points = Vec::<[f32; 4]>::new();
let (sin, cos) = (PI / 4).sin_cos();
for i_idx in 0..size {
    let i = i_idx as f32;
    for j_idx in 0..size {
        let j = j_idx as f32;
        let point = [2.0 * (i * cos - j * sin) - full_radius, (i * sin + j * cos) - full_radius];
        let dist = (point[0] * point[0] + point[1] * point[1]).sqrt();
        if dist <= full_radius {
            value_map.insert((i_idx as isize, j_idx as isize), points.len());
            points.push([point[0], 0.0, point[1], 0.0]);
        }
    }
}

to this...

//isize, usize, and f32 symbols can serve as types and value constructors
//in F#, type constructors/adapters begin with an upper case letter: ISize, USize, F32
//HashMap and Vec are constructors that take type parameters but, 
//  in the case of, an initial capacity USize value
//The use of variables as value accessors as oppose to typed 
//  constrained value accessors is respected and admired
//  in this treatment
//'new' is used but it is somewhat redundant
//'(,)' tuple constructor, '[,]' array constructor 
//note: there is no treatment distinguishing static, module, 
//  and instance methods here, sorry.

let size = (USize 2) * full_radius
if size == 0 
    panic "Creating empty mesh!"

let mut value_map = new (HashMap (isize,isize) usize)
let mut points = new (Vec F32 4)
let (sin, cos) = sin_cos (PI / 4)
for i_idx in 0..size 
    let i = F32 i_idx
    for j_idx in 0..size 
        let j = F32 j_idx
        let (x,y) = (  
            //tuple split across lines need no comma
            (F32 2) * (i * cos - j * sin) - full_radius
            (i * sin + j * cos) - full_radius
            )
        let dist = sqrt (x * x + y * y)
        if dist <= full_radius 
            push value_map (i_idx, j_idx) points.len
            push points [x, 0.0, y, 0.0]
            )

or (...using '|>' pipe operator with implicit stack usage optimization, of course where f '|>' g = g f = g(f) )...

match (USize 2) * full_radius with
0    => panic "Creating empty mesh!"
size =>
    let mut value_map = new (HashMap (isize,isize) usize)
    let mut points = new (Vec F32 4)
    let (vsin, vcos) = sin_cos (PI / 4)

    range 0..size
    |> map |ix| (ix, F32 ix)
    |> zip (range 0..size)
    |> map |iy, (ix, fx)| (ix, fx, iy, F32 iy)
    |> map |ix, fx, iy, fy| 
        let x = (F32 2) * (vcos * fx - vsin * fy) - full_radius
        let y = (vsin * fx + vcos * fy) - full_radius
        (ix, x, iy, y)
    |> filter |_, x, _, y| 
        let distance = sqrt (x * x + y * y)
        distance <= full_radius
    |> iterate |ix, x, iy, y|
        push value_map (ix, iy) points.len
        push points [x, 0.0, y, 0.0]