White space dependent syntax

[Moderator note: This is a classic flame war topic, so I want to lay out some ground rules. Significant whitespace and related syntactic choices have been debated extensively all over the internet, and we don't need to reproduce the whole debate here. Please focus on providing new information or reacting to specific Rust-related ideas, and do not repeat points that have already been made. Remember that large changes must meet a high threshold of cost/benefit and consensus. Everyone, please accept that there are pros and cons, and different people may put different weights on them.]

16 Likes

Apparently, we do think spacing is significant since we do indent and line break our code despite it's syntactic irrelevance and become shock if such structuring do not reflect what the symbols express.

However, your response has revealed to me that white space dependent languages have us write code in two dimensions while white space independent languages have us write code in one dimension only (linearly). Thanks for that insight. :slight_smile:

Consequently, white space independent languages basically throw away our visualization machinery and instead have us crawl through the code like a worm collating meaning from one lone symbol to the next. Still, we reject this myopia and insist on proper formatting and hope that it is consistant with that serpentine text.

I wonder what three dimensional code would look like. :slight_smile:

Syntax highlighting.

5 Likes

white space is an esoteric programming language. it depends on spaces for syntax validity such as python, making the text polyglot.recently i am facing a problem regarding white space syntax while i was coding in python. i have searched many websites along with Facebook to sort this out.

It probably involves futures, async and await. :slight_smile:

More seriously, when writing assembler for the Intel i860 RISC processor one had a few dimensions to play in:

1st Dimension - Instructions are listed down the page in execution order as normal

2nd Dimension - You could write two columns of instructions, integer operations on the left, floating point operations on the right. Thus making use of the opportunities for parallel execution of integer and float ops.

3rd Dimension - When using instructions that made use of the pipe line the result of an instruction on one line would actually appear in the destination register specified in an other instruction three or four instructions down the page!

Needless to say it was almost impossible for humans to write optimal code for that best. As it happened the compiler writers never figured out how to get their compilers to do it either. Hence the i860 never performed well and flopped.

Which is why modern processors do all that instruction reordering, parallel execution scheduling, pipeline management, etc on the fly in the hardware.

Oddly enough Intel did not learn from this experience, they made much the same mistakes in the Itanium design.

2 Likes

and for those who don't have color then font types, font sizes, and font styles (bold, underline, italics, capitalization, etc.) can become the range of that third dimension. :wink:

On the contrary, you already 'properly format' your code with new lines and indents. The weird syntax is to then redundantly decorate it with semicolons and comas as though your intent is not clear. Strip away those semicolons and typically your intent remains crystal clear. Indents and new lines is really how we already read code and thus require it be laid out.

I don't do it, this is the job for rustfmt.
(well, it's not always true, but in many cases it is)

As a practical matter it is harder to work with white space blocked/delimited source code and more error prone:

When copy or cut and pasting code around as you reorganize things the block indenting can be easily messed up.

In the absence of braces around blocks and such, the semantics of the structure does not get copied around with the text.

When you are confronted with code that has indentation errors it can be pretty hard to spot. The compiler will not know something is wrong as it does not have the semantic support of the braces and other punctuation.

So many times we see Python code posted to the net that is impossible to understand or use with out significant effort to recover the lost semantics.

All in all those "redundant" semicolons, braces etc are very helpful

Some call it "weird", I call it "fault tolerant".

5 Likes

Yes, tooling will have to be improve and the use of kerning fonts (non-monospaced) discouraged. Presently, a lot of editors already support block indenting and in particular after block pasting by simply pressing tab or shift-tab. Additionally, most source code markup attribute already represent the source code with monospaced fonts.

Missing a comma or semicolon is generally more difficult to spot than an actual incorrect indentation, especially with 'indent marker' tooling that shows a virtual vertical line of indent markers.

...that's a lot of work to instruct rustfmt. :slight_smile:

Perhaps the reverse might be more expeditious...have a 'fmtrust' that instead add all those semicolons, commas, and parenthesis to properly already indented source code.

My point was, to me, semicolons and curly braces are more familiar and thus more readable. My editor gives me red squiggles if I forget a comma or semicolon.

Thus, I am more productive when I must use curly braces and semicolons. I have to constantly remind myself not to type semicolons in Python. The IDE gets more tangled up trying to help me auto-indenting than with a language with curly braces (or begin and end) and semicolons.

I don't want to say semicolons and curly braces are objectively better, but it definitely is better for my productivity, and I suspect for a lot of experienced Rustaceans as well.

2 Likes

My compilers spot missing a commas, semicolons, braces, etc in no time.

Conversely a compiler cannot help you with a white space indentation error:

if a = b
    x = 2
    y = 3
z = 4

Should that z be where it is or did I absentmindedly lose an indent?

Using proportional fonts for source code really does not appeal.

3 Likes

When I first learned 'C' I thought the semicolon was redundant since a line break almost always followed it anyway and a line break was itself a symbol (even parenthesis for application was dubious). I can understand the semicolon if you typically stacked a bunch of statements on the same line but that is so rear. Consequently, an exception has established the rule. So, when Pascal also did it I fell in line. When C#, Java, Javascript, etc. also did it I didn't even bat an eye. Now Rust, despite its advances (variables as gated access to values instead of values themselves) is also hauling along this legacy...

"The line must be drawn here...No further." - Picard :slight_smile:

Ending a block prematurely with a misplaced brace is like me explaining this possibility without code as visual aide. You have to snake through the symbols. A casual visual inspection won't do. Consistent with this fact is that in your example the potential problem is readily apparent. Note that the code is not necessarily wrong but does mean a specific thing: x and y are conditionally set while z is not.

[Again, folks, no more back-and-forth about the general philosophy and trade-offs of significant whitespace. Feel free to have that discussion elsewhere. Perhaps at this point if there are people who agree on a need for Rust-specific tooling or changes, they can start working on specifying or prototyping those as a next step for testing consensus.]

10 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.