Lifetimes are whitespace-sensitive

Why does Rust require lifetimes to be on one line? I was just trying to meme around with formatting somebody's code and noticed that lifetimes spanning multiple lines raise a syntax error. Is this necessary for Rust, and does it add any value to the language?

#![feature(non_ascii_idents)]
#![allow(non_snake_case, non_camel_case_types, uncommon_codepoints, bindings_with_variant_name, unused_mut, dead_code)]

#[derive(Debug)]
enum 
m͏ut
{
m͏ut
}

impl
m͏ut
{
    fn
m͏ut
    <'
m͏ut
    >(&
mut 
    self,
m͏ut
    : &'
m͏ut
mut 
m͏ut
    ) -> &'
m͏ut
mut 
m͏ut
    {
m͏ut
    }
}

fn main() {
    let
mut 
mu͏t
    : &
mut 
m͏ut
    = &
mut 
m͏ut
    ::
m͏ut
    ;
    println!("{:?}", &
mut 
mu͏t
    );
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error: unterminated character literal
  --> src/main.rs:16:6
   |
16 |     <'
   |      ^

error: aborting due to previous error

error: could not compile `playground`.

To learn more, run the command again with --verbose.

A lifetime like 'a is a single token, so this is due to how the lexer is defined.

Historically, I think this was influenced by the OCaml grammar where identifiers can contain single-quotes, so for example foo' is a valid variable name. (The Rust compiler was originally written in OCaml.)

I don't think there's any fundamental reason it had to be defined this way. As far as I can tell, allowing whitespace between the single-quote and the identifier would not have introduced any ambiguity to the Rust grammar. (Changing the definition of a token at this point might be backward-incompatible for macros, though.)

3 Likes

Yep, there's even a lifetime descriptor for macros to match on them directly.

playground

macro_rules! take_lifetime {
    ($lt: lifetime) => {};
}

take_lifetime!{'a}
1 Like

I think the succinct answer to the initial question is that lifetimes are special identifiers that start with '. In other words, in this case the ' is not a prefix operator. In Rust spaces are not permitted in the middle of any identifier.

1 Like

They once almost were, r#ident with space#.

I don't consider that to be a counter-example; raw strings (and thus raw-string idents) can contain virtually anything, including Unicode confusables.

(Just to clarify, raw identifiers in Rust are restricted to the same characters as normal identifiers. They cannot contain whitespace or Unicode confusables. The link above was to an alternative proposal that was never accepted.)

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.