Turbofish operator `::<>`, why is it ambiguous?

When calling generic function with an explicit type, one needs to add :: before the type, which is somewhat different from the function definition. Same thing happens for generic structs. E.g:

fn f<T>() {}

fn main() {
    f::<i32>(); // instead of f<i32>(), must have `::`
    let v = Vec::<i32>::new(); // instead of Vec<i32>::new();
    let w = <Vec<i32>>::new(); // This also works.
}

I have asked about this before (newbie question link). There is also a related RFC.

My question:
The discussion around ::<> says that this syntax can be ambiguous in some cases. Furthermore, removing the :: requirement will have some non-negligible cost (to the compilers and so on) which needs to be carefully evaluated. These statements look intriguing to me (who knows nothing about compiler implementations). Can some one explain why it can be ambiguous, and why it is not "that easy" to drop the :: syntax requirement? A "<" after a generic struct/function, how can it be ambiguous? It's not a "less than" or a "(half) bit shift" for sure, right? Also why removing the restriction can be non-trivial? At the end of the day, the Rust compiler nowadays can even give error messages saying that you need to add ::. Doesn't that imply that the compiler already knew what I want?... Also, C++ does it since day 1 so what's really the hard part? (These are lame "criticisms" from a user who never wrote compilers ^_^. I'm just curious to know why.)

1 Like

It totally can be a comparison or shift operator. For example, this is a valid program:


#[derive(PartialEq, PartialOrd)]
struct Vec;

fn main() {
    let x: bool = Vec < Vec;
}

C++ is in a really bad state in this regard. Apparently, parsing < requires the parser to do name resolution. And in general, C++ syntax is undecidable.

10 Likes

But here Vec is not a generic struct right? The compiler is able to know that the Vec in main is just an empty struct constant.

I see. So what you are saying is that there is a "name resolution" process in compiling which tells "what this symbol is" (a generic struct? a variable? a function? a generic function?). If we resolved the name. Then it will be unambiguous since a "<" after a generic type is not ambiguous (but we need to know it's a generic type first). Involving "name resolution" before or during the "early syntax parsing" is somewhat expensive and messy, I guess? So Rust compiler parses the syntaxes first before knowing what each identifier really means, and then does the name resolution. In this case, the "parsing" part of "<" can be ambiguous. Am I getting the point?

Languages are easier to parse when the parser doesn't have to track names, scopes and types during parsing. Rust is like that, and what it sees when it first parses the syntax is:

#[attribute(Name, Name)]
struct Name;

fn name() {
    let name: name = Name < Name;
}

C made a mistake with typedef, and a parser can't know whether foo * bar is multiplication or definition of a bar pointer without knowing types, so parsers have to be more complicated to keep track of types during parsing, and C syntax is sensitive to order of definitions.

This is valid Rust too:

fn main() {
    let x: bool = Vec < Vec;
}

#[derive(PartialEq, PartialOrd)]
struct Vec;

and here the parser wouldn't be able to know the meaning of Vec < before reaching struct Vec later. In extreme situations, the earlier syntax could change the meaning of later syntax and create a chicken-egg problem in parsing.

11 Likes

I see. Thanks for the detailed explanation!

I believe this is the best answer ever given to the "why is it ambiguous" question: https://github.com/rust-lang/rust/blob/master/src/test/ui/bastion-of-the-turbofish.rs .

8 Likes

Haha you got me!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.