Some question about literals

Hello,
What is the difference between the "string" and "literal string" and "integer" and "literal integer"?
In the following line, which one is "integer" and which one is "literal integer"?

print!("{} + {} = {}", 34, 80, 80 + 34);

Thank you.

2 Likes

The expression "string" and "integer" refer to types, or any value of those types (including literals).

In contrast, a literal refers to exclusively a value that is spelled out explicitly in the code, and is thus known at compile-time. So "a string literal" is also "a string", and "an integer literal" is also "an integer", but the converse is (obviously) not true.

2 Likes

Following on from the previous response, in your example:

print!("{} + {} = {}", 34, 80, 80 + 34);

"{} + {} = {}" is a string literal that is 12 bytes long.
34, 80, and 80 + 34 are all integer literals.

Edit (see below): 34 and 80 are integer literals. 80 + 34 is an expression composed of two literals that evaluates to an integer value. Because the value 114 is not literally included in the source code, the expression 80 + 34 is not an integer literal.

On the other hand, if you had code like this:

let mut s = String::from("{} + {}");
s.push_str(" = {}");

// get x and y from the user at run-time
let i: usize = x + y;

then the non-literal string s now refers to the same value as the string literal "{} + {} = {}"

Likewise, if x and y are 34 and 80, the integer variable i now has the same value as evaluating the expression containing two integer literals: 80 + 34.

1 Like

Nitpick: this is an integer value (or, more specifically, an expression evaluating to integer value), consisting of two integer literals. 80 and 34 are literals, but 80 + 34 is not a literal, it's a complex expression.

2 Likes

Point taken. In that case, 80 + 34 is an expression that evaluates to an integer value. The result of evaluation is not "literally" included in the source code, so it's not an integer literal in itself.

The print! macro and related ones are a good example why this distinction can be relevant. print! expects, as its first argument, a string literal[1]. A string literal is the literal occurrence of text between quotation marks in your source code. A string literal in Rust evaluates to a string slice of type &'static str (or a “string slice reference” depending on who you ask, there’s some convention of calling things like &[T] or &str a “slice”, and others of only considering the type [T] or str itself a “slice”).

If you try

let s = "hello";
print!(s);

then it won’t work (but gives good suggestions on how to fix the problem).

error: format argument must be a string literal
 --> src/lib.rs:3:12
  |
3 |     print!(s);
  |            ^
  |
help: you might be missing a string literal to format with
  |
3 |     print!("{}", s);
  |            +++++

Even more problematic is trying[2]

let s = "hello {} How are you?";
print!(s, "world!");

While something similar to this would work with printf in C (at least I believe so), the string literal passed to macros like print is actually inspected at compile time. Macros are essentially functions running at compile time and working with / manipulating Rust syntax, so it’s unsurprising they do care about syntactic criteria such as a “string literal” vs other expressions (such as a path expression in the case of print!(s) which does not compile), even though they have the same type, and the macro is uninterested in the potential run-time value of the expression here anyways and only reads and uses it at compile-time to build up some more complex code that properly combines the string parts of the formatting string, with the formatting information between {}s, and function calls for printing the the additional arguments.


  1. there are two types of string literals in Rust, “normal” ones and “raw string literals”; different syntax for the same kind of thing, while the latter can eventually be nicer to write if otherwise you’d have to do lots of escaping. print! and related formatting macros accept both styles of string literals as their format string argument ↩︎

  2. where there’s no good fix if you don’t know the format string at compile time; the Rust formatting API in the standard library simply does not support this ^^ ↩︎

4 Likes

Yes, this works without compilation warnings on both GCC and Clang:

#include <stdio.h>

int main() {
    char format_string[] = "Hello %s, how are you?\n";
    char value[] = "world";
    printf(format_string, value);
    return 0;
}

This makes me curious. Why does Rust not allow us to do the same thing:

static format_string: &'static str = "Hello {}";

let value = "world";
println!(format_string, value);

Perhaps it's easier to require the programmer to always specify the string literal inline, but static values can be proven to have a fixed address and value at compile-time (or am I missing something here?)

1 Like

Macros don't have access to the whole program - otherwise they would be too complex to implement effeciently. They only can use the exact tokens passed to them.

4 Likes

static values cannot be proven to have a fixed value at compile-time; unsafe code can change the value of a static at runtime. const values can be proven to have a fixed value at compile time, but macros don't have access to the whole program, only the bits inside the parentheses (for a macro macro!(macro can see this) can only see whatever replaces "macro can see this", and not the rest of the program).

1 Like

Could you explain what do you mean here? static items (not static mut) do have a fixed value, it's UB to try to change them (since every access goes through shared reference).

Ah, right. Hence the importance of distinguishing "literal values" from non-literal ones.

I forgot about mutable static values. Non-mut static values cannot be mutated, I believe, but the point about macros only working with local code fragments is important.

Safety, as usual. This is syntactically valid C program:

int main() {
    char buf[3000] = {0};
    for (int i=0; i < 1000; i++) {
        strcat(buf, "%d ");
        printf(buf, i);
        printf("\n");
    }
}

But whether it will finish execution or not depends, more-or-less, on the phase of the moon.

Rust doesn't like such things. And to make it more flexible would require significant change to the whole machinery.

That's not really enough, you would need const, not static. But yes, it can be done. That's how C++ std::format works. But Rust doesn't have TMP and macros are not flexible enough to support that usecase.

2 Likes

static mut items are a subset of static - you can currently treat any static mut item as a static item with extra abilities.

I’d call that an “impressive amount of UB”.[1] In fact, it’s so much UB that I believe even some novice C programmers might get a hunch something isn’t quite right with this code.


  1. Besides the more obvious increasingly plentifully missing variadic arguments to all the printf calls, there’s also the fact that buf is (one byte) too small to hold the longest 1000 * "%d " + a final NULL byte format string. ↩︎

1 Like

Hello,
Thank you so much for your reply.
So, the numbers like 34, 80, 80 + 34 that clearly written in the code are literal and a number like 114 that is a result of an operation is non literal. Am I right?

Yes, several things:

  1. Macros are a syntactic abstraction. Compilation has several phases, so "compile-time" is not one thing. Macros run after parsing but before type checking and const evaluation, so no matter how "compile-time known" something is, if it's not a literal, then macros don't stand a chance of evaluating it. The "compile-time known" is an imprecise description, as it confounds several levels of informedness; a literal is a stronger guarantee (i.e. "more compile-time known") than a const or static item.

  2. you can't even express "this must be a static or const" in the type system. A &'static str can be obtained in several ways, not only via string literals. For example, the following function produces a &'static str that depends on user input:

    fn return_leaky_user_input() -> &'static str {
        let mut s = String::new();
        std::io::stdin().read_line(&mut s).unwrap();
        Box::leak(s.into_boxed_str())
    }
    
3 Likes

No, this is not quite right.

34 and 80 are integer literals (you can use them directly as integers).

34 + 80 is not an integer literal, because the compiler needs to evaluate it first.

An impressing case is that some built-in macros can be used as literals, which is an imporatant fact when writing macros[1]: Rust Playground

print!(stringify!( a + b )); // ok
print!(include_str!("../Cargo.toml")); // ok

  1. like one of the common trick of include_str! is to use it with #[doc] ↩︎

2 Likes

I skipped over this detail deliberately, as it complicates the discussion, but indeed there’s some compiler magic here that softens the “print! needs a literal” rule in ways that are virtually impossible[1] to archive for not-build-in macros (on either end, i.e. this relies on print! (or format_args!) being built-in just as much as on stringify! and include_str! being built-in).

So for purposes of learning the fundamentals/principles of how macros operate and what conceptually constitutes a “literal”, etc…, it’s best to simply ignore all of this :innocent:


Edit: Wait… looking at that playground, I was not aware that macro_rules macros can be expanded, too. Does print! have the ability to expand all macros? (I.e. I haven’t tested function-style proc-macros yet.) So then print!(stringify!( a + b )) does perhaps not rely on stringify to be built-in? (In which case, you got the description completely backwards, not “An impressing (sic) case is that some built-in macros can be used as literals”, but “An impressive case is that some built-in macros can expand other macros into literals”.)


  1. or at least highly nontrivial ↩︎

3 Likes

print! can't expand all macros, but it works for function-style proc-macros: https://www.rustexplorer.com/b/w0r1ap

Thanks. That's accurate.