Out of bounds on &str - why does it run?

Given the following code:

fn main() {
    let text = "Rust";
    println!("{}", text);
    println!("'{}'", &text[1..3]);
    println!("'{}'", &text[1..10]);
}

I get:

Rust
'us'
thread 'main' panicked at 'byte index 10 is out of bounds of `Rust`', try.rs:5:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

It is not clear to me why isn't this problem caught during compilation? After all I have a hard-coded string which is also immutable so, unless I misunderstand this, this could be caught during compilation.

Indexing str is done via trait implementation, so the compiler would have to special-case str to understand that it would panic. (The implementation can also panic if you try to split on a non-unicode-boundary, not just OOB).

2 Likes

The amount of things that “could be” determined at compilation is unlimited, but also, there must be a limit, and it happens to be before any mechanism that could warn about out-of-bounds access in this case.

More specifically, the thing (“problem”) that could happen for the compiler here is to determine that the code unconditionally panics, whilst unconditionally panicking is likely not the intended behavior. By the way, note that, without a doubt, the only expected and correct behavior of this code is to compile successfully and panic at run-time, and all that could be issued at compile-time is a lint (i.e. most commonly those are panics, though some lints are also configured to error by default, unless you disable them or they happen in a dependency of your crate).

The issue that this code has is also is a relatively imprecise measure – whilst “panics unconditionally” is quite precise, the part where we’d need to determine that panicking unconditionally is likely not the intended behavior might require more thought, though in this case it’s likely actually the part of determining the code panics unconditionally that isn’t easy.

The problem is that Rust code is designed exclusively for execution at run-time. While there is ongoing work to enable more and more code to support compile-time execution (via const fn), slicing strings is not even part of this yet, and even operations that are const fn are actually designed exclusively for supporting explicitly being executed at compile-time, as doing so can have side effects like

  • taking a long time
  • erroring
  • having slightly different behavior compared to run-time
  • etc…

so that using this mechanism for determining whether something panics unconditionally would be non-trivial, too.

That only leaves the option of (more or less) hard-coding certain operations that can be checked for whether they unconditionally panic. (In the general case, one would probably also need to hard code operations that are not used for intentionally creating panics, since otherwise it’d be impossible to know whether code was intended to unconditionally panic, after all.) That’s involving manual work, so it’s even more limited than the things that automated approaches could achieve, so it isn’t surprising that the vast majority of code that unconditionally panics will not be detected by existing lints, including this one.

1 Like

str doesn't have a compile-time known length. The variable you created has type &str, which can point at a slice of any length, and could even be mutated (if you used a mutable binding). Thus the code works as expected. Arguably, there could be a lint which finds such errors, but slicing strings with constants is a bad idea anyway, and I almost never see it used. Strings contain unicode codepoints which don't correspond to any visual characters. This means that the manual indexing like you did is very brittle and error-prone. Use string operations (substring search, timming etc) to properly get the desired substring.

If you used instead a byte array of known length, you would indeed get a compile error.

(Clippy lint, not an error.)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.