Reading Directly into Variables

If there is one feature I do miss not having in Rust, it is something akin to C's scanf or D's readf, which permits you to read directly into variables without the necessity of having an intermediate string representation of your input. I'm picturing something in Rust like a "read!" macro, which would provide for similar behavior. Has there ever been any discussion surrounding the addition of such a feature and, if so, has it been rejected as incompatible with the language in some way?

There's a number of crates implementing things like that. Personnally I prefer non-interactive interfaces with every options in the command line or in a configuration file, so I never used them.

If you don't need formatted input, the helper function is really simple (I think most of us have written it at some point).

fn input() -> std::io::Result<String> {
    let mut buffer = String::new();
    std::io::stdin().read_line(&mut buffer)?;
    Ok(buffer)
}

let i: u64 = input()?.trim().parse()?;
For fun:
// renaming the function above to `input_string`
fn input<T>() -> anyhow::Result<T>
where
    T: std::str::FromStr,
    T::Err: std::error::Error + Send + Sync + 'static,
{
    Ok(input_string()?.trim().parse()?)
}

The thing is, this is something that people mainly use for solving toy problems, such as homework assignments and competitive programming puzzles. Real-world programs that ever read from standard input:

  1. should by and large be generic and accept any Reader internally (so as to support standard input, file input, piping, getting a remote URL, etc.); and
  2. their parsing tasks are usually way more complicated and subtle than needing "an integer" or "a string"; they usually feature full-fledged parsers for one format or another (JSON, CSV, you name it).

Since the standard library provides both building blocks (reading into an unstructured string and parsing it generically), and putting them together is trivial (as demonstrated above), it's not really warranted to increase the stdlib API surface with such functions.


Actually, one more thing that comes to my mind is this: what should the error type be? Since there are two types of errors that can occur, it's not possible to express them using eiher <T as FromStr>::Err or as io::Error. One could either play fast-and-loose and wrap the <T as FromStr>::Err into an io::Error(io::ErrorKind::Other, …), which is not nice in a library, or create an entirely new error type just for this function, which offers a clear choice between the two cases. That would probably be an overkill.

4 Likes

Wrote a crate especially for you :wink:

P.S. It's closer to modern Pascal's readstring and readint or C's gets, not scanf.

@H2CO3, about errors. At the beginning, i tried to implement error like this

enum Error<T: core::str::FromStr> {
    A(std::io::Error),
    B(T::Err)
}

but &str is !FromStr. Then, i tried to implement trait with type Err;, that will be implemented for both T: core::str::FromStr and &str. It had to be type Err = T::Err; and type Err = core::fmt::Error;. So, the compiler told me that implementations for T: core::str::FromStr and &'a str are conflicting :dotted_line_face:

So i just made enum variants zero-sized.

No offense, but this is pretty bad.

let mut input = String::new();
if let Err(_) = std::io::stdin().read_line(&mut input) {
    $crate::ReadResult::Err($crate::ReadError::StdinError)
} else {
    unsafe {
        let ptr: *const str = input.trim_end();
        core::mem::forget(input);
        $crate::ReadResult::Ok(&*ptr)
    }
}

You're just allocating a string (same as the standard solution) and leaking memory. (same stands for the other branch)

Maybe then we shouldn't use &'static str because it's memory leaking? What's the problem? No use-after-free, no invalid references.

Don't forget the string ? There's no good reason to.

Leaking is ok in some specific situations, mostly when you're sure something will be used until the end of your program. Moreover it's generally well marked by using a function with "leak" in its name.

In your case readln might be used in a tight loop, in which case you would end up leaking a ton of memory. And the worst part is that this isn't even clear, but hidden behind a macro.

Also, there's no reason that needs to use unsafe.

2 Likes

Pointer deref

And yes, I understand what you meant. But will not using standard Stdin::read_line cause same effect in one scope? Or actually i didn't understand what you meant?

read_line uses the reference to String, and the String will be freed when the caller drops it.

1 Like

Box::leak(str.into_boxed_slice()) does the about same thing without unsafe, although leaking is not a nice design anyway.

I'm with @H2CO3 on this one. For tiny programs that read a few lines and print a number, Rust is never going to beat Python and other languages on elegant syntax. For Rust there's no value in adding syntax sugar for this case.

In larger programs reading ASCII decimal numbers from stdin quickly stops being sufficient. Bigger programs will want to support more input syntax, more complex data, faster parsing, and for that Rust does have nice solutions, e.g. serde and a ton of parser generators.

DIY parser with sscanf destroyed performance of GTA, and this wouldn't have been an issue if they had serde :slight_smile:

5 Likes

Really. I forgot about it. You're the caller, that's your problem doesn't seem good tho. Am i rewriting this to return a String then or this is not a solution?

@kornel, doesn't Box::leak return &mut T?

It returns &'static mut str, which you can cast to &'static str if you want to.

Not "the problem", but "the privilege". They are allowed to hold the data exactly as long as they need and not to suffer from someone holding it at the same time forever. And, thanks to automatic Drop, that's not a burden in the vast majority of cases (and when it is, well, there's something highly specific going on, so they might reach to unsafe themselves).

1 Like

Yes, but that's not a problem because you can't read into a str anyway. You have to create a String, which is FromStr.