Interpret the path enclosed in double quotes on Windows

Hi.
I want to create a Windows CLI tool that passes a directory pathname as an argument, interprets it, and processes it. After some experimentation, I noticed that sometimes directory paths were not interpreted well.

For example:

fn main() -> std::io::Result<()> {
    let arg = std::env::args().skip(1).next().unwrap();
    let metadata = std::fs::metadata(&arg)?;
    assert!(metadata.is_dir());
    Ok(())
}

The result of passing each of the following paths to this code is as follows.
Because the directory path contains a space, it is enclosed in double quotes.

  • cargo run -- "foo bar"
    • Ok
  • cargo run -- ".\foo bar"
    • Ok
  • cargo run -- ".\foo bar\"
    • Ouch!
    • Err(Os { code: 123, kind: Other, message: "The filename, directory name, or volume label syntax is incorrect." })

Why does the last experiment fail? What workarounds are available?

When I double quotes you need to escape the backslashes (all of them), alternatively you can use slashes in path. Windows learnt to cope well with them.

".\foo bar\" means foo bar" folder, and you don't have a folder with " in its name.

It's not even searching for that folder, but complaining that the quotation hasn't been closed.

Oh, I was wrong that I pass a double quoted string as a path.
Trimmed and passed the double quotes and it worked!

I assumed that it would convert properly as well...

This was my misinterpretation... :disappointed_relieved:

Umm, \" means "...

In the first place, it is wrong to specify \ at the end and pass it??

AFAIK it's okay to put a backslash at the end if it's a folder.

As pointed by NobbZ and kornel, you need to escape \, i.e. cargo run -- ".\foo bar\\".

Feelds very weird, but this seems to be how cmd works...

To save a directory path with a trailing backslash (\) requires adding a second backslash to 'escape the escape'
so for example instead of "C:\My Docs\" use "C:\My Docs\\"
https://ss64.com/nt/syntax-esc.html

Even though it works in this case, I'd advise against using single backslashes in doublequoted arguments in CMD or Powershell for folder separation.

If your letter isnt f but t after the backslash, it will result in a tabstop beeing passed in. Or n will result in a newline passed to the program.

2 Likes

I didn't really understand the specification of interpreting arguments specified on the Windows command line... :disappointed_relieved:

Thank you for your advice, everyone!

https://docs.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=vs-2019

Too complicated ... :expressionless:

It's complicated but not quite as complicated as it looks if you're only handling paths. Basically \ is always treated literally, except when it's followed by a ". Only then does special rules apply, so you only need to remember to close the quoted path with \\" instead of \" (or just not end paths with a \).

The thing that might make it trickier is that different shells can have slightly different parsing rules in addition to the normal command line parsing. E.g. powershell and cmd may (or may not) behave slightly differently.

Here's a quick test program to show how arguments are parsed:

fn main() {
    println!("Raw commandline:");
    println!("\t{}", get_raw_command_line());
    
    println!("Parsed arguments:");
    for argument in std::env::args().skip(1) {
        println!("\t{}", argument);
    }
}

// Get the command line as it was before any parsing is applied
fn get_raw_command_line() -> String {
    unsafe {
        let line = GetCommandLineW();
        
        let mut cursor = line;
        let mut length = 0;
        while *cursor != 0 {
            length += 1;
            cursor = cursor.add(1);
        }
        let array: &[u16] = std::slice::from_raw_parts(line, length);
        
        String::from_utf16(array).expect("Invalid unicode")
    }
}

#[link(name="kernel32")]
extern "system" {
    fn GetCommandLineW() -> *const u16;
}

For example, the following is a quick summary of the behaviour on my Windows 10 box.

Cmd

Shell input Raw commandline Parsed Argument
\testing\ \testing\ \testing\
"\testing\" \testing\" \testing"
"\testing\\" \testing\ \testing\
"\test ing\" "\test ing\"" \test ing"
"\test ing\\" "\test ing\\" \test ing\

Powershell

Shell input Raw commandline Parsed Argument
\testing\ \testing\ \testing\
"\testing\" \testing\ \testing\
"\testing\\" \testing\\ \testing\\
"\test ing\" "\test ing\"" \test ing"
"\test ing\\" "\test ing\\" \test ing\

Note that both powershell and cmd remove unneeded quotes around arguments. However powershell appears to simply strip the quotes, whereas cmd applies the commandline parsing rules when doing so.

Conclusion

All programmers can really do is handle the arguments in the standard way. It's unfortunately up to the user to know how to handle their shell. I know this isn't a very satisfying answer in the face of seemingly weird shell behaviour and confusing parsing rules. But it's usually better to be consistent with other programs and in any case you can't assume that all versions of all Windows shells will all behave exactly the same way.

Possibly the best thing is to try to produce more detailed error messages to the end user, if possible. A double quote isn't a valid character for Windows paths so if it's at the end of an argument that expects a path it's a good indication that the shell isn't behaving as the user expected.

4 Likes

@chrisd
I understand very well. Thank you! :smiley: