Reading Arguments From File

Well, getting close to 200 lines of code. So I'm going to try and only post what is relevant. I have two functions that read lines of a file. Let's say the lines of the file are like this:

-u test -l 192.168.1.x -w cheese
-u user -l 192.168.1.x -w "superduper"

The arguments in the file are the same as for the program itself. I thought about using args[0], but if there are lots of lines I don't want to overload things with sub-processes. So I'm trying to get the parameters like -u, -l, -w, etc. and pass them to a function I already have that would do the same thing with these arguments. Below are the functions I'm currently using to get the lines of the file:

fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
    let file = File::open(filename)?;
    Ok(io::BufReader::new(file).lines())
}

fn read_it(fname: &str, fexec: &str) {
    let args: Vec<String> = env::args().collect();

if let Ok(lines) = read_lines(&fname) {
    // Consumes the iterator, returns an (Optional) String
   for line in lines {
        if let Ok(ip) = line {
                let split = ip.split(" ");
                let more = fexec.split(" ");
                let number = ip.split(" ");

                // print values of options.
                for s in split.skip(1).step_by(2) {
                    println!("{}", s);
                }
                
                // print commands
                for m in more {
                    println!("{}", m);
                }    
        }
   }
}

If you run this with a similar file to read and pass it another argument of "date" you should get output like the following:

$ ./target/release/split_it -f hosts -e date
test
192.168.1.x
cheese
date
-u
-l
-w
ph33r
192.168.1.x
superduper
date
-u
-l
-w

As you may notice, I was able to do a split on empty spaces and then print out the values of -u, -l -w and -e with two loops. Then a third loop prints out the literal parmaters -u -l -w. I need to do this with multiple combinations of arguments, which will probably use a match statement to call different functions I already have.

What I actually need to do and having trouble with:

Some how I need to take the values from the output before and put them into variables or the indexes of a vector or array. Since the number of arguments could be different each time, I'm guessing a vector would be best. I'm just not sure if I need to pass the parameters like -u -l -w into a vector, or if I could have a match statement check those as-is. If they can't be used as-is I know I can read those values in with vec.push() within those for loops. I'm just unsure how to use the indexes of a vector with a match statement as well.

I also know there may be another solution to this I'm not thinking of. If anyone has suggestions on the best way to do this I would be grateful.

Thanks

I’m not sure from your description what you’re trying to do. It looks like most of those positional arguments are supposed be parameters to the flag that comes immediately before them: If it were me, I’d work towards building a structure that has an Option field for each possible argument:

struct Args {  // TODO: more descriptive names
    u: Option<String>,
    l: Option<String>,
    w: Option<String>,
    e: Option<String>
}

Once you have those, you can match on them like this (.. ignores unlisted fields):

match args {
    Args { u: None, .. } => { Err("No User Specified") },
    Args { u: Some(u), l: Some(ip), .. } => { Ok( login(u,ip)? ) },
    /* ... */
    _ => { unimplemented!() }
}
1 Like

If it were me, and it is, I would use toml (https://docs.rs/toml/0.5.6/toml/) to read a config.toml file with all ones arguments in it.

I would use clap to get the same arguments from the command line.

Arguments given on the command would override those from the toml file, or whatever you want.

Unless you are writing this from scratch just for a fun learning exercise.

2 Likes

That makes sense, but would I need to use a loop like I was? Or would there be a better way to read those when I first read the line?

I'm already using clap in this code:

fn main() {

    let matches = App::new("Test")
        .version("0.1.0")
        .author("me <somebody@mail.ru>")
        .about("Work In Progress")
        .arg(Arg::with_name("File")
                 .short("f")
                 .takes_value(true)
                 .help("Path to file of hosts."))
        .arg(Arg::with_name("Exec")
                 .short("e")
                 .takes_value(true)
                 .help("Commands to be quotes. Use quotes for muliple commands."))
        .arg(Arg::with_name("Host")
                 .short("l")
                 .takes_value(true)
                 .help("Ip or domain of host"))
        .arg(Arg::with_name("User")
                 .short("u")
                 .takes_value(true)
                 .help("Username. Should have sudo, but not enforced."))
        .arg(Arg::with_name("Port")
                 .short("p")
                 .takes_value(true)
                 .help("Specify port if different from 22."))
        .arg(Arg::with_name("Pass")
                 .short("w")
                 .takes_value(true)
                 .help("Password if SSH keys are not configured."))
        .arg(Arg::with_name("Sync")
                 .short("s")
                 .takes_value(true)
                 .help("Sync script or binary to be run on host(s)."))
        .get_matches();


    let status = Command::new("ssh-add")
                     .arg("-l")
                     .stdout(Stdio::null())
                     .stderr(Stdio::null())
                     .status()
                     .expect("failed to execute process");

    if !status.success() {
        println!("The ssh-agent is not installed, running or configured correctly.");
        println!("Try running \"eval $(ssh-agent -s); ssh-add;\" or configuring .bash_profile");
        process::exit(1);
    }

    match ["User", "Host", "Pass", "Port", "Exec", "File"].iter().map(|a| matches.value_of(a)).collect::<Vec<_>>().as_slice() {
        [Some(user), Some(host), Some(pass), Some(port), Some(exec)] => uhwpe(user, host, pass, port, exec),
        [Some(user), Some(host), None, Some(port), Some(exec)] => uhpe(user, host, port, exec),
        [Some(user), Some(host), Some(pass), None, Some(exec)] => uhwe(user, host, pass, exec),
        [Some(user), Some(host), None, None, Some(exec)] => uhe(user, host, exec),
        [None, None, None, None, Some(exec), Some(file) ] => read_it(file, exec),
        _ => println!("Garbage in, garbage out... Try --help"),
    }
}

Would it be possible to reuse this? I understand how it takes in arguments to the program. I'm unsure how to implement this when reading from a file though.

I'd prefer to not use toml if I don't have to. I'm not opposed to it if there's no other option though.

If you’re already using clap, you can use get_matches_from to parse arguments from a source other than the command line.

If you still want to do this yourself, you could call a function like this after splitting the line apart:

fn parse_args(params: impl Iterator<Item=&str>) -> Args {
    let mut result: Args = Default::default();  // NB: requires #[derive(Default)]

    loop {
        match params.next() {
            None => { return result; },
            Some("-u") => {
                result.u = Some(
                    params.next()
                          .expect("-u requires an argument")
                          .into()
                );
            },
            /* ... */
            Some(flag) => { panic!("Unknown option: {}", flag); }
        }
    }
}
2 Likes

Well, that looks nice, but let me ask, is "impl Iterator<Item=&str>" to iterate over the parameters and values passed to it? If so, could I just pass it the arguments like "parse_args(&ip, &fexec);" from read_it function I posted earlier? I would assume if it reads in with clap, I won't need to break up the lines with split. I would also assume I could match multiple arguments like "Some("-u", "-l", "-w") => {". In my last post I used clap like this to map the value of -u to the variable "User":

    .arg(Arg::with_name("User")
             .short("u")
             .takes_value(true)
             .help("Username. Should have sudo, but not enforced."))

Would this need to be done? Or would -u, -l, etc. become variable names themselves?

Sorry for all the questions. I just want to ensure I don't become a script kiddie with Rust.

Thanks!

Well, I decided to try that function that 2e71828 suggested. I ensured to put #[derive(Default)] at the top of my source file, but I got a lot of errors from it:

$ cargo build --release
   Compiling glue v0.1.0 (/home/user/rust/glue)
error: expected pattern, found `.`
  --> src/main.rs:36:9
   |
36 |         .arg(Arg::with_name("File")
   |         ^ expected pattern

error: `derive` may only be applied to structs, enums and unions
 --> src/main.rs:1:1
  |
1 | #[derive(Default)]
  | ^^^^^^^^^^^^^^^^^^

error[E0412]: cannot find type `Args` in this scope
  --> src/main.rs:21:52
   |
21 |   fn parse_args(params: impl Iterator<Item=&str>) -> Args {
   |                                                      ^^^^
   | 
  ::: /root/.cargo/registry/src/github.com-1ecc6299db9ec823/clap-2.33.2/src/args/arg.rs:43:1
   |
43 | / pub struct Arg<'a, 'b>
44 | | where
45 | |     'a: 'b,
46 | | {
...  |
56 | |     pub r_ifs: Option<Vec<(&'a str, &'b str)>>,
57 | | }
   | |_- similarly named struct `Arg` defined here
   |
help: a struct with a similar name exists
   |
21 | fn parse_args(params: impl Iterator<Item=&str>) -> Arg {
   |                                                    ^^^
help: consider importing this struct
   |
3  | use std::env::Args;
   |

error[E0412]: cannot find type `Args` in this scope
  --> src/main.rs:22:21
   |
22 |       let mut result: Args = Default::default();  // NB: requires #[derive(Default)]
   |                       ^^^^
   | 
  ::: /root/.cargo/registry/src/github.com-1ecc6299db9ec823/clap-2.33.2/src/args/arg.rs:43:1
   |
43 | / pub struct Arg<'a, 'b>
44 | | where
45 | |     'a: 'b,
46 | | {
...  |
56 | |     pub r_ifs: Option<Vec<(&'a str, &'b str)>>,
57 | | }
   | |_- similarly named struct `Arg` defined here
   |
help: a struct with a similar name exists
   |
22 |     let mut result: Arg = Default::default();  // NB: requires #[derive(Default)]
   |                     ^^^
help: consider importing this struct
   |
3  | use std::env::Args;
   |

error[E0106]: missing lifetime specifier
  --> src/main.rs:21:42
   |
21 | fn parse_args(params: impl Iterator<Item=&str>) -> Args {
   |                                          ^ expected named lifetime parameter
   |
help: consider introducing a named lifetime parameter
   |
21 | fn parse_args<'a>(params: impl Iterator<Item=&'a str>) -> Args {
   |              ^^^^                            ^^^

error: aborting due to 5 previous errors

Some errors have detailed explanations: E0106, E0412.
For more information about an error, try `rustc --explain E0106`.
error: could not compile `glue`.

To learn more, run the command again with --verbose.

I noticed "derive` may only be applied to structs, enums and unions", so maybe this is wrong? Anyone see how I could correct this?

Thanks

Just a side question. I've been doing something similar with serde. Should I switch to toml?

Toml was only a suggestion. There are many other formats one could use for which there are conveniently available crates to parse I guess.

Back in my node.js days all my config files were in JSON format because that was so trivially easy to deal with from Javascript. Indeed that is very common, the setting files for MS VS Code are in JSON for example.

The Java guys likely use XML as that is a big thing in the Java world.

So yes, it would be easy to use JSON format config files and parse them with serde.

I just thought that as toml is used by cargo that might be a suitably 'rusty' thing to do.

Anyone have suggestions on that function?

What's wrong with using a proper serialization format with a well-tested parser and a common understanding across people and languages? Why do you want to roll your own serialization format?

No, not as-is, but since it's a lot of boilerplate, consider using structopt directly, and then you can have both structopt and serde fill in the same structure, from the command line and from a file, respectively.

The idea is if someone is new to, or doesn't know Toml they could learn one syntax instead of two. What I'm including in the file being read are the same arguments as the program itself. I know there's sometimes a hatred for noobs that won't read the docs, but I thought being so easy a cave-man could do it would have an appeal too. Like I said, if I have to use Toml, it won't be that bad, it just wasn't my first choice.

Do you have suggestions for making this work with Toml? If so I'm not opposed to listening.

Thanks

But then this argument also applies to the format you are trying to come up with. Tokenizing command line arguments is not trivial – the shell can (and does) perform all sorts of transformations on them, so a naïve split-on-whitespace approach is not sufficient (what about quoted arguments containing whitespace, for example?).

A serialization format solves these problems properly.

I wasn't directing hatred towards anyone. I strongly deny being hateful just because I'm coming up with technical and professional arguments. I didn't even mention beginners or the issue of not reading the documentation. So please refrain from putting words into my mouth.

I can only repeat my suggestion above: you can use a config struct with structopt and serde so that it could be populated by command line arguments as well as by any serde-compatible format (including TOML).

The main thing you'd do is define the type and use derive macros like this:

#[derive(Debug, Clone, StructOpt, Serilaize, Deserialize)]
struct Config {
    infile: String,
    outfile: Option<String>,
    captain_age: u32,
}

That wasn't meant to be directed at you or anyone in particular. My apologies if you felt it was, but that was simply a re-telling of my experiences in other forums, not here.

I'll look into StructOpt and the derive macros you mentioned. For now I guess I'll wait and see if 2e71828 or anyone wants to comment on the function they provided.

Thanks

I had meant that to go along with (and produce) the Args struct that I had posted earlier, and it was meant more for illustration than drop-in code, so I didn’t vet it as well as I maybe should have.

Yes, you’d need to call it with something like ["-u", "username", "-l", "127.0.0.1"].iter() (or some other code that produces a similar iterator). I deliberately skipped writing out that part because doing it right is inherently complex, like @H2CO3 mentioned. For a simple project, str::split_whitespace() will get you pretty far, but there’s not a great path to go from there to a more complete soluton.

If so, could I just pass it the arguments like "parse_args(&ip, &fexec);" from read_it function I posted earlier?

That function is designed to replace and be a more robust version of what you wrote originally. Instead of assuming that options and their arguments will alternate, it reads an option from the iterator and then takes as many parameters as that option calls for. It is also roughly equivalent to what clap’s get_matches_from will do, except that clap produces its own generic result object.

I would assume if it reads in with clap, I won't need to break up the lines with split.

This is unfortunately not the case. clap only works with arguments after they’ve been pre-parsed into separate strings. That’s usually done by the shell when invoking your program; if you’re getting arguments from somewhere else, you’ll need to break them up yourself somehow.

If you’re using clap, yes; that’s how you configure it. It should be possible to share the configuration code between both paths of reading arguments, but I’m not familiar enough with clap to advise you here.

The function I posted writes all of this in the program text itself, and then produces a struct with members u, l, w, ... that you can access like parse_result.u. They’re all Options, so you’ll need to use match, unwrap, or similar to handle some being missing.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.