I'm trying to split a string that contains quoted text so I can use it as command args later in a separate process.
// this: "bash -c \"rm -f tmp/pids/server.pid && bundle exec rails s -p 3000 -b '0.0.0.0'\""
// to this:
args = [
"bash",
"-c",
"rm -f tmp/pids/server.pid && bundle exec rails s -p 3000 -b '0.0.0.0'",
]
I've tried using regex for this but have not been successful in splitting the string on other parts of the string not inside quotes. The only way I found to achieve the above result was to first split on the quote, then using a peekable iterator to exclude the last element in the previous iterator, splitting every other element in the iterator on whitespace using a temporary vector, and then flattening the temporary vector and pushing the last element without splitting on it. This works well for instances such as the above example where everything unquoted can be an arg and the last part of the command is the "command" being passed to bash with the -c flag, but would break in other scenarios.
This is what I've tried so far:
use regex::Regex;
fn main() {
let cmd = "bash -c \"rm -f tmp/pids/server.pid && bundle exec rails s -p 3000 -b '0.0.0.0'\""
.to_string();
// This doesn't work because it will split on everything inside parenthesis, returning
// an iterator with everything else but the quoted text.
let re = Regex::new(r#""[^"]*"|\s+"#).unwrap();
let args: Vec<&str> = re.split(&cmd).collect();
dbg!(args);
// This doesn't work because I'm splitting on spaces so the quoted text also
// gets split.
let re = Regex::new(r#"\s+"#).unwrap();
let args: Vec<&str> = re.split(&cmd).collect();
dbg!(args);
// This works as desired but seems overly engineered. Maybe I'm just overthinking?
// Perhaps this is almost good enough and I'm just missing some more idiomatic way
// of expressing this?
let cmd_split_quotes: Vec<&str> = cmd.split_terminator('"').collect();
let mut cmd_split_spaces = cmd_split_quotes.iter().peekable();
let mut arg_vec_temp = vec![];
let mut arg_vec = vec![];
while let Some(chunk) = cmd_split_spaces.next() {
if cmd_split_spaces.peek().is_some() {
let split_chunk: Vec<String> = chunk.split_whitespace().map(String::from).collect();
arg_vec_temp.push(split_chunk);
} else {
arg_vec = arg_vec_temp.iter().flatten().map(String::from).collect();
arg_vec.push(chunk.to_string());
}
}
dbg!(arg_vec);
println!("{}", cmd);
}
How would you guys go about doing this? Is this possible?
(Cross posted on reddit for more visibility, so here's the link for that just in case: https://www.reddit.com/r/rust/comments/tlvf5h/splitting_string_on_white_space_but_preserving/)