Problem with regex capture?. What am I missing?

Greetings!

I've decided to start learning Rust and before I dive into a couple courses I found on Udemy, I thought I'd attempt to convert a script I made in Ruby into Rust.

Unfortunately, and as you may imagine, I have come across a few issues that I'm unable to solve and/or google.

What my goal is with this script is to list the text between two curly braces {...}

Here is my file contents:

define contactgroup {
    contactgroup_name acmeco
    alias Admin On Call
    members John
}

This is what I've done via ruby:

#!/usr/bin/env ruby
# Load oncall configuration file
config_file   = 'oncall.cfg'
config        = File.read(config_file)

contacts = config.match(/define contactgroup {(.*)}/m)[1].strip
contacts.each_line do |line|
  line.chomp!
  puts line
end

The results:

contactgroup_name acmeco
    alias Admin On Call
    members John

The following is what I have in my rust script:

extern crate regex;
use regex::Regex;

use std::fs::File;
use std::io::prelude::*;

fn main() {
    let contacts_config = "oncall.cfg";

    let mut file = File::open(contacts_config).expect("Cant Open File");
    let mut contents = String::new();
    file.read_to_string(&mut contents).expect("Cannot Find File");

    let re = Regex::new(r"define contactgroup \{\n(.*)").unwrap();
    for caps in re.captures_iter(&contents) {
        println!("{:?}\n", caps);
    }
}

The results:

Captures({0: Some("define contactgroup {\n    contactgroup_name acmeco"), 1: Some("    contactgroup_name acmeco")})

What am I missing?

Essentially, what I will be doing with the content captured/matched is that I will loop through each line until I find a matched line, such as member john and replace it with member tim. But thats for after I solve this issue :slight_smile:

Any guidance would be appreciated. I am open and wanting to learn so anything helps, such as links to examples, the proper sections in the docs, etc.

I'm really digging the language and want to see what it can do!

Thank you for your time.

Not sure what you expect, the regex apparently is matching what you need. If you only need the string inside a specific capture group, you can use .get() or .name(). You are doing this in the Ruby script by writing [1] but I don't see any equivalent (e.g. the above two methods) being used in your Rust code.

1 Like

At a glance, I think one difference between the two programs is that you explicitly split your result by lines in the ruby program, but don't do this in Rust.

In the Rust program, you use captures_iter, but according to the docs (linked), this returns an iterator over "nonoverlapping capture groups". Just like in Ruby, the two capture groups are 0 for the whole string, and 1 for the group you declared with (). One of these two captures is inside the other, so they are overlapping and are returned together.

I could definitely be missing something, but if you want to match the behavior of the Ruby program, how about starting out doing roughly the same thing - grabbing the first capture, and then splitting it by line?

let re = Regex::new(r"define contactgroup \{\n(.*)").unwrap();
let contacts = re.captures(contents).unwrap().get(1).unwrap();
for line in contacts.as_str().lines() {
    println!("{}", line);
}

Note that I've also changed your println!("{:?}\n", ...); into println!("{}", ...) to match the Ruby behavior: {:?} debug-prints and includes the surrounding quotes, and \n is redundant with the ln in println.

This should be closer, but it doesn't exactly match the behavior. For me, it outputs a single line:

    contactgroup_name acmeco

This brings us to the second difference: you've used the m flag in ruby, which I think must specify a multiline search. To do the same in Rust, use the s flag:

let re = Regex::new(r"(?s)define contactgroup \{\n(.*)").unwrap();

See https://docs.rs/regex/1.3.9/regex/#grouping-and-flags - m would allow ^/$ to be line start/end, and s allows . to match \n.

This now outputs:

    contactgroup_name acmeco
    alias Admin On Call
    members John
}

(code on playground)

It's not exactly right, but maybe this is enough to get on the right track?

1 Like

On the Regex::new line you are missing setting the s flag to allow . To match new lines. You are also missing the closing } after the capture group which your ruby code has.

This is what the line should look like:
let re = Regex::new(r"(?s)define contactgroup \{\n(.*)}").unwrap();

You can find more details on flags for groups here: https://docs.rs/regex/1.3.9/regex/#grouping-and-flags

Edit: Oops typed basically the same solution as daboross but just slower so I didn’t see the post before adding the reply :slight_smile:

1 Like

Thank you all for your very, very helpful advice!

When compiling, it was complaining about the ending } being redundant (or something like that), and well, who was I to question it...

Thank you (and daboross) for the links to grouping and flags. I'll definitely be dissecting that as I go.

I originally tried the .get(1) at first, but it wasn't working for me either. Kept outputting as if no matches were found when doing something like this:

    match re.captures_read(&contents){
        Some(caps) => println!("Found match: ",&caps[1]),
        None => println!("Could not find match..." )
    } 

Glad to see that I was at least on the right track.

Originally, I was just getting the single line as you were, but changed something and got the output I posted above. If i recall, that was before I learned of captures_iter though.

That was my intended goal. To start out with splitting them line by line. Figured once I was able to access the captured group, I could then figure out how to loop through the lines.

Good to know its similar in implementation as to other languages.

Again, thank you all for your help and advice. Figured I was missing something simple, and I was. I am glad this community exists!

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.