Equivalent to pythons `str.partition` method?

So I'm trying to rewrite a parser in Python in Rust that uses str.partition() in Python. The Python code looks like:

  for f in inputs:
    for line in f:
      line = line.partition('#')
      tokens = line[0].split()

I'm thinking an equivalent (and idiomatic) version might be something like:

for f in inputs.iter() {
    for line in f.lines()  {
        let (line, old): (Vec<char>, Vec<char>) = line.chars().partition(|c| c == '#');
        let tokens = old.split(|c| c.is_whitespace()).collect::<Vec<_>>();

But I might be wrong, so I thought I'd ask (I've never ported Python programs to Rust -- at least ones like this, so I'm not sure how to get the same behavior idiomatically). Thoughts?

Equivalent to Python's str.partition() method in Rust

So, Python's str.partition(sep) is a minor convenience wrapper around .splitn(sep, 1).

Mainly, one could polyfill it as:

def partition(s: str, sep: str):
    if not sep:
        raise ValueError("Empty separator")
    (before, *mb_after) = s.splitn(sep, 1)
    return (before, *((sep, mb_after[0]) if mb_after else ("", "")))

Note how the core logic is still .split(): str.partition() tries to find a point in the string at which to split / cut the string in two parts.

In Rust, however, Iterator::partition does not do that. Instead, given a predicate, it shall dispatch each element of the iterator to either of the two collections / "buckets" it returns: the left "bucket" will contain the elements that matched (so just '#'s in your case), and the right bucket will contain the others).

So, if you want to implement .partition() in Rust, just rewrite the Python function above in Rust :slightly_smiling_face:

fn partition<'s, 'sep> (s: &'s str, sep: &'sep str)
  -> (&'s str, &'sep str, &'s str)
{
    if sep.is_empty() {
        panic!("Empty separator");
    }
    let mut iter = s.splitn(sep, 1);
    let before = iter.next().unwrap();
    if let after = iter.next() {
        (before,  sep, after)
    } else {
        (before, "", "")
    }
}

Although using "sentinel values (such as "") to signal an error / special case is not very idiomatic in Rust, where we have the wonderful Option enum:

fn partition<'s, 'sep> (s: &'s str, sep: &'sep str)
  -> (&'s str, Option<&'s str>)
{
    if sep.is_empty() {
        panic!("Empty separator");
    }
    let mut iter = s.splitn(sep, 1);
    (
        iter.next().unwrap(),
        iter.next(),
    )
}

EDIT: at that point we are very close to the signature of @steffahn's very aptly suggested .split_once() function :ok_hand:

2 Likes

python's partition seems to be doing something entirely different, judging by the docs I could find, compared to Rust's Iterator::partition. For a better Rust-equivalent you could use e.g. str::split_once. Something like line.split_once('#').unwrap_or(&line) should give you the whole string if there is no # and the string up to (not including) the first # otherwise.

You could then proceed using the str::split_whitespace method

2 Likes

So, deviating from the approach others seem to be taking, it looks to me like you 1) want to ignore everything after # on a line, and 2) split everything before it by whitespaces. Right?

Here is how I would do it

Pretty sure this is what I intended to do, we'll see.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.