Setting three strings from a Vec, with Default fallback

Preliminary note: Here is just a curiousity question.

Is there a better way to implement impl From<Vec<String>> for Parts given below?

For a "better way", I mean:

  • Keeping the same performance or improving it;
  • Cosmetics: avoiding repeating strings.next().unwrap_or_default() three times, without using macros;
  • Still passing all the given tests.

Here is the code:

#[derive(Debug, PartialEq)]
pub struct Parts {
    left: String,
    middle: String,
    right: String,
}

impl From<Vec<String>> for Parts {
    fn from(strings: Vec<String>) -> Parts {
        // CODE REVIEW PART: ################################################################
        let mut strings = strings.into_iter();
        Parts {
            left:   strings.next().unwrap_or_default(),
            middle: strings.next().unwrap_or_default(),
            right:  strings.next().unwrap_or_default(),
        }
        //###################################################################################
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    macro_rules! s { ($s:expr) => { String::from($s) } }

    macro_rules! string_with_ptr {
        ($s:expr) => {{
            let s = String::from($s);
            let ptr = s.as_ptr();
            (s, ptr)
        }};
        ($($s:expr),*) => { ( $( string_with_ptr!($s), )* ) };
    }

    #[test]
    fn test_empty_input() {
        assert_eq!(
            Parts::from(vec![]),
            Parts { left: s!(""), middle: s!(""), right: s!("") }
        );
    }

    #[test]
    fn test_small_input() {
        let (left_string, left_ptr) = string_with_ptr!("left");

        // Object to test:
        let parts = Parts::from(vec![left_string]);

        // Test that strings has been moved, not cloned:
        assert_eq!(left_ptr, parts.left.as_ptr());

        // Test the behaviour:
        assert_eq!(
            parts,
            Parts { left: s!("left"), middle: s!(""), right: s!("") }
        );
    }

    #[test]
    fn test_big_input() {
        let (
            (  left_string,   left_ptr),
            (middle_string, middle_ptr),
            ( right_string,  right_ptr),
        ) = string_with_ptr!("left", "middle", "right");

        // Object to test:
        let parts = Parts::from(vec![left_string, middle_string, right_string, s!("foo"), s!("bar")]);

        // Test that strings has been moved, not cloned:
        assert_eq!(left_ptr, parts.left.as_ptr());
        assert_eq!(middle_ptr, parts.middle.as_ptr());
        assert_eq!(right_ptr, parts.right.as_ptr());

        // Test the behaviour:
        assert_eq!(
            parts,
            Parts { left: s!("left"), middle: s!("middle"), right: s!("right") }
        );
    }
}

You could do:

let mut strings = strings.into_iter();
let mut next = move || strings.next().unwrap_or_default();
Parts {
    left:   next(),
    middle: next(),
    right:  next(),
}

but I think that your version is already very simple and clear.

Edit: As expected Rust optimizes the closure, so same performance (Original, New)

First, let me just say thank you for making an amazing SSCCE. Very clear where to look, and tests that caught my first attempt being wrong. Perfect.

Here's a very different way to do it:

let mut parts = [""; 3].map(String::from);
for (p, x) in parts.iter_mut().zip(strings) {
    *p = x;
}
let [left, middle, right] = parts;
Parts { left, middle, right }

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=774659303feec31b0d35020a939cf3d4

I don't know what I'd actually do, but hopefully that gives you some ideas to chew on.

3 Likes

FWIW I tried this very verbose option, and found that it generated longer assembly, that appears, to my fairly untrained eye, to be doing extra work in the len >= 3 case, for some reason.

(This post previously had a version without the truncate, which was even worse.)

impl From<Vec<String>> for Parts {
    fn from(mut strings: Vec<String>) -> Parts {
        match strings.len() {
            0 => Parts {
                right:  String::new(),
                middle: String::new(),
                left:   String::new(),
            },
            1 => Parts {
                right:  String::new(),
                middle: String::new(),
                left:   strings.remove(0),
            },
            2 => Parts {
                right:  String::new(),
                middle: strings.remove(1),
                left:   strings.remove(0),
            },
            _ => {
                strings.truncate(3);
                Parts {
                    right:  strings.remove(2),
                    middle: strings.remove(1),
                    left:   strings.remove(0),
                }
            },
        }
    }
}

Playground
Compiler Explorer

It's not length == 3, it's length >= 3 -- if greater, it will shift all the elements after index 2, etc., if it can't determine it's okay to leave the vector in an invalid state for awhile. The consuming methods have an advantage in this sense.

(But I didn't check the compiler output.)

Right after posting I realized that, and came back to edit it, adding a truncate call. It now seems less obviously worse, but the assembly output is still slightly longer.

Hmm, that inspires another one from me. I forgot to take advantage of having ownership of the Vec in my previous one.

strings.resize_with(3, Default::default);
let [left, middle, right]: [_; 3] = strings.try_into().unwrap();
Parts { left, middle, right }

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=cc2793f27d50bad348d9fcebeae8148d

4 Likes

I think it should be possible to use array::map for this purpose.

let mut strings = strings.into_iter();
let [left, middle, right] = [(); 3].map(|()| strings.next().unwrap_or_default());
Parts {
    left,
    middle,
    right,
}
1 Like

Thank you all for your answers, it's very instructive to look at all the approaches which can be considered for such a simple problem, and it also shows how great is the Rust community!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.