Iterate over chars of a string and take ownership

I want to iterate over the digits of base-10 numbers given in a iterator [123, 437, ...] -> "123437...". My first approach was

iter.flat_map(|x| x.to_string().chars())

However, that fails with " returns a value referencing data owned by the current function", which makes sense, since chars() does not take ownership of the created string. Is it possible to create some iterator over the chars while taking ownersip of the string. Something like into_chars()

I don't think such an iterator currently exists, but in this case I would recommend fold instead.

use std::fmt::Write;

fn main() {
    let xs = [123, 437, 100];
    
    let s = xs.iter().fold(String::new(), |mut s, x| {
        write!(&mut s, "{}", x).expect("Cannot fail when writing to string.");
        s
    });
    
    println!("{}", s);
}

playground

Wouldn't this require reallocating potentially multiple times? The nice thing about using iterators is that Rust (I assume) implicitly uses the length of the iterator to avoid reallocation.

Iterators cannot always automatically predict the length accurately, and this is one of those cases.

1 Like

Ah, I see. So folding and writing into a String is effectively what an iterator would be doing anyway?

Well close to. When using fold you guarantee that no estimates will be made, unless you manually do it when creating the string, but it's pretty similar with collect(), since the size_hint() function that they use to estimate allocation size is not always able to compute the size correctly. In this case if you wanted to use collect(), you would need a flat_map(), but that combinator is more or less not able to compute a size hint at all.

1 Like

A bit of a XY problem is to abuse the fact that digits are ASCII, and thus go to the Vec<u8> view of a String:

iter.flat_map(|x| Vec::from(x.to_string()).into_iter()).map(char::from))

EDIT: edited the code for more clarity

I think you meant Vec::from_iter. into_iter is an instance method that takes no arguments. Also, it takes IntoIterator, so you can omit the .into().

Hmmm, my code might indeed have been less idiomatic when written in that fashion, so I have rewritten it, while keeping the same semantics:

iter.flat_map(|x|
-   Vec::into_iter(x.to_string().into())
+   Vec::from(x.to_string()).into_iter()
).map(char::from))

The idea is that a String is a Vec<u8> with an added invariant of the sequence of bytes being valid UTF-8. Since we are dealing with ASCII bytes, there is no UTF-8 shenanigans whatsoever, so we can "drop" that invariant and handle the bytes directly, by transforming the String .into() the Vec<u8> it wraps (zero-cost transformation). And now we can .into_iter() it.

Finally, for a solution compliant with the OP, I go back to the char world from the u8, since ASCII bytes are a subset of chars (Unicode Scalar Values).

In Rust you can always call any method as a function that expects its receiver as a first argument:

  • So, for instance, instead of 42.to_string(), you can write i32::to_string(&42)

  • Much like in Python, by the way:

    • str.format("Hello, {}", "World") is the same as "Hello, {}".format("World")
    # Rust mode = ON
    println = lambda *args, **kw: print(str.format(*args, end="\n", **kw))
    
    println("Hello, {}", "World")
    

Or you can just slice.iter().map(|n| n.to_string()).collect() due to this impl in std.

2 Likes

Ah, right. I knew that about Rust methods, but I didn't realize you were doing it there.

You can also use x.into_bytes() to turn it into its inner Vec<u8>.

Is there any reason why there is no into_chars() method on String. Would it make sense to propose one?

Is there any real use case to consume the String and returns an iterator over chars? OP's problem can be solved without iterating over char as my previous post.

In my case, the stream is infinite and is not terminated by a collect operation. @Yanderos code is exactly what I want, but only works for ascii strings (which is ok for my usecase). Other datastructurs also have methods that return iterators that take the ownership of the datastructure.