What is the simplest way to create a derived iterator from a first iterator when the derived iterator returns values aggregated from the fist one

For instance : I read text from a file and return it as the iterator : lines.

If I want to suppress empty lines I can use filter() : lines.filter(...) etc.

But what if I want to return paragraphs as vectors of consecutive non-empty lines, as in the following python example:

lines = open("some_text.txt")

paragraph = []
paragraph_list = []

def get_paragraphs(paragraph, paragraph_list):
    for line in lines:
        if len(line.strip()) > 0:
            paragraph.append(line)
        else:
            if len(paragraph) > 0 :
                paragraph_list.append(paragraph)      
                paragraph = []

get_paragraphs(paragraph, paragraph_list)

for item in paragraph_list:
    print(item)
    print("- - -")

The itertools crate has some useful functions for this, like batching and group_by.

1 Like

The main things Itertools::batching get you (which are perfectly valid considerations) are less boilerplate and the ability to just use a closure inline. Here's an example where you can compare.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.