Cannot move out of struct

I have an EDI spec that I'm trying to implement in rust, namely the CWR format or Common Works Registration format. The purpose of the spec isn't really important but basically it consists of a bunch of different 'record types' all of which share a common field, and most of which share another two common fields. Here's the gist of it:

  1. A file has a header record, followed by one or more groups, followed by a trailer record
  2. A group has a header record, followed by one or more transactions, followed by a trailer record
  3. A transaction consists of a set of records that describe a transaction

All of the records, from the file level down to the transaction level have a record type, so I tried modelling this with a trait:

pub trait CwrRecord : AsCwrRecord + std::fmt::Debug {
    fn get_record_type(&self) -> RecordType;
}

pub enum RecordType {
    HDR,
    TRL,
    GRH,
    GRT,
    SPU
    // Rest omitted for brevity
}

All of the records at the transaction level have record type, as well as transaction number and record sequence number, so I also modelled this with a trait:

pub trait CwrTransactionRecord: CwrRecord {
    fn get_transaction_number(&self) -> u32;
    fn get_record_sequence_number(&self) -> u32;
}

Because you can't directly cast between traits, even in a heirarchy, I had to also implement an associated generic function for all types that implement the CwrRecord trait, and did so with this trait and function:

pub trait AsCwrRecord {
    fn as_cwr_record(&self) -> &dyn CwrRecord;
}

impl <T: CwrRecord> AsCwrRecord for T {
    fn as_cwr_record(&self) -> &dyn CwrRecord {
        self
    }
}

All of these work great so far. So using these we can define a few record types, going to omit some of this for brevity:

Here is a file trailer record, for instance:

#[derive(Debug)]
pub struct TrlRecord {
    pub group_count: u32,
    pub transaction_count: u32,
    pub record_count: u32,
}

impl CwrRecord for TrlRecord {
    fn get_record_type(&self) -> RecordType {
        RecordType::TRL
    }
}

Here is one of the transactional record types:

#[derive(Debug)]
pub struct SpuRecord {
    pub some_other_data: i32,
    pub transaction_number: u32,
    pub record_sequence_number: u32
}

impl CwrRecord for SpuRecord {
    fn get_record_type(&self) -> RecordType { RecordType::SPU }
}

impl CwrTransactionRecord for SpuRecord {
    fn get_transaction_number(&self) -> u32 {
        self.transaction_number
    }

    fn get_record_sequence_number(&self) -> u32 {
        self.record_sequence_number
    }
}

All of this works great! But looking at the heirarchy from the beginning, we need to wrap these record types up into group and transaction (I am ignoring the file level for now):

#[derive(Debug)]
pub struct CwrTransaction<'a> {
    records: Vec<&'a dyn CwrTransactionRecord>,
}

#[derive(Debug)]
pub struct CwrGroup<'a> {
    pub group_header: GrhRecord,
    pub transactions: Vec<CwrTransaction<'a>>,
    pub group_trailer: GrtRecord,
}

Alright, so now we are at the part I am sort of stuck at - I want to be able to treat a transaction as something that can be iterated over to get records, no problem, I can implement IntoIterator for it:

impl<'a> IntoIterator for CwrTransaction<'a> {
    type Item = &'a dyn CwrTransactionRecord;
    type IntoIter = CwrTransactionIntoIter<'a>;

    fn into_iter(self) -> Self::IntoIter {
        CwrTransactionIntoIter {
            transaction: self,
            index: 0,
        }
    }
}

pub struct CwrTransactionIntoIter<'a> {
    transaction: CwrTransaction<'a>,
    index: usize,
}

impl<'a> Iterator for CwrTransactionIntoIter<'a> {
    type Item = &'a dyn CwrTransactionRecord;

    fn next(&mut self) -> Option<&'a dyn CwrTransactionRecord> {
        let result = if self.index < self.transaction.records.len() {
            self.index += 1;
            Some(self.transaction.records[self.index - 1])
        } else {
            None
        };

        result
    }
}

This works great as well! The real issue is starting to try to also treat CwrGroup as something that can be iterated over. I saw that there is a flatten() function that can be used, and I have successfully flattened a free-standing Vec<CwrTransaction> into just CwrTransactionRecords, however when I try to do it from a struct I run into issues:

impl<'a> IntoIterator for &'a CwrGroup<'a> {
    type Item = &'a dyn CwrRecord;
    type IntoIter = CwrGroupIntoIter<'a>;

    fn into_iter(self) -> Self::IntoIter {
         CwrGroupIntoIter {
            has_returned_header: false,
            has_returned_trailer: false,
            group: &self,
            transaction_iter: &mut self.transactions.into_iter().flatten()
        }
    }
}

pub struct CwrGroupIntoIter<'a> {
    has_returned_header: bool,
    has_returned_trailer: bool,
    group: &'a CwrGroup<'a>,
    transaction_iter: &'a mut dyn Iterator<Item = &'a dyn CwrTransactionRecord>
}

impl<'a> Iterator for CwrGroupIntoIter<'a> {
    type Item = &'a dyn CwrRecord;

    fn next(&mut self) -> Option<&'a dyn CwrRecord> {
        if !self.has_returned_header {
            self.has_returned_header = true;
            return Some(&self.group.group_header);
        }

        if let Some(record) = self.transaction_iter.next() {
            return Some(record.as_cwr_record());
        }

        todo!()
    }
}

Some of the veterans can probably guess, I hit two issues here:

error[E0515]: cannot return value referencing temporary value
  --> src/cwr/group.rs:16:10
   |
16 | /          CwrGroupIntoIter {
17 | |             has_returned_header: false,
18 | |             has_returned_trailer: false,
19 | |             group: &self,
20 | |             index: 0,
21 | |             transaction_iter: &mut self.transactions.into_iter().flatten()
   | |                                    --------------------------------------- temporary value created here
22 | |         }
   | |_________^ returns a value referencing data owned by the current function

error[E0507]: cannot move out of `self.transactions` which is behind a shared reference
  --> src/cwr/group.rs:21:36
   |
21 |             transaction_iter: &mut self.transactions.into_iter().flatten()
   |                                    ^^^^^^^^^^^^^^^^^ move occurs because `self.transactions` has type `Vec<transaction::CwrTransaction<'_>>`, which does not implement the `Copy` trait

error: aborting due to 2 previous errors; 1 warning emitted

How can I store an iterator for use in my CwrGroupIntoIter, and am I going about this the correct way (to have CwrGroup::into_iter give me back what I want - an iterator over &dyn CwrRecord)?

You're constructing the iterator as a local variable in the into_iter method, and you need the CwrGroupIntoIter object to own the iterator it creates.

Declare it as

transaction_iter: Box<dyn Iterator<Item = &'a dyn CwrTransactionRecord>>

Then construct it with

Box::new(self.transactions.into_iter().flatten())

I feel like that is a super close solution, I had tried Box at one point but I ran into this same issue, although things were structured slightly differently then, this is the new error:

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter `'a` due to conflicting requirements
  --> src/cwr/group.rs:15:42
   |
15 |       fn into_iter(self) -> Self::IntoIter {
   |  __________________________________________^
16 | |          CwrGroupIntoIter {
17 | |             has_returned_header: false,
18 | |             has_returned_trailer: false,
...  |
21 | |         }
22 | |     }
   | |_____^
   |
note: first, the lifetime cannot outlive the lifetime `'a` as defined on the impl at 11:6...
  --> src/cwr/group.rs:11:6
   |
11 | impl<'a> IntoIterator for &'a CwrGroup<'a> {
   |      ^^
note: ...so that the types are compatible
  --> src/cwr/group.rs:15:42
   |
15 |       fn into_iter(self) -> Self::IntoIter {
   |  __________________________________________^
16 | |          CwrGroupIntoIter {
17 | |             has_returned_header: false,
18 | |             has_returned_trailer: false,
...  |
21 | |         }
22 | |     }
   | |_____^
   = note: expected `IntoIterator`
              found `IntoIterator`
   = note: but, the lifetime must be valid for the static lifetime...
note: ...so that the expression is assignable
  --> src/cwr/group.rs:20:31
   |
20 |             transaction_iter: Box::new(self.transactions.into_iter().flatten())
   |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   = note: expected `Box<(dyn Iterator<Item = &dyn records::CwrTransactionRecord> + 'static)>`
              found `Box<dyn Iterator<Item = &dyn records::CwrTransactionRecord>>`

I assume this is because it can't figure out the lifetime of the iterator without it being a reference, and trying to make it a reference gets us back to the previous error

The full source can be found here if it helps (check out the adding_groups branch): https://github.com/aqez/cwr_highlighter

That's probably related to this:

This says that your trait will only be implemented if the CwrGroup effectively contains a reference to itself, which is invalid.

You probably want this:

This indicates that the references within CwrGroup live at least as long as the reference to CwrGroup. You can always think of a generic lifetime parameter on a struct as indicating that the struct exists as a 'view' over some data elsewhere. Your reference is a view of the struct, and your struct is a view of some other data, so you need the struct to outlive the reference, or you'll allow a dangling reference.

Now the CwrGroupIntoIter object is acting as view of the struct again, so you'll also need to bound it by something that doesn't outlive the CwrGroup it points to. So I'm thinking something like this, though I'm not absolutely sure I've got the bounds correct:

impl<'a: 'b, 'b> IntoIterator for &'b CwrGroup<'a> {
    type Item = &'a dyn CwrRecord;
    type IntoIter = CwrGroupIntoIter<'a, 'b>;
    ...
}

pub struct CwrGroupIntoIter<'a, 'b> {
    has_returned_header: bool,
    has_returned_trailer: bool,
    // We hold a reference that won't outlive the source data.
    group: &'b CwrGroup<'a>,
    // Here, we hold an iterator that returns references to the source data, but the iterator itself must not outlive our group reference above.
    transaction_iter: Box<dyn Iterator<Item = &'a dyn CwrTransactionRecord> + 'b>
}

I think we're getting super close but still having slight trouble. I've added good names for the lifetimes now:

#[derive(Debug)]
pub struct CwrGroup<'group> {
    pub group_header: GrhRecord,
    pub transactions: Vec<CwrTransaction<'group>>,
    pub group_trailer: GrtRecord,
}

// 'group outlives 'iterator. implemented for references that live at least as
// long as 'iterator with a lifetime parameter of 'group
impl<'group: 'iterator, 'iterator> IntoIterator for &'iterator CwrGroup<'group> {
    type Item = &'group dyn CwrRecord;
    type IntoIter = CwrGroupIntoIter<'iterator>;

    fn into_iter(self) -> Self::IntoIter {
        CwrGroupIntoIter {
            has_returned_header: false,
            has_returned_trailer: false,
            group: &self,
            transaction_iter: Box::new(self.transactions.into_iter().flatten()),
        }
    }
}

pub struct CwrGroupIntoIter<'iterator> {
    has_returned_header: bool,
    has_returned_trailer: bool,
    group: &'iterator CwrGroup<'iterator>,
    transaction_iter: Box<dyn Iterator<Item = &'iterator dyn CwrTransactionRecord>>
}

impl<'iterator> Iterator for CwrGroupIntoIter<'iterator> {
    type Item = &'iterator dyn CwrRecord;

    fn next(&mut self) -> Option<&'iterator dyn CwrRecord> {
        if !self.has_returned_header {
            self.has_returned_header = true;
            return Some(&self.group.group_header);
        }

        if let Some(record) = self.transaction_iter.next() {
            return Some(record.as_cwr_record());
        }

        if !self.has_returned_trailer {
            self.has_returned_trailer = true;
            return Some(&self.group.group_trailer);
        }

        None
    }
}

And getting this compiler error:

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter `'iterator` due to conflicting requirements
  --> src/cwr/group.rs:13:36
   |
13 | impl<'group: 'iterator, 'iterator> IntoIterator for &'iterator CwrGroup<'group> {
   |                                    ^^^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the lifetime `'iterator` as defined on the impl at 13:25...
  --> src/cwr/group.rs:13:25
   |
13 | impl<'group: 'iterator, 'iterator> IntoIterator for &'iterator CwrGroup<'group> {
   |                         ^^^^^^^^^
note: ...but the lifetime must also be valid for the lifetime `'group` as defined on the impl at 13:6...
  --> src/cwr/group.rs:13:6
   |
13 | impl<'group: 'iterator, 'iterator> IntoIterator for &'iterator CwrGroup<'group> {
   |      ^^^^^^
note: ...so that the types are compatible
  --> src/cwr/group.rs:13:36
   |
13 | impl<'group: 'iterator, 'iterator> IntoIterator for &'iterator CwrGroup<'group> {
   |                                    ^^^^^^^^^^^^
   = note: expected `IntoIterator`
              found `IntoIterator`

error: aborting due to previous error

I've also tried impl<'group: 'iterator, 'iterator> IntoIterator for &'group CwrGroup<'group> which makes more sense to me logically, but then it says 'iterator is not constrained.

Here's what I was able to manage.

There are a few problems with your design. First,

pub struct CwrGroupIntoIter<'g, 'a> {
    group: &'g CwrGroup<'a>,
    transaction_iter: Box<dyn Iterator<Item=&'a dyn CwrTransactionRecord> + 'g>
    ...
}

This looks suspiciously like a self-borrow. It looks like the transaction_iter wants to be borrowing from group, so they can't really be in the same struct if so, but because they both borrow from the transaction vec, I got around that by cloning the whole Vec. What you should probably do here is not store &'g CwrGroup<'a> internally, but break it apart:

pub struct CwrGroupIntoIter<'g, 'a> {
    has_returned_header: bool,
    has_returned_trailer: bool,
    group_header: &'a GrhRecord, // taken from CwrGroup
    group_trailer: &'a GrtRecord, // taken from CwrGroup
    // transactions: .. gone. It was consumed and made into the iterator below.
    transaction_iter: Box<dyn Iterator<Item=&'a dyn CwrTransactionRecord> + 'g>
}

Now, here you'll notice that I'm storing references to the group header and trailer. This is because in your Iterator impl for CwrGroupIntoIter, you're doing this:

        if !self.has_returned_header {
            self.has_returned_header = true;
            return Some(&self.group.group_header);
        }

... you're constructing a reference to return in this method, but you must return a reference scoped to 'iterator according to type Item = &'iterator dyn CwrRecord;. You cannot construct this reference in this function, because &mut self is a temporary borrow that only exists when you call next. So I stored this reference already constructed in the CwrGroup. You might want to construct it elsewhere.

Thanks! This makes sense, I was trying to avoid clone but maybe that was for no good reason. The code you gave works great with clone, and now compiles and runs correctly for me, thank you for the help!

1 Like

I did end up being able to get this to work without a clone - it took some pretty deep thinking about it though! I realized that having the CwrGroup own the transactions was not what I wanted, it just needs to reference them:

#[derive(Debug)]
pub struct CwrGroup<'group> {
    pub group_header: &'group GrhRecord,
    pub transactions: &'group Vec<CwrTransaction<'group>>,
    pub group_trailer: &'group GrtRecord,
}

Then the clone can just fall away naturally:

impl<'group: 'iterator, 'iterator> IntoIterator for &'group CwrGroup<'iterator> {
    type Item = &'group dyn CwrRecord;
    type IntoIter = CwrGroupIntoIter<'group, 'iterator>;

    fn into_iter(self) -> Self::IntoIter {
        CwrGroupIntoIter {
            has_returned_header: false,
            has_returned_trailer: false,
            group: self,
            transaction_iter: Box::new(self.transactions.iter().flatten())
        }
    }
}

But this meant I needed to implement IntoIter for &CwrTransaction instead of CwrTransaction and change its ownership as well to just hold a reference:

pub struct CwrTransactionIntoIter<'a> {
    transaction: &'a CwrTransaction<'a>,
    index: usize,
}
impl<'a> IntoIterator for &'a CwrTransaction<'a> {
    type Item = &'a dyn CwrTransactionRecord;
    type IntoIter = CwrTransactionIntoIter<'a>;

    fn into_iter(self) -> Self::IntoIter {
        CwrTransactionIntoIter {
            transaction: self,
            index: 0,
        }
    }
}

Now with all that done we can iterate through it all with no copies/clones. Still, I wouldn't have come up with this without your help @skysh, especially for realizing that I needed to place the iterator in a box so that it would live long enough to be useful in the next(&mut self) method on Iterator! And also the strange syntax here: Box<dyn Iterator<Item = &'iterator dyn CwrTransactionRecord> + 'group> which I haven't come across before.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.