Declaring two versions of a struct to cope with ownership

nroi · November 15, 2020, 5:17pm

When dealing with ownership and references, I sometimes feel tempted to either use clone(), or create a new struct which is almost identical to an existing struct except for the fact that some fields include a lifetime annotation. This way, I can have ownership when I need to, but at the same time, use reference types when possible to avoid cloning.

But it seems inelegant to clutter the codebase with things like:

struct MyStruct {
    value: T
}

struct MyStructRef <'a> {
    value: &'a T
}

When there is absolutely no difference between the two structs except for the fact that one deals with lifetimes and the other doesn't.

So I'm wondering: is there something I'm missing that actually makes this pattern completely unnecessary? Or is this something that even experienced Rust programmers deal with regularly?

I've included some code to help illustrate the problem I'm having:

// We intend to marshall a Vec<i32> and include a timestamp of the time when the Vec was marshalled:
struct ItemsTimestamped {
    items: Vec<i32>,
    timestamp_unix_epoch: u64,
}

const TIMESTAMP_UNIX_EPOCH: u64 = 1600000000; // just use a constant value for demo purposes.

fn unmarshall() -> ItemsTimestamped {
    // Let's just pretend we're using something like serde to deserialize the struct from a JSON file.
    ItemsTimestamped {
        items: vec![1, 2, 3],
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    }
}

// Marshalling requires only a reference to the struct.
fn marshall(my_struct_timestamped: &ItemsTimestamped) {}


// However, even if the struct itself is just a reference, the fields within this struct are owned.
// But we want to include the Vec<i32> in the struct without taking ownership of the
// original Vec<i32>. So we want to be able to write something like the following:

fn solution_1() {
    // obtain ownership of the items.
    let items = unmarshall().items;
    // marshall the items without losing ownership:
    marshall_cloned(&items);
    // just to illustrate that we haven't lost ownership:
    take_ownership(items);
}


// 1st solution: Just clone the Vec. This can be inefficient if the Vec is large.
fn marshall_cloned(items: &Vec<i32>) {
    let timestamped = &ItemsTimestamped {
        items: items.clone(),
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    };
    marshall(timestamped)
}

// 2nd solution: Take ownership of the struct to serialize, and then return this exact same struct.
// So the caller does lose ownership of the Vec (which we actually want to avoid), but the caller
// can then just use the return value in place of the Vec that was moved.
// The disadvantage is that, when reading the code, it is not immediately obvious why a Vec<i32>
// is passed and also returned. Usually, for a function with a type signature like (T) -> T, you would
// expect that the function does not simply return the input without making any modifications.
fn marshall_ref_return_original(items: Vec<i32>) -> Vec<i32> {
    let timestamped = ItemsTimestamped {
        items,
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    };
    marshall(&timestamped);
    timestamped.items
}

// 3rd solution: Introduce a new type that basically has the same meaning as ItemsTimestamped,
// but includes lifetimes. This also works and does not involve cloning, but now we need
// a new type just to cope with lifetimes.

struct ItemsTimestampedRef<'a> {
    items: &'a Vec<i32>,
    timestamp_unix_epoch: u64,
}

fn marshall_ref(_my_struct_timestamped: ItemsTimestampedRef) {}

fn marshall_new_type(items: &Vec<i32>) {
    let timestamped = ItemsTimestampedRef {
        items,
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    };
    marshall_ref(timestamped)
}

fn solution_2() {
    let items = unmarshall().items;
    let items = marshall_ref_return_original(items);
    take_ownership(items);
}

fn solution_3() {
    let items = unmarshall().items;
    marshall_new_type(&items);
    take_ownership(items);
}

fn take_ownership(_items: Vec<i32>) {}

fn main() {
    solution_1();
    solution_2();
    solution_3();
}

alice · November 15, 2020, 5:23pm

It's unclear where the issue is. This should work perfectly fine:

fn solution_1() {
    // obtain ownership of the items.
    let items = unmarshall().items;
    // marshall the items without losing ownership:
    marshall_cloned(&items);
    // just to illustrate that we haven't lost ownership:
    take_ownership(items);
}

It sounds like you think ownership is transferred in some situation where it is not actually transferred. Perhaps post the code you wanted to compile but doesn't?

nroi · November 15, 2020, 5:37pm

I've described the issue in the comment of the function marshall_cloned():

Just clone the Vec. This can be inefficient if the Vec is large.

The code that I posted does compile. But all three solutions in this code have their own disadvantage, so my question is if there are better approaches.

alice · November 15, 2020, 5:40pm

Ah, I see. You want to go from &Vec<i32> to &ItemsTimestamped. Then you probably want a separate struct, yes.

steffahn · November 15, 2020, 5:56pm

There’s the option of using generics to avoid the separate struct while in fact (after monomorphization) having defined a separate struct. (The main advantages is if you have functionality that is supposed to work on both versions.)

#![allow(unused)]

// generic struct
struct ItemsTimestamped<Items> {
    items: Items,
    timestamp_unix_epoch: u64,
}

// type synonyms (if needed)
type ItemsTimestampedOwned = ItemsTimestamped<Vec<i32>>;
type ItemsTimestampedRef<'a> = ItemsTimestamped<&'a [i32]>;

// everything still working
fn unmarshall() -> ItemsTimestampedOwned {
    // Let's just pretend we're using something like serde to deserialize the struct from a JSON file.
    ItemsTimestamped {
        items: vec![1, 2, 3],
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    }
}

const TIMESTAMP_UNIX_EPOCH: u64 = 1600000000;

fn marshall_ref(_my_struct_timestamped: ItemsTimestampedRef) {}
fn marshall(items: &[i32]) {
    let timestamped = ItemsTimestamped {
        items,
        timestamp_unix_epoch: TIMESTAMP_UNIX_EPOCH,
    };
    marshall_ref(timestamped)
}


// functionality that is supposed to work on both versions
// can use generics
fn common_function(my_struct_timestamped: ItemsTimestamped<impl AsRef<[i32]>>) {
    // access item
    my_struct_timestamped.items.as_ref()[0];
}

// could also define generic borrowing function
impl<T: AsRef<[i32]>> ItemsTimestamped<T> {
    fn as_timestamped_ref(&self) -> ItemsTimestampedRef<'_> {
        ItemsTimestamped {
            items: self.items.as_ref(),
            timestamp_unix_epoch: self.timestamp_unix_epoch,
        }
    }
}

// some random code using common_function
fn use_common_function() {
    let owned = unmarshall();
    common_function(owned.as_timestamped_ref());
    common_function(owned)
}

SkiFire13 · November 15, 2020, 8:05pm

Another solution would be to use Cow<[i32]>

kornel · November 16, 2020, 10:36am

In case of growable Vec vs slice &[], the layout in memory is different. Rust too low-level and strictly-typed to magically insert any code to abstract away such difference.

There is also a semantic difference that a struct containing a temporary borrow (<'a>) has to be limited to the scope of the borrow and never try to free borrowed data. OTOH a self-contained struct isn't limited by any scope, but it does have to free the memory after its last use. So they can't be treated the same way.

It is relatively common that you get owning and borrowing versions of a type (String/&str, PathBuf/Path, Vec/&[], CString/CStr, etc.)

You can make one type that can dynamically contain either variant with an enum like Cow, but generally whether something is borrowed or owned is fundamental in Rust. It's necessary for the borrow checker to check validity of the code, and necessary for the compiler to insert appropriate Drop for memory management.

Fun fact: this distinction exists in C and is similarly strict, but isn't enforced in the type system. In C it's about whether you call free on the data or not. If you call free on borrowed data, you'll get a crash or double-free. If you don't call free on owned data, you'll get a memory leak. If you call free sometimes depending on a flag, that's like Rust's Cow.

system · February 14, 2021, 10:36am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
A question about ownership help	6	1384	January 12, 2023
Struct that owns an object that needs a reference to the struct help	4	1481	May 29, 2022
Beginner lifetimes/ownership question help	7	282	October 16, 2023
Struct member lifetimes help	3	256	August 9, 2023
A lifetime annotation problem help	4	215	April 17, 2023

Declaring two versions of a struct to cope with ownership

Related Topics