Moving out of a function both owned and borrowed entities?

Thanks. Interesting does it fix the original problem. I will try. Is that solution based on the assumption?

It avoids the need for such a conversion, yes.

1 Like

Basically you moved out dst_buffer_bytes from the structure if I understand it correctly.

Basically as_byte_slice just reinterprets any kind of structure as bytes.

That should be a feasible solution as-well. If you want a still-maintained crate alternative, try ouroboros :wink:

1 Like

Oh, cool! Looking into that too.

Sorry but your assumption is incorrect. If field F of the struct is at address A, and another field points to address A, then it will still point to address A even after you will have moved the struct to another place (say, address B). Hence, it will be a dangling pointer. If you somehow force the compiler to still accept the code, you will have problems with that dangling pointer, and therefore you should not do that.

2 Likes

But owner of data is moved with the reference. Hence no deallcation of memory happen and no dangling pointer.

At least in theory having zero knowledge about the actual implementation of the compiler logic it looks like that.

The dangling reference isn't created by deallocation. It is created by the fact that the value is moved out of the place where it keeps pointing. That's what a move means: moving the value from one place to another. The place from where it is moved is invalidated, considered uninitialized, and may not be accessed again.

3 Likes

But both dst_buffer_bytes and dst_cursor point on the same data which continue to be available. And both returned. I don't understand why that should not work. I thought it is something what Rust have out of the box. Anyway thanks for explanations.

I see no theoretical reason why that code should not work. That's a pity Rust does not support it currently.

Simplified problem

That is true, and it is not the issue. Those are both borrows. There is no problem with putting two simultaneous borrows into the same struct.

The issue is that the owned buffer, dst_buffer is also inside the same struct that contains borrows to itself. Consider the following, simplified scenario which only involves primitive types for the sake of understanding. I am using raw pointers instead of references, so that I can intentionally shoot myself in the foot despite the attempts of the type system to prevent me from doing exactly that (Playground):

#[derive(Debug)]
struct SelfRef {
    value: usize,
    pointer: *const usize,
}

fn main() {
    let mut x = SelfRef { value: 42, pointer: std::ptr::null() };
    x.pointer = &x.value;
    
    println!("&x.value = {:p}; x = {:#?}", &x.value, x);
    
    // Let's move `x` to another place, say `y`.
    let y = x;
    
    println!("&y.value = {:p}; y = {:#?}", &y.value, y);
}

This prints the following:

&x.value = 0x7fff29c73fa8; x = SelfRef {
    value: 42,
    pointer: 0x00007fff29c73fa8,
}
&y.value = 0x7fff29c74020; y = SelfRef {
    value: 42,
    pointer: 0x00007fff29c73fa8,
}

As you can see, the address of the field y.value is different from the address of the field x.value. This happens because x and y are distinct variables and live in two different locations in memory. However, moves in Rust are just byte-wise copies where the source will be considered invalid (uninitialized) after the move is completed. Therefore, the pointer-valued field isn't magically updated to point to the address of the field inside the new variable, y.value. So y.pointer will still point to x.value, which, however, doesn't exist anymore, precisely because it has been moved out.

To further illustrate this with ASCII art: this is how memory looks like after x is initialized but y is not yet:

+-----------+-------------------------+-----+---------------+---------------+
| value: 42 | pointer: 0x7fff29c73fa8 | ... | uninitialized | uninitialized |
+-----------+-------------------------+-----+---------------+---------------+
^           ^                               ^               ^
|           +-- address 0x7fff29c73fb0      |               +-- address 0x7fff29c74028
+-- address 0x7fff29c73fa8                  +-- address 0x7fff29c74020

And this is what it looks like after the contents of variable x has been moved to variable y:

+---------------+---------------+---------------+-----------+-------------------------+
| uninitialized | uninitialized |      ...      | value: 42 | pointer: 0x7fff29c73fa8 |
+---------------+---------------+---------------+-----------+-------------------------+
^               ^                               ^           ^
|               +-- address 0x7fff29c73fb0      |           +-- address 0x7fff29c74028
+-- address 0x7fff29c73fa8                      +-- address 0x7fff29c74020

As you can see, the values (the actual numerical contents of value and the contents of pointer) didn't change, but the location of the value in memory (i.e., the "identity" of which variable they are stored in) did. That's what a move is.

However, the pointer not having changed means that it still points inside the now-defunct variable. This in turn means that dereferencing it and using the referred (non-)value is Undefined Behavior. (If you had a reference instead of a raw pointer, the situation would be even worse, because you wouldn't even need to dereference it to cause instant UB: the mere existence of a reference that points to uninitialized memory is considered UB in itself.)

This is not something that needs to be "supported". This is a programmer error that needs to be detected, and Rust is super successful in detecting this kind of mistake.


Of course, when an owning heap allocation (e.g. a Box) is involved, you could technically argue that the location of the heap-allocated object doesn't change because things move around on the stack. That is true. However, due to the way references are related to each other, any reference derived from the heap allocation would also be considered invalid after the heap-allocating smart pointer or container has itself moved.
This in turn can be worked around by using raw pointers and unsafe code in a way that is correct and sound. However, is highly recommended against, because needing a self-referential type is a code smell. There are much, much better and cleaner ways of solving this kind of problem safely; the usual recommendation is that you split up the type into an owner and a view type. This also helps with decoupling, encapsulation, and upholding the single responsibility principle.

4 Likes

Thanks. Interesting. I am trying to understand.

What about this?

use byte_slice_cast::*;
use owning_ref::*;

fn main() {
    let context = Context::new();
    dbg!(&context);
    dbg!(context.dst.as_owner());
    dbg!(&*context.dst);
}

//

#[derive(Debug)]
struct Context {
    // pub dst_buffer : Box::< [ f32 ] >,
    // pub dst_buffer_bytes : &'a [ u8 ],
    pub dst: OwningRef<Box<[f32]>, [u8]>,
}

//

impl Context {
    fn new() -> Context {
        let len: usize = 2;
        let dst_buffer: Box<[f32]> = vec![0_f32; len].into_boxed_slice();
        // let dst_buffer_bytes = dst_buffer.as_byte_slice();

        let dst = OwningRef::new(dst_buffer);
        let dst = dst.map(|dst_buffer| dst_buffer.as_byte_slice());

        Context { dst }
        // Context { dst_buffer, dst_buffer_bytes }
    }
}

Do you see any problem with such solution?

Playground

OwningRef works by pretty much unconditionally requiring a heap-allocated container. In addition, it uses raw pointers and encapsulates any unsafe code, so that provenance rules aren't violated.

I'm pretty confident it should be safe to use, because it has been around for a long time, and probably many people have scrutinized its implementation.


I don't use it myself, though, and I generally dislike the whole idea of owning references, purely from an architectural point of view. And while I have utmost respect for people who contribute to the community correct abstractions over unsafe implementations, I still prefer not to rely on unsafe except as a last resort. Your mileage may vary, but I feel that Rust's ownership model (and the escape hatches provided by std) fit the overwhelming majority of the kind of real-world code that programmers need to write. Nowadays, I prefer to use unsafe purely for FFI interaction, and basically never for manual memory management.

4 Likes

Sound like you prefer Type Theory, Lambda Calculus and Math over Turing Machine and Computer Architecture centered approaches. Thanks for recommendation.

Actually, there are multiple open soundness issues.

Most of them should be fixable, some of them are super old; the crate probably could be more actively maintained than it currently is.

5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.