Moving out of a function both owned and borrowed entities?

If you move the struct around, then any pointer to its fields wouldn't be updated, and would be dangling.

1 Like

If you move the struct around, then any pointer to its fields wouldn't be updated, and would be dangling.

Can I make the struct owner of the data?

You can, that's not the point. The point is that you can't make a reference-typed field of a struct point to itself.

3 Likes

Because format is [ f32 ]. It is sound wave. But [ u8 ] is what Cursor expects.
2 representations of the same data.

You can, that's not the point. The point is that you can't make a reference-typed field of a struct point to itself.

My understanding is I move both enitities owner and borrowed Context { dst_cursor, dst_buffer } so no dangling. Perhaps there is some kind of marker to tell compiler about that?

Hmm, I think converting between Box<[f32]> and Box<[u8]> is not actually possible due to alignment differences. I guess you could create your own wrapper for Cursor that implements Write (and forwards the Seek implementation) anyways...

#![feature(new_uninit)]

use byte_slice_cast::*;
use std::{
    cmp,
    io::{self, Write},
};

fn main() {
    let context = context_make();
    dbg!(&context);
}

//

fn context_make() -> Context {
    Context::new()
}

//

#[derive(Debug)]
struct Context {
    pub dst_cursor: MyCursor,
}

//

impl Context {
    fn new() -> Context {
        let mut dst_buffer = vec![0_f32; 1024].into_boxed_slice();

        let dst_buffer_bytes = dst_buffer.as_mut_byte_slice();
        let dst_cursor = io::Cursor::new(dst_buffer_bytes);
        let context = Context {
            dst_cursor: MyCursor(io::Cursor::new(WrappedBox(dst_buffer))),
        };
        context
    }
}

#[derive(Debug)]
struct MyCursor(io::Cursor<WrappedBox>);
impl MyCursor {
    fn into_inner(self) -> Box<[f32]> {
        self.0.into_inner().0
    }
}

#[derive(Debug)]
struct WrappedBox(Box<[f32]>);
impl AsRef<[u8]> for WrappedBox {
    fn as_ref(&self) -> &[u8] {
        self.0.as_byte_slice()
    }
}
impl AsMut<[u8]> for WrappedBox {
    fn as_mut(&mut self) -> &mut [u8] {
        self.0.as_mut_byte_slice()
    }
}

// implementation adapted from standard library
impl io::Write for MyCursor {
    #[inline]
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        let original_pos = self.0.position();
        let slice = self.0.get_mut().as_mut();
        let pos = cmp::min(original_pos, slice.len() as u64);
        let amt = (&mut slice[(pos as usize)..]).write(buf)?;
        self.0.set_position(original_pos + amt as u64);
        Ok(amt)
    }

    #[inline]
    fn write_vectored(&mut self, bufs: &[io::IoSlice<'_>]) -> io::Result<usize> {
        let mut nwritten = 0;
        for buf in bufs {
            let n = self.write(buf)?;
            nwritten += n;
            if n < buf.len() {
                break;
            }
        }
        Ok(nwritten)
    }

    #[inline]
    fn flush(&mut self) -> io::Result<()> {
        Ok(())
    }
}

impl io::Seek for MyCursor {
    #[inline]
    fn seek(&mut self, style: io::SeekFrom) -> Result<u64, std::io::Error> {
        self.0.seek(style)
    }

    #[inline]
    fn stream_position(&mut self) -> io::Result<u64> {
        self.0.stream_position()
    }
}

(code compiles, but otherwise untested)

1 Like

Thanks. Interesting does it fix the original problem. I will try. Is that solution based on the assumption?

It avoids the need for such a conversion, yes.

1 Like

Basically you moved out dst_buffer_bytes from the structure if I understand it correctly.

Basically as_byte_slice just reinterprets any kind of structure as bytes.

That should be a feasible solution as-well. If you want a still-maintained crate alternative, try ouroboros :wink:

1 Like

Oh, cool! Looking into that too.

Sorry but your assumption is incorrect. If field F of the struct is at address A, and another field points to address A, then it will still point to address A even after you will have moved the struct to another place (say, address B). Hence, it will be a dangling pointer. If you somehow force the compiler to still accept the code, you will have problems with that dangling pointer, and therefore you should not do that.

2 Likes

But owner of data is moved with the reference. Hence no deallcation of memory happen and no dangling pointer.

At least in theory having zero knowledge about the actual implementation of the compiler logic it looks like that.

The dangling reference isn't created by deallocation. It is created by the fact that the value is moved out of the place where it keeps pointing. That's what a move means: moving the value from one place to another. The place from where it is moved is invalidated, considered uninitialized, and may not be accessed again.

3 Likes

But both dst_buffer_bytes and dst_cursor point on the same data which continue to be available. And both returned. I don't understand why that should not work. I thought it is something what Rust have out of the box. Anyway thanks for explanations.

I see no theoretical reason why that code should not work. That's a pity Rust does not support it currently.

Simplified problem

That is true, and it is not the issue. Those are both borrows. There is no problem with putting two simultaneous borrows into the same struct.

The issue is that the owned buffer, dst_buffer is also inside the same struct that contains borrows to itself. Consider the following, simplified scenario which only involves primitive types for the sake of understanding. I am using raw pointers instead of references, so that I can intentionally shoot myself in the foot despite the attempts of the type system to prevent me from doing exactly that (Playground):

#[derive(Debug)]
struct SelfRef {
    value: usize,
    pointer: *const usize,
}

fn main() {
    let mut x = SelfRef { value: 42, pointer: std::ptr::null() };
    x.pointer = &x.value;
    
    println!("&x.value = {:p}; x = {:#?}", &x.value, x);
    
    // Let's move `x` to another place, say `y`.
    let y = x;
    
    println!("&y.value = {:p}; y = {:#?}", &y.value, y);
}

This prints the following:

&x.value = 0x7fff29c73fa8; x = SelfRef {
    value: 42,
    pointer: 0x00007fff29c73fa8,
}
&y.value = 0x7fff29c74020; y = SelfRef {
    value: 42,
    pointer: 0x00007fff29c73fa8,
}

As you can see, the address of the field y.value is different from the address of the field x.value. This happens because x and y are distinct variables and live in two different locations in memory. However, moves in Rust are just byte-wise copies where the source will be considered invalid (uninitialized) after the move is completed. Therefore, the pointer-valued field isn't magically updated to point to the address of the field inside the new variable, y.value. So y.pointer will still point to x.value, which, however, doesn't exist anymore, precisely because it has been moved out.

To further illustrate this with ASCII art: this is how memory looks like after x is initialized but y is not yet:

+-----------+-------------------------+-----+---------------+---------------+
| value: 42 | pointer: 0x7fff29c73fa8 | ... | uninitialized | uninitialized |
+-----------+-------------------------+-----+---------------+---------------+
^           ^                               ^               ^
|           +-- address 0x7fff29c73fb0      |               +-- address 0x7fff29c74028
+-- address 0x7fff29c73fa8                  +-- address 0x7fff29c74020

And this is what it looks like after the contents of variable x has been moved to variable y:

+---------------+---------------+---------------+-----------+-------------------------+
| uninitialized | uninitialized |      ...      | value: 42 | pointer: 0x7fff29c73fa8 |
+---------------+---------------+---------------+-----------+-------------------------+
^               ^                               ^           ^
|               +-- address 0x7fff29c73fb0      |           +-- address 0x7fff29c74028
+-- address 0x7fff29c73fa8                      +-- address 0x7fff29c74020

As you can see, the values (the actual numerical contents of value and the contents of pointer) didn't change, but the location of the value in memory (i.e., the "identity" of which variable they are stored in) did. That's what a move is.

However, the pointer not having changed means that it still points inside the now-defunct variable. This in turn means that dereferencing it and using the referred (non-)value is Undefined Behavior. (If you had a reference instead of a raw pointer, the situation would be even worse, because you wouldn't even need to dereference it to cause instant UB: the mere existence of a reference that points to uninitialized memory is considered UB in itself.)

This is not something that needs to be "supported". This is a programmer error that needs to be detected, and Rust is super successful in detecting this kind of mistake.


Of course, when an owning heap allocation (e.g. a Box) is involved, you could technically argue that the location of the heap-allocated object doesn't change because things move around on the stack. That is true. However, due to the way references are related to each other, any reference derived from the heap allocation would also be considered invalid after the heap-allocating smart pointer or container has itself moved.
This in turn can be worked around by using raw pointers and unsafe code in a way that is correct and sound. However, is highly recommended against, because needing a self-referential type is a code smell. There are much, much better and cleaner ways of solving this kind of problem safely; the usual recommendation is that you split up the type into an owner and a view type. This also helps with decoupling, encapsulation, and upholding the single responsibility principle.

4 Likes

Thanks. Interesting. I am trying to understand.