Use references instead of ids in struct

I want to try replacing ids with references, to not need to look up (and possibly fail) ids, when I know that the reference is valid (which is kept track of by Rust).

My setup is that I have events that can occurr multiple times at some locations. So my current model is (irrelevant fields omitted)

struct Location {
  name: String,
}

struct Event<'a> {
  name: String,
  occurrences: Vec<Occurrence<'a>>,
}

struct Occurrence<'a> {
  start: NaiveDateTime,
  location: &'a Location,
}

So instead of the occurrence storing an id to the location, it keeps a reference, which I know is valid thanks to Rust.

This does not work, however. I have to keep a hold of both the locations and the events for all references to possibly be valid, so I created the following

struct Store<'a> {
  locations: Vec<Location>,
  events: Vec<Event<'a>>,
}

I get compiler errors about the lifetime of the locations, when I try to build a Store:

let bar = Location {
    name: "Some bar".to_string(),
};
let locations = vec![bar];
let events = vec![Event {
    name: "Social Dance".to_string(),
    occurrences: vec![Occurrence {
        start: NaiveDate::from_ymd(2019, 4, 1).and_hms(20, 30, 00),
        location: &locations[0],
    }],
}];
let store = Store {
    locations: locations,
    events: events,
};

The error I get is that the occurrence borrows the location, but then locations is moved into store even though the borrow events is also used in the store. I'm not sure whether the issue is that moving a variable might change the addresses, therefore the references might become invalid, or whether vectors are mutable and therefore the addresses are not guaranteed to remain valid.

Is it possible to create a datastructure, where the occurrences can reference the locations?

I think this is the problem of self-referential structs, where you are storing both an object and references to it in the same struct. In general this isn't safe, because moving the struct can invalidate references (This is one of the uses for the new Pin api). In your case what happens if the Vec gets bigger, and has to reallocate? All the references would now be invalid, and so break Rust's memory safety guarantee.

The way to work round all of these issues is to use IDs

1 Like

Correct, see

for furhter details :slight_smile:

2 Likes

Rust cannot keep track of the validity of safe-referential structs. In these cases, unsafe is needed, whereby the programmer promises to be keeping track of it by themselves. Up until recently, it wasn't even possible to have a generic abstraction of this.

Thus, keep using indexing, which has the advantage of being checked at runtime at least.


Implementing it with Pin-ning

To prove how complex the code gets when using Pin-ning for self-referential structs, I have implemented what you want. I have tried to be as sound as possible, so there will be many intermediate structures. At some point I unsafe-ly promote a reference to 'static, since I think this is sound given the lack of a public (non-unsafe) API abusing it, but I am not sure about it.

The idea of pinning is that you tag a pointer with the Pin<_> wrapper to signify that the pointee is pinned, unless the pointee is "unaware of / able to ignore such property" (Unpin marker trait). Thus, we need an intermediate (#[repr(transparent)]) struct wrapping our Locations so that they become sensitive to pinning.
By construction, any slice of PinSensitiveLocations will be sensitive to pinning too.

Then, the owning pointer type used will be Box (since it can be Pin tagged with safe code), which can be created from a Vec with .into_boxed_slice() method.

Now, only remains one delicate situation: when moving both the locations and the referring events, the events need to be "downgraded to a raw unrelated reference / pointer", before plugging them together again ("reupgrading them to valid references") . There is no way at this step to statically guarantee that the events do relate to the given locations. It is thus unsafe.

Usage

use lib::*;

fn main ()
{
    let bar = Location {
        name: String::from("Some bar"),
    };
    let foo = Location {
        name: String::from("Some foo"),
    };
    let locations = Locations(vec![bar, foo]).pinned();
    let events = vec![Event {
        name: "Social Dance".into(),
        occurrences: vec![Occurrence {
            start: NaiveDate::from_ymd(2019, 4, 1).and_hms(20, 30, 00),
            location: locations.get(0).unwrap(),
        }],
    }];
    let mut store = unsafe {
        // safety: it is indeed from `locations` that `events` does borrow
        Store::with_events(events)
            .and_locations(locations)
    };
    dbg!(&store);

    // Adding an event afterwards
    let (locations, mut events) = store.get_locations_and_take_events();
    let start = NaiveDate::from_ymd(2019, 4, 1).and_hms(20, 30, 00);
    events.push(Event {
        name: "Pin-pong".into(),
        occurrences: vec![
            Occurrence { start, location: locations.get(0).unwrap(), },
            Occurrence { start, location: locations.get(1).unwrap(), },
        ],
    });
    let store_events = Store::with_events(events);
    let Store { locations, .. } = store;
    let store = unsafe {
        // safety: it is indeed from `locations` that `events` does borrow
        store_events.and_locations(locations)
    };
    dbg!(&store);
}

lib

#![deny(
    elided_lifetimes_in_paths,
)]

#![allow(
    dead_code,
)]

// 0.4.6
use ::chrono::*;

mod lib {
    use core::{*,
        marker::PhantomPinned as PinSensitive,
        pin::Pin,
    };

    #[derive(Debug)]
    pub
    struct Location {
        pub
        name: String,
    }

    #[repr(transparent)]
    pub
    struct PinSensitiveLocation {
        location: Location,
        _pin_sensitive: PinSensitive,
    }

    impl ops::Deref for PinSensitiveLocation {
        type Target = Location;

        fn deref (self: &'_ Self) -> &'_ Self::Target
        {
            &self.location
        }
    }

    impl fmt::Debug for PinSensitiveLocation {
        fn fmt (
            self: &'_ Self,
            stream: &'_ mut fmt::Formatter<'_>,
        ) -> fmt::Result
        {
            // write!(stream, "PinSensitive")?;
            fmt::Debug::fmt(&self.location, stream)
        }
    }
    
    #[derive(Debug)]
    pub
    struct Locations /* = */ (
        pub
        Vec<Location>
    );

    #[derive(Debug)]
    pub
    struct PinnedLocations /* = */ (
        Pin<Box< [PinSensitiveLocation] >>
    );

    impl PinnedLocations {
        pub
        fn get (
            self: &'_ Self,
            index: usize,
        ) -> Option< Pin<&'_ PinSensitiveLocation> >
        {
            if index < self.0.len() {
                Some(unsafe {
                    // safety: index / field access is a valid pin projection
                    self.0
                        .as_ref()
                        .map_unchecked(move |locations| locations.get_unchecked(index))
                })
            } else {
                None
            }
        }
    }

    impl Locations {
        pub
        fn pinned (self: Self) -> PinnedLocations
        {
            let mut boxed_locations: Box<[Location]> = self.0.into_boxed_slice();
            let pinned_locations: Box<[PinSensitiveLocation]> = unsafe {
                // Safety: PinSensitiveLocation is #[repr(transparent)]
                let (len, ptr): (usize, *mut PinSensitiveLocation) = (
                    boxed_locations.len(),
                    boxed_locations.as_mut_ptr() as *mut PinSensitiveLocation,
                );
                mem::forget(boxed_locations);
                Box::from_raw(
                    slice::from_raw_parts_mut(
                        ptr,
                        len,
                    )
                )
            };
            PinnedLocations(pinned_locations.into())
        }
    }
    
    #[derive(Debug)]
    pub
    struct Event<'loc> {
        pub
        name: String,
        
        pub
        occurrences: Vec<Occurrence<'loc>>,
    }
    
    #[derive(
        Debug,
        // ensures there is no drop glue:
        // `WithEvents` transmute requires that the &'loc reference must not be dereferenced when dropped
        Clone, Copy,
    )]
    pub
    struct Occurrence<'loc> {
        pub
        start: ::chrono::NaiveDateTime,
        
        pub
        location: Pin<&'loc PinSensitiveLocation>,
    }
    
    #[derive(Debug)]
    pub
    struct Store {
        pub
        locations: PinnedLocations,

        events: Vec<Event<'static>>,
    }

    impl Store {
        pub
        fn events (
            self: &'_ Self,
        ) -> &'_ Vec<Event<'_>>
        {
            &self.events
        }

        /// API to allow mutation and adding events
        /// (reconstructing the Store is needed)
        pub
        fn get_locations_and_take_events (
            self: &'_ mut Self,
        ) -> (&'_ PinnedLocations, Vec<Event<'_>>)
        {
            (&self.locations, mem::replace(&mut self.events, Vec::default()))
        }
    }
    
    impl Store {
        pub
        fn with_events (
            events: Vec<Event<'_>>,
        ) -> WithEvents
        {
            // safety:
            //   - no public non-`unsafe` API
            //   - Event : Copy thus drop does not deref
            unsafe {
                WithEvents(mem::transmute(events))
            }
        }
    }

    pub
    struct WithEvents /* = */ (
        /// has to be private
        Vec<Event<'static>>
    );
    impl WithEvents {
        /// # Safety
        ///
        /// the given `events` must refer to the input `locations` only
        pub
        unsafe fn and_locations (
            self: Self,
            locations: PinnedLocations,
        ) -> Store
        {
            let WithEvents(events) = self;
            Store {
                locations,
                events,
            }
        }
    }
}
1 Like

Thanks for your reply! Very interesting to see your explanation and code.

I guess I will simply wrap it up in an abstraction and use ids and hash sets or similar for now.

There are some crates to help with self-referential structs. I haven't used them myself so I'm not sure if they match your use case, but here they are in any case: owning_ref and rental.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.