How to create a struct with fields which reference each other

I am trying to create a struct of the type:

struct System {
    ver: Version,
    logger: Logger,
    computer: Computation,
}

where Logger requires a reference of Version and Computation requires a reference of Logger so that:

impl System {
    pub fn new() -> System {

        let ver = Version::new();
        let logger = Logger::new(&ver);
        let computer = Computer::new(&logger);

        Self { ver, logger, computer }
    }
}

This does not work because the compiler complains about the fact that ver is borrowed, logger is borrowed and so I cannot move them in the last statement. However, this is the kind of struct I would like to have: various fields where some of them have as a reference another field of the same struct. How can I do this?

Thanks in advance.

There is no direct way to have self-referential types in Rust. Your Version, Logger, and Computer would need to be in different structs in order to use actual references (&).

References are usually for short-term borrows, not long-term data structures.

1 Like

For a first prototype, I always use an Arc, an atomic reference counted type.
This replaces your reference. If you need mutability, it becomes ArcMutex.

This typically works good enough, but is not the most performant solution. But it also to check if this is a bootleneck or not

1 Like

If this kind of structure is not possible then one has to create all the objects in the main program, using the references required. In this case and for a large application with many different objects the main function can easily become very large in size and complexity like a master constructor for everything. Is this a good approach?

Thank you. I need to take a look at that but I had the impression that this was trivial but not obvious since I am still learning Rust.

There is also crates such as ouroboros that offers a macro for creating such "self-referencing" data types that would otherwise only really be possible to pull of using unsafe code.

I think that this crate is for the case one wants to have a self-referential struct. What I want is to have a struct with fields which for their construction they require a reference to another field in the struct. This means that the creation of an instance of this struct has to be done in steps starting with one field and then continuing with the next one using the previous field as an input reference etc. In a language like C++ this is trivial to do in the class constructor.

That isn't a problem in practice. Usually, if you need self-referential structs, then it's more likely than not a design issue. The "tree of owned objects" approach that Rust's ownership and borrowing system enforces is actually a very natural thing if you architect your types correctly.

(The lack of self-references is not an artificial restriction anyway, it's a hard technical necessity – if a field referred to another field, then that reference would be invalidated when the whole struct is moved.)

1 Like

I am not talking about structs which reference themselves. In the example I gave I have a struct with fields which require for their initialisation/creation a reference to another field of the same struct.

That is exactly what a self-referential type is.

If you wrote the equivalent in C++, it would compile, but it would lead to dangling pointers and undefined behavior.

For me, this is not obvious.
In Rust, it is very easy to dig into premature optimization. I myself always need to force myself to write simple code, not optimal one. So I like Arc a lot.

That’s actually what typically is meant by “self-referencing” struct, and also what ouroboros tailors to. It’s not necessarily that any given field references itself, but the struct as a whole does when one field contains a reference to another field (in some form). Note that this is only true, assuming I interpret what you’re saying correctly, namely, that not only the construction requires a reference to another field, but also that those references are required to stay live beyond the construction.

In Rust syntax this would mean a setting that looks less like

struct System {
    ver: Version,
    logger: Logger,
    computer: Computation,
}

and more like

struct System {
    ver: Version,
    logger: Logger<'a>,
    computer: Computation<'b>,
}

where instead of 'a (or 'b) what we “want” to say is something like “lifetime of field ver” (or logger) or “this struct borrows from the field ver” (or logger).

If on the other hand the types of Logger or Computation don’t have lifetime parameters, there should be no problem in the first place, so I doubt that’s the case, given that you wouldn’t have opened this thread in this case.

Ouroboros can express this e.g. as follows:

use ouroboros::self_referencing;

#[self_referencing]
struct SystemFields {
    ver: Version,

    #[borrows(ver)]
    #[covariant]
    logger: Logger<'this>,

    #[borrows(logger)]
    #[covariant]
    computer: Computer<'this>,
}
// wrapper so we can define our own "new" function, using
// the ouroboros-generated one internally
struct System(SystemFields);

impl System {
    fn new() -> Self {
        Self(
            // let's use the more fancy …Builder API instead of
            // `SystemFields::new` (both are essentially equivalent though)
            SystemFieldsBuilder {
                ver: Version::new(),
                logger_builder: |version| Logger::new(version),
                computer_builder: |logger| Computer::new(logger),
            }
            .build(),
        )
    }
}

(full code in the “Rust Explorer”)

Note by the way that ouroboros will implicitly add a level of indirection to the borrowed fields ver and logger (i.e. as if those were Box<Version> and Box<Logger<'_>>), otherwise this System struct could not be moved, even though Rust typically assumes every type can be moved. (This is a significant difference to C++ by the way, where “moving”, if it’s even supported by a given type, only happens through customizable “move constructors”) There’s alternative approaches in Rust where the safe API around such a struct would utilize Pin so that users can never move it, but that’s a bit more complicated and I’m not aware of existing crates akin to ouroboros that would help you with building such a struct without touching the unsafety yourself.

4 Likes

Thank you for your posting. This looks like a lot of complicated work for something which should be simple to do.

Oh it is definitely not easy to do in a language that doesn't have a VM. I would love to know a way to do this "simply" in C or C++, which isn't also UB.

1 Like

Why dangling pointers? If class System creates and owns the Version, Logger and Computer objects and deallocates them in its destructor I don't see the problem. Furthermore, System could be made a singleton class and created once in the beginning with its contained objects so that it gets destroyed at the end of running the application.

The dangling pointers would come when you tried to move the System struct (e.g. by returning it from new) - this would invalidate the references contained within Logger and Computer, as they'd still be pointing at the System's old location.

I think it'd be helpful to give a bit more context on what Version/Logger/Computation are and how they use each other's data - it's hard to say what the right solution to your problem is without context.

5 Likes
#include <iostream>
#include <string>
#include <memory>

class Version {
public:
    int ver;
    Version (int _ver) : ver(_ver) {}
};

class Logger {
public:
    std::shared_ptr<Version> version;
    Logger (std::shared_ptr<Version> _version) : version(_version) {}
    void log(std::string& msg) { std::cout << msg << std::endl; }
};

class Computer {
public:
   std::shared_ptr<Logger> logger;

   Computer (std::shared_ptr<Logger> _logger) : logger(_logger) {}
};

class System {
public:
    std::shared_ptr<Version> version;
    std::shared_ptr<Logger> logger;
    std::shared_ptr<Computer> computer;

    System() {
        version = std::make_shared<Version>(1);
        logger = std::make_shared<Logger>(version);
        computer = std::make_shared<Computer>(logger);
    }
};

int main (void) {
    auto system = std::make_unique<System>();
    auto msg = std::string("Version : ") + std::to_string(system->version->ver);
    system->computer->logger->log(msg);
}

Well of course it works - the Rust equivalent of std::shared_ptr would be Rc.
You need to define your structs as:

struct System {
    ver: Rc<Version>,
    logger: Rc<Logger>,
    computer: Rc<Computation>,
}

impl System {
    pub fn new() -> System {
        let ver = Rc::new(Version::new());
        let logger = Rc::new(Logger::new(Rc::clone(&ver)));
        let computer = Rc::new(Computation::new(Rc::clone(&logger)));
        
       Self { ver, logger, computer }
    }
}
2 Likes

Yeah, Rc and Arc are the equivalents for shared_ptr in Rust, so if you want a direct port of the code, that would be the way to go.

I feel like Rust code tends to be more explicit about dependencies, rather than storing pointers all over the place, though - e.g. if Computer needs a Logger, people would probably just pass &mut Logger to the function that needs it.

4 Likes

If a field of a struct points to another field of the same struct, and you move the whole struct, then the original location is invalidated/deinitialized. The pointer-/reference-typed field would not be magically updated, so it would still point to the old, invalid location.


You should at least consider that Rust is 15 years old, its designers and implementors are seasoned compiler developers, and they are very well-versed in memory management questions. Maybe if the language works this way, and several people explain to you the reasoning behind this, then it is not some superficial restriction or oversight that you should be pointing out or arguing with…

1 Like