Help with data structures with self-referential nested elements

I am working on a 2D CAD program that will be used to display wiring diagrams.

I am trying to decide on a good data structure for wires and cables. (I am a completely self taught coder, and never took a Data Structures or Algorithms class).

Currently I have a WireType struct and a CableType struct (Library Types) that store common properties of each object, such as manufacturer, color, overall structure, etc and then a Wire struct and a Cable struct (Project Types). The Project Type structs are instances of the types represented by the Library Type structs within the program. Each Project Type data stores a string key that refers back to its associated Library Type.

All of this data currently gets read in from TOML files via Serde and stored in a master Library or Project struct.

I just switched to using separate structs to represent the data read in from the TOML files vs the data stored in memory, as there are way more data fields on the in-memory versions. (thanks to an answer on a different question.)

Where I am questioning things, is the nested nature of wires and cables. As an example, see the following image of a cable with 4 cables inside it, each with 3 wires.

In my program, i need to be able to reference each individual core of a cable, so I can link it to specific connection points on other entities.

I have a CableCore enum that can contain either a Wire or a Cable, and each cable can have any number of CableCores. This means that CableCores can be nested until recursion depth is reached...

One core of my question (no pun intended), is should I keep Wires separated out, or just treat them as Cables with 1 core within my internal data structures as I can have different data structures in memory than in the file.

The other core of my question, is how to handle nested cables/cores like the above when I need to have a user refer to a specific core or set of cores in a Project file to define connections between other entities. It would be much easier if I had the user define everything via a GUI and keep track of everything behind the scenes, but I want to have the users define everything in TOML files.

Essentially the workflow for the end user of the software is the following:

  1. Define WireTypes and CableTypes in Library files
  2. Define individual Wires and Cables in Project files
  3. Define the Equipment the wires and cables are connecting to in other Project files
  4. Define the Connections between Equipment using a specific Wire/Cable/Core of a Cable. The connection would include what Equipment is on each end, what ConnectionPoint is in use on each Equipment for each end, and then which Wire/Cable/Core of a Cable is connecting those two Equipment instances.

The last step is the piece I am struggling with the most, as I have no easy way to associate user visible IDs to nested cores within a cable, unless I essentially hard code a specific algorithm which is inflexible.

I have an implementation currently that "Works" TM, but the implementation doesn't feel like I am doing things the correct way, repeating myself a lot and backing myself into corners. It also relies on the users defining the core ids in a very specific way that is inflexible.

I apologize for the wall of text, and thank you to anyone in advance who has suggestions / feedback on this.

This is more of a schema design problem than a Rust problem — you’d have the same problem in your data files in any language, even if the implementation of processing those files looked different.

If I were solving this problem for my own use, I would say that each cable definition in a file should be able to give names or numbers to each of its internal cables or the cables/wires named in those cables — pretty much exactly like how in Rust, a module can choose to pub use (and rename) items from inside its child modules. Thus, for example, a CAT-5 (Ethernet) cable could have wires numbered 1-8 even if it is made up of four twisted-pair sub-cables which each label their wires 1 and 2 or A and B or something, but it could also provide names for the pairs if desired.

Agreed that this is more of a Schema design issue than a rust specific issue. I posted it here, since folks have been helpful in the past, and rust has some quirks with data structure design (self referential structs, ownership, etc).

Do you have any thoughts on how you would implement that specifically?

It depends on what operations the data structure needs to support, and which ones need to be efficient.

But the problem is made much simpler by the fact that you are not, as far as you have said, writing an editor. This means you can — and probably should — build data structures that represent the data loaded from the TOML files in multiple ways.

For example, you might, while reading a cable definition, traverse it recursively and come up with a unique ID for every sub-cable and every wire in the cable, and for each such ID, store a path whose components identify where, in the tree of nested cables, that specific wire or cable can be found. That way, the in-memory representation of Connections only needs to refer to these unique IDs which can be just numbers, which makes things easier and cheaper to work with; the Connection representations never need to care about the structure of cables, just point to “that part”.

If you’re familiar with database servers, think of this as indexing — you can have as many indexes as you need for different kinds of queries.

For automatic numbering, I think labeling each bundle of wires with a letter and each conductor with a number might be worthwhile. For CAT-5 cable, that would look like four pairs labeled A-D (or A-E if you count the cable as a whole), with eight conductors labeled 1-8.

Agreed. There is no need to specifically represent exactly what is in the TOML files on the display. I am thinking of this more like a Markdown or LaTeX renderer than an actual CAD program. I do want to give users the flexibility to move objects around on screen once the objects are rendered, but the on disk data structure doesn't need to include that data directly.

I have been trying to implement something similar, but I am running into a problem, where I need to either generate those IDs in a predictable and documentable fashion, so that someone can define which core goes which in the TOML file, or map from predictable IDs to internal IDs which just seems to add its own layer of headaches.

Right now, this is how I am defining a connection in the TOML file, where end1 and end2 are the string key in a map of Equipment, directly read in from the TOML file, and connection is an inline table like { Cable = { cable_id = PLACEHOLDER, core_id = PLACEHOLDER } } where the ID of the cable is also a string key in the map of Cables read in from the TOML file.

end1 = "PLACEHOLDER"

end2 = "PLACEHOLDER"

connection = "PLACEHOLDER"

An example of a specific cable definition currently is shown below:

Library definition

# This is just one cable, it was chosen in my test project to be complex to test the import functionality. Usually, there would be fewer definitions and more reuse.

[cable_types]

# only defining required values
[cable_types.belden_638AFJ]
cross_sect_area = {value = [688, 1], original_unit="square millimeter"}
cross_section = "Circular"

[cable_types.belden_638AFJ.cores]

[cable_types.belden_638AFJ.cores.card_reader]
CableType="belden_638AFJ_card_reader_3_pair_inner"
[cable_types.belden_638AFJ.cores.door_contact]
CableType="belden_638AFJ_door_contact_inner"
[cable_types.belden_638AFJ.cores.rex]
CableType="belden_638AFJ_rex_inner"
[cable_types.belden_638AFJ.cores.lock_power]
CableType="belden_638AFJ_lock_power_inner"

[[cable_types.belden_638AFJ.layers]]
layer_number=1
layer_type="Jacket"
material = "Polyvinyl Chloride"
ac_electric_potential_rating = {value = [300, 1], original_unit = "volt"}
dc_electric_potential_rating = {value = [300, 1], original_unit = "volt"}
temperature_rating = {value = [75,1], original_unit = "degree Celsius"}
rating = "UL Listed, Plenum, Flamarrest®"
color = "Yellow"

[cable_types.belden_638AFJ.line_style]
color = "Yellow"
line_thickness = {value = [20, 1], original_unit = "point (computer)"}
line_appearance = [4, 2]

[cable_types.belden_638AFJ.dimensions]
height = {value = [74,5], original_unit = "mm"}
width = {value = [74,5], original_unit = "mm"}
diameter = {value = [74,5], original_unit = "mm"}

[cable_types.belden_638AFJ.catalog]
manufacturer = "Belden"
model = "638AFJ"
description = "Access Control, 16c (#18-3pr, #16-4c, #18-6c), Shielded, Outer Jacket, CMP"
# totally made up
part_number = "00-1500-16"
manufacturer_part_number = "638AFJ"
supplier = "Digikey"
supplier_part_number = "BEL8706-1000-ND"

[cable_types.belden_638AFJ_card_reader_3_pair_inner]
cross_sect_area = {value = [1683, 10], original_unit="square millimeter"}
cross_section = "Circular"

[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.black]
WireType = "belden_638AFJ_18AWG_black_inner"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.red]
WireType = "belden_638AFJ_18AWG_red_inner"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.white]
WireType = "belden_638AFJ_18AWG_white_inner"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.green]
WireType = "belden_638AFJ_18AWG_green_inner"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.brown]
WireType = "belden_638AFJ_18AWG_brown_inner"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.cores.orange]
WireType = "belden_638AFJ_18AWG_orange_inner"
[[cable_types.belden_638AFJ_card_reader_3_pair_inner.layers]]
layer_number = 1
layer_type = "Shield"
material = "Bi-Laminate (Alum+Poly) Tape"
[[cable_types.belden_638AFJ_card_reader_3_pair_inner.layers]]
layer_number = 2
layer_type = "Jacket"
material = "Polyvinyl Chloride"
rating = "Flamarrest®"
color = "Orange"
[cable_types.belden_638AFJ_card_reader_3_pair_inner.dimensions]
height = {value = [183,25], original_unit = "mm"}
width = {value = [183,25], original_unit = "mm"}
diameter = {value = [183,25], original_unit = "mm"}

[cable_types.belden_638AFJ_door_contact_inner]
cross_sect_area = {value = [573, 10], original_unit="square millimeter"}
cross_section = "Circular"

[cable_types.belden_638AFJ_door_contact_inner.cores.black]
WireType = "belden_638AFJ_18AWG_black_inner"
[cable_types.belden_638AFJ_door_contact_inner.cores.red]
WireType = "belden_638AFJ_18AWG_red_inner"
[[cable_types.belden_638AFJ_door_contact_inner.layers]]
layer_number = 1
layer_type = "Shield"
material = "Bi-Laminate (Alum+Poly) Tape"
[[cable_types.belden_638AFJ_door_contact_inner.layers]]
layer_number = 2
layer_type = "Jacket"
material = "Polyvinyl Chloride"
rating = "Flamarrest®"
color = "White"

[cable_types.belden_638AFJ_door_contact_inner.dimensions]
height = {value = [427,100], original_unit = "mm"}
width = {value = [427,100], original_unit = "mm"}
diameter = {value = [427,100], original_unit = "mm"}

[cable_types.belden_638AFJ_rex_inner]
cross_sect_area = {value = [3927, 50], original_unit="square millimeter"}
cross_section = "Circular"

[cable_types.belden_638AFJ_rex_inner.cores.black]
WireType = "belden_638AFJ_18AWG_black_inner"
[cable_types.belden_638AFJ_rex_inner.cores.red]
WireType = "belden_638AFJ_18AWG_red_inner"
[cable_types.belden_638AFJ_rex_inner.cores.white]
WireType = "belden_638AFJ_18AWG_white_inner"
[cable_types.belden_638AFJ_rex_inner.cores.green]
WireType = "belden_638AFJ_18AWG_green_inner"
[[cable_types.belden_638AFJ_rex_inner.layers]]
layer_number = 1
layer_type = "Shield"
material = "Bi-Laminate (Alum+Poly) Tape"
[[cable_types.belden_638AFJ_rex_inner.layers]]
layer_number = 2
layer_type = "Jacket"
material = "Polyvinyl Chloride"
rating = "Flamarrest®"
color = "Blue"


[cable_types.belden_638AFJ_rex_inner.dimensions]
height = {value = [5,1], original_unit = "mm"}
width = {value = [5,1], original_unit = "mm"}
diameter = {value = [5,1], original_unit = "mm"}


[cable_types.belden_638AFJ_lock_power_inner]
cross_sect_area = {value = [1017, 10], original_unit="square millimeter"}
cross_section = "Circular"

[cable_types.belden_638AFJ_lock_power_inner.cores.black]
WireType = "belden_638AFJ_16AWG_black_inner"
[cable_types.belden_638AFJ_lock_power_inner.cores.red]
WireType = "belden_638AFJ_16AWG_red_inner"
[cable_types.belden_638AFJ_lock_power_inner.cores.white]
WireType = "belden_638AFJ_16AWG_white_inner"
[cable_types.belden_638AFJ_lock_power_inner.cores.green]
WireType = "belden_638AFJ_16AWG_green_inner"
[[cable_types.belden_638AFJ_lock_power_inner.layers]]
layer_number = 1
layer_type = "Shield"
material = "Bi-Laminate (Alum+Poly) Tape"
[[cable_types.belden_638AFJ_lock_power_inner.layers]]
layer_number = 2
layer_type = "Jacket"
material = "Polyvinyl Chloride"
rating = "Flamarrest®"
color = "Grey"


[cable_types.belden_638AFJ_lock_power_inner.dimensions]
height = {value = [569,100], original_unit = "mm"}
width = {value = [569,100], original_unit = "mm"}
diameter = {value = [569,100], original_unit = "mm"}

[wire_types]

[wire_types.belden_638AFJ_18AWG_black_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Black"
[wire_types.belden_638AFJ_18AWG_red_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Red"
[wire_types.belden_638AFJ_18AWG_white_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "White"
[wire_types.belden_638AFJ_18AWG_green_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Green"
[wire_types.belden_638AFJ_18AWG_brown_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Brown"
[wire_types.belden_638AFJ_18AWG_orange_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [8229, 10000], original_unit = "square millimeter"}
stranded = true
num_strands = 7
strand_cross_sect_area = {value = [1281, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Orange"

[wire_types.belden_638AFJ_16AWG_black_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [327, 250], original_unit = "square millimeter"}
stranded = true
num_strands = 19
strand_cross_sect_area = {value = [647, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Black"
[wire_types.belden_638AFJ_16AWG_red_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [327, 250], original_unit = "square millimeter"}
stranded = true
num_strands = 19
strand_cross_sect_area = {value = [647, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Red"
[wire_types.belden_638AFJ_16AWG_white_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [327, 250], original_unit = "square millimeter"}
stranded = true
num_strands = 19
strand_cross_sect_area = {value = [647, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "White"
[wire_types.belden_638AFJ_16AWG_green_inner]
material = "Copper"
insulated = true
insulation_material = "Polyvinyl Chloride"
conductor_cross_sect_area = {value = [327, 250], original_unit = "square millimeter"}
stranded = true
num_strands = 19
strand_cross_sect_area = {value = [647, 10000], original_unit = "square millimeter"}
insulation_rating = "Flamarrest®"
insulation_color = "Green"

Project Type

# This hasn't been fully vetted yet and is more an example of what I am trying to do.

[cables]

[cables.cable1]
cable_type = "belden_638AFJ"
length = {value = [20, 1], original_unit = "meter"}

[equipment]
[equipment.switch1]
equipment_type = "spst_toggle_switch"
identifier = "switch1"

[equipment.switch2]
equipment_type = "spst_toggle_switch"
identifier = "switch2"


[[connections]]

end1 = {Equipment = {equipment_id = "switch2", connection_point_id = "right"}}
end2 = {Equipment = {equipment_id = "switch1", connection_point_id = "left"}}
# This core_id is what I am trying to find a good solution for. This is just an idea.
connection = {Cable = {cable_id = "cable1", core_id = "card_reader.card_reader_3_pair_inner.black"}}

[[connections]]

end1 = {Equipment = {equipment_id = "switch2", connection_point_id = "left"}}
end2 = {Equipment = {equipment_id = "switch1", connection_point_id = "right"}}
# This core_id is what I am trying to find a good solution for. This is just an idea.
connection = {Cable = {cable_id = "cable1", core_id = "card_reader.card_reader_3_pair_inner.red"}}

Automatic numbering is fine, but it has to account for several layers of nested wires/cables.

I also want to be able to have users use human parsable identifiers if possible, rather than just assigning numbers to things.

Sure, you should definitely allow for customizing the labels. But unless you force the user to label each core themselves, you'll need to do something automatically. Your options are to label sequentially, with no regard to nesting; label with full nesting (e.g. 0.2.4); or something in between (like my suggestion, only differentiating between the bottom layer (conductors) and the upper layers). There's no magic algorithm to assign semantically meaningful names to arbitrary bunches of wires.

I agree that I need an automatic mechanism for core identification (kinda what I am asking for advice about in the OP).

The key problem that I am trying to solve here, is to allow that automatic mechanism of labeling to be predictable, so a user can manually enter the IDs of the cores into the connection definitions (see the TOML files I posted in response to kpreid).

I am fine separating out the display identifier and the in-file identifier (and in fact I already have) but that doesn't solve the issue of having predictable identifiers for users to use in the connection definitions.

What about the various methods I described doesn't solve that problem?

I didn’t mean that the users should ever work with these unique assigned IDs. They should be an internal implementation detail, because they aren’t stable over edits. I think your users should be working with paths — like, as I said, Rust paths to items in modules. The paths can be explicitly shortened by a cable definition giving an alias for one of its components’ components.

I can see the paths method as a path forward (no pun intended) for the users in the TOML files.

so the idea would be to assign each core and core within a core, a generated ID, so I could ahve the equivalent of a flat-map of cores, without having to recurse each time?

I am still a little bit confused about the benefits of the generated IDs, if I still have the nesting there.

Sorry, I am a bit tired and not tracking as well today.

That's right. The generated IDs can be Copy (numbers or newtypes of numbers), which makes them a lot easier to work with in code, and harder to misinterpret, because the paths will be relative to “the current cable”, but the program needs to look at the whole picture, and it will be easier to get that picture right if the IDs of the wires at each end of a connection don't need to be interpreted relative to their cable definition.

That makes sense. I will play around with that tonight and see where I get.

Thanks