Hi,
Happy holidays to all!
I am reading data from xml file (using roxmltree crate) and creating a data dictionary out of it. There are lot of strings which are duplicated in many xml block so I was thinking of not duplicating string but instead store a reference to it in multiple places. Below is some code with comments explaining what I am trying to do -
#[derive(Debug)]
pub struct DataDictionary<'a> {
field_set: HashSet<String>, // as I get a field name, i store owned String in this set. and everywhere else i want to use the reference of this string.
fields_by_tag: HashMap<u32, FieldEntry<'a>>,
tag_name_map: HashMap<&'a String, u32>,
components: HashMap<String, ComponentEntry<'a>>,
groups: HashMap<String, Group<'a>>,
}
impl<'a> DataDictionary<'a> {
fn new() -> Self {
Self {
field_set: HashSet::new(),
fields_by_tag: HashMap::new(),
tag_name_map: HashMap::new(),
components: HashMap::new(),
groups: HashMap::new()
}
}
fn add_field(&mut self, fld: FieldEntry<'a>) {
// TODO: add check so that tag should not be present already - duplicate tag, invalid xml
self.fields_by_tag.insert(fld.number, fld);
}
fn get_field(&self, tag: &u32) -> Option<&FieldEntry> {
self.fields_by_tag.get(tag)
}
fn add_field_tag(&mut self, name_ref: &'a String, tagnum: u32) {
self.tag_name_map.insert(name_ref, tagnum);
}
#[derive(Debug)]
struct FieldEntry<'a> {
number: u32,
name: &'a String,
ftype: String,
values: HashSet<FieldValueEntry>,
}
impl<'a> FieldEntry<'a> {
fn new(number: u32, name: &'a String, ftype: &str) -> Self {
Self {
number,
name: name,
ftype: ftype.to_string(),
values: HashSet::new()
}
}
fn set_valid_value(&mut self, val: FieldValueEntry) {
self.values.insert(val);
}
}
pub fn create_data_dict(fix_xml: &str) -> Option<DataDictionary> {
let mut file_data = String::with_capacity(1024*64);
let mut file = File::open(fix_xml).unwrap();
file.read_to_string(&mut file_data).unwrap();
let doc = Document::parse(&file_data).unwrap();
let mut dictionary = DataDictionary::new();
for root_child in doc.root_element().children().filter(|node| node.node_type() == NodeType::Element) {
match root_child.tag_name().name() {
"fields" => {
field_handler(root_child, &mut dictionary);
},
"components" => {
component_handler(root_child, &mut dictionary);
},
_ => {
println!("No processing this");
}
}
}
Some(dictionary)
}
fn field_handler(field_node: Node, dict: &mut DataDictionary) {
for node in field_node.children().filter(|n| n.node_type() == NodeType::Element) {
let fname = node.attribute("name").unwrap();
let fnum = node.attribute("number").unwrap().parse::<u32>().unwrap();
let ftype = node.attribute("type").unwrap();
let mut f_entry: FieldEntry;
match dict.get_field_set(fname) {
Some(name_ref) => {
dict.add_field_tag(name_ref, fnum);
f_entry = FieldEntry::new(fnum, name_ref, ftype);
},
None => {
dict.insert_field_set(fname);
let name_ref = dict.get_field_set(fname).unwrap();
dict.add_field_tag(name_ref, fnum);
f_entry = FieldEntry::new(fnum, name_ref, ftype);
}
}
for child in node.children().filter(|n| n.node_type() == NodeType::Element && n.has_tag_name("value")) {
let fvalue_entry = FieldValueEntry::new(
child.attribute("enum").unwrap(),
child.attribute("description").unwrap()
);
f_entry.set_valid_value(fvalue_entry);
}
dict.add_field(f_entry);
}
}
When I compile this, I get following error -
error[E0623]: lifetime mismatch
--> src/codegen.rs:189:22
|
181 | fn field_handler(field_node: Node, dict: &mut DataDictionary) {
| -------------------
| |
| these two types are declared with different lifetimes...
...
189 | dict.add_field_tag(name_ref, fnum);
| ^^^^^^^^^^^^^ ...but data from `dict` flows into `dict` here
error: aborting due to previous error
For more information about this error, try `rustc --explain E0623`.
I am not able to understand this lifetime requirement. I was assuming that data dictionary objects data variables contains references to each other, they should share the lifetime but I am wrong it seems. Also, when I try -
rustc --explain E0623
I get "error: no extended information for E0623"
Could someone help to explain what is going on here? Also, is it a good idea to store these kind of references or should I just duplicate the strings to keep things simple?