"Producer" method with argument lifetime linked

Hello everyone!

I'am very new to rust, and while working on simple pet project stuck on, potentially trivial, question.

Context is my lack of lifetime understanding(what a suprise...).

I'am using roxmltree library to parse xml.(The problem is not related to specific library but to lifetime code pattern).

It has parse method, which produces structure, containing reference to source string slice.

Method signature is:

https://github.com/RazrFalcon/roxmltree/blob/master/src/parse.rs#L256

And the document structure is:
https://github.com/RazrFalcon/roxmltree/blob/master/src/lib.rs#L74

My problem is when I want to obtain string inside some method and build document right there and return constructed document from this method, e.g.:

fn some_string_producer() -> String {
    return String::from(r"<aaa></aaa>");
}

fn some_method<'s>() -> Document<'s> {
    let src = some_string_producer();
    let doc = Document::parse(&src).expect("Failed to parse xml");
    return doc; //it is incorrect and it's fair...but how to be?
}

I do understand why I can not return obj referring method local variable, but I can't find a way how to escape from this trap, without alternative parse-method signatures.

Looks like in such patterns I must keep source string along(or wider) with constructed document, but what if i don't need this string on calee side. I think I just don't get some basic thing here. Something like pass Box to be consumed by document.

PS
Saw a user with similar problem in issues for that library but I see nothing useful in answer - https://github.com/RazrFalcon/roxmltree/issues/34

Thanks in advnace

The method signature is more easily understandable as rendered in the generated documentation: roxmltree::Document - Rust

pub fn parse(text: &str) -> Result<Document<'_>, Error>

This is using lifetime elision and short for

pub fn parse<'a>(text: &'a str) -> Result<Document<'a>, Error>

which means the lifetime parameter of the Document is linked to the lifetime of the text reference, i.e. the Document cannot exist anymore without the string, usually since the Document doesn’t copy out the String data but instead just contains references into the original string. (Helps improves the performance since there’s less copying and allocating.)

One way to work with this limitation is to re-structure the API so that you’re taking some String buffer to work with. E.g.

fn some_string_producer(buf: &mut String) {
    buf.clear();
    buf.push_str(r"<aaa></aaa>");
    // or, something like `*buf = String::from(r"<aaa></aaa>")` would work, too
    // instead of clearing, you could also `assert!` emptyness, which avoids users accidentally passing in
    // data they didn’t want deleted
}

// caller is expected to provide an empty buffer
fn some_method<'s>(buf: &'s mut String) -> Document<'s> {
    some_string_producer(buf);
    let doc = Document::parse(&buf).expect("Failed to parse xml");
    doc
}

Other, more complicated, approaches would need self-referencing data-types in order to bundle up the String with the Document somehow. This is not something Rust supports out-of-the-box, but there are crates that can help; if you can live without them, that’s usually more straightforward though.

Some parsing crates also contain owning versions of their parsing outputs, and/or a way to convert them to an owned variant that doesn’t depend on an existing string somewhere anymore; e.g. by using Cow<'a, str> internally instead of &'a str, and providing a to_owned(self) -> Document<'static>-style method. This crate does not appear to do such a thing though.

Thanks for fast reply.

Yes I understand why the compiler complains and understand the scenario when source reference can be passed to method from the caller.

A little trick with empty string - will remember it, but it does not remove necessity to keep upstack a reference that no more needed at caller.

Such way it looks like a bit restricting api from library author, but mb the father i'll go in rust, the more I will feel comfortable to such decisions.

PS Thanks for hint about links to signatures at documentation, it's way easier to read.

I mean, if you’re curious, here’s a self-referencing-struct solution using ouroboros:

/*
[dependencies]
roxmltree = "0.14"
ouroboros = "0.15"
*/

use roxmltree::Document;
use ouroboros::self_referencing;

#[self_referencing]
struct OwnedDocument {
    buffer: String,
    #[borrows(buffer)]
    #[covariant]
    document: Document<'this>,
}

impl OwnedDocument {
    // not necessary, as `borrow_document` already exists
    // but demonstrates the kind of access this `OwnedDocument` provides
    fn get(&self) -> &Document<'_> {
        self.borrow_document()
    }
}

fn some_string_producer() -> String {
    String::from(r"<aaa></aaa>")
}

fn some_method() -> OwnedDocument {
    let src = some_string_producer();
    let doc = OwnedDocumentBuilder {
        buffer: src,
        document_builder: |src| Document::parse(&src).expect("Failed to parse xml"),
    };
    doc.build()
}

fn main() {
    let doc = some_method();
    dbg!(doc.get());
}

In the Rust Explorer

Thanks for practical illustration, bookmarked and i'll explore it. At that moment i'll stick to your recommendation

if you can live without them, that’s usually more straightforward though.

just to not overwhelm my mind at pet project =)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.