Error: Vec<Node> returns a value referencing data owned by the current function

Hey there,

I've spent a few hours on this compilation error, read through various documentation and tried several different ways to get my code to compile.

I'd really appreciate a push in the right direction.
Thanks

Here is my code

#[macro_use]
extern crate prettytable;
extern crate reqwest;
extern crate select;

use prettytable::Table;
use select::document::Document;
use select::node::Node;
use select::predicate::Name;

fn main() {
    let hrefs = scrape_hrefs("https://www.jasperdunn.com").unwrap();
    create_table(hrefs);
}

fn scrape_hrefs(url: &str) -> Result<Vec<Node>, reqwest::Error> {
    let response = reqwest::blocking::get(url)?;
    assert!(response.status().is_success());

    let links = Document::from_read(response)
        .unwrap()
        .find(Name("a"))
        .filter(|link| link.attr("href").is_some())
        .collect();

    Ok(links)
}

fn create_table(links: Vec<Node>) {
    let mut table = Table::new();

    table.set_titles(row!["href", "innerText", "innerHTML"]);

    for link in links {
        let href = link.attr("href").unwrap();
        let innerText = link.as_text().unwrap();
        let innerHTML = link.inner_html();

        table.add_row(row![href, innerText, innerHTML]);
    }

    table.printstd();
}

Here is the error I'm getting

error[E0515]: cannot return value referencing temporary value
  --> src/main.rs:26:5
   |
20 |       let links = Document::from_read(response)
   |  _________________-
21 | |         .unwrap()
   | |_________________- temporary value created here
...
26 |       Ok(links)
   |       ^^^^^^^^^ returns a value referencing data owned by the current function

I'm not at all familiar with this library, but it looks like the Node returned by find has a lifetime tied to its parent Document and contains a pointer to it. This means that you can't return a Node if you own its Document.

This is a typical scenario when you should convert the pointers to something that owns the content it's responsible for. Apparently there's a Node::data() method which returns a cloneable, owning value representing the data of the node.

You can even clone all the attrs() and the name() of each Node in a similar manner, and maybe return them in a convenient OwnedNode structure like this:

struct OwnedNode {
    name: String,
    data: Data,
    attrs: Vec<(String, String)>,
}

impl From<Node> for OwnedNode {
    fn from(node: Node) -> Self {
        OwnedNode {
            name: node.name().to_owned(),
            data: node.data().clone(),
            attrs: node.attrs().map(|(k, v)| (k.to_owned(), v.to_owned())).collect(),
        }
    }
}

fn scrape_hrefs(url: &str) -> Result<Vec<OwnedNode>, reqwest::Error> {
    let response = reqwest::blocking::get(url)?;
    assert!(response.status().is_success());

    let links = Document::from_read(response)
        .unwrap()
        .find(Name("a"))
        .filter(|link| link.attr("href").is_some())
        .map(OwnedNode::from)
        .collect();

    Ok(links)
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.