Returning &str from a local variable in match

I’m trying to get a list of links from a web page. Problem is, when I select the herf attribute from tags, I get relative URLs. I want to join that relative URL to the URL of the page I fetched to get the absolute version. But I cannot return the &str from the parsed url::Url, because that variable is local.

use error_chain::quick_main;
use select::{document::Document, predicate::Name};
use url::Url;

mod errors {
    error_chain::error_chain! {
        foreign_links {
            ParseError(url::ParseError);
            ReqError(reqwest::Error);
            IoError(std::io::Error);
        }
    }
}

fn run() -> errors::Result<()> {
    let base = Url::parse("https://www.rust-lang.org")?;
    let resp = reqwest::get(base.as_str())?;
    Document::from_read(resp)?
        .find(Name("a"))
        .filter_map(|n| n.attr("href"))
        .map(|n| match base.join(n) {
            Ok(u) => u.as_str(),
            Err(_) => n,
        })
        .for_each(|url| println!("{}", url));

    Ok(())
}

quick_main!(run);

When I try to build this, I get the following:

$cargo run
   Compiling getlinks v0.1.0 (/me/code/rust/getlinks)
error[E0515]: cannot return value referencing local variable `u`
  --> src/main.rs:22:22
   |
22 |             Ok(u) => u.as_str(),
   |                      -^^^^^^^^^
   |                      |
   |                      returns a value referencing data owned by the current function
   |                      `u` is borrowed here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0515`.
error: Could not compile `getlinks`.

To learn more, run the command again with --verbose.

I could clone the string, but that just seems like a bad idea. I could also probably get rid of the map and move the match into the for_each, and use println! twice, but map just seems more intuitive.

Edit: I’m aware there’s probably a bug where I’ll get something like https://www.rust-lang.org/https://www.other-url.example.com/ if I try joining two absolute URLs. I’ll figure that out next.
Edit 2: Nope, actually url::Url takes care of that for you.

println!("{}", Url::parse("https://www.rust-lang.org/").unwrap().join("https://www.example.com/resource.html").unwrap());

prints https://www.example.com/resource.html.

Move the match into the for_each and get rid of the map

That’s what I was thinking too, but suppose I wanted to turn that list of URLs into a collection. Is it not possible to use map in this way?

Well, no you can’t use map like this. The reason is that base.join(...) returns a value. That value lives on the stack and will be destroyed at the end of the closure’s scope. So you can’t return a reference to it.

1 Like

That makes perfect sense. It’s just a shame, because I can foresee a lot of instances when a method will return a reference, and it would be good to map to that reference.

I actually just solved this particular case by using into_string() instead of as_str(); it moves ownership out of the Url, which is no longer valid. I then had to change the type in the other match arm and it works now.

// --snip--
    .map(|n| match base.join(n) {
        Ok(u) => u.into_string(),
        Err(_) => String::from(n),
    })
// --snip--

But maybe I’m being ridiculous, and moving it to the for_each with two println!s is just fine.

Document::from_read(resp)?
        .find(Name("a"))
        .filter_map(|n| n.attr("href"))
        .for_each(|n| {
            let url = base.join(n);
            let url = match url.as_ref() {
                Ok(u) => u.as_str(),
                Err(_) => n,
            };
            println!("{}", url)
        });

Does this work?

1 Like

Yes. That looks nicer. Thank you.

Actually no, u does not live long enough.

Sorry, forgot an as_ref, now it should work.

That does.