What are pipes and underscores doing in Rust?


#1

Hi guys,
i’m pretty new to rust. I want to use the curl crate. Now i see this code example:

use std::io::{stdout, Write};
use curl::easy::Easy;

let mut handle = Easy::new();
handle.url("https://www.rust-lang.org/").unwrap();
handle.write_function(|data| {
    Ok(stdout().write(data).unwrap())
}).unwrap();
handle.perform().unwrap();

This code make sense, i think write_function is a callback. But what the hell are those “|” pipes are doing? (|data|) I know them as OR operator.

Or a small code snippet from gtk-rs:

window.connect_delete_event(clone!(window => move |_, _| {
        window.destroy();
        Inhibit(false)
}));

What does |_, _|?

I don’t find any information about pipes and underscores in rust. Can anyone help me?


#2

“Pipes” are used to defined closures https://doc.rust-lang.org/book/second-edition/ch13-01-closures.html

_ is used when there is an input argument that you simple don’t use and the way to tell the compiler about it is to add a _ you can also do _my_var as well for the same result.


#3

Hey

That’s not a pipe, that’s the closure syntax. Also underscores are for wildcards. You can use wildcards here because you are not using the variables passed so it doesn’t matter.


#4

Thanks for the quick answer! Make sense now :slight_smile:


#5

You can also use underscores to throw away type annotations you don’t care about, for example:

let list = &[1u8, 2, 3, 4];
let v: Vec<_> = list.iter().map(|item| item + 1).collect();

collect turns an iterator of something into a collection, but it needs to know what type of collection to create. The compiler can already infer the type of the item, so the only additional info it needs is the type of the container. In this case, we’re giving it the annotation “vector of type ‘whatever, I don’t care’”, and letting the compiler fill in the “whatever” with the type it inferred already.

It’s less to type, and less you need to change if you ever decide to change the type of list.


#6

I believe _ is a little more special – it’s not a variable binding at all, but rather drops the assigned value right away. Whereas _my_var is a real binding that just doesn’t issue any warning about being unused, but still doesn’t drop until it goes out of scope (like any other variable).


#7

Hmm, I’ve never heard of this distinction and a quick test seems to show that _ is dropped at the end also.


#8

@vitalyd _ is dropped right away.

struct NoisyDrop(&'static str);

impl Drop for NoisyDrop {
    fn drop(&mut self) {
        println!("dropping {}", self.0);
    }
}

fn main() {
    let _a = NoisyDrop("a");
    let _ = NoisyDrop("b");
    println!("let's see what happens");
}
dropping b
let's see what happens
dropping a

#9

Interesting - I tried a similar test using _ as a function parameter and it was dropped at the end.

Is this actually specd somewhere? I don’t see why either matters in terms of drop order or optimization.

Edit: for locals I guess this just leaves the rvalue alone and the _ isn’t considered an lvalue. But for function args it appears to have no difference.


#10

The way I always phrase this is, _ never binds in the first place. That
is, it’s a pattern that explicitly doesn’t bind to anything.

This is documented but it’s Christmas and I’m lazy so I don’t have the link
handy :laughing:


#11

But what about in function args, which is what @emoon was referring to (I believe)? I think for locals it makes sense - it’s truly just syntax to ignore the value and no different than just creating an rvalue without binding - it’s dropped at the end of the statement.

Oh, and Merry Christmas to whoever is celebrating it! :slight_smile:


#12

_ is just special in general. In all contexts where _ is allowed (which IIRC is just patterns and types), _ will pretty much always have behavior that is in some way distinct from e.g. _a, because _ is not an identifier. (you can’t even match it with ident in a macro)

I did some more testing:

  • It seems the auto-dropping is only for let statements. match patterns of _ don’t appear to drop anything, nor do if let or for (which are often understood to desugar into some form of match anyways).
  • It applies recursively to subpatterns. let (_, a) = tuple; will drop the first element.

Personally, I’ve never understood why let bindings even do this, since it can be a hazard for things like RAII locks if you’re not aware of it. Besides, we have drop in the prelude, don’t we?


#13

For let statements, I think this is no different than rvalues being dropped at the end of the statement.

What kind of RAII hazard do you envision? If anything, early drop is usually desirable.


#14

_ is a keyword that means ignore.

As for function parameters, you should think of functions kind of like

fn f(a: i32, _: i32) {
  ...
}
f(0, 1);

// should be thought of, wrt RAII, as

fn f(tup: (i32, i32)) {
  let (a, _) = tup;
  {
    ...
  }
}

f((0, 1));

#15

That would imply that a _ param would be dropped on entry to the function, which doesn’t appear to be the case.


#16

No, it doesn’t - that’s not how _ works. let _ = <expr>; is exactly equivalent to <expr>;. There are some value category things going on here. I’d recommend trying out the following examples, to get more intuition:

struct D(i32);
impl Drop for D {
  fn drop(&mut self) { println!("dropped {}", self.0); }
}

fn main() {
  {
    D(0);
    let x = D(1);
    let _ = D(2);
    D(3);
    let y = D(4);
    let _ = D(5);
  }

  {
    let var = D(0);
    let _ = var;
    var;
    D(1);
  }
}

I don’t really feel like explaining what’s going on here with the value categories, because that would require synthesis of difficult wording on my part. However, if you know C++, the rules are similar.


#17
extern "C" fn thread_unsafe_c_func() { }

lazy_static! {
    static ref LOCK: Mutex<()> = Default::default();
}

pub fn safe_wrapper() {
    // Prevent multiple threads executing at once.
    // (BUG: We drop this too soon!)
    let _ = LOCK.lock();
    
    unsafe { thread_unsafe_c_func() };
}

#18

I’d say that’s poor code :slight_smile:. Instead, wrapping that function into a struct that can be owned by the Mutex is more appropriate. Otherwise the two are disconnected and you can make an RAII mistake anyway without the compiler shouting about it.