Possible to access borrow checker data (ideally in proc_macro)?

#1

Is there any way to get the dataflow graph from the borrow checker as a lightweight substitute for symbolic execution? Ideally, this would be possible from a proc_macro but this would require using the borrow checker as a standalone library.

I see Polonius but it doesn’t look like it’s usable as a library.

1 Like
#2

A proc macro couldn’t do this without knowing the signatures of all functions called and the variances of lifetime parameters for every type.

#3

Sure it could. It’d just have to compile the code beforehand. The question is whether the information generated by the compiler can be easily consumed by not-the-compiler.

#4

Okay. Let’s say that code does something so simple as calling a method. In order to resolve a method call, the compiler must:

  • Know all things that have been imported into the current module via use.
  • Know all traits in scope that have a similarly named method.
  • Know all impls of these traits, including those defined by the current crate.
  • Know all inherent methods of the type (which for a local type could be defined anywhere in the crate).
  • Know all function signatures in scope and all impls of all other traits in existence, because even just determining the type of the method receiver may require some level of trait solving. (rust can infer some type parameters of traits if they are uniquely determined by the set of impls in existence).

But proc_macros run when the compiler hasn’t even finished expanding macros! This is a problem, because a macro invocation that appears after the current macro in source order could do all sorts of things:

  • It could emit a use statement to import a trait.
  • It could emit an inherent impl for the type containing that method.
  • It could emit a impl of one of the traits that defines the method.
  • It could emit an impl for some other trait that causes type inference to become ambiguous in a place earlier in the method body where it used to be uniquely determined.
  • It could be what emits the struct or enum definition for the type of the receiver!

This is part of the trade-off Rust made by being designed to be almost entirely agnostic of source order (in contrast to, say, C++).

#5

Sure, but say I’m the author of the proc_macro. Let’s go one step further and say that the macro invocation looks like this:

src/lib.rs

my_macro! { // because proc_macro inner attributes aren't supported

use something::not::in::prelude;

struct Foo(u32);

impl Foo {
    fn bar(&self) {
        let val = self.0;
        println!("{}", val);
    }
}

}

Now what I could do (when running the proc_macro) is take the inner source code, compile it, get the rustc movement data, and figure out that val was borrowed out of self. The question is how to get that movement data!

(I appreciate your detailed responses, by the way)

#6

What I’m saying is you can’t just “compile the inner source code!” That source code may depend on other source code that does not even exist in the AST at the time the proc macro is called.

And even when that code does exist, it does not matter because the unit of compilation in rust is the entire crate. The compiler does not figure out anything related to the type system until long after all macros are fully expanded.

2 Likes
#7

Now, there is a way to do the sort of thing you’re talking about with rustc plugins. This is how e.g. clippy is implemented.

Unfortunately, there is currently no stable interface in the compiler for giving plugins access to type system info and etc. So if you want to write a plugin of your own, you’ll only be able to use it on a nightly compiler where you’re allowed to depend on compiler internals.