Mysterious procedural macro error

We're stuck with a completely mysterious compilation problem with a procedural macro. By mysterious I mean that:

  • The error points at the name of the procedural macro in the #[derive] invocation, and even with -Zproc-macro-backtrace you cannot get any hint into the nature of the problem.

  • We know exactly which part of the generated code causes the error, and if we comment it out the compiler does not complain anymore, but the same exact code copied and pasted in another file compiles without problems.

If you want to see it happen, just

git clone https://github.com/vigna/epserde-rs
cd epserde-rs/epserde
RUSTFLAGS="-Zproc-macro-backtrace" cargo +nightly t --test test_types

If you want to see the code causing the problem, just

cargo expand --test test_types

and have a look at lines 1446-1463 and 1475-1492.

Any suggestion accepted. Are there diagnostic tools we are not aware of for procedural macros?

Can you at least point out in which test function the error is? I'm not willing to clone a whole repository for this.

1 Like

Good point. The function is test_enum_deep in tests/test_types.rs. The generated code that does not compile is

        fn _deserialize_full_inner(
            backend: &mut impl epserde::deser::ReadWithPos,
        ) -> core::result::Result<Self, epserde::deser::Error> {
            use epserde::deser::DeserializeInner;
            match usize::deserialize_full(backend)? {
                0usize => Ok(Self::A {}),
                1usize => {
                    Ok(Self::B {
                        0: <u64>::_deserialize_full_inner(backend)?,
                    })
                }
                2usize => {
                    Ok(Self::C {
                        0: <u64>::_deserialize_full_inner(backend)?,
                        1: <Vec<usize>>::_deserialize_full_inner(backend)?,
                    })
                }
                3usize => {
                    Ok(Self::D {
                        a: <i32>::_deserialize_full_inner(backend)?,
                        b: <V>::_deserialize_full_inner(backend)?,
                    })
                }
                tag => Err(epserde::deser::Error::InvalidTag(tag)),
            }
        }

The compiler complains as follows:

error: expected identifier, found `{`
   --> epserde/tests/test_types.rs:240:14
    |
240 |     #[derive(Epserde, Clone, Debug, PartialEq)]
    |              ^^^^^^^
    |              |
    |              expected identifier
    |              while parsing this struct
    |
    = note: this error originates in the derive macro `Epserde` (in Nightly builds, run with -Z macro-backtrace for more info)
error: proc-macro derive produced unparsable tokens
   --> epserde/tests/test_types.rs:240:14
    |
240 |     #[derive(Epserde, Clone, Debug, PartialEq)]
    |              ^^^^^^^

Just from looking at the snippet you posted above, this seems impossible. You can't have numbers as field names like you have in your Self::B and Self::C variants.

1 Like

Yes, you can:

enum Foo {
    A(i32),
}

fn x() -> Foo {
    Foo::A { 0: 100 }
}

It's deliberately universal syntax to allow constructing/matching any type of declared struct or enum using braced syntax, introduced by RFC 1506 to allow macros to generate the same code structure regardless of the type's declaration syntax.

3 Likes

Yes, you can if it's a tuple struct. A(100) is sugar for A{0:100}.

1 Like

Thanks, I wasn't aware of this.

As I said, this compiles without a glitch if I copy-paste it and replace the function calls with constants. Which might mean the problem is with the function calls, but then I can't understand how...

As to the original problem —

How did you determine this?

Assuming it's accurate, I would suggest working on a minimal reproduction — modify your macro to produce only the erroneous code, and strip away all unnecessary elements even if this stops the macro from serving any useful purpose. Keep only the confusing part. That way, more people will be able to think about your now-smaller problem, and it makes good material for a bug report if this turns out to be a compiler bug.

2 Likes

We know because if you eliminate from the procedural macro the branches 0usize =>, etc., leaving just the tag =>, the error disappear. The problem is that we are unable to isolate the problem because, as I wrote above, in isolation it works :man_shrugging:t2:.

You wrote

the same exact code copied and pasted in another file compiles without problems.

The problem with this strategy is that text does not always reproduce an identical token stream, so it isn't an accurate test. Instead, make your macro simpler while checking that it still produces the syntactically-broken code. Keep going until you can't find anything to remove. Minimize your original code into a repro.

1 Like

BTW, the problem persist even if you reduce the enum to just the variant A, so all branches 1=>usize, etc., disappear. This is why it's really weird—how can 0usize => Ok(Self::A {}), give a syntax error?

Good point. This won't compile:


use epserde::prelude::*;

#[derive(Epserde)]
enum Data {
    A,
}

The generated codes is

#![feature(prelude_import)]
#![cfg(test)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
use epserde::prelude::*;
enum Data {
    A,
}
#[automatically_derived]
impl epserde::traits::CopyType for Data {
    type Copy = epserde::traits::Deep;
}
#[automatically_derived]
impl epserde::ser::SerializeInner for Data {
    const IS_ZERO_COPY: bool = false;
    const ZERO_COPY_MISMATCH: bool = !false;
    #[inline(always)]
    fn _serialize_inner(
        &self,
        backend: &mut impl epserde::ser::WriteWithNames,
    ) -> epserde::ser::Result<()> {
        epserde::ser::helpers::check_mismatch::<Self>();
        match self {
            Self::A => {
                backend.write("tag", &0usize)?;
            }
        }
        Ok(())
    }
}
#[automatically_derived]
impl epserde::deser::DeserializeInner for Data {
    fn _deserialize_full_inner(
        backend: &mut impl epserde::deser::ReadWithPos,
    ) -> core::result::Result<Self, epserde::deser::Error> {
        use epserde::deser::DeserializeInner;
        match usize::deserialize_full(backend)? {
            0usize => Ok(Self::A {}),
            tag => Err(epserde::deser::Error::InvalidTag(tag)),
        }
    }
    type DeserType<'epserde_desertype> = Data;
    fn _deserialize_eps_inner<'a>(
        backend: &mut epserde::deser::SliceWithPos<'a>,
    ) -> core::result::Result<Self::DeserType<'a>, epserde::deser::Error> {
        use epserde::deser::DeserializeInner;
        match usize::deserialize_full(backend)? {
            0usize => Ok(Self::DeserType::<'_>::A {}),
            tag => Err(epserde::deser::Error::InvalidTag(tag)),
        }
    }
}

The problem is with the two structure creation expressions after 0usize => .

Great. Now start making your macro simpler. For example: convert it from a derive macro into a function-like macro, so it doesn't require a struct as input. Stop emitting an impl; just emit functions. Remove the use epserde:: — since this is a syntax error, you don't need the produced code to be valid any more than syntactically.

Make your large macro smaller by all possible means, not just reasonable ones that fit your original use-case. At the end of this process, if you actually follow it to the end, you will have something with which you can obtain understanding (or a bug report).

If you are trying to make integers into field names, you have to use Index, not a number literal. I don't know if you are already doing this, but it's a common mistake.

1 Like

I think we're doing so: var_fields_vars.push(syn::Index::from(field_idx));.

On the other hand, a wrong type of token world explain everything. I think you're right.

Valentin Lorentz found the solution: the problem is in the lines

variant_full_des.push(quote!{ {} });
variant_eps_des.push(quote!{  {} });

which should be

variant_full_des.push(quote!{});
variant_eps_des.push(quote!{});

The result of cargo expand in both cases is character-by-character the same, but clearly the generated token must be different.

It is actually a cargo-expand problem.

Printing the ouput of our quote! using eprintln!, one sees clearly the problem—A { {} }. For some reason, cargo expand turns this into A {}, making impossible to diagnose the problem.

UPDATE: After opening an issue, David Tolnay noticed that the same happens with cargo rustc -- -Zunpretty=expanded, so it's not a specific cargo-expand problem.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.