Relative newbie: E0503 errors when building a parser by hand

Hello everyone.

For pedagogical reasons, I've decided to try and write for myself a simple, LL(1)-type programming language targeting the 65816 processor with a syntax not entirely dissimilar to that of Rust. Think something like Tiny-C, except with a Rust-y syntax. I decided to call it ROIL, for Rust/Oberon Inspired Language.

I've been following a technique used in the "15 compilers in 15 days" paper, namely building the compiler up in incremental pieces using test-driven programming techniques. As I work through the project so far, I've been getting encouraging results.

Until now. After refactoring some code that belongs in a parser into the parser.rs module proper, I'm now receiving a slew of E0503 errors. I've pushed my code to my "incubator" repository so everyone can see (and clone and poke around if they want; I welcome this, in fact) my work so far. I apologize if the code is not well structured; please remember this is all work in progress material.

https://git.sr.ht/~vertigo/incubator/tree/2d9c0625be35224b3a149d77ee5ef9e60906d3ed/item/compiler/roil/src/parser.rs#L84-102

As the code comment indicates, I'm baffled about why I'm getting an E0503 error mid-way through the grammar rule method that handles let-bindings (g_let).

If anyone has any recommendations on how to restructure the code so as to work around this issue, I'd be appreciative. Thanks in advance.

Prior research: I've looked here on the forums for similar issues involving E0503, but while there were many articles which popped up, none of them seemed to touch on my specific set of circumstances. rustc --explain E0503 yielded similarly "this is obvious, but I don't know how it relates" answers to my problem. So, I've resorted to posting a new topic here in the hopes that this adds to the knowledge base somehow.

NOTE: Because I know at least one person will recommend using a parser generator or combinator library, please don't. The reason I'm not is because I want the experience of having done it myself first. This is an exercise for myself to learn how to write non-trivial code in Rust. If I were writing for a commercial-grade product, or contributing to someone else's project, etc., then of course I'd use or recommend pre-existing tools. Thanks!

Basing purely off looking a the code and the comment within,

        let rval;
        if let Some(Token::Char('=')) = self.next {
            self.skip();
            rval = self.g_expr();  // <-- borrows mutably you said
        } else {
            return Item::Error;
        }

        // The error is here, self.next probably?
        if let Some(Token::Char(';')) = self.next {
            self.skip();
        } else {
            return Item::Error;
        }

        Item::DeclareLocal(id, &rval)
// rval must be live:           ^^^^

rval is an Item<'x> where 'x is due to an exclusive borrow of self. That means as long as you want rval to be alive, self must remain exclusively borrowed -- you can't use it. Or to view it another way, as soon as you use self again, rval becomes invalid.


Incidentally, borrow conundrums will be easier to spot if you enable this lint:

#![deny(elided_lifetimes_in_paths)]

As this will force you to rewrite this:

    pub fn g_expr(&mut self) -> Item {

As this:

    pub fn g_expr(&mut self) -> Item<'_> {

Which means the same as this:

    pub fn g_expr<'a>(&'a mut self) -> Item<'a> {

(which is why rval holds onto an exclusive borrow of self).

2 Likes

and...

OK, let me see if I have an understanding of what you're saying.

Because of the way that g_expr() function is declared, the Item it returns is bounded implicitly by the lifetime of self. Since g_expr() takes a mutable reference to self, it's assumed by the compiler that the resulting Item also has an exclusive dependency upon self. Is that a correct understanding?

What would you recommend as a way to break this dependency? I suppose I could pass in a mutable item reference, like so: my_parser.g_expr(&mut my_item) and have it fill it out; that way, g_expr() doesn't return anything, and the lifetimes of the Items can be more independently specified. Do you think that would work?

Yes. It's not really an assumption, it's explicitly what the function signature means. Think of the signature as an API contract the compiler has been tasked with enforcing. It enforces it both for the function writer (function body) and function callers.

It's hard to say without knowing the use case behind Item::DeclareLocal, which contains a borrowing field. If you never borrow from a given method, you could return Item<'static>; that would break the dependency.

Speaking very generally, abstracting over "borrowed or owned" tends to get complicated, so you may be better off splitting off the enum variants that need to borrow, or having a borrowing and non-borrowing enum.

1 Like

This is not what I gathered from reading Rust: The Programming Language or from my previous coding experiences. I thought lifetimes only scoped data (e.g., in this case, that self must at least out-live Item), but did not also convey mutability dependency.

But, wouldn't that then imply that the Item lasts as long as the program is running?

The use-case I had in mind was that handing a Item::DeclareLocal variant to a code generator would emit the machine code necessary to initialize a local variable. So, the parser would create the Item, and the code generator would consume (and, thus, drop) it later on.

While Item does have borrowing fields, they are to Strings and references to sub-Items. They do not have (to my knowledge at least) references to anything in the Parser or any of its methods.

Thank you for the explanations. This is very helpful for my understanding of what's happening. It looks like I have some major refactoring to do.

Lifetimes don't convey information about mutability, they convey information about how long a borrow lasts. The signature of the function says that the returned Item<'_> requires the borrow of &mut self to be "held" for as long as the Item<'_> exists.

If you constructed an Item::<'static>::DeclareLocal variant, the reference you used would need to live for the rest of the program, yes. But you aren't currently using DeclareLocal in the g_expr, so you can use Item<'static> without any issues. If you eventually need to use DeclareLocal there, you will likely need to choose a different strategy.

You have a larger architectural problem here though, because you're trying to return Item::DeclareLocal(id, &rval) from that function, but rval is a local. You said in your original post that that error wasn't related to this which is technically true, but the root cause is the same: the fact that Item contains a reference.

You can fix both by making Item own the subitem

pub enum Item {
    Error,
    ConstInteger(TargetUInt),
    DeclareLocal(String, Box<Item>),
}

The Box is necessary since the enum contains itself, though you could use any kind of indirection like Rc or Arc to the same effect.

Updating everything to work with that change fixes all of the errors in the file.

2 Likes

Sometimes, it is the simplest things that get me. I completely forgot about Box.

Thank you so much!

When you tie lifetimes together (which can happen with elided lifetimes in function signatures like happened here), it means that they are related borrows -- like you're tranfering a borrow, something like that. When you see

fn method(&self) -> ThingWithLifetime<'_>

It means that the ThingWithLifetime<'_> is (notionally if not literally) holding onto a borrow from &self.

The mutability / exclusiveness aspect isn't part of the lifetime itself, but given

fn method<'a>(&'a mut self) -> ThingWithLifetime<'a>

There is no way to get the ThingWithLifetime<'a> from the method without first exclusively borrowing self for 'a as well. Once that &'a mut self is created -- which you must do to call the method -- self is exclusively borrowed for the entirety of 'a.


Somehow I didn't notice your return in the problematic function was Item::DeclareLocal and @semicoleon has given a better response on that part. If I had noticed I wouldn't have mentioned the option of returning Item<'static> probably.

But what I mean was that if you had something like this:

fn method(&mut self) -> Item<'_> {
    // n.b. I'm never going to return a borrow-carrying variant
    //      like Item::DeclareLocal and I'm willing to promise
    //      that in my API
    Item::Error
}

You could change it to this:

// The input lifetime is no longer tied to the output lifetime
fn method(&mut self) -> Item<'static> {
    Item::Error
}

And it doesn't mean that the returned value has to live for the entirety of the program, it just means that if there's a borrowed variant, it's borrowed for 'static. [1] But there is no borrowed variant in this example.

See also common lifetime misconception #2.


  1. Whatever is borrowed would have to live for the entirety of the program. ↩︎

4 Likes

It is crucial to understand that lifetimes in rust do not describe how long data lives. They describe how long borrows are held.

Think of it this way: whenever you borrow something, you create a reference with a freshly minted lifetime annotation 'a. The borrow will then be held for as long as any value whose type mentions 'a exists[1], whether it's an &'a T or a Thing<'a>.

'static is the "empty annotation" that doesn't describe a borrow of anything, and is therefore permitted to exist forever.


  1. This is a bit of an oversimplification; technically the borrow is held until the last usage of that value, rather than when it falls out of scope ↩︎

3 Likes

Thanks everyone for helping to explain where I am getting hung up. This is exactly why I undertook this toy compiler project: if I hadn't, I'd've not learned anything new about how Rust lifetime annotations work. There is much food for thought here, which I'll need some time to properly digest.

1 Like

So much confusion could be avoided if elided_lifetimes_in_paths was at least warn-by-default...

1 Like

A warning would not have cleared up my misunderstanding of how lifetimes worked here.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.