Parse SQL statements

Hi, I'm working on a CLI tool, which allows a user to gather system information via a SQL statement, e. g.

SELECT * FROM users;

would return local users' details. This is a very simple example! However, the tool should also be able to handle complex SQL statements, e.g. joining multiple tables, statements prefixed with a WITH clause.

(1) Table names are then mapped to "plugins", which (2) gather the information. (3) Information is then written to a SQLite database. (4) Finally the SQL statement is executed against the database and (5) the result provided to the user.

I've implemented (2) to (5), but I'm struggling with (1). So, how to parse the SQL statement to identify the plugin to execute? I've looked into sqlite3-parser. However, it looks like it does not support to get the table names.

Hey, how would you approach this? Any idea, no matter how mature, appreciated.

Thanks.

Maybe it is just difficult to find in the doc. I see that ResultColumn has a table name. Perhaps it's done that way since columns from multiple tables can be selected.

The FromClause also has the table.

@jumpnbrownweasel, yes, I'm aware of those structs. I'm using the Parser to analyse the SQL statement. I see the information I need in the debug output but I can't find a way to access it via the crate's API.

Select { distinctness: None, columns: [Star], from: Some(FromClause { 
    select: Some(Table(QualifiedName { 
        db_name: None, 
        name: Name("users"), 
        alias: None 
    }, 
    None, 
    None)), 
    joins: None, op: None }), where_clause: None, group_by: None, window_clause: None }

Going through the source code of sqlite3-parser ... getting closer.

Table(QualifiedName { db_name: None, name: Name("users"), alias: None }, None, None)

Got it. The key to this was reading the source code not the docs and finally the enum OneSelect. I was not familiar with a struct being a variant.

pub enum OneSelect {
    /// `SELECT`
    Select {
        /// `DISTINCT`
        distinctness: Option<Distinctness>,
        /// columns
        columns: Vec<ResultColumn>,
        /// `FROM` clause
        from: Option<FromClause>,
        /// `WHERE` clause
        where_clause: Option<Expr>,
        /// `GROUP BY`
        group_by: Option<GroupBy>,
        /// `WINDOW` definition
        window_clause: Option<Vec<WindowDef>>,
    },
    /// `VALUES`
    Values(Vec<Vec<Expr>>),
}

Thank you @jumpnbrownweasel. Your input gave me confidence to try harder.

I guess I will have some fun the next days to find all locations where table names could show up. :slight_smile:

1 Like

Hi Tokcum, still a little confused on how you are getting the table names with OneSelect. So that would only get the table names for "Select" statements, correct? And to get all the OneSelect objects it looks like you need to check SelectBody has both a OneSelect object and can have a Vec that in turn has OneSelect objects. So looks like you have to check in two location. Would you mind describing the steps here. Thanks in advance for any light you can shed on this.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.