Code Tip #1: PEAR SQL Parser

In developing Dataface, I have had to use a number of existing libraries to make my life easier.  One such library is the SQL Parser library in PEAR.  This class will parse an SQL query into a data structure that can be manipulated quite easily.  The library also comes with the complementary SQL_Compiler class that can compile these data structures back into an SQL statement.  This can be handy if you want to be able to add filters, column lists, or sort orders to sql queries, or if you want to break a query down into its component parts.

 Simple Usage

require_once ‘SQL/Parser.php’;
$parser = new SQL_Parser();
$struct = $parser->parse("SELECT a,b,c FROM Foo");
print_r($struct);

The above code sample will output the following:

Array
(
    [command] => select
    [column_tables] => Array
        (
            [0] =>
            [1] =>
            [2] =>
        )

    [column_names] => Array
        (
            [0] => a
            [1] => b
            [2] => c
        )

    [column_aliases] => Array
        (
            [0] =>
            [1] =>
            [2] =>
        )

    [table_names] => Array
        (
            [0] => foo
        )

    [table_aliases] => Array
        (
            [0] =>
        )

    [table_join_clause] => Array
        (
            [0] =>
        )

)

 

This is a fairly simple representation and it should be straight forward what all this means, but I’ll go over it here.  The statement SELECT a, b, c FROM Foo is parsed into the above array with the following keys:

  • command – The SQL command (eg: SELECT, UPDATE, INSERT, etc..)
  • column_names – The names of the columns that are included in the query.
  • column_aliases – If you have a column name of the form foo as bar, indicating that you are aliasing that column, this will be an array of the aliases for each column. 
  • column_tables – If you have a column name of the form Foo.bar (indicating column bar from table Foo) then this is an array of the table names.
  • table_names – The names of the tables in the FROM clause.
  • table_aliases – The aliases of the tables in the FROM clause.  This array has corresponding indices to the table_names array.
  • table_join_clause – A structure to represent the join clause of the query.
  • where_clause – A structure to represent the where clause of the query.
  • sort_order – A structure to represent the ORDER BY clause of the query.

How about a slightly more complex query as an example to see how these structures work.

require_once ‘SQL/Parser.php’;
$parser = new SQL_Parser(null, ‘MySQL’);
$struct = $parser->parse("SELECT F.a as column1, B.b as column2 FROM Foo F inner join Bar B on F.c=B.c where column1 = ‘val1’ and column2 = ‘val2’");
print_r($struct);

This would output:

Array
(
    [command] => select
    [column_tables] => Array
        (
            [0] => F
            [1] => B
        )

    [column_names] => Array
        (
            [0] => a
            [1] => b
        )

    [column_aliases] => Array
        (
            [0] => column1
            [1] => column2
        )

    [table_names] => Array
        (
            [0] => Foo
            [1] => Bar
        )

    [table_aliases] => Array
        (
            [0] => F
            [1] => B
        )

    [table_join_clause] => Array
        (
            [0] =>
            [1] => Array
                (
                    [arg_1] => Array
                        (
                            [value] => F.c
                            [type] => ident
                        )

                    [op] => =
                    [arg_2] => Array
                        (
                            [value] => B.c
                            [type] => ident
                        )

                )

        )

    [table_join] => Array
        (
            [0] => inner join
        )

    [where_clause] => Array
        (
            [arg_1] => Array
                (
                    [arg_1] => Array
                        (
                            [value] => column1
                            [type] => ident
                        )

                    [op] => =
                    [arg_2] => Array
                        (
                            [value] => val1
                            [type] => text_val
                        )

                )

            [op] => and
            [arg_2] => Array
                (
                    [arg_1] => Array
                        (
                            [value] => column2
                            [type] => ident
                        )

                    [op] => =
                    [arg_2] => Array
                        (
                            [value] => val2
                            [type] => text_val
                        )

                )

        )

)

Comments

As you can see the column_names, column_aliases, and column_tables arrays all have corresponding indicesi.e. $struct[‘column_aliases’][$i] contains the alias (if it is specified) for the column whose name is stored in $struct[‘column_names’][$i] and the table for the column can be found in $struct[‘column_tables’][$i].

There is a similar correspondence between the table_names and table_aliases arrays.

comments powered by Disqus