1. Overview
A parser for SQL in Haskell. Also includes a pretty printer which formats output nicely. Current target is to parse most SQL:2011 queries, plus a good subset of DDL, non-query DML, transaction management, access control and session management.
This is the documentation for version 0.6.0. Documentation for other versions is available here: http://jakewheat.github.io/simple-sql-parser/.
Status: covers a lot of queries already, but the public API is probably not very stable, since adding support for all the not-yet-supported ANSI SQL syntax, then other dialects of SQL is likely to change the abstract syntax types considerably.
Tested with GHC 8.8.1, 8.6.5, 8.4.4 and 8.2.1.
2. Links
-
Haddock: haddock/index.html
-
Supported SQL: supported_sql.html
-
Test cases: simple-sql-parser test cases
-
Homepage: http://jakewheat.github.io/simple-sql-parser/latest
-
Hackage: http://hackage.haskell.org/package/simple-sql-parser
-
Repository: https://github.com/JakeWheat/simple-sql-parser
-
Bug tracker: https://github.com/JakeWheat/simple-sql-parser/issues
-
Changes: https://github.com/JakeWheat/simple-sql-parser/blob/master/changelog
-
Other versions: http://jakewheat.github.io/simple-sql-parser/
-
Parent project: http://jakewheat.github.io/
-
Contact: jakewheatmail@gmail.com
The simple-sql-parser is a lot less simple than it used to be. If you just need to parse much simpler SQL than this, or want to start with a simpler parser and modify it slightly, you could also look at the basic query parser in the intro_to_parsing project, the code is here: SimpleSQLQueryParser.
3. Feature support
-
query expressions
-
select lists
-
from clause
-
where clause
-
group by clause
-
having clause
-
order by clause
-
offset and fetch
-
set operators
-
common table expressions
-
wide range of scalar expressions
-
-
DDL (ansi dialect)
-
create,drop schema
-
create, alter, drop table
-
create, drop view
-
create, alter, drop domain
-
create, drop assertion
-
create, alter, drop sequence
-
-
non-query DML
-
delete
-
truncate
-
insert
-
update
-
-
Access control
-
grant, revoke - permissions and roles
-
create, drop role
-
-
Transaction management
-
begin, commit, rollback, savepoints
-
See the supported_sql.html page for details on the supported SQL.
Here is all the simple-sql-parser test cases rendered in a webpage so you can get an idea of what it supports.
4. Examples
Simple expression:
select a + b * c
Parsed AST:
Select{qeSetQuantifier = All,
qeSelectList =
[(BinOp (Iden (Name "a")) (Name "+")
(BinOp (Iden (Name "b")) (Name "*") (Iden (Name "c"))),
Nothing)],
qeFrom = [], qeWhere = Nothing, qeGroupBy = [], qeHaving = Nothing,
qeOrderBy = [], qeOffset = Nothing, qeFetchFirst = Nothing}
TPC-H query 21:
select
s_name,
count(*) as numwait
from
supplier,
lineitem l1,
orders,
nation
where
s_suppkey = l1.l_suppkey
and o_orderkey = l1.l_orderkey
and o_orderstatus = 'F'
and l1.l_receiptdate > l1.l_commitdate
and exists (
select
*
from
lineitem l2
where
l2.l_orderkey = l1.l_orderkey
and l2.l_suppkey <> l1.l_suppkey
)
and not exists (
select
*
from
lineitem l3
where
l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate
)
and s_nationkey = n_nationkey
and n_name = 'INDIA'
group by
s_name
order by
numwait desc,
s_name
fetch first 100 rows only;
Parsed:
Select{qeSetQuantifier = All,
qeSelectList =
[(Iden (Name "s_name"), Nothing),
(App (Name "count") [Star], Just (Name "numwait"))],
qeFrom =
[TRSimple (Name "supplier"),
TRAlias (TRSimple (Name "lineitem")) (Alias (Name "l1") Nothing),
TRSimple (Name "orders"), TRSimple (Name "nation")],
qeWhere =
Just
(BinOp
(BinOp
(BinOp
(BinOp
(BinOp
(BinOp
(BinOp
(BinOp (Iden (Name "s_suppkey")) (Name "=")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_suppkey"))))
(Name "and")
(BinOp (Iden (Name "o_orderkey")) (Name "=")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_orderkey")))))
(Name "and")
(BinOp (Iden (Name "o_orderstatus")) (Name "=") (StringLit "F")))
(Name "and")
(BinOp
(BinOp (Iden (Name "l1")) (Name ".") (Iden (Name "l_receiptdate")))
(Name ">")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_commitdate")))))
(Name "and")
(SubQueryExpr SqExists
(Select{qeSetQuantifier = All, qeSelectList = [(Star, Nothing)],
qeFrom =
[TRAlias (TRSimple (Name "lineitem"))
(Alias (Name "l2") Nothing)],
qeWhere =
Just
(BinOp
(BinOp
(BinOp (Iden (Name "l2")) (Name ".")
(Iden (Name "l_orderkey")))
(Name "=")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_orderkey"))))
(Name "and")
(BinOp
(BinOp (Iden (Name "l2")) (Name ".")
(Iden (Name "l_suppkey")))
(Name "<>")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_suppkey"))))),
qeGroupBy = [], qeHaving = Nothing, qeOrderBy = [],
qeOffset = Nothing, qeFetchFirst = Nothing})))
(Name "and")
(PrefixOp (Name "not")
(SubQueryExpr SqExists
(Select{qeSetQuantifier = All, qeSelectList = [(Star, Nothing)],
qeFrom =
[TRAlias (TRSimple (Name "lineitem"))
(Alias (Name "l3") Nothing)],
qeWhere =
Just
(BinOp
(BinOp
(BinOp
(BinOp (Iden (Name "l3")) (Name ".")
(Iden (Name "l_orderkey")))
(Name "=")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_orderkey"))))
(Name "and")
(BinOp
(BinOp (Iden (Name "l3")) (Name ".")
(Iden (Name "l_suppkey")))
(Name "<>")
(BinOp (Iden (Name "l1")) (Name ".")
(Iden (Name "l_suppkey")))))
(Name "and")
(BinOp
(BinOp (Iden (Name "l3")) (Name ".")
(Iden (Name "l_receiptdate")))
(Name ">")
(BinOp (Iden (Name "l3")) (Name ".")
(Iden (Name "l_commitdate"))))),
qeGroupBy = [], qeHaving = Nothing, qeOrderBy = [],
qeOffset = Nothing, qeFetchFirst = Nothing}))))
(Name "and")
(BinOp (Iden (Name "s_nationkey")) (Name "=")
(Iden (Name "n_nationkey"))))
(Name "and")
(BinOp (Iden (Name "n_name")) (Name "=") (StringLit "INDIA"))),
qeGroupBy = [SimpleGroup (Iden (Name "s_name"))],
qeHaving = Nothing,
qeOrderBy =
[SortSpec (Iden (Name "numwait")) Desc NullsOrderDefault,
SortSpec (Iden (Name "s_name")) Asc NullsOrderDefault],
qeOffset = Nothing, qeFetchFirst = Just (NumLit "100")})
Output from the simple-sql-parser pretty printer:
select s_name, count(*) as numwait
from supplier,
lineitem as l1,
orders,
nation
where s_suppkey = l1.l_suppkey
and o_orderkey = l1.l_orderkey
and o_orderstatus = 'F'
and l1.l_receiptdate > l1.l_commitdate
and exists (select *
from lineitem as l2
where l2.l_orderkey = l1.l_orderkey
and l2.l_suppkey <> l1.l_suppkey)
and not exists (select *
from lineitem as l3
where l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate)
and s_nationkey = n_nationkey
and n_name = 'INDIA'
group by s_name
order by numwait desc, s_name
fetch first 100 rows only;
Parsing some SQL and printing the AST:
import System.Environment
import Text.Show.Pretty
import System.IO
import Language.SQL.SimpleSQL.Parse
(parseStatements
,ParseError
,peFormattedError)
import Language.SQL.SimpleSQL.Syntax (ansi2011, Statement)
main :: IO ()
main = do
args <- getArgs
case args of
[] -> do
-- read from stdin
c <- getContents
doIt c
["-s", sql] -> do
-- parse arg given
doIt sql
[f] ->
-- read file
withFile f ReadMode (\h -> do
x <- hGetContents h
doIt x)
_ -> do
putStrLn "use no arguments to stream sql from stdin, e.g.:\n\
\ cat some.sql | SimpleSQLParserExample\n\
\n\
\use -s to parse sql on command line, e.g.:\n\
\ SimpleSQLParserExample -s \"select * from t\"\n\
\use a single arg to parse a file, e.g.\n\
\ SimpleSQLParserExample some.sql"
doIt :: String -> IO ()
doIt src = do
let parsed :: Either ParseError [Statement]
parsed = parseStatements ansi2011 "" Nothing src
either (error . peFormattedError)
(putStrLn . ppShow)
parsed
5. Installation
Installing the latest release from Hackage.
cabal v2-update && cabal v2-install simple-sql-parser
Working with the latest development version:
git clone https://github.com/JakeWheat/simple-sql-parser.git cd simple-sql-parser cabal v2-build
5.1. Running the tests
Get the source using 'cabal unpack' or 'git clone', then change to the source directory.
You can run the tests using cabal:
cabal v2-test
Or you can run them directly which gives more options. The tests use tasty, which provides the command line options. --hide-successes with --ansi-tricks=false so it works is a good option to use:
cabal v2-run test:Tests -- --hide-successes --ansi-tricks=false
6. Reporting bugs
Please report bugs here:
A good bug report (or feature request) should have an example of the SQL which is failing.
Feature requests are welcome, but please note that there is no-one generally available to work on these, so you should either make a pull request, or find someone willing to write the fixes and make a pull request.
There is a related tutorial on implementing a SQL parser here: http://jakewheat.github.io/intro_to_parsing/