Parsing SQL code in C# [closed]
Asked Answered
L

8

80

I want to parse SQL code using C#.

Specifically, is there any freely available parser which can parse SQL code and generate a tree or any other structure out of it? It should also generate the proper tree for nested structures.

It should also return which kind of statement the node of this tree represents.

For example, if the node contains a loop condition then it should return that this is a "loop type" of a node.

Or is there any way by which I can parse the code in C# and generate a tree of the type I want?

Letsou answered 26/2, 2009 at 4:20 Comment(1)
stackoverflow.com/questions/34102835Geo
D
11

[Warning: answer may no longer apply as of 2021]

Use Microsoft Entity Framework (EF).

It has a "Entity SQL" parser which builds an expression tree,

using System.Data.EntityClient;
...
EntityConnection conn = new EntityConnection(myContext.Connection.ConnectionString);
conn.Open();
EntityCommand cmd = conn.CreateCommand();
cmd.CommandText = @"Select t.MyValue From MyEntities.MyTable As t";
var queryExpression = cmd.Expression;
....
conn.Close();

Or something like that, check it out on MSDN.

And it's all on Ballmers tick :-)

There is also one on The Code Project, SQL Parser.

Good luck.

Danseur answered 26/2, 2009 at 4:43 Comment(7)
That takes "Entity-SQL", a dialect of SQL; I believe the OP means regular SQL, such as "Transact-SQL" (SQL Server's dialect). In short; this won't work.Effectual
As far as I can tell there are no clues in orgional question either way? So before we right this off shall we wait for @aaCog to confirm?Danseur
Well, I meant SQL and not T-SQL. Also, i do not want to submit any commands to the db server but just want to do the processing over the SQL code which you consider as a simple text.Letsou
I can't find an Expression property for EntityCommand on msdn. msdn.microsoft.com/en-us/library/…Section
For the -ve 'ers. This answer was valid in 2009. EF has moved on, and this no longer appliesDanseur
@Danseur - The fact that this answer is no longer correct is however a valid reason for downvoting. As you know, this is how Stackoverflow is meant to work. It relies on people either updating the answer, or accepting that it will be voted down in preference for more correct answers to be voted above it.Cremona
"it's all on Ballmers tick :-)" what is this supposed to mean? I'm not familiar with the idiom. I assume, given the reference to Ballmer, that it's intended to mean that it is a part of a Microsoft product, or something?Churl
M
37

Specifically for Transact-SQL (Microsoft SQL Server) you can use the Microsoft.SqlServer.Management.SqlParser.Parser namespace available in Microsoft.SqlServer.Management.SqlParser.dll, an assembly included with SQL Server and which can be freely distributed.

Here's an example method for parsing T-SQL as a string into a sequence of tokens:

IEnumerable<TokenInfo> ParseSql(string sql)
{
    ParseOptions parseOptions = new ParseOptions();
    Scanner scanner = new Scanner(parseOptions);

    int state = 0,
        start,
        end,
        lastTokenEnd = -1,
        token;

    bool isPairMatch, isExecAutoParamHelp;

    List<TokenInfo> tokens = new List<TokenInfo>();

    scanner.SetSource(sql, 0);

    while ((token = scanner.GetNext(ref state, out start, out end, out isPairMatch, out isExecAutoParamHelp)) != (int)Tokens.EOF)
    {
        TokenInfo tokenInfo =
            new TokenInfo()
            {
                Start = start,
                End = end,
                IsPairMatch = isPairMatch,
                IsExecAutoParamHelp = isExecAutoParamHelp,
                Sql = sql.Substring(start, end - start + 1),
                Token = (Tokens)token,
            };

        tokens.Add(tokenInfo);

        lastTokenEnd = end;
    }

    return tokens;
}

Note that the TokenInfo class is just a simple class with the above-referenced properties.

Tokens is this enumeration:

and includes constants like TOKEN_BEGIN, TOKEN_COMMIT, TOKEN_EXISTS, etc.

UPDATE: it is now a separate nuget: https://www.nuget.org/packages/Microsoft.SqlServer.Management.SqlParser

Maggot answered 22/2, 2013 at 15:34 Comment(1)
It looks like the best one. But I seem to have to manually build a expression tree.Metanephros
G
12

Scott Hanselman recently featured the Irony project which includes a sample SQL parser.

Geo answered 25/10, 2011 at 18:25 Comment(1)
(becouse CodePlex is shutting down, there is updated link: github.com/IronyProject/Irony )Basilio
D
11

[Warning: answer may no longer apply as of 2021]

Use Microsoft Entity Framework (EF).

It has a "Entity SQL" parser which builds an expression tree,

using System.Data.EntityClient;
...
EntityConnection conn = new EntityConnection(myContext.Connection.ConnectionString);
conn.Open();
EntityCommand cmd = conn.CreateCommand();
cmd.CommandText = @"Select t.MyValue From MyEntities.MyTable As t";
var queryExpression = cmd.Expression;
....
conn.Close();

Or something like that, check it out on MSDN.

And it's all on Ballmers tick :-)

There is also one on The Code Project, SQL Parser.

Good luck.

Danseur answered 26/2, 2009 at 4:43 Comment(7)
That takes "Entity-SQL", a dialect of SQL; I believe the OP means regular SQL, such as "Transact-SQL" (SQL Server's dialect). In short; this won't work.Effectual
As far as I can tell there are no clues in orgional question either way? So before we right this off shall we wait for @aaCog to confirm?Danseur
Well, I meant SQL and not T-SQL. Also, i do not want to submit any commands to the db server but just want to do the processing over the SQL code which you consider as a simple text.Letsou
I can't find an Expression property for EntityCommand on msdn. msdn.microsoft.com/en-us/library/…Section
For the -ve 'ers. This answer was valid in 2009. EF has moved on, and this no longer appliesDanseur
@Danseur - The fact that this answer is no longer correct is however a valid reason for downvoting. As you know, this is how Stackoverflow is meant to work. It relies on people either updating the answer, or accepting that it will be voted down in preference for more correct answers to be voted above it.Cremona
"it's all on Ballmers tick :-)" what is this supposed to mean? I'm not familiar with the idiom. I assume, given the reference to Ballmer, that it's intended to mean that it is a part of a Microsoft product, or something?Churl
M
8

You may take a look at a commerical component: general sql parser at http://www.sqlparser.com It supports SQL syntax of Oracle, T-SQL, DB2 and MySQL.

Masson answered 9/3, 2010 at 13:15 Comment(2)
Generate internal query parse tree in xml output for further processing. Support nested subquery, complex join and statements in stored procedure. dpriver.com/blog/…Masson
Yup, I'm using it, the API is a tad confusing, but definitely gets the job done. Pretty reasonable price as well. Support is very helpful.Darvon
M
6

Try ANTLR - There are a bunch of SQL grammars on there.

Mathematical answered 26/2, 2009 at 5:25 Comment(3)
any sample application in .NEt -with source code if is possible- that use AntLR ??Headlock
@PuterdoBorato link seems to be dead :(Lard
@Chris I was able to generate some SQL parsers using Antler 4.5.3 and . This something does not seem to be compilable though. Compiled parsers: github.com/another-guy/SqlSchemer/tree/… Steps you can follow to compile SQL parsers yourself (maybe you have more luck resolving the issues): github.com/another-guy/SqlSchemer/issues/2 If you do, please let me know. I still need a working parser...Lichenology
S
4

VSTS 2008 Database Edition GDR includes assemblies that handle SQL parsing and script generation that you can reference from your project. Database Edition uses the parser to parse the script files to represent in-memory model of your database and then uses the script generator to generate SQL scripts from the model. I think there are just two assemblies you need to have and reference in your project. If you don't have the database edition, you may install the trial version to get the assemblies or there might be another way to have them without installing the database edition. Check out the following link. Data Dude:Getting to the Crown Jewels .

Sacci answered 30/4, 2009 at 2:10 Comment(1)
Mehmet, VSTS 2008 Database Edition GDR manages Sql parsing for Oracle ??Headlock
B
2

Try GOLD Parser, it's a powerful and easy to learn BNF engine. You can search the grammars already made for what you want (ie: SQL ANSI 89 Grammar).

I started using this for HQL parsing (the NHibernate query language, very similar to SQL), and it's awesome.

UPDATE: Now the NH dev team has done the HQL parsing using ANTLR (which is harder to use, but more powerful AFAIK).

Braided answered 1/3, 2009 at 5:1 Comment(0)
U
1

As Diego suggested, grammars are the way to go IMHO. I've tried Coco/r before, but that is too simple for complex SQL. There's ANTLR with a number of grammars ready.

Someone even tried to build a SQL engine, check the code if there's something for you in SharpHSQL - An SQL engine written in C#.

Unhand answered 3/11, 2010 at 8:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.