Recursive expression parsing in Sprache
Clash Royale CLAN TAG#URR8PPP
Recursive expression parsing in Sprache
I am building a Sprache parser to parse expressions similar to SQL search conditions. e.g Property = 123
or Property > AnotherProperty
Property = 123
Property > AnotherProperty
So far both of those examples work, however I am struggling to figure out what I need to do to allow ANDing/ORing conditions and parenthesis.
Basically I have this so far:
private static readonly Parser<string> Operators =
Parse.String("+").Or(Parse.String("-")).Or(Parse.String("="))
.Or(Parse.String("<")).Or(Parse.String(">"))
.Or(Parse.String("<=")).Or(Parse.String(">=")).Or(Parse.String("<>"))
.Text();
private static readonly Parser<IdentifierExpression> Identifier =
from first in Parse.Letter.Once()
from rest in Parse.LetterOrDigit.Many()
select new IdentifierExpression(first.Concat(rest).ToArray());
public static readonly Parser<Expression> Integer =
Parse.Number.Select(n => new IntegerExpression Value = int.Parse(n));
public static readonly Parser<SearchCondition> SearchCondition =
from left in Identifier.Or(Number)
from op in Operators.Token()
from right in Identifier.Or(Number)
select new SearchCondition Left = left, Right = right, Operator = op;
This works for the simple cases above, but now I need a pointer on how to implement conditions like:
PropertyX = PropertyY OR PropertyX = PropertyZ
PropertyA > PropertyB AND (OtherAnotherProperty = 72 OR OtherAnotherProperty = 150)
Can anyone give me an idea on how to structure the parsers for this sort of thing?
IdentifierExpression
1 Answer
1
What you have so far is a basic comparison expression parser. It looks like you want to wrap that in a parser that handles logical expressions (and
, or
, etc.) with sub-expression support.
and
or
The code I posted at first was ripped from poorly-tested code I was still working on, which didn't handle statements with multiple terms. My understanding of the ChainOperator
method was clearly incomplete.
ChainOperator
Parse.ChainOperator
is the method that lets you specify your operators and have them appear 0-to-many times in the expression. I was making assumptions about how it worked that turned out to be just wrong.
Parse.ChainOperator
I've rewritten the code and added a few bits to make it easier to use:
// Helpers to make access simpler
public static class Condition
// For testing, will fail all variable references
public static Expression<Func<object, bool>> Parse(string text)
=> ConditionParser<object>.ParseCondition(text);
public static Expression<Func<T, bool>> Parse<T>(string text)
=> ConditionParser<T>.ParseCondition(text);
public static Expression<Func<T, bool>> Parse<T>(string text, T instance)
=> ConditionParser<T>.ParseCondition(text);
public static class ConditionParser<T>
static ParameterExpression Parm = Expression.Parameter(typeof(T), "_");
public static Expression<Func<T, bool>> ParseCondition(string text)
=> Lambda.Parse(text);
static Parser<Expression<Func<T, bool>>> Lambda =>
OrTerm.End().Select(body => Expression.Lambda<Func<T, bool>>(body, Parm));
// lowest priority first
static Parser<Expression> OrTerm =>
Parse.ChainOperator(OpOr, AndTerm, Expression.MakeBinary);
static Parser<ExpressionType> OpOr = MakeOperator("or", ExpressionType.OrElse);
static Parser<Expression> AndTerm =>
Parse.ChainOperator(OpAnd, NegateTerm, Expression.MakeBinary);
static Parser<ExpressionType> OpAnd = MakeOperator("and", ExpressionType.AndAlso);
static Parser<Expression> NegateTerm =>
NegatedFactor
.Or(Factor);
static Parser<Expression> NegatedFactor =>
from negate in Parse.IgnoreCase("not").Token()
from expr in Factor
select Expression.Not(expr);
static Parser<Expression> Factor =>
SubExpression
.Or(BooleanLiteral)
.Or(BooleanVariable);
static Parser<Expression> SubExpression =>
from lparen in Parse.Char('(').Token()
from expr in OrTerm
from rparen in Parse.Char(')').Token()
select expr;
static Parser<Expression> BooleanValue =>
BooleanLiteral
.Or(BooleanVariable);
static Parser<Expression> BooleanLiteral =>
Parse.IgnoreCase("true").Or(Parse.IgnoreCase("false"))
.Text().Token()
.Select(value => Expression.Constant(bool.Parse(value)));
static Parser<Expression> BooleanVariable =>
Parse.Regex(@"[A-Za-z_][A-Za-z_d]*").Token()
.Select(name => VariableAccess<bool>(name));
static Expression VariableAccess<TTarget>(string name)
MemberInfo mi = typeof(T).GetMember(name, MemberTypes.Field
// Helper: define an operator parser
static Parser<ExpressionType> MakeOperator(string token, ExpressionType type)
=> Parse.IgnoreCase(token).Token().Return(type);
And some examples:
static class Program
static void Main()
// Parser with no input
var condition1 = Condition.Parse("true and false or true");
Console.WriteLine(condition1.ToString());
var fn1 = condition1.Compile();
Console.WriteLine("t=0", fn1(null));
// Parser with record input
var record1 = new a = true, b = false ;
var record2 = new a = false, b = true ;
var condition2 = Condition.Parse("a and b or not a", record);
Console.WriteLine(condition2.ToString());
var fn2 = condition2.Compile();
Console.WriteLine("t0 => 1", record1.ToString(), fn2(record1));
Console.WriteLine("t0 => 1", record2.ToString(), fn2(record2));
You will still need to add your own parsers for handling comparison expressions and so on. Plug those into the BooleanValue
parser after the existing terms:
BooleanValue
static Parser<Expression> BooleanValue =>
BooleanLiteral
.Or(BooleanVariable)
.Or(SearchCondition);
I'm doing something similar with a more C#-style filter specification with type checking during the parse phase and separate parsers for strings vs numbers.
Thanks @Corey... unless I am missing something this will not parse expressions like
true and false or true
- it will only handle two expressions (obviously two is fine for true/false but I would want to handle any number of conditions (e.g Prop1 = A AND Prop2 = B AND Prop3 = C
)– Simon
Aug 8 at 18:39
true and false or true
Prop1 = A AND Prop2 = B AND Prop3 = C
Well damn. I thought I tested that. Will fix and update the answer.
– Corey
Aug 8 at 23:02
Done. Incidentally this is based on the structure used in SimpleCalc. I just changed some things to handle boolean conditions instead of arithmetic. At some point I'll post the stuff I'm working on to GitHub.
– Corey
Aug 9 at 6:08
Well, thank you Sir. That is very good. I also found this github.com/anpv/SpracheTest which has a similar approach to you - now I just need to figure out how it all works :) Thanks very much
– Simon
Aug 9 at 18:30
Fairly sure the recursive magic happens in the
Parse.ChainOperator
code. I haven't read the Sprache code enough to figure out how it works however.– Corey
Aug 9 at 23:40
Parse.ChainOperator
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
What is
IdentifierExpression
? Custom LINQ Expression to access your data?– Corey
Aug 8 at 1:59