-
Notifications
You must be signed in to change notification settings - Fork 38
Syntax Guide
otac0n edited this page Sep 18, 2012
·
18 revisions
A Pegasus grammar consists of a text file with two sections, in order:
- The "Settings" section.
- The "Rules" section.
Settings are specified in one of three ways:
-
@setting value
For simple values, just write the setting value out. This is parsed as a type name. -
@setting { value }
For more complex values, wrap the setting value in curly braces. This is parsed as a code section. -
@setting "value"
An alternative to using curly braces is to use a string.
-
@namespace
Specifies the namespace in which the parser class will be placed. -
@accessibility
Specifies the accessibility of the generated class. -
@classname
Specifies the name of the generated class. -
@using
Adds a using directive to the generated class file. (Multiple Allowed) -
@members
Allows for the definition of additional class members.
@namespace MyProject.Parsers
@accessibility internal
@classname MyParser
@using System.Linq
@using { Foo = System.String }
@members
{
private static bool HelperFunction()
{
}
}
The basic syntax of a rule is:
name = expression
By default, rules have a return type of string
. This can be modified by specifying a type for the rule, like so:
name <type> = expression
Rule flags are Boolean settings that are enabled on a per-rule basis. Flags come after the rule type, if there is one:
rule -flag = expression
rule <type> -flag = expression
Currently, the only supported rule flag is the -memoize
flag, which enables memoization for a particular rule.
- String
'foo'
or"bar"
: String expressions match a string literally. - Character Class
[a-z]
: Matches a single character that is within the character class. - Wildcard
.
: Wildcard expressions match any single character.
Strings and character classes can be marked as case-insensitive by suffixing the string or class with the letter i
. For example, "foo"i 'bar'i [baz]i
- Name
a
: Name expressions refer to a rule by name. - Labeled
foo:a
: Labeled expressions store a parse result for use in code assertions and expressions. - Sequence
a b c
: Sequence expressions match each component consecutively. - Choice
a / b / c
: Choice expressions provide options for parsing. They are evaluated consecutively. - Assertions
!a &b
: Assertion expressions act as look-aheads. They only peek at the parsing subject, they do not advance the cursor. - Code Assertions
!{foo} &{bar}
: Code assertions are similar to regular assertions. They represent C# code that returns a Boolean value, rather than performing a look-ahead. - Repetition
a? b+ c*
: Repetition expressions allow another expression to be repeated. - Parenthesis
( ... )
: Parenthesis are used to group expressions. - Type
(<type> ... )
: Type expressions allow part of a rule to have a certain return type. This has the same meaning as having a type for a rule, except it is constrained to the expression wrapped by the parenthesis.
- Code
{ code }
: Code expressions contain C# code that specifies the result of an expression. Code expressions must come at the end of a sequence. - Error
#ERROR{ code }
: Error-type code expressions are a special type of code expressions. The result of an error expression becomes an error message for an exception. Error-type code expressions must also come at the end of sequences. - State
#STATE{ code; }
: State-type code expressions allow for stateful parsing. The code in a state-type code expression is allowed to modify thestate
object in a way that supports backtracking and memoization. State expressions may appear anywhere in a rule definition.
-
/* ... */
Multi-line comment -
// ...
Single-line comment