Parsy is an easy way to combine simple, small parsers into complex, larger parsers. If it means anything to you, it's a monadic parser combinator library for LL(infinity) grammars in the spirit of Parsec, Parsnip, and Parsimmon.
Parsy requires Python 3.3 or greater.
This repo is no longer maintained. Instead, see https://github.com/python-parsy/parsy
-
string(expected_string)
Returns a parser that expects the
expected_string
and produces that string value. -
regex(exp, [flags=0])
Returns a parser that expects the given
exp
, and produces the matched string.exp
can be a compiled regular expression, or a string which will be compiled with the givenflags
. -
success(val)
Returns a parser that does not consume any of the stream, but produces
val
.
-
parser.parse(string)
Attempts to parse the given
string
. If the parse is successful and consumes the entire string, the result is returned - otherwise, aParseError
is raised. -
parser.parse_partial(string)
Similar to
parse
, except that it does not require the entire string to be consumed. Returns a tuple of(result, rest_of_string)
, whererest_of_string
is the part of the string that was left over. -
parser | other_parser
Returns a parser that tries
parser
and, if it fails, backtracks and triesother_parser
. These can be chained together.The resulting parser will produce the value produced by the first successful parser.
>>> parser = string('x') | string('y') | string('z')
>>> parser.parse('x')
'x'
>>> parser.parse('y')
'y'
>>> parser.parse('z')
'z'
-
parser.then(other_parser)
(alsoparser >> other_parser
)Returns a parser which, if
parser
succeeds, will continue parsing withother_parser
. This will produce the value produced byother_parser
>>> (string('x') >> string('y')).parse('xy')
'y'
-
parser.skip(other_parser)
(alsoparser << other_parser
)Similar to
then
(or>>
), except the resulting parser will use the value produced by the first parser.
>>> (string('x') << string('y')).parse('xy')
'x'
-
parser.many()
Returns a parser that expects
parser
0 or more times, and produces a list of the results. Note that this parser can never fail - only produce an empty list.
>>> parser = regex(r'[a-z]').many()
>>> parser.parse('')
[]
>>> parser.parse('abc')
['a', 'b', 'c']
-
parser.times(min [, max=min])
Returns a parser that expects
parser
at leastmin
times, and at mostmax
times, and produces a list of the results. If only one argument is given, the parser is expected exactly that number of times. -
parser.at_most(n)
Returns a parser that expects
parser
at mostn
times, and produces a list of the results. -
parser.at_least(n)
Returns a parser that expects
parser
at leastn
times, and produces a list of the results. -
parser.map(fn)
Returns a parser that transforms the produced value of
parser
withfn
.
>>> regex(r'[0-9]+').map(int).parse('1234')
1234
-
parser.result(val)
Returns a parser that, if
parser
succeeds, always producesval
.
>>> string('foo').result(42).parse('foo')
42
-
parser.bind(fn)
Returns a parser which, if
parser
is successful, passes the result tofn
, and continues with the parser returned fromfn
. This is the monadic binding operation.
The most powerful way to construct a parser is to use the generate
decorator. parsy.generate
creates a parser from a generator that should
yield parsers. These parsers are applied successively and their results
are sent back to the generator using .send()
protocol. The generator
should return the result or another parser, which is equivalent to
applying it and returning its result.
from parsy import generate
@generate
def form():
"""
Parse an s-expression form, like (a b c).
An equivalent to lparen >> expr.many() << rparen
"""
yield lparen
exprs = yield expr.many()
yield rparen
return exprs
@generate
def exact_number():
"""
Parse specified number of expressions, like
4: a b c d
"""
num = int(yield regex(r'[0-9]+')) # or .map(int)
yield string(':')
return expr.times(num)
Note that there is no guarantee that the entire function is executed: if any of the yielded parsers fails, parsy will try to backtrack to an alternative parser.