Description
I've tried to use parser combinator libraries across multiple languages, and I've never seen the kind of hard distinction between tokens and parsers this library has. Perhaps I wasn't paying attention (This library is actually the one I've been the least frustrated with), but it's interesting and comes with a set of advantages and disadvantages. I'm interested in why this decision was made. I'd also like to get feedback on my own understanding of the concepts. This might also help you write good docs, or Id be willing to write them and PR if my understanding is good enough. feel free to close and ignore if neither of these discussions interest you.
The advantage is that it's very, very clear (after a bit of conceptual learning) what each piece of a grammar is for. Tokens are specifically about character sequence recognition, while parsers are about token sequence recognition and mapping. Once you get the distinction, it's easy to write grammars.
I see two main disadvantages.
-
Being forced to declare tokens separately from parsers feels redundant. Consider
val id by regexToken(\\w+) use { text }
. This creates a token and a parser, but only registers the parser. The solution ofval idToken by regexToken... val idParser by idToken use { text }
is fine, but feels very clunky. -
It's not very easy to combine grammars, or reuse grammars as parsers in a parent grammar, specifically because tokens are separate entities. Consider two grammars A and B. If I want a third grammar C that expresses
A or B
, simply doing the obvious thing of setting the rootParser to this expression is insufficient, because C doesn't have the tokens defined in A and B, and in fact has no tokens at all. this problem gets worse with more grammars and deeper nesting. It's also not clear from your docs how such a merge operation should function.
Are these assessments fair? Am I missing something?
Thanks.