A parser for Julia using Tokenize that aims to extend the built-in parser by providing additional meta information along with the resultant AST.
using Pkg
Pkg.add("CSTParser")
using CSTParser
CSTParser.parse("x = y + 123")
CSTParser.EXPR
are broadly equivalent to Base.Expr
in structure. The key differences are additional fields to store, for each expression:
- trivia tokens such as punctuation or keywords that are not stored as part of the AST but are needed for the CST representation;
- the span measurements for an expression;
- the textual representation of the token (only needed for certain tokens including identifiers (symbols), operators and literals);
- the parent expression, if present; and
- any other meta information (this field is untyped and is used within CSTParser to hold errors).
All .head
values used in Expr
are used in EXPR
. Unlike in AST, tokens (terminal expressions with no child expressions) are stored as EXPR
and additional head types are used to distinguish between different types of token. These possible head values include:
:IDENTIFIER
:NONSTDIDENTIFIER (e.g. var"id")
:OPERATOR
# Punctuation
:COMMA
:LPAREN
:RPAREN
:LSQUARE
:RSQUARE
:LBRACE
:RBRACE
:ATSIGN
:DOT
# Keywords
:ABSTRACT
:BAREMODULE
:BEGIN
:BREAK
:CATCH
:CONST
:CONTINUE
:DO
:ELSE
:ELSEIF
:END
:EXPORT
:FINALLY
:FOR
:FUNCTION
:GLOBAL
:IF
:IMPORT
:LET
:LOCAL
:MACRO
:MODULE
:MUTABLE
:NEW
:OUTER
:PRIMITIVE
:QUOTE
:RETURN
:STRUCT
:TRY
:TYPE
:USING
:WHILE
# Literals
:INTEGER
:BININT (0b0)
:HEXINT (0x0)
:OCTINT (0o0)
:FLOAT
:STRING
:TRIPLESTRING
:CHAR
:CMD
:TRIPLECMD
:NOTHING
:TRUE
:FALSE
The ordering of .args
members matches that in Base.Expr
and members of .trivia
are stored in the order in which they appear in text.