ANTLR4-Java Tutorial - Parsing Markup

We are defining a sort of BBCode markup, with tags delimited by square brackets.

Project set up following Java Setup section of the ANTLR Mega Tutorial. Reference to original repo at: https://github.com/unosviluppatore/antlr-mega-tutorial.

The project can be built and run with gradle run and JUnit test can be run with gradle test.

Parsing Markup

ANTLR can parse many things, including binary data, in that case tokens are made up of non printable characters. But a more common problem is parsing markup languages such as XML or HTML. Markup is also a useful format to adopt for your own creations, because it allows to mix unstructured text content with structured annotations. They fundamentally represent a form of smart document, containing both text and structured data. The technical term that describe them is island languages. This type is not restricted to include only markup, and sometimes it’s a matter of perspective.

For example, you may have to build a parser that ignores preprocessor directives. In that case, you have to find a way to distinguish proper code from directives, which obeys different rules.

In any case, the problem for parsing such languages is that there is a lot of text that we don’t actually have to parse, but we cannot ignore or discard, because the text contain useful information for the user and it is a structural part of the document. The solution is lexical modes, a way to parse structured content inside a larger sea of free text.

Other than for markup languages, lexical modes are typically used to deal with string interpolation. That is when a string literal can contain more than simple text, for instance arbitrary expressions.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
gradle/wrapper		gradle/wrapper
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ANTLR4-Java Tutorial - Parsing Markup

Parsing Markup

About

Releases

Packages

Languages

License

NiccoMlt/antlr-java

Folders and files

Latest commit

History

Repository files navigation

ANTLR4-Java Tutorial - Parsing Markup

Parsing Markup

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages