- Distilled from https://gcc.gnu.org/wiki/GFortranStandards
- All drafts in PDFs contain LHS margin with line numberings.
- https://github.com/fujitsu/compiler-test-suite/tree/main/Fortran
- https://fortran-lang.discourse.group/t/fortran-compiler-testing-framework/1573
- https://github.com/scivision/fortran2018-examples
- https://github.com/fortran-lang/test-drive
- https://github.com/llvm/llvm-test-suite/tree/main/Fortran/gfortran
- https://github.com/OpenFortranProject/open-fortran-parser
- From ISO/IEC 1539-1:2023:
The syntax rules are not a complete and accurate syntax description of Fortran, and cannot be used to generate a Fortran parser automatically; where a syntax rule is incomplete, it is restricted by corresponding constraints and text.
- The main scraping script is extract.sh. It calls tritext to yank out text from a PDF file. Then, it calls a custom-written program just for extracting the EBNF from the text, extraction. The script also uses an ENBF in Antlr to parse and modify the scraped Fortran EBNF.
- There are many rules that end in
-list
. These are currently enumerated in the extract.sh script, but the rules should be generated. - The grammar generated is based on https://github.com/AkhilAkkapelli/Fortran2023Grammar. But,
there are several issues that need explanation.
- NAME is an identifer. PROGRAM is a keyword. There is no distinguishment between keywords and non-keywords, but there should be. In other words, NAME can remain NAME because it is that way in the spec, but PROGRAM should be KW_PROGRAM in order to distinguish the two uses.
- NAME is defined here: https://github.com/AkhilAkkapelli/Fortran2023Grammar/blob/553123a023f70e9a524e2a4036be128978834c42/Fortran2023Lexer.g4#L502. But, keyword "NAME" is define using NAAM here. https://github.com/AkhilAkkapelli/Fortran2023Grammar/blob/553123a023f70e9a524e2a4036be128978834c42/Fortran2023Lexer.g4#L152. It needs standardization.
- Beyond camel-case naming,
program : program_unit ( program_unit )*;
contains useless parentheses. These need to be removed. - The order of the parser rules does not correspond to the spec.
mult_op : ASTERIK | SLASH ;
. Terminal symbols should use the established Unicode character name. '*' is not named ASTERIK, but Asterisk.