Since parser.c
and friends can be generated from grammar.js
(or
grammar.json
), wouldn't it
be ok to not commit them to one's grammar / parser repository?
On a few occasions 1 2 3 maxbrunsfeld has recommended including generated parse source.
Also, currently, various parties (e.g. nvim-treesitter, tree-sitter-langs, Emacs 29+, Cursorless, difftastic, helix-editor, semgrep, etc.) assume the inclusion of these files to varying degrees.
In late 2020, maxbrunsfeld sketched out a draft plan to move away from doing this. A few months later(?), a form of this was added to the Tree-sitter 1.0 Checklist (search for "Mergeable Git Repos").
Of the checked repositories, it appears that around 93% have
src/parser.c
committed.
ATM then, it appears most folks are doing so and numerous projects that use tree-sitter assume this kind of setup.
Not doing so probably means that it's less likely for the grammar / parser in question to get used as widely.
There appear to be what might be considered compromise options:
- https://github.com/alex-pinkus/tree-sitter-swift#where-is-your-parserc (related issue)
- https://github.com/DerekStride/tree-sitter-sql#installation (related issue)
It's probably worth noting that there are security implications for what most folks are doing. (There might be some related activity attempting to address some of the concerns.)
See the section of the corresponding name in the repository README.
- Ensure the current working directory is the repository root directory.
cd questions/should-parser-source-be-committed
- Invoke
sh ./script/list-provides-parser-c.sh
- Observe output that ends like:
Minimum number of repositories with parser.c: 287
Number of repositories: 308
Tip: to get a list (possibly imperfect) of which repositories do not
have a parser.c
generated by the tree-sitter
cli, invoke:
sh ./script/list-no-parser.c.sh
This should produce output that ends something like:
tree-sitter-swift.alex-pinkus
tree-sitter-systemrdl.SystemRDL
tree-sitter-teal.euclidianAce
tree-sitter-vcd.wavedrom
tree-sitter-zeek.zeek
Did not find parser.c in: 21
Number of repositories: 308
Note that this search may not be perfect because:
-
A repository may be providing generated files in a separate branch (e.g. tree-sitter-swift).
-
Some repositories contain a file that happens to be named
parser.c
that isn't generated by thetree-sitter
cli (e.g. some things by stsewd). To account for this, the search code doesn't use thefind
command but rather globs and for loops. However, the chosen method might fail in some cases -- though that doesn't seem to have happened yet.