Skip to content

Latest commit

 

History

History
110 lines (84 loc) · 4.01 KB

File metadata and controls

110 lines (84 loc) · 4.01 KB

Should Generated Parser Source Be Committed?

Since parser.c and friends can be generated from grammar.js (or grammar.json), wouldn't it be ok to not commit them to one's grammar / parser repository?

Discussion

On a few occasions 1 2 3 maxbrunsfeld has recommended including generated parse source.

Also, currently, various parties (e.g. nvim-treesitter, tree-sitter-langs, Emacs 29+, Cursorless, difftastic, helix-editor, semgrep, etc.) assume the inclusion of these files to varying degrees.

In late 2020, maxbrunsfeld sketched out a draft plan to move away from doing this. A few months later(?), a form of this was added to the Tree-sitter 1.0 Checklist (search for "Mergeable Git Repos").

Of the checked repositories, it appears that around 93% have src/parser.c committed.

ATM then, it appears most folks are doing so and numerous projects that use tree-sitter assume this kind of setup.

Not doing so probably means that it's less likely for the grammar / parser in question to get used as widely.

There appear to be what might be considered compromise options:

It's probably worth noting that there are security implications for what most folks are doing. (There might be some related activity attempting to address some of the concerns.)

Prerequisites for Demo

See the section of the corresponding name in the repository README.

Demo Steps

  • Ensure the current working directory is the repository root directory.
  • cd questions/should-parser-source-be-committed
  • Invoke sh ./script/list-provides-parser-c.sh
  • Observe output that ends like:
Minimum number of repositories with parser.c: 287
Number of repositories: 308

Tip: to get a list (possibly imperfect) of which repositories do not have a parser.c generated by the tree-sitter cli, invoke:

sh ./script/list-no-parser.c.sh

This should produce output that ends something like:

tree-sitter-swift.alex-pinkus
tree-sitter-systemrdl.SystemRDL
tree-sitter-teal.euclidianAce
tree-sitter-vcd.wavedrom
tree-sitter-zeek.zeek

Did not find parser.c in: 21
Number of repositories: 308

Note that this search may not be perfect because:

  1. A repository may be providing generated files in a separate branch (e.g. tree-sitter-swift).

  2. Some repositories contain a file that happens to be named parser.c that isn't generated by the tree-sitter cli (e.g. some things by stsewd). To account for this, the search code doesn't use the find command but rather globs and for loops. However, the chosen method might fail in some cases -- though that doesn't seem to have happened yet.

References