Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SPM support #53

Closed
wants to merge 1 commit into from
Closed

Conversation

mattmassicotte
Copy link

This adds Swift bindings and Swift Package Manager (SPM) support. This should require no maintenance on your part, and should not impact the actual parsers in any way. Other similar changes were made here:

https://github.com/tree-sitter/tree-sitter-go
https://github.com/tree-sitter/tree-sitter-c
https://github.com/tree-sitter/tree-sitter-haskell

Happy to answer any questions!

@sogaiu
Copy link
Owner

sogaiu commented Aug 24, 2023

I'll have to think it over a bit as I've been moving toward removing all bindings.

See here and here for some more info.

The former is mostly about testing and responsibility of maintenance + trying to not be misleading about what is being provided.

The latter tries to make the point about an eventual transition to not having generated source in grammar repositories (at least not directly in one's git (or other VCS) repository) and that there are security implications for what folks are currently doing.

@sogaiu
Copy link
Owner

sogaiu commented Aug 25, 2023

IIUC, this PR at the tree-sitter repository may make it possible to generate Swift bindings. I don't know if it's ascertainable at this point, but may be it's possible to set up CI / automation to make use of that functionality after it lands.

@dannyfreeman
Copy link
Collaborator

This should require no maintenance on your part,

I would hope not, as I have no way to validate that anything related to swift even works, let alone confidently make any updates should something about this change. That said, I don't quite buy that no maintenance would be required, some would be required at some point. I'm not sure when because I don't follow anything in the apple development ecosystem.

Is it possible to publish this in another repository that somehow "consumes" this one, like as a submodule? I don't know if we should be tracking bindings for every language under the sun here. The node and rust ones I'm not a fan of having, but at least they are generated. Having a second repository seems preferable, and we could give it an official "blessing" by linking to it from the README.

If another repo is not a feasible solution, would we be able to count on you for any maintenance these bindings might require? We could certainly give you a heads up before any changes to this grammar are released.

@mattmassicotte
Copy link
Author

I have to apologize for not reading the documentation more closely to understand your thinking here. I've added C and Swift bindings to many other parsers and frankly, I've become lazy.

I was not aware of the PR to include more bindings when generating projects, thank you for pointing it out!

If another repo is not a feasible solution, would we be able to count on you for any maintenance these bindings might require? We could certainly give you a heads up before any changes to this grammar are released.

Of course! A heads up isn't even necessary. It will get noticed eventually by consumers, and they can pin to specific versions/commits until fixed. Should it even happen. There are currently 32 other parsers that have merged in SPM support, and I'm only aware of one that has required updates to the binding.

Changes to the contents of the src directory are the most likely breaking changes. Needing a scanner.c, or changing from a c to cpp based-one.

I'm very in favor of the philosophy you've adopted here. A few others have gone in a similar direction, including the parser for Swift itself. I'd really love to see systemic changes make to ease the burden for both parser maintainers as well as consumers. Both currently involve far too much pain. That binding PR is a step in that direction, but won't help here as it would just result in the same changes I've made.

Deviation from a "standard" tree-sitter parser structure does make using this parser more difficult, but I also respect what you want to do. Not to mention that this is your repo and you should manage in the way you thing works best.

For now, I can just maintain my fork. I've had to do this for a few others too, and in general it isn't too bad.

@dannyfreeman
Copy link
Collaborator

dannyfreeman commented Aug 25, 2023

Well lets keep an eye on that PR. I have subscribed to it, so if it gets merged we can get the bindings here and you can stop maintaining the fork.

I'd really love to see systemic changes make to ease the burden for both parser maintainers as well as consumers. Both currently involve far too much pain. That binding PR is a step in that direction, but won't help here as it would just result in the same changes I've made.

It is a really tough problem. To ease the burden of parser maintainers and get rid of all of this source controlled generated code, we would have to put more burden on the consumers because that code needs to be generated at some point. Right now I lean more towards source controlling the generated code. Unless tree-sitter gets some major overhaul in how it works I don't see that reality changing.

@mattmassicotte
Copy link
Author

Well lets keep an eye on that PR. I have subscribed to it, so if it gets merged we can get the bindings here and you can stop maintaining the fork.

Unfortunately all that PR will do is generate the code that is in this one. That binding code will still need to live in this repo, and will still need to be manually maintained should any changes be made that require it. It does not address any of the concerns brought up (if I understand them correctly).

@sogaiu
Copy link
Owner

sogaiu commented Aug 25, 2023

If we are to steer toward providing bindings, I'm leaning toward having CI do the generating and not have the generated files live in the git repository itself [1].

What bothers me the most is that the generated source and bindings are similar to build artifacts and thus the security-related issues are not the same as plain source files. At least in my case, I expect source files to be mostly scrutinized by folks making commits and there is a kind of audit trail (assuming we're not rewriting history -- we started protecting our master branch some time back FWIW).

Having generated files (parser.c and bindings) live in our VCS repositories is very similar to distributing binaries. Distributing such things is not the same kind of arrangement as for source files that humans have reviewed. For at least the following reasons, I prefer the artifacts to be built on machines (e.g. CI) that are not (solely) the developers':

  • reproducibility
  • possibility of preparing for more architectures / platforms
  • less risk to dev machines (in this case the machines on which tree-sitter-clojure commiters work)

The second point may not be too important here, but I think the first and third ones are.

The risk to dev's machine point may not be obvious, so I will spell out a bit of what I think that is. If people are known to be depending on files that are not being scrutinized so closely then the source (where they come from) of those files (in this case a dev's machine) will be a juicier target.

This might not seem like much of an issue but consider that nvim-treesitter depends on (by now I imagine) > 100 parsers looked after by different folks AND the users of such parsers are likely to be developers working on other projects.

Other grammars may be much more of a risk than one for Clojure, but I'm not excited about increasing my own machine's risk nor do I want to exhibit behavior that makes it seem like it's unproblematic.

One more thing I want to add is that I don't think a single CI doing the generating is healthy either. The risk to the party providing the CI is elevated as a result too. I might investigate doing the generation elsewhere as well to reduce the attraction of going after one particular CI-providing party's machines [2].

I don't really want to be in the business of distributing "binary-like things" though, various operating system distributions are already far better prepared and equipped to do this.


[1] GH Releases might be an alternative location for bindings and as spelled out a bit by mattmassicotte earlier, the Swift grammar has an alternate arrangement.

[2] I see that reproducibility (the potential) is important, but it can also help security-wise if exercised (e.g. actually reproducing in multiple locations for distribution or verifying via one's own building whether a build done elsewhere is sound). Of course, this is not cost-free...

@mattmassicotte
Copy link
Author

You absolutely do not need to convince me that the status quo stinks. I don't think the tree-sitter maintainers need to be convinced either, though they may need a little more prodding to prioritize the work.

Moving the generated artifacts out into a branch updated by CI would be great too. I'll try to make anything work, and will keep my eye on the repo for any changes.

@sogaiu
Copy link
Owner

sogaiu commented Aug 26, 2023

Thanks for the response and sharing your thoughts. I really appreciate it!

For clarification, I wrote my comment above for my own sake as well as for other participants to this discussion (including potential future folks). I should have stated that explicitly to reduce the chance of confusion. Sorry about that.

Regarding:

I don't think the tree-sitter maintainers need to be convinced either

Yes, I think that's correct. I've discussed this issue with some of them before and have also been doing so in the tree-sitter discord recently :)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants