Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Upgrade Parser to 4.08 AST and add 4.08 features #2487

Merged
merged 15 commits into from
Feb 20, 2020

Conversation

anmonteiro
Copy link
Member

@anmonteiro anmonteiro commented Nov 4, 2019

This diff upgrades the parser to the OCaml 4.08 AST and enables a few features that have landed upstream. Most notably these are now supported:

Regarding this last addition, the following example is now supported in this diff, and fixes #2455:

let (let+) = (x, f) => List.map(f, x);

let (and+) = List.map2((x, y) => (x, y));

let x = {
  let+ x = [2]
  and+ y = [3];

  (x, y);
};

P.S.: This PR supersedes the upgrade to the 4.06 AST (#2036)

@sikanhe
Copy link
Contributor

sikanhe commented Nov 7, 2019

Would this enable us to compile let+ to JS via BuckleScript, or do we need support for let+ on BuckleScript side as well?

@mrmurphy
Copy link

mrmurphy commented Nov 7, 2019

@sikanhe in order to use it in JS we’d have to wait for a bucklescript version that is either based on OCaml 4.08, or backfills ‘let+’ support with a built-in ppx.

Is that correct @anmonteiro?

@sikanhe
Copy link
Contributor

sikanhe commented Nov 7, 2019

How do we make the choice to wait or implement the PPX? Last BS syntax upgrade took a long time, and the lack of built-in syntax for async/await/option has been painful for years now.

@mrmurphy
Copy link

mrmurphy commented Nov 7, 2019

It probably has a lot to do with what Bob has on his plate. Probably if someone knew enough about BS to make a PR with a patch that back-ported support for let+ it would be easier to get forward motion. But I'm not sure.

@@ -320,6 +320,10 @@ let operator_chars =
['!' '$' '%' '&' '+' '-' ':' '<' '=' '>' '?' '@' '^' '|' '~' '#' '.'] |
( '\\'? ['/' '*'] )

let dotsymbolchar =
['!' '$' '%' '&' '*' '+' '-' '/' ':' '=' '>' '?' '@' '^' '|']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do y'all think about also supporting alphanumeric characters in the let bindings? I think this is a great opportunity for us to "make it our own" -- ocaml's syntax tends to be more terse and rely on sigils, whereas (imo) reason goes more for descriptive names.
For example, I would much rather be able to define

let (let+async) = ...

than

let (let+<==>&) = ...
Suggested change
['!' '$' '%' '&' '*' '+' '-' '/' ':' '=' '>' '?' '@' '^' '|']
['!' '$' '%' '&' '*' '+' '-' '/' ':' '=' '>' '?' '@' '^' '|' 'a'-'z' 'A'-'Z' '_' '0'-'9']

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion -- I'm happy to add that if it doesn't cause parser conflicts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea @jaredly. And at the same time it's compatible with the existing OCaml toolchain/pattern.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this still be printed to .ml with same AST? Or refmt will do the AST transformation on its own? It seems that there is no way to express let+opt in OCaml source according to documentation on binding operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lupus the ml printer will happily print it, it just won't be parseable :)
We could easily add a rewrite step, to encode e.g. let+opt into something valid, if that's a use case that people want. I believe that currently, due to bugs in the core ocaml printer, there are already cases where unparseable code is output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And as you say, the other option is to use the "backport" logic when printing to .ml if the extended character range is used. (note that the binary ast format, which is used by dune and bsb under normal reason development, will happily accept the extended character range)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to at least get some kind of loose commitment that the compiler will not one day end up rejecting alphanumeric identifiers in this portion of the AST.

@mrmurphy
Copy link

👏 👏 👏 👏 👏 👏 I love that change.

jaredly added a commit to jaredly/bucklescript that referenced this pull request Nov 28, 2019
…o enable let+ and friends!

If you want some cutting-edge technology, look no further.
This uses the code in this pr - reasonml/reason#2487
and the letop backport from this pr - anmonteiro/reason#3

And brings it all together!
@jaredly
Copy link
Contributor

jaredly commented Nov 29, 2019

Just in case, uh, anyone wants to try this out with bucklescript, I made a fork that you can npm install right now! Put "bs-platform": "jaredly/bucklescript#letop" in your dependencies and you're good to go! (this on top of 7.0.0-dev.2)
And here's an example repo that's using the let+ syntax.

How does it work, when bucklescript is on 4.06 and doesn't support that syntax? Inspired by dune's "future_syntax" flag that allows older ocaml version to use the monadic let, I added a little routine to refmt that backports the syntax when exporting a binary ast.

@mrmurphy
Copy link

Wow, yes yes yes! I love this approach for the following reasons:

  • The syntax is similar enough to the existing let-anything ppx that we'll be able to completely refactor our existing projects with little more than find and replace.
  • Having this syntax built-in to the language makes this a much easier sell to developers coming from JS where they are used to having the async / await keyword. Especially when a strong promise library rises to the surface in the community and offers pre-built “let” functions. I just love that, if this gets merged I can teach JS people how to use this feature right away without having to drag them into ppx-land.
  • So far, in almost every one of my projects, async sugar has been something I reach for early on. Maybe less often in my client apps, but even when writing in React-native and trying to be minimal I’ve found myself going to the trouble of installing the PPX. But the place where it is most important to me is on the server. ppx-let is one of the most used tools in our production server and we love it. The web framework we built on top of express is built upon it. And an open source version of that framework would be much more approachable, I imagine, if we have this syntax in core.

@bobzhang and @chenglou, I know you’ve been reluctant to move forward on this in the past, wanting to take the time to get the best thing. I beg you to take this PR and Antonio’s into serious consideration. This would be a huge boon to the server-side Bucklescript users, present and (hopefully many more) future.

Here are some stats about our usage of let-anything in our main project at work (this is in-production, not a side project)

  • 788 total uses in 126 different files.
  • 21 of those usages is on the front-end, but this project is mainly a server. It has a few pages managing account information and doing administration tasks that are built with Reason-React.

Anyone else who is using let-anything regularly, I invite you to post some stats about your usages below.

@jordwalke
Copy link
Member

jordwalke commented Nov 29, 2019

Thank you for the feedback @mrmurphy. I think that @anmonteiro / @jaredly's approach here of avoiding obscure/cryptic symbols, while still remaining compatible with existing OCaml libraries that do use those symbols is a major selling point.

@bobzhang
Copy link
Contributor

bobzhang commented Dec 2, 2019

@mrmurphy I don't have strong opinions on syntax in general. What concerns me most is less dependency and if it has a robust implementation.

@mrmurphy
Copy link

mrmurphy commented Dec 2, 2019

Ah okay, thanks for letting me know @bobzhang!

And thanks for giving us an idea of where your thinking is with this @jordwalke! Is this the kind of change that you want to sit on a bit more, thinking through the implications, or do you feel like it could potentially be merged in time for the Bucklescript 7.0 release?

(I've excitedly started preparing our Node server library for public release on top of Jared's fork. Hence the annoying question ;). Even though it's just a syntax change, it sort of feels like a whole new language with this sugar built-in!)

@anmonteiro
Copy link
Member Author

@mrmurphy I'm not sure if this will make it to BuckleScript v7. I think the plan for now is to cut a release of Reason with re-error first, and then merge this in to allow for testing in pre-releases.

@jaredly
Copy link
Contributor

jaredly commented Dec 2, 2019

@mrmurphy @anmonteiro I don't think it would be a "breaking change" to bucklescript, per se, so I could imagine doing a fast-follow with a 7.1 that includes these parser changes

@anmonteiro
Copy link
Member Author

I agree. I think this is strictly additive, but it includes an upgrade of 4 versions of the AST and I’d feel more at ease if we tested it first.

@mrmurphy
Copy link

@anmonteiro @jordwalke how are we feeling about this now that 7.0.1 is shipped and default? Would it be appropriate at this point to begin consider releasing a new patch with this merged? What other testing or approvals are we waiting on to move forward?

mrmurphy added a commit to mrmurphy/bucklescript that referenced this pull request Dec 14, 2019
This is obviously not ready to be merged yet. But my hope is that, once reasonml/reason#2487 is merged, we can get default support for monadic sugar with JS promises in Reason. Expected usage would be like this:

```reason
open Js.Promise;

{
    let.async name = resolve("amazeface");
    Js.log2("The user's name  is " ++ name); 
}
```

Except, I just realized that I don't think this will work, because the source file is in OCaml syntax, and (let.async) isn't valid OCaml syntax. So this PR may not even be possible.

Would you consider the possibility of changing the js_promise bindings source file to be a Reason file after rescript-lang#2748 is merged? I can see that you might be trying to keep Reason out of the source so that pure OCaml users of Bucklescript don't have to mess with Reason. But according to my understanding bsb should be able to compile both Reason and OCaml files in the same project without an issue, right?
@jordwalke
Copy link
Member

I'm feeling really good about this diff - the only thing is that I want to be able to thoroughly test the human-readable syntactic extension in a real production use case before encouraging it widely, so if we merge this we should not commit to supporting that specific feature (no documenting it publicly, no promise to not break people, etc).

@mrmurphy
Copy link

mrmurphy commented Dec 17, 2019

Amazing. Well I just started a new real-world server project at work. Real-world enough that it's already caused me to make a bunch of improvements to https://github.com/mrmurphy/serbet. It's using bs-let, but if we could get a merge, and a developer pre-release of bs-platform I'd be glad to upgrade the server and share it's current state for you and others to see how this feels in a real setting.

@mrmurphy
Copy link

For that matter, Serbet's design relies on the syntax sugar. I'd be happy to make an upgraded branch of that project as well for public examination.

@mrmurphy
Copy link

@jordwalke think there’s any chance we might sneak this into a dev release before the holidays so that we can get people to experiment and provide feedback when they’re hacking on their breaks? I’m concerned that the momentum will die otherwise and it’ll take a while to ramp back up in the new year.

@anmonteiro
Copy link
Member Author

I just rebased this on current master. The Esy build is apparently failing because Dune 2.X is not supported in OCaml <= 4.06.

I don't use esy so I'd welcome a patch updating the esy files / pinning to Dune 1.X

@jordwalke
Copy link
Member

Thank you everyone for collaborating on this. I think this is a great step towards ensuring compatibility with the OCaml ecosystem. I'm going to merge this and test more thoroughly my self, and I will report back here with any issues I find. I encourage you all to pin your projects to this commit once merged and provide some feedback/examples.

Only thing this was missing is an entry in the change log mentioning the names of the people who collaborated on this!

@Timbus
Copy link

Timbus commented Jun 15, 2020

I really like this new syntax, but do you always need to enter a block to use your let* binds? Seems like it, which is a bit unexpected since it's not needed in ocaml.
On that same note, is there a way to 'leave' the current scope that you're in aside from closing the block? In ocaml you can just drop a semicolon down and you're now outside of the callback/bind/whatever. When you're nesting a lot of smaller callbacks it works wonders.
Reason doesn't seem to fare so well, but since it's undocumented I could just be missing something here.

@anmonteiro
Copy link
Member Author

but do you always need to enter a block to use your let* binds

@Timbus no, what gives you that impression?

@Timbus
Copy link

Timbus commented Jun 16, 2020

Well if I golf it down, this is what I do and it works:

// Define a callback-style binding:
let (let*)(f, x) = f(x);
// Sample library function that takes a callback:
let get42 = (callback) => {
  callback(42)
}; // <- Huh.. semicolon is needed here?

// Invoking it:
{ // <- This is needed here?
  let* number = get42   // Wait a minute, no () ?
  print_endline(number |> Int.to_string)
};

Without the {brackets} around the usage it does not seem to work. Though nested let* binds will work within that scope so that's still nicer than nothing.

I annotated the example with some other interesting quirks..

Please correct me if I am wrong though -- in fact I hope I am. Right now it feels like I'm just randomly attaching semicolons and brackets until the compiler stops complaining. And I still can't seem to easily 'leave' the nested scope/binding/whatevs it's called if I have multiple let* scopes.

@Timbus
Copy link

Timbus commented Jun 16, 2020

Oh the semicolon isn't needed for get42 if I let-bind the next bit. I was golfing too hard. My bad.

@anmonteiro
Copy link
Member Author

You can’t use binding operators in structure items. That’s also how they work in OCaml. It’s not about blocks or brackets. It’s about the top level scope or not.

@Timbus
Copy link

Timbus commented Jun 17, 2020

Oh okay.
Well I'm just porting some code over from ocaml to reason, I guess I must have confused myself a bit here since I'm not familiar with the gritty details, apologies.

@anmonteiro anmonteiro deleted the anmonteiro/ocaml-408 branch April 3, 2021 01:39
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet