-
Notifications
You must be signed in to change notification settings - Fork 112
fix: properly parse octal escape sequences #1484
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
Fixes brownplt#1480. Currently, octal escape sequences are parsed incorrectly because the sequence is evaluated only partially. The "\" delimiter in an octal sequence like "\101" is already siphoned off during escape sequence matching, so it is enough to simply parse the match of the numeric sequence "101".
You're right. Should be fixed now. Right now the test method is rather crude, we check the string representation of the entire AST. Please let me know if you'd prefer a tree visitor or tests specific to the tokenizer rather than the parser; the latter may be more appropriate in this case. |
Mm, that is slightly awkward, yes. Seems to me that an easier way to do this would be a pair of tests: one that only tested parsing and only tested string literals (rather than let-bound string literals), and a second one that evaluated the string literal and checked that it had the desired value (since Pyret string expressions evaluate to JS strings, in the end). |
@blerner is there already infrastructure for testing the tokenizer? If not, is there documentation of setting up new test files? I would like to get some idea of how you all prefer to do that. |
Yup: look higher in this file, at pyret-lang/tests/parse/parse.js Lines 115 to 131 in 7656aa3
lex and then examine the toks array.
|
@blerner please take a look |
Have you run |
Yep. |
Oh my bad; I misread the test code -- I got concerned by the single-quotes in the string literals, since Pyret always prints its string literals with double-quotes...but you're checking the output of the lexer, which preserves the quotes of the actual token. Yep, this looks good, thanks! |
Thanks @blerner, @ayazhafiz ! |
Closes #1480.
Currently, octal escape sequences are parsed incorrectly because the
sequence is evaluated only partially. The "\" delimiter in an octal
sequence like "\101" is already siphoned off during escape sequence
matching, so it is enough to simply parse the match of the numeric
sequence "101".