Skip to content

fix: properly parse octal escape sequences #1484

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 4 commits into from
Sep 9, 2019

Conversation

ayazhafiz
Copy link
Contributor

@ayazhafiz ayazhafiz commented Sep 9, 2019

Closes #1480.

Currently, octal escape sequences are parsed incorrectly because the
sequence is evaluated only partially. The "\" delimiter in an octal
sequence like "\101" is already siphoned off during escape sequence
matching, so it is enough to simply parse the match of the numeric
sequence "101".

Fixes brownplt#1480.

Currently, octal escape sequences are parsed incorrectly because the
sequence is evaluated only partially. The "\" delimiter in an octal
sequence like "\101" is already siphoned off during escape sequence
matching, so it is enough to simply parse the match of the numeric
sequence "101".
@ayazhafiz
Copy link
Contributor Author

I think this one should be an error, since \88 is not a valid octal number.

You're right. Should be fixed now.

Right now the test method is rather crude, we check the string representation of the entire AST. Please let me know if you'd prefer a tree visitor or tests specific to the tokenizer rather than the parser; the latter may be more appropriate in this case.

@blerner
Copy link
Member

blerner commented Sep 9, 2019

Mm, that is slightly awkward, yes. Seems to me that an easier way to do this would be a pair of tests: one that only tested parsing and only tested string literals (rather than let-bound string literals), and a second one that evaluated the string literal and checked that it had the desired value (since Pyret string expressions evaluate to JS strings, in the end).

@ayazhafiz
Copy link
Contributor Author

@blerner is there already infrastructure for testing the tokenizer? If not, is there documentation of setting up new test files? I would like to get some idea of how you all prefer to do that.

@blerner
Copy link
Member

blerner commented Sep 9, 2019

Yup: look higher in this file, at

it("should have tight lexical extents for all tokens", function() {
lex("");
expect(numToks).toBe(1);
expect(toks[0].name).toBe("EOF");
for (var ws1 = 0; ws1 < ws.length; ws1++) {
for (var i = 0; i < all_toks.length; i++) {
for (var ws2 = 0; ws2 < ws.length; ws2++) {
var str = "" + ws[ws1] + all_toks[i] + ws[ws2]
lex(str);
test(numToks, 2, str, toks) &&
testPos(toks[0], all_toks[i], str, toks) &&
test(toks[1].name, "EOF", str, toks);
}
}
}
console.log("Finished single token tests");
});
. Basically, call lex and then examine the toks array.

@ayazhafiz
Copy link
Contributor Author

@blerner please take a look

@blerner
Copy link
Member

blerner commented Sep 9, 2019

Have you run make parse-test on this? (The parse tests don't run automatically in make test.)

@ayazhafiz
Copy link
Contributor Author

Have you run make parse-test on this? (The parse tests don't run automatically in make test.)

Yep.

@blerner
Copy link
Member

blerner commented Sep 9, 2019

Oh my bad; I misread the test code -- I got concerned by the single-quotes in the string literals, since Pyret always prints its string literals with double-quotes...but you're checking the output of the lexer, which preserves the quotes of the actual token.

Yep, this looks good, thanks!

@jpolitz jpolitz merged commit 8eff402 into brownplt:horizon Sep 9, 2019
@jpolitz
Copy link
Member

jpolitz commented Sep 9, 2019

Thanks @blerner, @ayazhafiz !

@ayazhafiz ayazhafiz deleted the fix/parse-escapes branch September 10, 2019 00:05
@blerner blerner linked an issue Jun 28, 2020 that may be closed by this pull request
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Octal string escapes don't seem to work, should this slice from zero?
3 participants