Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Changed lexers to use Sedlex #392

Open
wants to merge 9 commits into
base: dev-0-1-0
Choose a base branch
from

Conversation

puripuri2100
Copy link
Contributor

@puripuri2100 puripuri2100 commented Feb 24, 2023

@gfngfn
Copy link
Owner

gfngfn commented Feb 26, 2023

Memo:

satysfi doc-lang.saty -o doc-lang.pdf
 ---- ---- ---- ----
  target file: 'doc-lang.pdf'
  dump file: 'doc-lang.satysfi-aux' (will be created)
  parsing 'doc-lang.saty' ...
satysfi: internal error, uncaught exception:
         Failure("TODO: MATHCHARS (\"![\", \"doc-lang.saty\", line 18, characters 19-21)")
         Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33
         Called from Main__Parser.Tables.semantic_action.(fun) in file "src/parser.mly", line 1273, characters 21-101
         Called from MenhirLib.Engine.Make.reduce in file "lib/pack/menhirLib.ml", line 1416, characters 16-42
         Called from MenhirLib.Engine.Make.loop in file "lib/pack/menhirLib.ml", line 1702, characters 25-52
         Called from MenhirLib.Convert.Simplified.traditional2revised in file "lib/pack/menhirLib.ml" (inlined), line 193, characters 4-144
         Called from Main__ParserInterface.process in file "src/frontend/parserInterface.ml", line 16, characters 6-18
         Called from Main__FileDependencyResolver.register_document_file in file "src/frontend/fileDependencyResolver.ml", line 86, characters 4-118
         Called from Main__FileDependencyResolver.main in file "src/frontend/fileDependencyResolver.ml", line 150, characters 10-49
         Called from Main.build.(fun) in file "src/frontend/main.ml", line 1158, characters 17-55
         Called from Main.error_log_environment in file "src/frontend/main.ml", line 382, characters 4-16
         Called from Cmdliner_term.app.(fun) in file "cmdliner_term.ml", line 24, characters 19-24
         Called from Cmdliner_eval.run_parser in file "cmdliner_eval.ml", line 34, characters 37-44

@gfngfn
Copy link
Owner

gfngfn commented Feb 26, 2023

Thank you so much for developing a lexer using Sedlex!

What motivated me to consider re-implemention of the lexer with Sedlex was that it enables us to make every character token in math formulae have exactly one codepoint by replacing %token<Range.t * string> MATHCHARS with %token<Range.t * Uchar.t> MATHCHAR.

Would you mind modifying MATHCHARS as mentioned above? (Of course I will try it if you are not ready, so feel free to turn down!) Probably it will remedy the Failure("TODO: MATHCHARS (\"![\", ...") exception.

@puripuri2100
Copy link
Contributor Author

I modified MATHCHARS of Range.t * string to MATHCHAR of Range.t * Uchar.t

@leque
Copy link
Contributor

leque commented Feb 28, 2023

Great! This PR will also fixes #312. Would you mind adding parser tests about positions around multi-byte characters?

@puripuri2100
Copy link
Contributor Author

I added parser test about multi-byte characters (2c4a50d).

@puripuri2100
Copy link
Contributor Author

This PR fixes #313 :

$ cat e.saty
あ

$ satysfi e.saty
 ---- ---- ---- ----
  target file: 'e.pdf'
  dump file: 'e.satysfi-aux' (will be created)
  parsing 'e.saty' ...
! [Syntax Error at Lexer] at "e.saty", line 1, characters 0-1:
    illegal token 'あ' in a program area

@gfngfn gfngfn modified the milestones: v0.0.12, v0.1.0 Apr 6, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants