Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

String parsing incorrect #1969

Open
triska opened this issue Aug 20, 2023 · 7 comments
Open

String parsing incorrect #1969

triska opened this issue Aug 20, 2023 · 7 comments

Comments

@triska
Copy link
Contributor

triska commented Aug 20, 2023

Currently, I get:

$ scryer-prolog
?- use_module(library(lists)).
   true.
?- length("\x0\\x0\", L).
   L = 1, unexpected.
   L = 2. % expected
@triska
Copy link
Contributor Author

triska commented Aug 20, 2023

Also:

?- "\x0\\x0\" = [_,_].
   false, unexpected.
   true. % expected

@UWN
Copy link

UWN commented Aug 20, 2023

?- "\x0\" = [Null].
   Null = '\x0\'. % ok
?- "ab\x0\" = [A,B,Null].
   false, unexpected.
   A = a, B = b, Null = '\x0\'.
?- "ab\x0\" = [A,B].
   A = a, B = b, unexpected.
   false. % expected, but not found

@triska triska changed the title length/2 incorrect String parsing incorrect Aug 23, 2023
@triska
Copy link
Contributor Author

triska commented Aug 23, 2023

This issue may be confined to parsing strings, at least the following works exactly as expected:

?- A = '\x0\\x0\', atom_chars(A, Cs), Cs = [_,_].
   A = '\x0\\x0\', Cs = "\x0\\x0\".

So, 0-bytes are correctly handled in this case.

@UWN
Copy link

UWN commented Aug 24, 2023

So in a (single) quoted token (* 6.4.2 *) the null character is handled correctly whereas in a double quoted list token (* 6.4.6 *) it is ignored, but only if double quotes denote a list of characters! So it is not the parsing process as such which is incorrect, but rather this conversion.

?- set_prolog_flag(double_quotes,chars).
   true.
?- Cs = "a\x0\b".
   Cs = "ab", unexpected.
?- set_prolog_flag(double_quotes,codes).
   true.
?- Cs = "a\x0\b".
   Cs = [97,0,98].
?- set_prolog_flag(double_quotes,atom).
   true.
?- Cs = "a\x0\b".
   Cs = 'a\x0\b'.

@bakaq
Copy link
Contributor

bakaq commented Mar 2, 2025

[...] So it is not the parsing process as such which is incorrect, but rather this conversion.

Nah, I've dealt with the parser before (#2254) and I'm pretty sure this is just a parsing problem. The parser deals with the double_quotes at parsing time. This seems like an easy fix actually, and I think I just found the place of the problem. Will report the findings if I find anything.

@bakaq
Copy link
Contributor

bakaq commented Mar 3, 2025

Yeah, I solved it. Sending a PR in a moment.

@triska
Copy link
Contributor Author

triska commented Mar 5, 2025

Thank you a lot, this is solved in rebis-dev and can be closed once rebis-dev is merged into master!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants