-
Notifications
You must be signed in to change notification settings - Fork 11.5k
json
: document schema conversion in GBNF readme, align manual grammar examples & converters
#7841
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
@@ -57,7 +57,7 @@ std::unordered_map<std::string, BuiltinRule> PRIMITIVE_RULES = { | |||
{"object", {"\"{\" space ( string \":\" space value (\",\" space string \":\" space value)* )? \"}\" space", {"string", "value"}}}, | |||
{"array", {"\"[\" space ( value (\",\" space value)* )? \"]\" space", {"value"}}}, | |||
{"uuid", {"\"\\\"\" [0-9a-fA-F]{8} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{12} \"\\\"\" space", {}}}, | |||
{"char", {"[^\"\\\\] | \"\\\\\" ([\"\\\\/bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, | |||
{"char", {"[^\"\\\\\\x7F\\x00-\\x1F] | [\\\\] ([\"\\\\bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these possibly be updated to use the new .
operator, or is now not the time for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, we can't do that, because we need to exclude backslashes from this list. Nevermind, carry on! :)
grammars/json.gbnf
Outdated
|
||
# Optional space: by convention, applied in this grammar after literal chars when allowed | ||
ws ::= ([ \t\n] ws)? | ||
ws ::= [ \t\n]{,20} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a good change -- I like constraining the output in this way. Could even consider limiting it to something more restrictive like {,4}
or {,8}
, but this is a good start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've drafted an updated space rule in #7866. No matter what the bound is with this current syntax, models like Llama-3-8B & Phi-3-mini seem keen to misuse it. But given near-unlimited indent space only (and only 1 newline at a time), they're very sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! 👍
json
: document schema conversion in GBNF readme, align manual grammar examples & convertersjson
: document schema conversion in GBNF readme, align manual grammar examples & converters
JSON Schemas → GBNF
section to the grammar readmecc/ @HanClinto @ExtReMLapin