Parse, don’t validate #610

wismill · 2025-01-24T06:30:26Z

As a general note, if I had to rewrite the parser, I would opt for the “parse, don’t validate” approach.

The article is really good. But a yacc grammar is pretty much "parse" no? I'm curious which part you consider the "validate".

@bluetech My point is that we use a bunch of Expr that require further check on their type after the yacc parsing. So Expr is too general and we may end up with error with no reference to erroneous file/line. Consider the silly:

xkb_keymap { };
    xkb_keycodes { };
    xkb_types {
        type "XXX" {
        modifiers[1234] = Shift;
        map[Shift] = Level2;
        level_name[Level1] = "Base";
        level_name[Level2] = "Number";
        };
    };
    xkb_compat { };
    xkb_symbols { };
};

You would just get the following warning, without any reference to the erroneous line:

xkbcommon: WARNING: The modifiers field of a key type is not an array; Illegal array subscript ignored

Imagine now having this in an included key type file: really not ideal for debugging. But at this stage we do not have parser context anymore!

On the other hand, the following file will fail with a better (but still incomplete¹) error²:

xkb_keymap {
    xkb_keycodes { 
      <};> = 123;
      <🦆> = 456;
    };
    xkb_types { };
    xkb_compat { };
    xkb_symbols { };
};

That said, I guess that fully parsing a section is slower than the current approach so any include would get slower with the parse, don’t validate approach applied strictly.

Originally posted by @wismill in #598 (comment)

(unknown file) instead of the path of the relevant file. ↩
Guess what line is erroneous? The <🦆> one, not the <};> one… ↩

The text was updated successfully, but these errors were encountered:

bluetech · 2025-01-24T11:09:11Z

I understand what you mean now. But it's a bit hard for me to imagine how it will look.

Do you mean, get rid of the intermediate AST layer, and construct the semantic types (e.g. SymbolsInfo) directly from the parser? This is basically a "one-pass compiler", but I don't know if the XKB text format is amenable to this, unless you defer validations to a second pass.

Or do you mean, make the parser more specific, so we don't use e.g. Expr so much but e.g. more specific expressions in the grammar rules. This would blow up the amount of grammar rules but it should be possible I think.

wismill · 2025-01-24T12:11:27Z

I consider 4 main aspects:

Correctness and security (no leak, no NULL deference, etc.)
Easy code maintenance
Good error messages
Runtime efficiency

Observations:

1 is obviously the most important but difficult to achieve with the current general data types.
1 and 2 somehow go hand in hand.
3 and 4 may be competing: good logging is costly.
1 and 3 would benefit from a more specific parser (parse, don’t validate), but it may affect 4 with the blow-up of grammar rules?.
4 may benefit from a one-pass compiler, but what if we want a section that come after a lot of others in the same file? What if the keymap components do not come in order?

I would combine more specific rules (e.g. no array nor string where we expect a number) with maybe the immediate computation of whatever we can: masks, indexes. We want to have the error in context (file + position), not lost in postprocessing. This would also maybe help with 4 by allocating less, with as much as possible results directly usable by the semantic types.

bluetech · 2025-01-24T19:20:14Z

For fun, I did a quick test for one expr type, action/action list:
bluetech@211ac59
This does indeed make one place error early on invalid input - actions in symbols.
But we still need a check for interpret.actions = ... in xkb_compat files. If we want to make this safe, this when the "explosion" begins. We'd need to split the XkbMapConfig rule such that xkb_compat defines its own rule. Then we need to add CompatDeclList so it can define its own specific rule for this interpret global. Etc.

It should be possible but I can see the appeal of how it's currently done. Possibly with a custom recursive descent parser it would be more natural.

Arguably this small change is already an improvement.

I'll play with this more if I have some time.

wismill added enhancement Indicates new feature requests compile-keymap Indicates a need for improvements or additions to keymap compilation needs info labels Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse, don’t validate #610

Parse, don’t validate #610

wismill commented Jan 24, 2025 •

edited

Loading

bluetech commented Jan 24, 2025

wismill commented Jan 24, 2025

bluetech commented Jan 24, 2025

Parse, don’t validate #610

Parse, don’t validate #610

Comments

wismill commented Jan 24, 2025 • edited Loading

Footnotes

bluetech commented Jan 24, 2025

wismill commented Jan 24, 2025

bluetech commented Jan 24, 2025

wismill commented Jan 24, 2025 •

edited

Loading