[FLINK-38205][format][pb] Discard unknown fields by default #26881
+8
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
Currently, Flink Protobuf format does not discard unknown fields (and it does not even provide an option for that!). When the user PB IDL is a subset of the whole IDL, quite a few CPU is wasted for no means as Flink SQL will automatically discard all unknown fields anyway when converting to RowData. We should make it discard unknown fields by default.
Brief change log
Use PB's
DiscardUnknownFieldsWrapper
to wrap the parser.Verifying this change
This change is already covered by existing PB format tests.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation