-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
libbeat: Refactor error handling in schema.Apply() #7335
Conversation
libbeat/common/schema/schema.go
Outdated
@@ -55,7 +51,7 @@ func (conv Conv) HasKey(key string) bool { | |||
// implements Mapper interface for structure | |||
type Object map[string]Mapper | |||
|
|||
func (o Object) Map(key string, event common.MapStr, data map[string]interface{}) *Errors { | |||
func (o Object) Map(key string, event common.MapStr, data map[string]interface{}) multierror.Errors { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported method Object.Map should have comment or be unexported
libbeat/common/schema/error.go
Outdated
key string | ||
message string | ||
errorType ErrorType | ||
type KeyNotFoundError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type KeyNotFoundError should have comment or be unexported
+100 on that change. Now that we have a result struct we can also add potentially more / other data in the future if we need it. |
Two other options I was thinking of on what we could return:
It would make error checking more go like and often people would probably skip the result.
An other option is to have event as part of the result. When I read the code I asked myself why is event not part of the result as it's kind of the result. I'm good with all 3 option as all 3 are much better then we have now. Thanks for coming up with this. |
The main reason why I chose to keep the event as first returned value and the "result" as second was to keep it more similar to the current interface so it continue working the same as before for the majority of cases where the error is being ignored. But it is true that in any case the signature is changing and maybe it is a good idea to force the review of uses of this function, specially where the error is being ignored. I like the |
libbeat/common/schema/error.go
Outdated
} | ||
|
||
func (err *Error) IsType(errorType ErrorType) bool { | ||
return err.errorType == errorType | ||
func NewWrongFormatError(key string, msg string) *WrongFormatError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function NewWrongFormatError should have comment or be unexported
libbeat/common/schema/error.go
Outdated
} | ||
|
||
func (err *Error) SetType(errorType ErrorType) { | ||
err.errorType = errorType | ||
type WrongFormatError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type WrongFormatError should have comment or be unexported
libbeat/common/schema/error.go
Outdated
key string | ||
message string | ||
errorType ErrorType | ||
func NewKeyNotFoundError(key string) *KeyNotFoundError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function NewKeyNotFoundError should have comment or be unexported
libbeat/common/schema/error.go
Outdated
k.key = v | ||
} | ||
|
||
type KeyNotFoundError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type KeyNotFoundError should have comment or be unexported
libbeat/common/schema/error.go
Outdated
) | ||
|
||
type ErrorType int | ||
type KeyError interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type KeyError should have comment or be unexported
for _, err := range errors { | ||
if err, ok := err.(schema.KeyError); ok { | ||
err.SetKey(convMap.Key + "." + err.Key()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to have more meaningful errors on nested objects, previously a missing key would report something like Key "segments" not found
, now it reports key "total.segments" not found
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, that was not very nice in the past.
@@ -16,68 +16,67 @@ var ( | |||
"docs": c.Dict("docs", s.Schema{ | |||
"count": c.Int("count"), | |||
"deleted": c.Int("deleted"), | |||
}), | |||
}, c.DictOptional), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these optional modifiers because there are test files that don't contain these fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of that, added option to Apply
to fail only on required fields.
errors, ok := errs.(*s.Errors) | ||
if ok { | ||
assert.True(t, errors.HasRequiredErrors(), "mapping error: %s", errors) | ||
assert.EqualError(t, errors, "Required fields are missing: ,source") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin this is the only place where I have seen that we'd make use of ApplyResult.NotFound
, to check if source
is in the list of missing fields, but to take advantage of this we'd have to complicate the eventsMapping
function to return the result and not only the event and the errors. Also, I think that here it'd be enough with testing that it returns an error for the case of having a missing field, going further looks like if we were testing here the behaviour of schema.Apply
, that is already being tested in the schema
package.
Given that, I wonder again if it worths it to have the ApplyResult
object. From the perspective of a consumer of the API, it is cleaner to just have event, err := schema.Apply(data)
, with an event and an error that can be used directly. I think it is also more valuable to be able to give directly the semantics of the use case, as in event.MetricSetFields, err = schema.Apply(index)
or nodeData, _ := schemaXpack.Apply(node)
instead of obtaining the result and then assigning result.Event
to a meaningful variable. From our perspective, if we want to test more particular cases of missing fields, I think that a better place would be the schema
package.
I'd go back to the approach in #7167, but with the additional improvements in error handling added here too, and with ApplyTo
still returning a list of errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that from a user perspective it is much more convenient if the event can be assigned directly. It's also the reason why I started to think about 3 arguments.
For testing for the missing fields which are optional I think this is something specific to metricsets and not necessarly internal testing of Apply. The reason it only exists in one place so far I think is more that it was too complicated to do it. When we start to add more versions and have very subtle differences between each version I think tests for missing fields become more common.
I assume moving forward we will make optional fields the default as we discussed in the past and not required. In this scenario most fields will be optional. So I our testing will potentially more look like: max 3 missing fields and we get an error if there are more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the approach again, to event, err := schema.Apply(data, options...)
, options
is optional, and can be used to select the mode of operation of Apply
(so for example we can make all fields optional by default). Or to access unfiltered errors, so additional tests can be done on the missing fields for example. The test that was checking the missing fields has been refactored to this approach.
I have updated the description of the PR accordingly.
} | ||
|
||
func TestInvalidJsonForBadFormatShouldThrowError(t *testing.T) { | ||
t.Skip("the code is not really checking for uknown fields") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was passing because schema.Apply()
returned the empty schema.Errors
, it was not detecting the unknown field in the (valid) JSON file. This test can be probably removed if the missing field is optional, if is not optional, an explicit check should be added in eventsMapping
to check that it exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this is not optional, explicit check added and skip removed.
"total": c.Int("total"), | ||
"disconnects": c.Int("disconnects"), | ||
"total": c.Int("total", s.Optional), | ||
"disconnects": c.Int("disconnects", s.Optional), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields don't exist in the test input:
Error: Received unexpected error:
2 errors: key `requests.total` not found; key `requests.disconnects` not found
Test: TestStats
Messages: _meta/test/stats.700.json
keyErr, ok := errs[0].(s.KeyError) | ||
if assert.True(t, ok, c.Description) { | ||
assert.Equal(t, c.Expected, keyErr.Key(), c.Description) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ApplyTo
can be used as a more advanced method for special cases, it still returns the full list of errors, that can be further analyzed if needed.
jenkins, test this again, please |
libbeat/common/schema/options.go
Outdated
"github.com/elastic/beats/libbeat/common" | ||
) | ||
|
||
var DefaultApplyOptions = []ApplyOption{AllRequired} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported var DefaultApplyOptions should have comment or be unexported
libbeat/common/schema/schema.go
Outdated
@@ -130,6 +135,13 @@ func Optional(c Conv) Conv { | |||
return c | |||
} | |||
|
|||
// The required flag provokes an error in case the key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on exported function Required should be of the form "Required ..."
@@ -294,6 +312,12 @@ func DictOptional(c ConvMap) ConvMap { | |||
return c | |||
} | |||
|
|||
// The optional flag provokes an error in case the key doesn't exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on exported function DictRequired should be of the form "DictRequired ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this approach and now that I see it I'm quite happy that we kept iterating on it.
This will give us a lot of flexibility on how we can test things and making the switch to optional by default really easy.
for _, err := range errors { | ||
if err, ok := err.(schema.KeyError); ok { | ||
err.SetKey(convMap.Key + "." + err.Key()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, that was not very nice in the past.
@@ -66,25 +66,24 @@ var ( | |||
} | |||
) | |||
|
|||
func eventMapping(r mb.ReporterV2, info elasticsearch.Info, content []byte) []error { | |||
func eventMapping(r mb.ReporterV2, info elasticsearch.Info, content []byte) error { | |||
var all struct { | |||
Data map[string]interface{} `json:"_all"` | |||
} | |||
|
|||
err := json.Unmarshal(content, &all) | |||
if err != nil { | |||
r.Error(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nice thing that we can do now is that we can check for the error from eventMapping
and report it in only 1 place. The current code with reporting and returning the error now is only returning the error as there is only 1 type of errors left.
@@ -29,28 +31,34 @@ var ( | |||
schema = s.Schema{ | |||
"insert_order": c.Int("insert_order"), | |||
"priority": c.Str("priority"), | |||
"source": c.Str("source"), | |||
"source": c.Str("source", s.Required), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you pick this field as required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test failing if this field is not present, but I'm thinking now that all fields here seem to be required, I'm going to revert this part.
4f3231c
to
b63168f
Compare
`schema.Apply()` returned a `schema.Errors` object as error, with this object is not easy to check if there was really an error, specially when optional fields are involved. This leads to lots of cases where these errors were being ignored. This change introduces a series of changes to have better control on the errors happening during the application of an schema. The interface is kept as `event, err := schema.Apply(data)`, but now the error returned is always a plain error in case of one or more errors happened, or nil if no error happened. `schema.ApplyTo()` returns the list of errors, and can be used where multiple schemas are applied for the same event. `Apply()` and `ApplyTo()` can now receive a list of options, these options are functions that can be used to filter the errors returned or to access the unfiltered list of errors. This allows different combinations of resulting errors in case of optional or required fields. It also allows to add additional checks on all the errors happened, even on the ones ignored for the final result. Three options are added by now: * `AllRequired` to fail if any non-optional field is missing, this mimics the current behaviour, and is set by default if no other option is set. * `FailOnRequired` to fail if any required field is missing. A new `Required` option is also added to be able to declare required fields. * `NotFoundKeys(cb func([]string))` is an option builder that receives a function, this function is called with the list of missing keys, so additional custom checks or tests can be done. For lists of errors, the `multierror` library is used now, some custom errors have been added to help on previous modifications and to improve some error messages. Errors on structured data contain now the full paths of the fields.
@exekias @andrewkroh if you have time, could you also take a look to this PR? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work here! I think error handling has improved a lot with this, thank you for taking it!
schema.Apply()
returned aschema.Errors
object as error, with thisobject is not easy to check if there was really an error, specially when
optional fields are involved. This leads to lots of cases where these
errors were being ignored.
This change introduces a series of changes to have better control on the
errors happening during the application of an schema.
The interface is kept as
event, err := schema.Apply(data)
, but now theerror returned is always a plain error in case of one or more errors
happened, or nil if no error happened.
schema.ApplyTo()
returns the list of errors, and can be used wheremultiple schemas are applied for the same event.
Apply()
andApplyTo()
can now receive a list of options, theseoptions are functions that can be used to filter the errors returned or
to access the unfiltered list of errors. This allows different
combinations of resulting errors in case of optional or required fields.
It also allows to add additional checks on all the errors happened, even
on the ones ignored for the final result. Three options are added by now:
AllRequired
to fail if any non-optional field is missing, thismimics the current behaviour, and is set by default if no other option
is set.
FailOnRequired
to fail if any required field is missing. A newRequired
option is also added to be able to declare required fields.NotFoundKeys(cb func([]string))
is an option builder that receivesa function, this function is called with the list of missing keys, so
additional custom checks or tests can be done.
For lists of errors, the
multierror
library is used now, some customerrors have been added to help on previous modifications and to improve
some error messages. Errors on structured data contain now the full
paths of the fields.