-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Support fast Zap logging #366
Conversation
Sets and maps have nondeterministic behavior in iteration, but Zap logs/JSON has arrays that are orderly, causing issues in testing. Instead, we only write one element for those that are affected and test the loop logic separately in TestNondeterministicZapLogging.
CHANGELOG.md
Outdated
- No changes yet. | ||
- Added `zapcore.MarshalLogObject/Array` method to all generated structs | ||
and its underlying components, enums, and non-primitive typedefs, implementing | ||
Zap marshalers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to,
### Added
- Generated types now implement zapcore.ObjectMarshaler or
zapcore.ArrayMarshaler where appropriate. This should lead to much faster
logging of these objects.
gen/field.go
Outdated
@@ -1,4 +1,4 @@ | |||
// Copyright (c) 2018 Uber Technologies, Inc. | |||
// Copyright (c) 2018 Uber Technologies, Inc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
gen/field.go
Outdated
<- end -> | ||
} | ||
<- end> | ||
<end> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use <end ->
for to end the range so that if .Fields is empty, we don't get an empty line.
gen/field.go
Outdated
<$encAdd> | ||
<- end -> | ||
<- else -> | ||
<- $encAdd := printf "%s.Add%s(%q, %s)" $enc (zapEncoder .Type) .Name (zapMarshalerPtr .Type $fval) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reduce some of the duplication here with a TemplateFunc:
TemplateFunc("fieldMarshaler", func(required bool, t compile.TypeSpec, fval string) (string, error) {
if required {
return zapMarshaler(g, t, fval)
}
return zapMarshalerPtr(g, t, fval)
})
Then the template becomes something like,
<- $encAdd := printf "%s.Add%s(%q, %s)" ... (fieldMarshaler .Required .Type $fval) ->
<- if .Required ->
<if (zapCanError .Type) ->
...
<- else ->
<$encAdd>
<- end _>
<- else ->
...
<end>
gen/field.go
Outdated
func (<$v> *<.Name>) MarshalLogObject(<$enc> <$zapcore>.ObjectEncoder) error { | ||
<range .Fields> | ||
<- $fname := goName . -> | ||
<- $fval := printf "%s.%s" $v $fname -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can inline $fname
in the $fval
declaration
gen/map.go
Outdated
if err := g.EnsureDeclared( | ||
` | ||
type <.Name> <typeReference .Type> | ||
<$zapcore := import "go.uber.org/zap/zapcore"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imports at the top please
gen/map.go
Outdated
<- $encAdd := printf "%s.Add%s((string)(%s), %s)" $enc (zapEncoder .Type.ValueSpec) $k (zapMarshaler .Type.ValueSpec $v)> | ||
<if (zapCanError .Type.ValueSpec) -> | ||
if err := <$encAdd>; err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn, this is getting repetitive. I wonder if we can clean up this err != nil
business.
Can we maybe do something like,
<zapEncodeBegin .Type.ValueSpec>
<$enc>.Add<zapEncoder ..>(..., <zapMarshaler .Type.ValueSpec $v>)
<zapEncodeEnd .Type.ValueSpec>
Where zapEncodeBegin and zapEncodeEnd add the if err != nil
, etc. if needed?
This will handle all these cases where we're doing if zapCanErr
and
$errAdd
again and again.
gen/typedef.go
Outdated
@@ -103,6 +103,26 @@ func typedef(g Generator, spec *compile.TypedefSpec) error { | |||
return <equals .Target $lhs $rhs> | |||
<- end> | |||
} | |||
|
|||
<if (eq (zapEncoder .Target) "Object") -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template format supports template-level comments. Let's add a couple to explain what's going on here.
</* ... */>
gen/typedef.go
Outdated
<$enc := newVar "enc"> | ||
func (<$v> <$typedefType>) MarshalLogObject(<$enc> <$zapcore>.ObjectEncoder) error { | ||
<if isStructType . -> | ||
return ((<typeReference .Target>)(<$v>)).MarshalLogObject(<$enc>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this case handled by zapMarshaler . $v
too?
gen/zap.go
Outdated
*compile.StructSpec: | ||
return true | ||
} | ||
panic(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop the panic. You can add checks for the true cases since there are
fewer of those and return false by default for all other cases.
Alternatively, you can also just call zapEncoder
on the type and return true
only if the output is "Object" or "Array".
(If you do the zapEncodeBegin/zapEncodeEnd business, you can probably drop
this whole function.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you want to drop panic for the other functions as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the cases we're doing exhaustive checks, let's keep the panic (but with a better error message).
For cases like this where it's more of an explicit list of things that can error out, we can just do catch-all for the false case rather than panicking.
This reverts the last 6 commits.
zapCanError has been replaced by zapEncodeBegin/zapEncodEnd. This allows a more readable template while achieving the same behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor issue with the map key's RootTypeSpec, but LGTM otherwise (besides nits).
gen/type.go
Outdated
@@ -193,6 +193,10 @@ func valueListName(g Generator, spec compile.TypeSpec) string { | |||
return fmt.Sprintf("_%s_ValueList", g.MangleType(spec)) | |||
} | |||
|
|||
func zapperName(g Generator, spec compile.TypeSpec) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// zapperName returns the name that should be used for wrapper types that
// implement zap.ObjectMarshaler or zap.ArrayMarshaler for the provided
// Thrift type.
gen/list.go
Outdated
@@ -177,3 +181,40 @@ func (l *listGenerator) Equals(g Generator, spec *compile.ListSpec) (string, err | |||
|
|||
return name, wrapGenerateError(spec.ThriftName(), err) | |||
} | |||
|
|||
func (l *listGenerator) zapMarshaler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document what we're logging things as.
// Slices are logged as JSON arrays.
gen/map.go
Outdated
@@ -281,3 +281,147 @@ func (m *mapGenerator) equalsUnhashable(g Generator, spec *compile.MapSpec) (str | |||
|
|||
return name, wrapGenerateError(spec.ThriftName(), err) | |||
} | |||
|
|||
func (m *mapGenerator) zapMarshaler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document this one in detail because it has non-obvious behavior.
// Maps are logged as objects if the key is a string or a typedef of a
// string. If the key is not a string, maps are logged as arrays of
// objects with a key and value.
//
// map[string]int32{"foo": 1, "bar": 2}
// => {"foo": 1, "bar": 2}
//
// map[int32]string{1: "foo", 2: "bar"}
// => [{"key": 1, "value": "foo"}, {"key": 2, "value": "bar"}]
gen/list.go
Outdated
|
||
func (l *listGenerator) zapMarshaler( | ||
g Generator, | ||
root *compile.ListSpec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/root/spec
In this context, there's only the "spec", no "root".
gen/map.go
Outdated
}, | ||
); err != nil { | ||
return "", err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional nit: can return err with the sprintf. There's no chance of that
panicking because name and fieldValue are strings (non-nillable).
err := g.EnsureDeclared(`
...
`, ...)
return fmt.Sprintf(...), err
gen/map.go
Outdated
g Generator, | ||
keySpec compile.TypeSpec, | ||
keyVar string, | ||
valueSpec compile.TypeSpec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: instead of passing keySpec and valueSpec, can pass in just the MapSpec.
gen/map.go
Outdated
valueSpec compile.TypeSpec, | ||
valueVar string, | ||
) (string, error) { | ||
name := fmt.Sprintf("_MapItem_%s_%s_Zapper", g.MangleType(keySpec), g.MangleType(valueSpec)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: can mangle the MapSpec and append _Item_Zapper
to that instead.
gen/zap.go
Outdated
|
||
// Containers | ||
case *compile.MapSpec: | ||
switch t.KeySpec.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compile.RootTypeSpec(t.KeySpec)
because the map key may be a typedef of a
string. If we don't have tests for this, we should add a test case to
typedefs.thrift.
gen/zap.go
Outdated
// zapMarshaler takes a TypeSpec, evaluates whether there are underlying elements | ||
// that require more Zap implementation to log everything, and returns a string | ||
// that properly casts the fieldValue, if needed, for logging. | ||
func (z *zapGenerator) zapMarshaler(g Generator, spec compile.TypeSpec, fieldValue string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Can you add,
// This should be used in conjunction with zapEncoder,
//
// v := ...
// enc.Add<zapEncoder .Type>("foo", <zapMarshaler .Type "v">)
// There is no AppendBinary for ArrayEncoder, so we opt for encoding it ourselves and | ||
// appending it as a string. We also use AddString instead of AddBinary for ObjectEncoder | ||
// for consistency. | ||
base64 := g.Import("encoding/base64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akshayjshah FYI. All binary will be logged as base64. We will never rely on AddBinary because there's no AppendBinary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left some optional nits & questions, but I'm really excited about this.
func (<$m> <.Name>) MarshalLogObject(<$enc> <$zapcore>.ObjectEncoder) error { | ||
for <$k>, <$v> := range <$m> { | ||
<zapEncodeBegin .Type.ValueSpec -> | ||
<$enc>.Add<zapEncoder .Type.ValueSpec>((string)(<$k>), <zapMarshaler .Type.ValueSpec $v>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If $k
is already a string, do we need the (string)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be a typedef of a string or a typedef of a typedef of a string.
<range .UniqueItems -> | ||
case <.Value>: | ||
enc.AddString("name", "<.Name>") | ||
<end -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that there's no default
case in the switch because there's no safe default name, correct? If so, can we leave a comment in the generated code explaining that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick update: Comments in the code aren't propagated by the templating logic
we have here. Godocs are propagated, though, so @mh-park added a comment about
how enums are serialized as a godoc.
func (<$l> <.Name>) MarshalLogArray(<$enc> <$zapcore>.ArrayEncoder) error { | ||
for _, <$v> := range <$l> { | ||
<zapEncodeBegin .Type.ValueSpec -> | ||
<$enc>.Append<zapEncoder .Type.ValueSpec>(<zapMarshaler .Type.ValueSpec $v>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abhinav This is fine, but it has the minor downside of bailing out on the first error. This potentially leaves other errors lurking later in the array. How complex would it be to use multierr
here and collect all the errors in one pass?
Completely fine if we leave this for a follow-up (or never do it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be a bit complex because of the different corner cases with some
encoders returning errors, some not.
I'll create a follow-up task to explore that in the future but for v1 of this,
we can skip it IMO.
// MarshalLogArray implements zapcore.ArrayMarshaler, enabling | ||
// fast logging of <.Name>. | ||
func (<$m> <.Name>) MarshalLogArray(<$enc> <$zapcore>.ArrayEncoder) error { | ||
<- if isHashable .Type.KeySpec -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭 Thrift 😭
// There is no AppendBinary for ArrayEncoder, so we opt for encoding it ourselves and | ||
// appending it as a string. We also use AddString instead of AddBinary for ObjectEncoder | ||
// for consistency. | ||
base64 := g.Import("encoding/base64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♂️ My fault, noted this in uber-go/zap#388.
Concurrent review :) Yeah, this is totally my fault - I added a note to zap's 2.0 wishlist about this. For now, it's fine as-is. |
This changes the generated MarshalLogObject for enums (thriftrw#366) to support the `go.label` annotation for enum items (thriftrw#363).
#366 added fast zap logging by generating extra `MarshalLog...` methods on generated types. For typedefs in thrift, other types that log them will cast to the underlying type in their `MarshalLog...` method, causing the generated code to need to import multiple go packages generated from both direct and indirect thrift dependencies. This should not be necessary, as typedefs of complex types (anything that cannot be mapped directly to a go stdlb type) get their own `MarshalLog...` methods, which cast to underlying type when necessary. This PR changes the logging for typedef'ed fields to: - cast to the underlying type inline when the underlying thrift type directly maps to a go type. - defer to the typedef's generated `MarshalLog...` method otherwise. This results in having more predictable importpaths (eg knowing that one thrift import will lead to one go import in generated code) makes implementing good bazel rules for thriftrw easier. See internal issue GO-362 for a more elaborate example of how the current behaviour complicates our Bazel setup.
#366 added fast zap logging by generating extra `MarshalLog...` methods on generated types. For typedefs in thrift, other types that log them will cast to the underlying type in their `MarshalLog...` method, causing the generated code to need to import multiple go packages generated from both direct and indirect thrift dependencies. This should not be necessary, as typedefs of complex types (anything that cannot be mapped directly to a go stdlb type) get their own `MarshalLog...` methods, which cast to underlying type when necessary. This PR changes the logging for typedef'ed fields to: - cast to the underlying type inline when the underlying thrift type directly maps to a go type. - defer to the typedef's generated `MarshalLog...` method otherwise. This results in having more predictable importpaths (eg knowing that one thrift import will lead to one go import in generated code) makes implementing good bazel rules for thriftrw easier. See internal issue GO-362 for a more elaborate example of how the current behaviour complicates our Bazel setup.
#366 added fast zap logging by generating extra `MarshalLog...` methods on generated types. For typedefs in thrift, other types that log them will cast to the underlying type in their `MarshalLog...` method, causing the generated code to need to import multiple go packages generated from both direct and indirect thrift dependencies. This should not be necessary, as typedefs of complex types (anything that cannot be mapped directly to a go stdlb type) get their own `MarshalLog...` methods, which cast to underlying type when necessary. This PR changes the logging for typedef'ed fields to: - cast to the underlying type inline when the underlying thrift type directly maps to a go type. - defer to the typedef's generated `MarshalLog...` method otherwise. This results in having more predictable importpaths (eg knowing that one thrift import will lead to one go import in generated code) makes implementing good bazel rules for thriftrw easier. See internal issue GO-362 for a more elaborate example of how the current behaviour complicates our Bazel setup.
#366 added fast zap logging by generating extra `MarshalLog...` methods on generated types. For typedefs in thrift, other types that log them will cast to the underlying type in their `MarshalLog...` method, causing the generated code to need to import multiple go packages generated from both direct and indirect thrift dependencies. This should not be necessary, as typedefs of complex types (anything that cannot be mapped directly to a go stdlib type) get their own `MarshalLog...` methods, which cast to underlying type when necessary. This PR changes the logging for typedef'ed fields to: - cast to the underlying type inline when the underlying thrift type directly maps to a go type. - defer to the typedef's generated `MarshalLog...` method otherwise. This results in having more predictable importpaths (eg knowing that one thrift import will lead to one go import in generated code) makes implementing good bazel rules for thriftrw easier. See internal issue GO-362 for a more elaborate example of how the current behaviour complicates our Bazel setup. Co-authored-by: Abhinav Gupta <abg@uber.com>
#366 added fast zap logging by generating extra `MarshalLog...` methods on generated types. For typedefs in thrift, other types that log them will cast to the underlying type in their `MarshalLog...` method, causing the generated code to need to import multiple go packages generated from both direct and indirect thrift dependencies. This should not be necessary, as typedefs of complex types (anything that cannot be mapped directly to a go stdlib type) get their own `MarshalLog...` methods, which cast to underlying type when necessary. This PR changes the logging for typedef'ed fields to: - cast to the underlying type inline when the underlying thrift type directly maps to a go type. - defer to the typedef's generated `MarshalLog...` method otherwise. This results in having more predictable importpaths (eg knowing that one thrift import will lead to one go import in generated code) makes implementing good bazel rules for thriftrw easier. See internal issue GO-362 for a more elaborate example of how the current behaviour complicates our Bazel setup. Co-authored-by: Abhinav Gupta <abg@uber.com>
(This supersedes #361.)
This adds a
zapcore.MarshalLogObject/Array
method to all generated structs and its underlying components, enums, and non-primitive typedefs, implementing Zap marshalers (i.e. ObjectMarshaler, ArrayMarshaler). For typedefs of primitives, the logging is up to the user, simply by casting it down to the root type and using the respectiveAdd/Append...
method of the Zap encoder.