Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Remove blob of zlib data from state machine errors #583

Merged
merged 1 commit into from
May 5, 2024

Conversation

jb3
Copy link
Collaborator

@jb3 jb3 commented May 4, 2024

Due to how zlib and the state machine work, when the state machine
crashed with an error, it would log the last blob that was received.

This blob could not be decoded from error traces because it depends on
the context that it was received by, additionally we cannot decode it as
an error logging step as once it has been decoded by a context it cannot
be inflated again.

Removing this blob is the best solution here as it shrinks the size of
state machine errors and makes them easier to read by users and Nostrum
team.

In state machine traces, the payload will now be shown as "PAYLOAD
REMOVED", if the last event was not a payload, the event will remain
unmodified.

@jb3
Copy link
Collaborator Author

jb3 commented May 4, 2024

An example state machine trace now looks like the following (it's still long, but now all content is mostly relevant):


21:51:19.695 shard=0 [error] ** State machine <0.471.0> terminating
** Last event = {info,{gun_ws,<0.473.0>,#Ref<0.1439088919.2704539650.19368>,
                              {binary,<<"PAYLOAD REMOVED">>}}}
** When server state  = {connected,#{stream =>
                                         #Ref<0.1439088919.2704539650.19368>,
                                     seq => nil,
                                     '__struct__' =>
                                         'Elixir.Nostrum.Struct.WSState',
                                     session => nil,
                                     gateway => <<"gateway.discord.gg">>,
                                     shard_num => 0,conn_pid => <0.471.0>,
                                     total_shards => 1,resume_gateway => nil,
                                     conn => <0.473.0>,
                                     zlib_ctx =>
                                         #Ref<0.1439088919.2704670722.19387>,
                                     last_heartbeat_ack =>
                                         #{microsecond => {447937,6},
                                           second => 19,
                                           calendar => 'Elixir.Calendar.ISO',
                                           month => 5,
                                           '__struct__' => 'Elixir.DateTime',
                                           day => 4,year => 2024,minute => 51,
                                           hour => 20,
                                           time_zone => <<"Etc/UTC">>,
                                           zone_abbr => <<"UTC">>,
                                           utc_offset => 0,std_offset => 0},
                                     heartbeat_ack => true,
                                     heartbeat_interval => 41250,
                                     last_heartbeat_send => nil}}
** Reason for termination = error:badarg
** Callback modules = ['Elixir.Nostrum.Shard.Session']
** Callback mode = [state_functions,state_enter]
** Stacktrace =
**  [{erlang,binary_to_atom,[0],[{error_info,#{module => erl_erts_errors}}]},
     {'Elixir.Nostrum.Shard.Session',connected,3,
                                     [{file,"lib/nostrum/shard/session.ex"},
                                      {line,258}]},
     {gen_statem,loop_state_callback,11,[{file,"gen_statem.erl"},{line,1395}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]
** Time-outs: {1,[{state_timeout,send_heartbeat}]}

Due to how zlib and the state machine work, when the state machine
crashed with an error, it would log the last blob that was received.

This blob could not be decoded from error traces because it depends on
the context that it was received by, additionally we cannot decode it as
an error logging step as once it has been decoded by a context it cannot
be inflated again.

Removing this blob is the best solution here as it shrinks the size of
state machine errors and makes them easier to read by users and Nostrum
team.

In state machine traces, the payload will now be shown as "PAYLOAD
REMOVED", if the last event was not a payload, the event will remain
unmodified.
@jb3 jb3 force-pushed the jb3/statem-cleaner-errors branch from 5d81920 to 10988e0 Compare May 5, 2024 03:41
Copy link
Collaborator

@jchristgit jchristgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, I LOVE this one!

@jb3 jb3 merged commit f7d1139 into master May 5, 2024
9 checks passed
@jb3 jb3 deleted the jb3/statem-cleaner-errors branch May 5, 2024 11:52
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants