Consider adding a config option to specify failure recovery policy #112
cam-schultz
started this conversation in
Ideas
Replies: 1 comment
-
What if you could specify a config value per chain for |
Beta Was this translation helpful? Give feedback.
0 replies
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
-
Context and scope
AWM Relayer spins up a listener goroutine for each source chain specified in the config. The current behavior when a an unrecoverable error occurs in a goroutine is to mark the application as unhealthy, which most often results in the application being killed and restarted.
If the cause of the unrecoverable error is isolated to a single chain (e.g. the configured API node for that chain become unreachable) then the relayer will still be marked unhealthy on the whole. This may be desirable for some use cases, but in others, there may be flexibility to allow for downtime on one chain while still relaying between others.
Discussion and alternatives
One possible solution would be to add a per-source chain configuration option to specify a failure policy, e.g.
kill_on_error | allow_failure
. For example, a user could specify that the relayer should cease to function altogether ifChain A
becomes unreachable (or otherwise produces critical relayer errors) by setting theChain A
config tokill_on_error
, but allowChain B
to fail without interrupting the rest of the relayer process by specifyingallow_failure
.For the
allow_failure
case, we should make it very obvious that the relayer is in a not fully functional, although valid, state.Beta Was this translation helpful? Give feedback.
All reactions