Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Modification of the pod observation #14

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

janLo
Copy link

@janLo janLo commented Dec 16, 2020

This is mainly a RFC as I'm aware that it changes the current behavior quite a bit - and I'm fine to keep it as a fork. Still I'm curious what others think of these changes:

  • I've removed the PodFailed branch from the pod observation as in my case, when I hit this branch, the reason and message was always empty. So I got a lot of events "Pod/xyz: " without any further helpful metadata. If there is a pod-change I always search for terminated containers.
  • Every failed container causes a sentry event now and not only the first one. They will be grouped by a common fingerprint though.
  • I've used the generateName field instead of uuid for the fingerprint (if the object has no owner) as this is the stable for a lot of solutions working with some kind of pod template (in my case Jenkins and Gitlab-Runner). If the generateName is not provided it uses the name in a mangled form like the sentry-kubernetes implementation did to allow better grouping.
  • I took the message out of the fingerprints as its often containing very specific information like the container ids preventing any grouping.

The podRailed case seemed pretty useless to me as the message and reason
was always empty so I got a lot of meaningless events. I also think that
any Pod-Failure that is not due to failed containers will show up as
events and get better reported as such.

Another disadvantage of the former implementation was, that it reported
only ever one container failure per pod.
@janLo janLo changed the title Refactor pod observation Modification of the pod observation Dec 16, 2020
janLo and others added 11 commits December 17, 2020 10:30
This introduces a name mangleing like the sentry-kubernets
implementation does for more aggressive issue grouping.
The message often contains unique information like container-ids or node
names causing basically no grouping to happen.
…VENT_LEVELS...

- Leaving SKIP_EVENT_LEVELS unset
  Defaults to old behavior, "normal" type events will be skipped

- Setting SKIP_EVENT_LEVELS to empty
  No event type will be skipped

- Setting SKIP_EVENT_LEVELS to a comma separated event type list like "SKIP_EVENT_LEVELS=normal,warning"
  All events of those types will be skipped. The filtering mechanism is case-insensitive, so "normal,warning" is the same as "nOrMal,WARning".
Fix Pod Term Detection and Make SkipLevels Configurable
feat: favor a namespace's skipLevel annotation for the global skipLevel env
added:
- NS annotation secunet.sentry/skip-event-reasons:
  allow configuration of skip event reasons per namespace

- env SKIP_EVENT_REASONS:
  global skip by reason config if NS has no skip config annotation

- Pod annotation secunet.sentry/ignore-pod-updates=true:
  allows for suppression of pod update event handling through the forwarder

changed:
- skip config declaration format now supports specifying a resource type before a criteria:
  [involved object type:]criteria[,...]
  E.g. normal,Pod:warning,Service:error
  See: parseSkipConfig(...) at skip.go:71
feat: skip by reason and skip pod updates...
@salzig
Copy link

salzig commented Jul 28, 2021

@wichert what's your opionion on this?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants