Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Document how to monitor warning logs #1

Closed
stealthybox opened this issue Oct 4, 2019 · 6 comments
Closed

Document how to monitor warning logs #1

stealthybox opened this issue Oct 4, 2019 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@stealthybox
Copy link

Hi there 👋
Nice tool you have here!

I found this feature interesting:

a warning percentage: When your RAM usage hits this threshold, kubemem will log the warning

I thought the program might be doing this by creating an Event in the k8s API.
However, I found it logs the warning and returns 0. This is quick and simple:
https://github.com/16Bitt/kubemem/blob/14a1c13/main.c

LivenessProbe logs get recorded by the Kubelet in an Event.
however, this is only on failure, not success.

There aren't many useful places probe logs actually end up.
They don't show up in container logs.
This stackoverflow response is still effectively correct today:
https://stackoverflow.com/a/34599554

Modern k8s now supports creating a Warning Event for probes:
https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/kubelet/prober/prober.go#L123-L130
This wasn't true in 2016 when that SO answer was written.

However, the Exec Prober doesn't support returning probe.Warning, so a Warning Event can't be created:
https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/probe/exec/exec.go#L41-L55

The kubelet will start logging all Exec Probe command output at '-v=4` and that's the only way you could monitor those messages.

It might be worth documenting this?
Perhaps there is another logging/event mechanism I missed.

How are you using these warning messages at $work?
Are you collecting your kubelet logs in something like ElasticSearch/Datadog/Loki and then monitoring for them?

Cheers :)


Unrelated:
Starting in v1.16.1, probe output is limited to 10kb:
kubernetes/kubernetes#82514
https://github.com/kubernetes/kubernetes/blob/v1.16.1/pkg/probe/exec/exec.go#L48-L72
(just a neat thing I learned)

@16Bitt
Copy link
Owner

16Bitt commented Oct 4, 2019

Hi there! Thanks for pointing this out! I’ve primarily been using the logs for debugging the tool, but I had assumed the probe logs would show within the pod logs.

I’ll put together an option to manually create the event through the REST API, which may add a bit more complexity but would certain be worth the investment.

As for the logging infrastructure we us at $work, we use Sumologic-fluentd to aggregate pod and API server logs. A significant drawback to this is bumping up verbosity even one level in the cluster can rack up your bills super quickly, so I’d rather not burden users with a verbosity increase. Adding audit logging to our clusters added 2Gi a day in sumo logic.

I haven’t rolled this out in prod yet (I wrote this on my vacation) but I’m very certain a lot of changes will happen as I start to use this in production worker pods. Unfortunately my company doesn’t allow open source contributions, so this will have to be done after hours once I’m back in the office.

@16Bitt 16Bitt added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 4, 2019
@16Bitt 16Bitt self-assigned this Oct 4, 2019
@stealthybox
Copy link
Author

stealthybox commented Oct 7, 2019

Appending to PID 1 stdout might work:

echo warning >> /proc/1/fd/1

It's worth testing in a Pod to see.

That might be a sensible default behavior in a Pod, or it could be opt-in and showed in the example.
It should be possible to disable as it relies on the probe having write access.

A more generic flag might work too:

--log-file=/proc/1/fd/1  # append to PID 1 stdout

@stealthybox
Copy link
Author

I tested the idea with a basic shell script and an exec, and it seems to work:

terminal 1

# Start a pod that prints some numbers
kubectl run testpid1 --image busybox -- sh -c 'for i in $(seq 1 3600); do sleep 1; echo $i; done'
kubectl logs -f deploy/testpid1

terminal 2

# Run a separate process in that pod that appends to the pod log once
k exec -it deploy/testpid1 -- sh -c 'echo helloworld >> /proc/1/fd/1'

This seems simpler than creating Events.
The program can keep logging to both places so that the Failure Events still have logs.

@16Bitt
Copy link
Owner

16Bitt commented Oct 7, 2019

Great suggestions! Thank you so much for your input!

Here's a PR that allows setting the logfile along with updated documentation: #2

It also fixes a really dumb bug on my part, as I was assuming that sysinfo accounted for cgroups

@16Bitt
Copy link
Owner

16Bitt commented Dec 30, 2019

Forgot to close this!

@16Bitt 16Bitt closed this as completed Dec 30, 2019
@darioleanbit
Copy link

Hi @16Bitt , if I try to do

# echo warning >> /proc/1/fd/1
bash: /proc/1/fd/1: Permission denied

I get this error, is there any workaround? Thanks!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants