Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[DNM] Horizon k8s cluster logging #399

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mcgonago
Copy link

Add support to Horizon operator for k8s cluster logging.

@openshift-ci openshift-ci bot requested review from dprince and viroel December 17, 2024 22:23
Copy link
Contributor

openshift-ci bot commented Dec 17, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mcgonago
Once this PR has been reviewed and has the lgtm label, please assign dprince for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

func GetLogVolumeMount() corev1.VolumeMount {
return corev1.VolumeMount{
Name: logVolume,
MountPath: "/var/log/manila",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/log/horizon ?

// Horizon is the global ServiceType that refers to all the components deployed
// by the horizon-operator
Horizon storage.PropagationType = "Horizon"

//LogFile -
LogFile = "/var/log/horizon/horizon.log"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we also planning to capture Apache logs as part of this? Or do you feel that just the Horizon application logs are sufficient for any support / debugging requirements your team may have?

Current logging for Apache is just to the stdout of the container, so that might be sufficient:
https://github.com/openstack-k8s-operators/horizon-operator/blob/main/templates/horizon/config/httpd.conf#L25-L26

It just means logs will be lost when / if the pod is evicted from a node.

This can probably be a separate PR and topic, but just asking to make sure it has been considered.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing about writing it to a specific log file is that we wont be able to see them simply by oc logs on the Horizon pod. So we will also need to add a sidecar container to the pod which will just run tail -f /var/log/horizon/horizon.log. That architecture is defined under this section:
https://kubernetes.io/docs/concepts/cluster-administration/logging/#sidecar-container-with-logging-agent

We can just add a new container to the pod called horizon-logs or something. That way, users will be able to clearly tell which container they can check to get Horizon Django application logs.

@mcgonago
Copy link
Author

recheck

Comment on lines +101 to +107
func GetLogVolumeMount() corev1.VolumeMount {
return corev1.VolumeMount{
Name: logVolume,
MountPath: "/var/log/horizon",
ReadOnly: false,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be easier to return a slice of corev1.VolumeLogMount here. Then you can just call it from within the pod definition above without having to append the result to a slice.

Suggested change
func GetLogVolumeMount() corev1.VolumeMount {
return corev1.VolumeMount{
Name: logVolume,
MountPath: "/var/log/horizon",
ReadOnly: false,
}
}
func GetLogVolumeMount() []corev1.VolumeMount {
return []corev1.VolumeMount{
{
Name: logVolume,
MountPath: "/var/log/horizon",
ReadOnly: false,
},
}
}

RunAsUser: &runAsUser,
},
Env: env.MergeEnvs([]corev1.EnvVar{}, envVars),
VolumeMounts: []corev1.VolumeMount{GetLogVolumeMount()},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change suggested to the GetLogVolumtMount() function below. You can just call the function here:

Suggested change
VolumeMounts: []corev1.VolumeMount{GetLogVolumeMount()},
VolumeMounts: GetLogVolumeMount(),

@@ -146,6 +162,7 @@ func Deployment(
},
Env: env.MergeEnvs([]corev1.EnvVar{}, envVars),
VolumeMounts: volumeMounts,
[]corev1.VolumeMount{GetLogVolumeMount()}...),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, you can just call the function once it's changed to return a slice.

@@ -158,6 +175,9 @@ func Deployment(
},
},
}
deployment.Spec.Template.Spec.Volumes = append(GetVolumes(
instance.Name,
instance.Spec.ExtraMounts), GetLogVolume())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggested change to GetLogVolume() means that this would require merging the two slices at this point though.

The other way you could do it, is keep the function the same and then just append this to the volumeMounts variable defined on line 99 and just have everything using the same mounts.

I would probably opt for the first option though, just to keep the mounts minimal on the log pod. It just means you'll need to merge the slices here, rather than appending since this would now give you a slice of slices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, sorry, just realised this is the Volume not the Volume mount. The only thing missing here is the HorizonPropagation variable. So it should be:

Suggested change
instance.Spec.ExtraMounts), GetLogVolume())
instance.Spec.ExtraMounts, HorizonPropagation), GetLogVolume())

@@ -158,6 +175,9 @@ func Deployment(
},
},
}
deployment.Spec.Template.Spec.Volumes = append(GetVolumes(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getVolumes? GetVolumes is undefined.

Comment on lines 910 to 912
templateParameters := map[string]interface{}{
"LogFile": horizon.LogFile,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this will override the entire templateParameters and you will just end up with LogFile in there. So you want to just set the LogFile key:

Suggested change
templateParameters := map[string]interface{}{
"LogFile": horizon.LogFile,
}
templateParameters["LogFile"] = horizon.LogFile

Copy link
Contributor

@fmount fmount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than what @bshephar already suggested, to have a green run we should:

  1. rebase the patch on main as a lot of things changed because of the Topology related patches
  2. add the missing GetLogVolume() and GetLogVolumeMount() to volume and volumeMount of deployment.go

I can see envTests passing locally after the two changes above:

Will run 29 of 29 specs
•••••••••••••••••••••••••••••W0228 14:45:07.689610  507081 reflector.go:462] pkg/mod/k8s.io/client-go@v0.29.14/tools/
cache/reflector.go:229: watch of *v1beta1.Horizon ended with: an error on the server ("unable to decode an event from
 the watch stream: context canceled") has prevented the request from succeeding
W0228 14:45:07.689610  507081 reflector.go:462] pkg/mod/k8s.io/client-go@v0.29.14/tools/cache/reflector.go:229: watch
 of *v1beta1.Topology ended with: an error on the server ("unable to decode an event from the watch stream: context c
anceled") has prevented the request from succeeding


Ran 29 of 29 Specs in 18.622 seconds
SUCCESS! -- 29 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS

@@ -109,6 +109,22 @@ func Deployment(
Spec: corev1.PodSpec{
ServiceAccountName: instance.RbacResourceName(),
Containers: []corev1.Container{
// the first container in a pod is the default selected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to comment on different lines other than the modified ones, but L90 and L91 should be:

volumes := append(getVolumes(instance.Name, instance.Spec.ExtraMounts, HorizonPropagation), GetLogVolume())
volumeMounts := append(getVolumeMounts(instance.Spec.ExtraMounts, HorizonPropagation), GetLogVolumeMount())

@@ -132,6 +148,27 @@ func Deployment(
},
},
}
deployment.Spec.Template.Spec.Volumes = getVolumes(instance.Name, instance.Spec.ExtraMounts, HorizonPropagation)
/*
* +++owen - when looking at how manila did this - we see care taken with a GetVolumes that handles two pods of the same service?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the comment from L153 to L160

// If possible two pods of the same service should not
// run on the same worker node. If this is not possible
// the get still created on the same worker node.
deployment.Spec.Template.Spec.Affinity = affinity.DistributePods(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this function as Topology already merged

// the first container in a pod is the default selected
// by oc log so define the log stream container first.
{
Name: instance.Name + "-log",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to update envTests to point to container[1] when checking horizon, and not the sidecar w/ logs.

--- a/tests/functional/horizon_controller_test.go
+++ b/tests/functional/horizon_controller_test.go
@@ -580,7 +580,7 @@ var _ = Describe("Horizon controller", func() {
                        th.AssertVolumeExists(CABundleSecretName, d.Spec.Template.Spec.Volumes)
                        th.AssertVolumeExists(InternalCertSecretName, d.Spec.Template.Spec.Volumes)

-                       svcC := d.Spec.Template.Spec.Containers[0]
+                       svcC := d.Spec.Template.Spec.Containers[1]

                        // check TLS volume mounts
                        th.AssertVolumeMountExists(CABundleSecretName, "tls-ca-bundle.pem", svcC.VolumeMounts)

According to [1] looks like it's a good practice introducing a sidecar
container for logging purposes. By doing this we can rotate, process
and even forward logs elsewhere without affecting the original service
pod.

[1] https://kubernetes.io/docs/concepts/cluster-administration/logging/

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
Copy link
Contributor

openshift-ci bot commented Feb 28, 2025

@mcgonago: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/horizon-operator-build-deploy-kuttl 9c13c9f link true /test horizon-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants