Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore: add cluster, nodes that are sending incorrect host metrics #6969

Merged
merged 4 commits into from
Jan 29, 2025

Conversation

srikanthccv
Copy link
Member

@srikanthccv srikanthccv commented Jan 29, 2025

Summary

Part of #6890

In our old charts, we were sending the metrics of the container where the agent is running instead of the host machine it is running. We ask users to upgrade to get the correct results. However, that info is not very useful if they upgraded in some clusters but not in others. This change adds cluster and node names that are sending the incorrect host metrics so they can know which cluster needs to be updated.


Important

Add functionality to identify clusters and nodes sending incorrect host metrics by updating IsSendingK8SAgentMetrics and HostListResponse.

  • Behavior:
    • IsSendingK8SAgentMetrics in hosts.go now returns lists of cluster and node names sending incorrect host metrics.
    • Updates GetHostList in hosts.go to populate ClusterNames and NodeNames in the response.
  • Models:
    • Adds ClusterNames and NodeNames fields to HostListResponse in infra.go.

This description was created by Ellipsis for 5f3524e. It will automatically update as commits are pushed.

@github-actions github-actions bot added the chore label Jan 29, 2025
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 5f3524e in 1 minute and 20 seconds

More details
  • Looked at 98 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pkg/query-service/app/inframetrics/hosts.go:336
  • Draft comment:
    Avoid adding non-ClickHouse related logic to the ClickHouseReader interface. Use the DAO in the telemetry instance for such logic instead. This is also applicable to the DidSendHostMetricsData function.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The functions in question are actually doing ClickHouse-specific queries - they're querying metric tables with ClickHouse-specific SQL syntax. There isn't really any non-ClickHouse logic here that could be moved to a DAO. The functions are appropriately placed in this repo since they deal directly with querying ClickHouse for metrics data.
    Could there be some business logic mixed in with the ClickHouse queries that I'm missing? The functions do some data processing after getting results.
    The post-processing is minimal and directly related to the ClickHouse query results - just transforming the raw data into the expected return format. This is typical for a repository layer.
    The comment should be deleted because it incorrectly suggests moving ClickHouse-specific query logic out of the ClickHouseReader interface.

Workflow ID: wflow_pjwrefrk46Ec6Hx4


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

pkg/query-service/app/inframetrics/hosts.go Outdated Show resolved Hide resolved
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
@srikanthccv srikanthccv enabled auto-merge (squash) January 29, 2025 14:07
@srikanthccv srikanthccv merged commit 88084af into main Jan 29, 2025
15 of 16 checks passed
@srikanthccv srikanthccv deleted the missing-data-hosts branch January 29, 2025 14:15
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants