Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Dashboard forwarding from cos-configuration-k8s can be unreliable #312

Open
Batalex opened this issue Mar 22, 2024 · 2 comments
Open

Dashboard forwarding from cos-configuration-k8s can be unreliable #312

Batalex opened this issue Mar 22, 2024 · 2 comments

Comments

@Batalex
Copy link

Batalex commented Mar 22, 2024

Bug Description

I have an issue with custom dashboards from cos-configuration-k8s not appearing in the grafana interface.

I managed to pinpoint the source of the issue to this charm because the custom dashboards are present in the relation databag, as well as in the grafana container.

juju ssh --container grafana grafana/0 ls -1 /etc/grafana/provisioning/dashboards

default.yaml
juju_alertmanager-k8s_e9224b0.json
juju_cos-configuration-k8s_043a2b3.json
juju_cos-configuration-k8s_af3132d.json
juju_grafana-agent_0def0c2.json
juju_grafana-agent_6545430.json
juju_grafana-agent_ab32508.json
juju_grafana-agent_feefa09.json
juju_loki-k8s_0804127.json
juju_prometheus-k8s_35dd368.json
self_dashboard.json

See that two cos-config files are present in the output above, but they do not appear in grafana.

I can sometimes address the issue by scaling up and down grafana, but this operation is not a sure fix

To Reproduce

I have not been able to find a way to consistently reproduce the issue. However, in all case, I would have multiple grafana agents related to the monitoring stack.

COS - juju export bundle
bundle: kubernetes
saas:
  remote-8ae57c5a420b4e8c889fd8eba6c28be9: {}
  remote-57789c2419f64cb8874a0822ebaa787b: {}
applications:
  alertmanager:
    charm: alertmanager-k8s
    channel: stable
    revision: 101
    resources:
      alertmanager-image: 87
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,2048M
    trust: true
  catalogue:
    charm: catalogue-k8s
    channel: stable
    revision: 33
    resources:
      catalogue-image: 32
    scale: 1
    options:
      description: "Canonical Observability Stack Lite, or COS Lite, is a light-weight,
        highly-integrated, \nJuju-based observability suite running on Kubernetes.\n"
      tagline: Model-driven Observability Stack deployed with a single command.
      title: Canonical Observability Stack
    constraints: arch=amd64
    trust: true
  cos-configuration-k8s:
    charm: cos-configuration-k8s
    channel: stable
    revision: 45
    resources:
      git-sync-image: 32
    scale: 1
    options:
      git_branch: main
      git_repo: https://github.com/batalex/cos-rules
      grafana_dashboards_path: grafana/dashboards/
      prometheus_alert_rules_path: rules/
    constraints: arch=amd64
    storage:
      content-from-git: kubernetes,1,1024M
    trust: true
  grafana:
    charm: grafana-k8s
    channel: stable
    revision: 105
    resources:
      grafana-image: 68
      litestream-image: 43
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,2048M
    trust: true
  loki:
    charm: loki-k8s
    channel: stable
    revision: 118
    resources:
      loki-image: 91
    scale: 1
    constraints: arch=amd64
    storage:
      active-index-directory: kubernetes,1,2048M
      loki-chunks: kubernetes,1,10240M
    trust: true
  prometheus:
    charm: prometheus-k8s
    channel: stable
    revision: 170
    resources:
      prometheus-image: 139
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,10240M
    trust: true
  traefik:
    charm: traefik-k8s
    channel: stable
    revision: 169
    resources:
      traefik-image: 158
    scale: 1
    constraints: arch=amd64
    storage:
      configurations: kubernetes,1,1024M
    trust: true
relations:
- - traefik:ingress-per-unit
  - prometheus:ingress
- - traefik:ingress-per-unit
  - loki:ingress
- - traefik:traefik-route
  - grafana:ingress
- - traefik:ingress
  - alertmanager:ingress
- - prometheus:alertmanager
  - alertmanager:alerting
- - grafana:grafana-source
  - prometheus:grafana-source
- - grafana:grafana-source
  - loki:grafana-source
- - grafana:grafana-source
  - alertmanager:grafana-source
- - loki:alertmanager
  - alertmanager:alerting
- - prometheus:metrics-endpoint
  - traefik:metrics-endpoint
- - prometheus:metrics-endpoint
  - alertmanager:self-metrics-endpoint
- - prometheus:metrics-endpoint
  - loki:metrics-endpoint
- - prometheus:metrics-endpoint
  - grafana:metrics-endpoint
- - grafana:grafana-dashboard
  - loki:grafana-dashboard
- - grafana:grafana-dashboard
  - prometheus:grafana-dashboard
- - grafana:grafana-dashboard
  - alertmanager:grafana-dashboard
- - catalogue:ingress
  - traefik:ingress
- - catalogue:catalogue
  - grafana:catalogue
- - catalogue:catalogue
  - prometheus:catalogue
- - catalogue:catalogue
  - alertmanager:catalogue
- - grafana:grafana-dashboard
  - remote-57789c2419f64cb8874a0822ebaa787b:grafana-dashboards-provider
- - loki:logging
  - remote-57789c2419f64cb8874a0822ebaa787b:logging-consumer
- - prometheus:receive-remote-write
  - remote-57789c2419f64cb8874a0822ebaa787b:send-remote-write
- - grafana:grafana-dashboard
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:grafana-dashboards-provider
- - loki:logging
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:logging-consumer
- - prometheus:receive-remote-write
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:send-remote-write
- - cos-configuration-k8s:grafana-dashboards
  - grafana:grafana-dashboard
- - cos-configuration-k8s:prometheus-config
  - prometheus:metrics-endpoint
--- # overlay.yaml
applications:
  alertmanager:
    offers:
      alertmanager-karma-dashboard:
        endpoints:
        - karma-dashboard
        acl:
          admin: admin
  grafana:
    offers:
      grafana-dashboards:
        endpoints:
        - grafana-dashboard
        acl:
          admin: admin
  loki:
    offers:
      loki-logging:
        endpoints:
        - logging
        acl:
          admin: admin
  prometheus:
    offers:
      prometheus-receive-remote-write:
        endpoints:
        - receive-remote-write
        acl:
          admin: admin

Environment

  • multipass using charm-dev blueprint
  • juju version 3.1.7
  • COS stack deployed using cos-lite bundle

Relevant log output

unit-grafana-0: 14:44:51 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:49:04 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7f8f52db4090> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:49:05 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:53:29 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fc7421142c0> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:53:29 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:57:51 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fa9c039e220> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:57:52 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 15:02:21 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fd1153ac2c0> returned None; continuing with

Additional context

No response

@lucabello
Copy link
Contributor

We are probably missing a restart on that hook!

@michaeldmitry
Copy link
Contributor

@Batalex
Can you please try and see if the issue still exists in grafana-k8s edge 137 and grafana-agent(if you need it) edge 417

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants