-
Notifications
You must be signed in to change notification settings - Fork 11.7k
RIP 46 Observability Improvement for RocketMQ
- Current State: development
- Authors: SSpirits aaron-ai WentingYang
- Shepherds: yukon@apache.org lizhanhui@apache.org
- Mailing List discussion: dev@rocketmq.apache.org
- Pull Request:
- Released: no
- Will we add a new module? -- No.
- Will we add new APIs? -- No.
- Will we add new features? -- Yes.
Currently, the RocketMQ kernel doesn't support metrics natively. RocketMQ community has a project, rocketmq-exporter, to grab broker runtime data and export it to Prometheus. This project suffers some limits:
- Inconvenience: user who wants to build a monitor system must deploy a standalone component.
- Performance: rocketmq-exporter grabs metrics data using mqadmin tools, which puts additional pressure on broker and client.
- Standard: currently, rocketmq-exporter doesn't follow the Prometheus metrics naming convention. This will bring confusion to users who want to build their monitoring system.
- Applicability: rocketmq-exporter only supports broker metrics. With the release of RocketMQ 5.0, more modules also need to expose metrics.
- Provides out-of-the-box metrics for broker and proxy. No need to deploy any other component.
- Adapt to community observability standard.
- More metrics and more accuracy.
- Redesign metrics. Embrace OpenTelemetry specification and community.
- Implement a metrics architecture for RocketMQ modules, such as broker, proxy, etc.
- Support pull (by Prometheus) and push (to OpenTelemetry collector) mode to get metrics data.
- Maintain compatibility. Users who use rocketmq-exporter could seamlessly migrate to the new metrics system.
The goal of this RIP only involves metrics. Tracing and logging are out of scope.
Nothing specific.
This RIP will not involve too many architectural changes. It implements a metrics manager like BrokerStatsManger to collect and export metrics. The metrics manager provides two ways to export metrics: pull or push.
Pull mode is designed to be compatible with Prometheus. Typical, in the K8S deployment environment, Prometheus can directly pull metrics data from the endpoint provided by broker. There is no need to deploy additional components.
Push mode is recommended by OpenTelemetry, which means it needs to deploy a collector to transfer metrics data.
This RIP does not discuss the metrics specification details, which will be addressed in further issues.
- Method signature changes -- Nothing specific.
- Method behavior changes -- Nothing specific.
- CLI command changes -- Nothing specific.
- Log format or content changes -- Nothing specific.
Some users are now using rocketmq-exporter. New metrics require compatibility with current usage. And the control panel, such as Prometheus, is not necessarily deployed under the same network as broker. So It is also meaningful to design a proxy mode of rocketmq-exporter to access new metrics data.
We could implement an OpenTelemetry collector in rocketmq-exporter: Broker export metrics data to rocketmq-exporter, and rocketmq-exporter provide a new endpoint for Prometheus access.
We split this proposal into several tasks:
- Task1: Define the specification of metric.
- Task2: Implement metrics collector and exporter framework in broker and proxy.
- Task3: Develop metrics for broker and proxy.
- Task4: Backport new metrics to rocketmq-exporter.
- Task5: Other ecosystem work, like document, Grafana dashboard template, etc.
Keep the status quo, and continue to use rocketmq-exporter.
Copyright © 2016~2022 The Apache Software Foundation.
- Home
- RocketMQ Improvement Proposal
- User Guide
- Community