-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add Alternator TTL metrics to Alternator dashboard #1783
Comments
I've tried to see those metrics with 5.1 and 2022.2 and couldn't. Is there something specific I should do? What the user should look for in a metric? |
As you can see in the above-linked commit, it did reach 5.1.
|
@nyh my point was the metrics were missing, not that they were zero. Please see my other comment about the user perspective, the main question will it be helpful and how? |
You're right. I checked, and today for the "expiration service" to start at all you need 1. Alternator to be enabled (alternator port configured) and 2. the TTL experimental feature to be turned on. If one of these aren't on, the "expiration service" is never started, and it never registers these Alternator TTL metrics. Is this a problem? I was under the assumption that a missing metric is basically the same thing as a zero metric - especially after your recent patch which (if I remember correctly) drops zero metrics from the output.
That's a good question. Here is what I think:
Maybe the |
Most of the time it's fine not to report counters that are never used. After enabling and running the test I got: My option to remove empty counters is done explicitely, but the idea is the same, don't report what is not needed |
Two of the other metrics, |
This is how I run it: There's only one alternator |
@amnonh I know what happened :-) The tests in test_ttl.py are all very slow so they are skipped by default, you need to add the "--runveryslow" option to pytest to actually run those tests :-) I just wrote a test that verifies that these two metrics actually work when an item expires. The new test takes around one second, I think I'll put it in, and also consider reducing the TTL frequency even less than one second to make these tests even faster. I'll open an issue about these tests being skipped. |
Fixed by #1782 |
…metrics)' from Nadav Har'El We had quite a few tests for Alternator TTL in test/alternator, but most of them did not run as part of the usual Jenkins test suite, because they were considered "very slow" (and require a special "--runveryslow" flag to run). In this series we enable six tests which run quickly enough to run by default, without an additional flag. We also make them even quicker - the six tests now take around 2.5 seconds. I also noticed that we don't have a test for the Alternator TTL metrics - and added one. Fixes #11374. Refs scylladb/scylla-monitoring#1783 Closes #11384 * github.com:scylladb/scylladb: test/alternator: insert test names into Scylla logs rest api: add a new /system/log operation alternator ttl: log warning if scan took too long. alternator,ttl: allow sub-second TTL scanning period, for tests test/alternator: skip fewer Alternator TTL tests test/alternator: test Alternator TTL metrics
In Scylla commit scylladb/scylladb@c262309 we added four metrics for the new Alternator TTL feature. The Alternator TTL feature runs background threads which look for expired items and delete them, and these metrics can be used to see that these threads have indeed been running, how often they scanned the table, how many items got deleted, etc. (the commit message linked above contains a longer description of each metric).
We should probably (?) add an Alternator TTL tab in the Alternator dashboard, with these metrics.
Please note that Alternator TTL is currently an "experimental" feature, so in the default case, all these metrics will be zero. I don't know how this experimental-ness should, or should not, affect the design of the monitoring dashboard.
The text was updated successfully, but these errors were encountered: