Add a graph for scylla_io_queue_flow_ratio #2306

vladzcloudius · 2024-06-03T22:16:36Z

System information

Scylla version (you are using): 2022.x+

Describe the feature and the current behavior/state.
In light of fixing scylladb/seastar#1641 there was a new metric add: scylla_io_queue_flow_ratio.
Here is a patch with its description: scylladb/seastar@dd6b20d

Correlating with this value may be helpful when debugging I/O related performance issues.

cc @xemul

The text was updated successfully, but these errors were encountered:

amnonh · 2024-06-07T16:38:32Z

it has both mountpoint and iogroup labels, do you want just everything on the same panel?
This is an example from a three nodes cluster

What about aggregation? Naturally, the sum is meaningless, but do we want to have an option to aggregate?

vladzcloudius · 2024-06-07T17:09:13Z

it has both mountpoint and iogroup labels, do you want just everything on the same panel? This is an example from a three nodes cluster

What about aggregation? Naturally, the sum is meaningless, but do we want to have an option to aggregate?

AFAIU for this metric any aggregation is meaningless.
@xemul, could you, please, confirm?

amnonh · 2024-06-07T18:01:07Z

@vladzcloudius Naturally, there is no point in summing over it, will Max? Min be helpful? Do you want to filter out values that equal to 1?
I'm thinking about a cluster with a few hundred cores. How will you make sense of a graph like this?

vladzcloudius · 2024-06-07T18:11:38Z

@vladzcloudius Naturally, there is no point in summing over it, will Max? Min be helpful? Do you want to filter out values that equal to 1? I'm thinking about a cluster with a few hundred cores. How will you make sense of a graph like this?

Good point.
A special thing about this metric is that we want to see "outliers" - mins and maxes.
However seeing a single maximum or minimum value is also not very useful.
Ideally we'd be able to toggle a sorting order: this way one case see first all maximum and then all minimum values.

amnonh · 2024-06-07T18:32:29Z

Good point. A special thing about this metric is that we want to see "outliers" - mins and maxes. However seeing a single maximum or minimum value is also not very useful. Ideally we'd be able to toggle a sorting order: this way one case see first all maximum and then all minimum values.

Sorry, still no easy way to play with the sort order in Grafana.

I want to include this panel in the next release, but I'm afraid as is, it will not be useful with so many lines in the graph. If we do not have a good idea, I will do it anyhow, but I hope we can come up with something better.

If you only care about the outliers I can show the average of everything as a base (maybe per mount point and iogroup) and anything that is outside of let's say two standard deviations (plus some minimal threshold to remove noise)

amnonh · 2024-06-09T23:22:29Z

@vladzcloudius take a look at my latest comments, if we can't find something better, I'll include a panel with all graphs in it, I'm afraid it will not be useful with clusters with many cores

This patch adds a panel that shows scylla_io_queue_flow_ratio. Fixes scylladb#2306 Signed-off-by: Amnon Heiman <amnon@scylladb.com>

vladzcloudius · 2024-06-10T21:46:35Z

We probably should think about it a bit more - having a graph for all shards will be quote a hassle indeed.
Probably we should show some statistical function expressing a level of noise in this metric, e.g. the SD itself.

amnonh · 2024-06-10T22:47:52Z

I think we should filter all values that are close to 1 (i.e., close to 100%). I'm not sure what the threshold should be but, no point in scrolling throw few hundreds values all of them 1.

vladzcloudius added the enhancement New feature or request label Jun 3, 2024

amnonh added this to the Monitoring 4.8 milestone Jun 7, 2024

amnonh mentioned this issue Jun 10, 2024

scylla-advanced: Add a panel for scylla_io_queue_flow_ratio #2312

Merged

amnonh closed this as completed in #2312 Jun 17, 2024

amnonh closed this as completed in 28f43db Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a graph for scylla_io_queue_flow_ratio #2306

Add a graph for scylla_io_queue_flow_ratio #2306

vladzcloudius commented Jun 3, 2024

amnonh commented Jun 7, 2024 •

edited

Loading

vladzcloudius commented Jun 7, 2024

amnonh commented Jun 7, 2024

vladzcloudius commented Jun 7, 2024

amnonh commented Jun 7, 2024 •

edited

Loading

amnonh commented Jun 9, 2024

vladzcloudius commented Jun 10, 2024

amnonh commented Jun 10, 2024

Add a graph for scylla_io_queue_flow_ratio #2306

Add a graph for scylla_io_queue_flow_ratio #2306

Comments

vladzcloudius commented Jun 3, 2024

amnonh commented Jun 7, 2024 • edited Loading

vladzcloudius commented Jun 7, 2024

amnonh commented Jun 7, 2024

vladzcloudius commented Jun 7, 2024

amnonh commented Jun 7, 2024 • edited Loading

amnonh commented Jun 9, 2024

vladzcloudius commented Jun 10, 2024

amnonh commented Jun 10, 2024

amnonh commented Jun 7, 2024 •

edited

Loading

amnonh commented Jun 7, 2024 •

edited

Loading