Show `scylla_sstables_bloom_filter_memory_size` on one of the dashboards #2219

michoecho · 2024-03-11T11:14:38Z

We saw production nodes OOMing due to bloated bloom filters multiple times lately. (2 times in the last week).

And every time people seem to forget that we have a metric for this, and they waste time e.g. logging into the cluster and duing the bloom filter files.

Since the metric proved useful many times, maybe we should put it into one of the dashboards.

Perhaps it should be normalized by total memory. (E.g. sum by (...) (scylla_sstables_bloom_filter_memory_size) / sum by (...) (scylla_memory_total_memory)).

The text was updated successfully, but these errors were encountered:

amnonh · 2024-03-11T14:47:22Z

@michoecho can you also suggest an alert based on that?

amnonh · 2024-03-11T15:01:30Z

@michoecho can you look at the memory metrics you put, it's the same one

michoecho · 2024-03-11T15:04:15Z

@michoecho can you also suggest an alert based on that?

@avikivity @denesb @mykaul What do you think? Should we have an alert if bloom filters exceed some fraction of memory? And if yes, what should be the threshold? 0.1, or is that too aggressive?

avikivity · 2024-03-11T15:11:22Z

@michoecho can you also suggest an alert based on that?

@avikivity @denesb @mykaul What do you think? Should we have an alert if bloom filters exceed some fraction of memory?

yes

And if yes, what should be the threshold? 0.1, or is that too aggressive?

I think it's a reasonable starting point.

mykaul · 2024-03-12T06:57:45Z

I've asked @d-helios to scan our cloud to see where we are at right now, I think based on the results we can determine the alert threshold

amnonh · 2024-03-12T08:59:54Z

@michoecho ping, please look at the metric you put, one is missing

michoecho · 2024-03-12T09:56:21Z

@michoecho ping, please look at the metric you put, one is missing

missing

If you mean the fact that I typed

sum by (...) (scylla_sstables_bloom_filter_memory_size) / sum by (...) (scylla_sstables_bloom_filter_memory_size)

then the answer is that I meant

sum by (...) (scylla_sstables_bloom_filter_memory_size) / sum by (...) (scylla_memory_total_memory)

Otherwise I don't understand what you are asking for.

amnonh · 2024-03-12T10:24:41Z

@michoecho yes, that is exactly what I was asking about

amnonh · 2024-03-13T09:59:04Z

I've checked the cloud, I think 0.1 is a good threshold

michoecho added the enhancement New feature or request label Mar 11, 2024

amnonh added this to the Monitoring 4.7 milestone Mar 11, 2024

amnonh added a commit to amnonh/scylla-grafana-monitoring that referenced this issue Mar 13, 2024

scylla-detailed add bloom filter percentage graph scylladb#2219

5e337fb

amnonh mentioned this issue Mar 13, 2024

scylla-detailed add bloom filter percentage graph #2223

Merged

amnonh added a commit to amnonh/scylla-grafana-monitoring that referenced this issue Mar 13, 2024

scylla-detailed add bloom filter percentage graph scylladb#2219

7d767d1

amnonh closed this as completed in #2223 Mar 13, 2024

amnonh mentioned this issue Mar 19, 2024

monitor bloom filter memory usage #2225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show `scylla_sstables_bloom_filter_memory_size` on one of the dashboards #2219

Show `scylla_sstables_bloom_filter_memory_size` on one of the dashboards #2219

michoecho commented Mar 11, 2024 •

edited

Loading

amnonh commented Mar 11, 2024

amnonh commented Mar 11, 2024

michoecho commented Mar 11, 2024

avikivity commented Mar 11, 2024

mykaul commented Mar 12, 2024

amnonh commented Mar 12, 2024

michoecho commented Mar 12, 2024

amnonh commented Mar 12, 2024

amnonh commented Mar 13, 2024

Show scylla_sstables_bloom_filter_memory_size on one of the dashboards #2219

Show scylla_sstables_bloom_filter_memory_size on one of the dashboards #2219

Comments

michoecho commented Mar 11, 2024 • edited Loading

amnonh commented Mar 11, 2024

amnonh commented Mar 11, 2024

michoecho commented Mar 11, 2024

avikivity commented Mar 11, 2024

mykaul commented Mar 12, 2024

amnonh commented Mar 12, 2024

michoecho commented Mar 12, 2024

amnonh commented Mar 12, 2024

amnonh commented Mar 13, 2024

Show `scylla_sstables_bloom_filter_memory_size` on one of the dashboards #2219

Show `scylla_sstables_bloom_filter_memory_size` on one of the dashboards #2219

michoecho commented Mar 11, 2024 •

edited

Loading