Skip to content

Releases: scylladb/scylla-monitoring

Release 4.9.0

23 Jan 09:44
Compare
Choose a tag to compare
Release 4.9.0 Pre-release
Pre-release

What is new in ScyllaDB-Monitoring 4.9.0 Release

  • Add a summary table for all tables #2373
  • Alternator dashboard - improve the information panel #2365
  • Show all CQL Consistency Levels #2361
  • Show only top-k or bottom-k, not both #2442
  • Add limit-k to the dashboards #2454
  • Switch to Prometheus 3.x #2450
  • Switch to Loki to 3.x #2356

Bug Fixes

  • Alternator total ops is empty #2440
  • Write timeouts panel is not filtered by the scheduling group #2432
  • RPC metrics are intermixed #2428
  • CQL Fail should be shown only if a node is in operation mode #2427
  • Tables with no writes are not shown in the keyspace dashboard #2425

Operational changes

  • ScyllaDB plugin executables are making the repo 153MB #2246
  • Make it easy to run local Thanos query #2464
  • Support placing dashboards under a closed support folder #2462
  • Add Quick startup option #2436

Release 4.8.3

20 Nov 12:09
Compare
Choose a tag to compare

Bug fixes in Release 4.8.3

  • Use the updated panel repeat options to work with the new Grafana scene-based dashboard #2424

Release 4.8.2

12 Nov 10:35
Compare
Choose a tag to compare

New in Release 4.8.2

Add manager 3.4 support

Release 4.8.1

30 Sep 05:56
Compare
Choose a tag to compare

Bug fixes

  • start-all.sh --target-directory option have error in its documentation #2398
  • Some panels in the OS dashboard do not respect the DC filter #2396
  • Unknown Alternator OP - BatchGetItemSize #2393
  • Compression-related panels are confusing #2392
  • Using stack graph for active read is confusing #2389
  • Row and Partitions insertions are measured in read/sec #2386
  • The latencies Legend format are confusing #2383
  • scylla_io_queue_flow_ratio graph is inconvenient #2382
  • Add batch latency and batch size metrics to Alternator dashboard #2380
  • Alternator OPs are not representative of real ops - in case of BatchGetItem and similar batch ops area/alternator #2374

Release 4.8.0

19 Aug 11:35
Compare
Choose a tag to compare

New In Release 4.8.0

  • Support for Scylla Manager 3.3 #2339
  • Make the Tablet section collapsible #2329
  • Add panels for network compression #2325
  • Add filters that limit the number of results per panel [breaking-changes] #2319
  • Add a graph for scylla_io_queue_flow_ratio #2306
  • Make the IO-group panel group by iogroup, stream #2305
  • Tooltip now allows scrolling #2209
  • Add metrics for RPC #2104
  • Unify Scylla-Manager status and progress #2009
  • Different aggregation functions for the latency metrics #1741

Bug Fixes

  • Non-Paged CQL Reads Gauge isn't working. #2295
  • " I/O Group All Queue consumption" dashboard use wrong type of graph. #2293
  • Increase nodes table column width to display full ip by default #2302
  • Fix panels description in the advanced dashboard #2290
  • Full page screenshot is broken #2324
  • Make genconfig support ipv6

Operational changes

  • splitBrain alert support for a multi-cluster setup #2304
  • Allow setting local network and docker_pram from env file #2035
  • scylla_storage_proxy_coordinator_read_timeouts repeated twice in regexp. #2323
  • make prometheus the default datasource #2268
  • Deprecated level label #2322
  • The os dashboard accepts multiple node_exporter jobs #2317
  • Support Prometheus various scrap interval sampling [breaking-changes] #2345

Breaking Changes

  • The dashboards now support longer Prometheus scrape intervals, which are configurable and passed as a parameter in the Grafana data source configuration.
  • To better handle clusters with high core counts, the dashboards limit the number of series shown by default. You can change that limit from the drop-down menu at the top.

Release 4.7.2

08 May 08:26
Compare
Choose a tag to compare

Bug Fixes in Release 4.7.2

  • Alternator, complete the move to summaries #2278
  • Wrong port number for node manager agents in prometheus/prometheus.consul.yml.template when using Docker #2277

Release 4.7.1

21 Apr 06:37
Compare
Choose a tag to compare

Bug fixes in 4.7.1

  • The bloom filter alert causes too many false positives #2263
  • Non-token aware queries graph (and gauge) is broken. #2259
  • Service level selection is not carried over between dashboards #2253
  • Hints manager sent annotation, uses the wrong metric #2250
  • Add cluster label to manager base metrics #2270

Release 4.7.0

04 Apr 13:49
Compare
Choose a tag to compare

New in Release 4.7.0

  • Update alternator dashboard #2226
  • Make the default dashboard refresh interval configurable #2220
  • Show scylla_sstables_bloom_filter_memory_size on the detailed dashboard #2219
  • Update Alternator latencies histogram and summaries #2214
  • Combine the Advisor table with the alert table in the overview dashboard #2166
  • Easier method to run multiple monitoring stacks side-by-side #2164
  • Add ethtool metrics to Datadog integration #2163
  • Add tablet metrics to the detailed dashboard #2119, #2111
  • Add storage-related metrics #2044
  • New alert - cluster in split-brain state #1677
  • Enhanced experience with --archive command line flag #2158, #2177
  • The explanation for the unified class group graph is not clear #2178

Bug fixes

  • No closing parenthesis #2229
  • The variable $sg is not defined. #2228
  • Prometheus continues to trigger alerts for a node that has already been removed from scylla_servers.yml #2227
  • read-timeouts in the overview dashboard are breaking when no cdc metrics are reported #2193
  • Manager metrics are inconsistent #2191
  • Version information is cut - although there's plenty of space available in the panel #2189
  • Reads panel does not reflect shards #2171
  • Overview page - no data [write latency, Read timeout by DC] #2162
  • Manager memory metrics interfere with the OS ones #2198
  • The actual interval for calculating metrics is greater than the one specified in evaluation_interval. #2087

operational chagnes

  • start-all.sh optionally skip alertmanager #2239
  • Allow an easy way to start Prometheus with protobuf support #2155
  • Regex for empty string |$^ in dashboards #2192
  • prometheus/prometheus.yml.template: set evaluation interval to 20s #2185
  • Improved experience when working with Archive #2177
  • start-all.sh: create a file with the parameters of the last run operation #2174
  • remove the deprecated level label #2160
  • Performance and security enhencements #2154
  • Allow setting local network from env file #2035

scylla-monitoring-4.6.2

12 Feb 13:58
Compare
Choose a tag to compare
Pre-release

Release 4.6.2

Release 4.6.1

23 Jan 13:39
Compare
Choose a tag to compare

New in release 4.6.1

  • Alert severity for repair and backup failures changed to warn #2151
  • Update Grafana version to 10.2.3

Bug Fixes

  • support custom port when using podman #2152