Improvement: Extend gstat statistics by indexes (null values) #8404

sim1984 · 2025-01-20T11:38:03Z

Currently, gstat outputs the following statistics for the index:

    Index HORSE_IDX_BIRTHDAY (0) 
	Root page: 150238, depth: 2, leaf buckets: 167, nodes: 545113 
	Average node length: 4.94, total dup: 520604, max dup: 27865 
	Average key length: 2.00, compression ratio: 1.90 
	Average prefix length: 3.75, average data length: 0.05 
	Clustering factor: 436641, ratio: 0.80 
	Fill distribution: 
	     0 - 19% = 0 
	    20 - 39% = 1 
	    40 - 59% = 0 
	    60 - 79% = 0 
	    80 - 99% = 166

It is proposed to expand this statistics with the number of null values in the keys. This value is quite important if the index can contain null values, since the real selectivity for operations that do not take null into account will be different (primarily equality). It is clear that the share of null values should be in the stored statistics, as is selectivity now. However, the number of null values in the gstat output will also be useful for assessing the real selectivity.

    Index HORSE_IDX_BIRTHDAY (0) 
	Root page: 150238, depth: 2, leaf buckets: 167, nodes: 545113 
	Average node length: 4.94, total dup: 520604, max dup: 27865 
        Segments: 1, Nulls: 27866
	Average key length: 2.00, compression ratio: 1.90 
	Average prefix length: 3.75, average data length: 0.05 
	Clustering factor: 436641, ratio: 0.80 
	Fill distribution: 
	     0 - 19% = 0 
	    20 - 39% = 1 
	    40 - 59% = 0 
	    60 - 79% = 0 
	    80 - 99% = 166

The text was updated successfully, but these errors were encountered:

dyemanov · 2025-01-20T11:51:27Z

How is it going to work for compound indices? Count only NULLs in all segments?

sim1984 · 2025-01-20T12:00:59Z

I think it's worth adding the output of the number of segments to this statistic. And consider null only for single-segment indexes, in other cases just don't output, or consider when null in all segments.

sim1984 changed the title ~~ImprovementЖ Extend gstat statistics by indexes (null values)~~ Improvement: Extend gstat statistics by indexes (null values) Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement: Extend gstat statistics by indexes (null values) #8404

Improvement: Extend gstat statistics by indexes (null values) #8404

sim1984 commented Jan 20, 2025 •

edited

Loading

dyemanov commented Jan 20, 2025

sim1984 commented Jan 20, 2025

Improvement: Extend gstat statistics by indexes (null values) #8404

Improvement: Extend gstat statistics by indexes (null values) #8404

Comments

sim1984 commented Jan 20, 2025 • edited Loading

dyemanov commented Jan 20, 2025

sim1984 commented Jan 20, 2025

sim1984 commented Jan 20, 2025 •

edited

Loading