implemented accurate_aggregation

fixed negative docids updated CBO; added multithreaded performance estimates SI iterator intersection w/bitmap is now done in the library
manticoresoftware · Dec 1, 2022 · c47052f · c47052f
1 parent 5cee9b3
commit c47052f
Show file tree

Hide file tree

Showing 29 changed files with 889 additions and 440 deletions.
diff --git a/cmake/GetColumnar.cmake b/cmake/GetColumnar.cmake
@@ -18,8 +18,8 @@ include ( update_bundle )
 # still can do it for any specific requirements.
 
 # Versions of API headers we are need to build with.
-set ( NEED_COLUMNAR_API 16 )
-set ( NEED_SECONDARY_API 5 )
+set ( NEED_COLUMNAR_API 17 )
+set ( NEED_SECONDARY_API 6 )
 
 
 # Note: we don't build, neither link with columnar. Only thing we expect to get is a few interface headers, aka 'columnar_api'.

diff --git a/manual/Changelog.md b/manual/Changelog.md
@@ -21,7 +21,6 @@
 
   If you are running a replication cluster, you'll need to run `ALTER TABLE <table name> REBUILD SECONDARY` on all the nodes or follow [this instruction](../Securing_and_compacting_an_index/Compacting_an_index.md#Optimizing-clustered-indexes) with just change: run the `ALTER .. REBUILD SECONDARY` instead of the `OPTIMIZE`.
 * `SHOW SETTINGS`
-* `disable_ps_threshold`
 * `max_matches_increase_threshold`
 
 ### Packaging

diff --git a/manual/Searching/Options.md b/manual/Searching/Options.md
@@ -121,23 +121,6 @@ Integer. Max found matches threshold. The value is selected automatically if not
 
 In case Manticore cannot calculate the exact matching documents count you will see `total_relation: gte` in the query [meta information](../Profiling_and_monitoring/SHOW_META.md#SHOW-META), which means that the actual count is **Greater Than or Equal** to the total (`total_found` in `SHOW META` via SQL, `hits.total` in JSON via HTTP). If the total value is precise you'll get `total_relation: eq`.
 
-### disable_ps_threshold
-Integer. Disables pseudo sharding when the number of unique values of a groupby attribute is greater than the threshold. Default is 65536.
-
-When a groupby query is executed, Manticore estimates the number of unique values of groupby attribute using secondary indexes. If that number is greater than the threshold,
-pseudo sharding is disabled in order to increase accuracy of count() and aggregates. Loss of accuracy may occur because pseudo sharding runs queries in several threads,
-each thread getting `max_matches` groups which are merged later. If there are a lot of unique groupby values, each thread may get its own set of groups, and groups present in
-other threads may not make it to `max_matches`. This may be avoided by increasing `max_matches`, but increasing `max_matches` with pseudo_sharding enabled leads to increased
-memory consumption.
-
-Setting `disable_ps_threshold` to a value significantly higher than the default may result in disabling pseudo sharding when you have way too many unique values.
-The more unique values you have the higher is the chance of getting inaccurate results in case one of the pseudo shards just doesn't return a group at all, 
-because after sorting it appears to be on a place greater than the `max_matches` limit.
-
-If `disable_ps_threshold` is set too low, you'll disable pseudo sharding too early and the performance will be suboptimal.
-
-See also [max_matches_increase_threshold](../Searching/Options.md#max_matches_increase_threshold), which can affect the behavior of the `max_matches` option.
-
 ### expand_keywords
 `0` or `1` (`0` by default). Expands keywords with exact forms and/or stars when possible. Refer to [expand_keywords](../Creating_an_index/NLP_and_tokenization/Wildcard_searching_settings.md#expand_keywords) for more details.
 
@@ -194,7 +177,7 @@ Integer. Sets the threshold that `max_matches` can be increased to. Default is 1
 
 Manticore may increase `max_matches` to improve groupby and/or aggregation accuracy when `pseudo_sharding` is enabled and if it detects that the number of unique values
 of groupby attribute is less than this threshold. Loss of accuracy may occur when pseudo sharding executes the query in several threads or RT index performs
-parallel searches in disk chunks. See [disable_ps_threshold](../Searching/Options.md#disable_ps_threshold) for more details.
+parallel searches in disk chunks.
 
 If the number of unique values of groupby attribute is less than the treshold, `max_matches` will be set to this number. Otherwise, default `max_matches` will be used.
 
@@ -261,8 +244,6 @@ The result set is in both cases the same; picking one option or the other may ju
 Limits max number of threads to use for current query processing. Default - no limit (the query can occupy all [threads](../Server_settings/Searchd.md#threads) as defined globally).
 For batch of queries the option must be attached to the very first query in the batch, and it is then applied when working queue is created and then is effective for the whole batch. This option has same meaning as option [max_threads_per_query](../Server_settings/Searchd.md#max_threads_per_query), but applied only to the current query or batch of queries.
 
-See [disable_ps_threshold](../Searching/Options.md#disable_ps_threshold), which can affect the behavior of the `threads` option.
-
 ### token_filter
 Quoted, colon-separated of `library name:plugin name:optional string of settings`. Query-time token filter gets created on search each time full-text invoked by every index involved and let you implement a custom tokenizer that makes tokens according to custom rules.
 ```sql

diff --git a/src/aggregate.cpp b/src/aggregate.cpp
@@ -93,7 +93,9 @@ class AggrColumnar_Traits_T : public AggrFunc_Traits_T<T>
 		{
 			std::string sError; // FIXME! report errors
 			m_pIterator = CreateColumnarIterator ( pColumnar, m_sAttr.cstr(), sError );
-			m_eType = pColumnar->GetType ( m_sAttr.cstr() );
+			columnar::AttrInfo_t tAttrInfo;
+			if ( pColumnar->GetAttrInfo ( m_sAttr.cstr(), tAttrInfo ) )
+				m_eType = tAttrInfo.m_eType;
 		}
 		else
 			m_pIterator.reset();

diff --git a/src/columnarfilter.cpp b/src/columnarfilter.cpp
@@ -49,7 +49,9 @@ void ColumnarFilter_c::SetColumnar ( const columnar::Columnar_i * pColumnar )
 
 	std::string sError; // fixme! report errors
 	m_pIterator = CreateColumnarIterator ( pColumnar, m_sAttrName.cstr(), sError );
-	m_iColumnarCol = pColumnar->GetAttributeId ( m_sAttrName.cstr() );
+	columnar::AttrInfo_t tAttrInfo;
+	if ( pColumnar->GetAttrInfo ( m_sAttrName.cstr(), tAttrInfo ) )
+		m_iColumnarCol = tAttrInfo.m_iId;
 }
 
 

diff --git a/src/columnarrt.cpp b/src/columnarrt.cpp
@@ -528,8 +528,7 @@ class ColumnarRT_c : public ColumnarRT_i
 	columnar::Iterator_i *					CreateIterator ( const std::string & sName, const columnar::IteratorHints_t & tHints, columnar::IteratorCapabilities_t * pCapabilities, std::string & sError ) const override;
 	std::vector<common::BlockIterator_i *>	CreateAnalyzerOrPrefilter ( const std::vector<common::Filter_t> & dFilters, std::vector<int> & dDeletedFilters, const columnar::BlockTester_i & tBlockTester ) const override { return {}; }
 
-	int				GetAttributeId ( const std::string & sName ) const override;
-	common::AttrType_e GetType ( const std::string & sName ) const override;
+	bool			GetAttrInfo ( const std::string & sName, columnar::AttrInfo_t & tInfo ) const override;
 	bool			EarlyReject ( const std::vector<common::Filter_t> & dFilters, const columnar::BlockTester_i & tBlockTester ) const override { return false; }
 	bool			IsFilterDegenerate ( const common::Filter_t & tFilter ) const override { return false; }
 
@@ -597,17 +596,17 @@ columnar::Iterator_i * ColumnarRT_c::CreateIterator ( const std::string & sName,
 }
 
 
-int ColumnarRT_c::GetAttributeId ( const std::string & sName ) const
+bool ColumnarRT_c::GetAttrInfo ( const std::string & sName, columnar::AttrInfo_t & tInfo ) const
 {
 	auto * pFound = m_hAttrs ( sName.c_str() );
-	return pFound ? pFound->second : -1;
-}
+	if ( !pFound )
+		return false;
 
+	tInfo.m_iId = pFound->second;
+	tInfo.m_eType = pFound->first->GetType();
+	tInfo.m_bHasHash = false;
 
-common::AttrType_e ColumnarRT_c::GetType ( const std::string & sName ) const
-{
-	auto * pFound = m_hAttrs ( sName.c_str() );
-	return pFound ? pFound->first->GetType() : common::AttrType_e::NONE;
+	return true;
 }
 
 

diff --git a/src/costestimate.cpp b/src/costestimate.cpp
@@ -26,16 +26,16 @@ class CostEstimate_c : public CostEstimate_i
 private:
 	static constexpr float SCALE = 1.0f/1000000.0f;
 
-	static constexpr float COST_PUSH					= 12.5f;
+	static constexpr float COST_PUSH					= 6.0f;
 	static constexpr float COST_FILTER					= 8.5f;
-	static constexpr float COST_COLUMNAR_FILTER			= 1.5f;
+	static constexpr float COST_COLUMNAR_FILTER			= 6.0f;
 	static constexpr float COST_INTERSECT				= 5.0f;
 	static constexpr float COST_INDEX_READ_SINGLE		= 1.0f;
-	static constexpr float COST_INDEX_READ_DENSE_BITMAP	= 30.0f;
-	static constexpr float COST_INDEX_READ_SPARSE		= 700.0f;
-	static constexpr float COST_INDEX_UNION_COEFF		= 0.7f;
-	static constexpr float COST_LOOKUP_READ				= 33.0f;
-	static constexpr float COST_INDEX_ITERATOR_INIT		= 200.0f;
+	static constexpr float COST_INDEX_READ_DENSE_BITMAP	= 1.5f;
+	static constexpr float COST_INDEX_READ_SPARSE		= 30.0f;
+	static constexpr float COST_INDEX_UNION_COEFF		= 4.0f;
+	static constexpr float COST_LOOKUP_READ				= 7.0f;
+	static constexpr float COST_INDEX_ITERATOR_INIT		= 150.0f;
 
 	const CSphVector<SecondaryIndexInfo_t> &	m_dSIInfo;
 	const SelectIteratorCtx_t &					m_tCtx;
@@ -55,6 +55,7 @@ class CostEstimate_c : public CostEstimate_i
 	float	CalcIndexCost() const;
 	float	CalcFilterCost ( bool bFromIterator, float fDocsAfterIndexes ) const;
 	float	CalcAnalyzerCost() const;
+	float	CalcMTCost ( float fCost ) const;
 
 	float	CalcGetFilterComplexity ( const SecondaryIndexInfo_t & tSIInfo, const CSphFilterSettings & tFilter ) const;
 	bool	NeedBitmapUnion ( const CSphFilterSettings & tFilter, int64_t iRsetSize ) const;
@@ -73,8 +74,7 @@ CostEstimate_c::CostEstimate_c ( const CSphVector<SecondaryIndexInfo_t> & dSIInf
 bool CostEstimate_c::NeedBitmapUnion ( const CSphFilterSettings & tFilter, int64_t iRsetSize ) const
 {
 	// this needs to be in sync with iterator construction code
-	const size_t BITMAP_ITERATOR_THRESH = 16;
-	const float	BITMAP_RATIO_THRESH = 0.002;
+	const size_t BITMAP_ITERATOR_THRESH = 8;
 
 	bool bFitsIteratorThresh = false;
 	if ( tFilter.m_eType==SPH_FILTER_RANGE )
@@ -90,8 +90,8 @@ bool CostEstimate_c::NeedBitmapUnion ( const CSphFilterSettings & tFilter, int64
 	if ( m_tCtx.m_iCutoff>=0 )
 		iRsetSize = Min ( iRsetSize, m_tCtx.m_iCutoff );
 
-	float fRsetRatio = float ( iRsetSize ) / m_tCtx.m_iTotalDocs;
-	return bFitsIteratorThresh && fRsetRatio >= BITMAP_RATIO_THRESH;
+	const int QUEUE_RSET_THRESH = 4096;
+	return bFitsIteratorThresh && iRsetSize>QUEUE_RSET_THRESH;
 }
 
 
@@ -113,7 +113,8 @@ bool CostEstimate_c::IsWideRange ( const CSphFilterSettings & tFilter ) const
 
 static bool IsSingleValueFilter ( const CSphFilterSettings & tFilter )
 {
-	return tFilter.m_eType==SPH_FILTER_VALUES && tFilter.m_dValues.GetLength()==1;
+	return  ( tFilter.m_eType==SPH_FILTER_VALUES && tFilter.m_dValues.GetLength()==1 ) ||
+			( tFilter.m_eType==SPH_FILTER_STRING && tFilter.m_dStrings.GetLength()==1 );
 }
 
 
@@ -141,12 +142,10 @@ float CostEstimate_c::CalcIndexCost() const
 		const auto & tFilter = m_tCtx.m_dFilters[i];
 		if ( HasSeveralSIIterators(tFilter) )
 		{
-			if ( !NeedBitmapUnion ( tFilter, iDocs ) )
-				fCost += Cost_IndexUnionQueue(iDocs);
-
-			// we fetch real number of iterators from PGM only when we suspect that the query may have a lot of iterators
-			// otherwise we just try to guess
 			uNumIterators = CalcNumSIIterators ( tFilter, iDocs );
+
+			if ( uNumIterators>1 && !NeedBitmapUnion ( tFilter, iDocs ) )
+				fCost += Cost_IndexUnionQueue(iDocs);
 		}
 
 		if ( uNumIterators )
@@ -230,20 +229,70 @@ float CostEstimate_c::CalcAnalyzerCost() const
 		if ( tSIInfo.m_eType!=SecondaryIndexType_e::ANALYZER )
 			continue;
 
+		assert ( m_tCtx.m_pColumnar );
+		columnar::AttrInfo_t tAttrInfo;
+		bool bHasHash = false;
+		if ( m_tCtx.m_pColumnar->GetAttrInfo ( tFilter.m_sAttrName.cstr(), tAttrInfo ) )
+			bHasHash = tAttrInfo.m_bHasHash;
+
 		float fFilterComplexity = CalcGetFilterComplexity ( tSIInfo, tFilter );
 		int64_t iDocs = tSIInfo.m_iRsetEstimate;
 
-		// minmax tree eval
-		fCost += Cost_Filter ( m_tCtx.m_iTotalDocs/1024*1.33f, fFilterComplexity );
+		// filters that process but reject values are 2x faster
+		float fAcceptCoeff = std::min ( float(tSIInfo.m_iRsetEstimate)/m_tCtx.m_iTotalDocs, 1.0f ) / 2.0f + 0.5f;
+		float fTotalCoeff = fFilterComplexity*tAttrInfo.m_fComplexity*fAcceptCoeff;
 
-		// the idea is that minmax rejects most docs and 50% of the remaining docs are filtered out
-		fCost += Cost_ColumnarFilter ( Min ( iDocs*2, m_tCtx.m_iTotalDocs ), fFilterComplexity );
+		if ( bHasHash )
+		{
+			// strings with prebuilt hashes don't have minmax, so we scan the whole index
+			fCost += Cost_ColumnarFilter ( m_tCtx.m_iTotalDocs, fTotalCoeff );
+		}
+		else
+		{
+			// minmax tree eval
+			const int MINMAX_NODE_SIZE = 1024;
+			int iMatchingNodes = ( iDocs + MINMAX_NODE_SIZE - 1 ) / MINMAX_NODE_SIZE;
+			int iTreeLevels = sphLog2 ( m_tCtx.m_iTotalDocs );
+			fCost += Cost_Filter ( iMatchingNodes*iTreeLevels, fFilterComplexity );
+
+			// the idea is that minmax rejects most docs and 50% of the remaining docs are filtered out
+			fCost += Cost_ColumnarFilter ( Min ( iDocs*2, m_tCtx.m_iTotalDocs ), fTotalCoeff );
+		}
 	}
 
 	return fCost;
 }
 
 
+float CostEstimate_c::CalcMTCost ( float fCost ) const
+{
+	if ( m_tCtx.m_iThreads==1 )
+		return fCost;
+
+	int iMaxThreads = sphCpuThreadsCount();
+
+	const float fKPerf = 0.16f;
+	const float fBPerf = 1.38f;
+
+	float fMaxPerfCoeff = fKPerf*iMaxThreads + fBPerf;
+	float fMinCost = fCost/fMaxPerfCoeff;
+
+	if ( m_tCtx.m_iThreads==iMaxThreads )
+		return fMinCost;
+
+	const float fX1 = 1.0f;
+	float fX2 = iMaxThreads;
+	float fY1 = fCost;
+	float fY2 = fMinCost;
+
+	float fA = ( fY2-fY1 ) / ( 1.0f/float(sqrt(fX2)) - 1.0f/float(sqrt(fX1)) );
+	float fB = fY1 - fA / float(sqrt(fX1));
+	float fX = m_tCtx.m_iThreads;
+	float fY = fA/float(sqrt(fX)) + fB;
+
+	return fY;
+}
+
 
 uint32_t CostEstimate_c::CalcNumSIIterators ( const CSphFilterSettings & tFilter, int64_t iDocs ) const
 {
@@ -347,21 +396,29 @@ float CostEstimate_c::CalcQueryCost()
 
 	fCost += Cost_Push ( iDocsToPush );
 
+	if ( iNumAnalyzers || iNumFilters )
+	{
+		assert(!iNumIndexes); // SI always run in a single thread
+		fCost = CalcMTCost(fCost);
+	}
+
 	return fCost;
 }
 
 /////////////////////////////////////////////////////////////////////
 
-SelectIteratorCtx_t::SelectIteratorCtx_t ( const CSphVector<CSphFilterSettings> & dFilters, const CSphVector<FilterTreeItem_t> &	dFilterTree, const CSphVector<IndexHint_t> & dHints, const ISphSchema &	tSchema, const HistogramContainer_c * pHistograms, SI::Index_i * pSI, ESphCollation eCollation, int iCutoff, int64_t iTotalDocs )
+SelectIteratorCtx_t::SelectIteratorCtx_t ( const CSphVector<CSphFilterSettings> & dFilters, const CSphVector<FilterTreeItem_t> & dFilterTree, const CSphVector<IndexHint_t> & dHints, const ISphSchema &	tSchema, const HistogramContainer_c * pHistograms, columnar::Columnar_i * pColumnar, SI::Index_i * pSI, ESphCollation eCollation, int iCutoff, int64_t iTotalDocs, int iThreads )
 	: m_dFilters ( dFilters )
 	, m_dFilterTree ( dFilterTree )
 	, m_dHints ( dHints )
 	, m_tSchema ( tSchema )
 	, m_pHistograms ( pHistograms )
+	, m_pColumnar ( pColumnar )
 	, m_pSI ( pSI )
 	, m_eCollation ( eCollation )
 	, m_iCutoff ( iCutoff )
 	, m_iTotalDocs ( iTotalDocs )
+	, m_iThreads ( iThreads )
 {}
 
 

diff --git a/src/costestimate.h b/src/costestimate.h
@@ -42,12 +42,14 @@ struct SelectIteratorCtx_t
 	const CSphVector<IndexHint_t> &			m_dHints;
 	const ISphSchema &						m_tSchema;
 	const HistogramContainer_c *			m_pHistograms = nullptr;
+	columnar::Columnar_i *					m_pColumnar = nullptr;
 	SI::Index_i *							m_pSI = nullptr;
 	ESphCollation							m_eCollation = SPH_COLLATION_DEFAULT;
 	int										m_iCutoff = -1;
 	int64_t									m_iTotalDocs = 0;
+	int										m_iThreads = 1;
 
-			SelectIteratorCtx_t ( const CSphVector<CSphFilterSettings> & dFilters, const CSphVector<FilterTreeItem_t> &	dFilterTree, const CSphVector<IndexHint_t> & dHints, const ISphSchema &	tSchema, const HistogramContainer_c * pHistograms, SI::Index_i * pSI, ESphCollation eCollation, int iCutoff, int64_t iTotalDocs );
+			SelectIteratorCtx_t ( const CSphVector<CSphFilterSettings> & dFilters, const CSphVector<FilterTreeItem_t> &	dFilterTree, const CSphVector<IndexHint_t> & dHints, const ISphSchema &	tSchema, const HistogramContainer_c * pHistograms, columnar::Columnar_i * pColumnar, SI::Index_i * pSI, ESphCollation eCollation, int iCutoff, int64_t iTotalDocs, int iThreads );
 
 	bool	IsEnabled_SI ( const CSphFilterSettings & tFilter ) const;
 	bool	IsEnabled_Analyzer ( const CSphFilterSettings & tFilter ) const;

diff --git a/src/detail/coroutine_impl.h b/src/detail/coroutine_impl.h
@@ -36,7 +36,10 @@ void ClonableCtx_T<REFCONTEXT, CONTEXT, IS_ORDERED>::LimitConcurrency ( int iDis
 
 	auto iContexts = iDistThreads - 1; // one context is always clone-free
 	if ( !iContexts )
+	{
+		m_bSingle = true;
 		return;
+	}
 
 	m_dChildrenContexts.Reset ( iContexts );
 	m_dJobsOrder.Reset ( iContexts );