Add support for intra-segment search concurrency #13542

javanna · 2024-07-04T14:46:37Z

This PR introduces support for optionally creating slices that target leaf reader context partitions, which allow them to be searched concurrently. This is good to maximize resource usage when searching force-merged indices, or indices with rather big segments, by parallelizig search execution across subsets of segments being searched.

Note: this PR does not affect default generation of slices. Segments can be partitioned by overriding the IndexSearcher#slices(List<LeafReaderContext>) method to plug in ad-hoc slices creation. Moreover, the existing IndexSearcher#slices static method now creates segment partitions when the additional allowSegmentsPartitions argument is set to true.

The overall design of this change is based on the existing search concurrency support that is based on LeafSlice and CollectorManager. A new LeafReaderContextPartition abstraction is introduced, that holds a reference to a LeafReaderContext and the range of doc ids it targets. A LeafSlice noew targets segment partitions, each identified by a LeafReaderContext instance and a range of doc ids. It is possible for a partition to target a whole segment, and for partitions of different segments to be combined into the same leaf slices freely, hence searched by the same thread. It is not possible for multiple partitions of the same segment to be added to the same leaf slice.

Segment partitions are searched concurrently leveraging the existing BulkScorer#score(LeafCollector collector, Bits acceptDocs, int min, int max) method, that allows to score a specific subset of documents for a provided
LeafCollector, in place of the BulkScorer#score(LeafCollector collector, Bits acceptDocs) that would instead score all documents.

Changes that require migration

The migrate guide has the following new clarifying items around the contract and breaking changes required to support intra-segment concurrency:

Collector#getLeafCollector may be called multiple times for the same leaf across distinct Collector instances created by a CollectorManager. Logic that relies on getLeafCollector being called once per leaf per search needs updating.
a Scorer, ScorerSupplier or BulkScorer may be requested multiple times for the same leaf
IndexSearcher#searchLeaf change of signature to accept the range of doc ids
BulkScorer#score(LeafCollector, BitSet) is removed in favour of BulkScorer#score(LeafCollector, BitSet, int, int)
static IndexSearcher#slices method changed to take a last boolean argument that optionally enables the creation of segment partitions
TotalHitCountCollectorManager now requires that an array of LeafSlices, retrieved via IndexSearcher#getSlices,
is provided to its constructor

Note: DrillSideways is the only component that does not support intra-segment concurrency and needs considerable work to do so, due to its requirement that the entire set of docs in a segment gets scored in one go.

The default searcher slicing is not affected by this PR, but LuceneTestCase now randomly leverages intra-segment concurrency. An additional newSearcher method is added that takes a Concurrency enum as the last argument in place of the useThreads boolean flag. This is important to disable intra-segment concurrency for DrillSideways related tests that do support inter-segment concurrency but not intra-segment concurrency.

Next step

While this change introduces support for intra-segment concurrency, it only sets up the foundations of it. There is still a performance penalty for queries that require segment-level computation ahead of time, such as points/range queries. This is an implementation limitation that we expect to improve in future releases, see #13745.

Additionally, we will need to decide what to do about the lack of support for intra-segment concurrency in DrillSideways before we can enable intra-segment slicing by default. See #13753 .

More details: issues found

core

I have found the following two issues that revolved around total hits counting, which have now been addressed by adapting TotalHitCountCollectorManager's behaviour:

IndexSearcher#count / TotalHitCountCollectorrely onWeight#count(LeafReaderContext)`, which now gets called multiple times against the same leaf and leads to excessive counting of hits.
LRUQueryCache caches the return value of Weight#count. When we execute the same query against the same segment multiple times (as part of a single search call), the first time we do the actual counting for the docs that the first partition holds, and subsequent times we should do the same, count hits in each partition of the same segment instead of retrieving the count from the cache.

Both these issues are addressed by modifying TotalHitCountCollectorManager to return an enhanced version of TotalHitCountCollector that tracks leaves that it sees, and whether they early terminated or not. The goal is to ensure consistency and correctness across multiple times that a leaf collector for the same leaf is requested.
If a partition for a certain leaf early terminates, all of the other partitions of that same segment should also early terminate and not contribute to the total hits count. If a partition for a certain leaf computes hit counts, all of the other partitions of that same segment should also compute hit counts ( as opposed to potentially being early terminated or retrieved from the cache, which only work globally per leaf, and not per leaf partition)

I have also adjusted CheckHits to not go outside of the bounds of the doc id range of the current slice.

I fixed TestSortRandoms custom ScorerSupplier to not rely on a scorer supplier being pulled once and only once per LeafReaderContext

facet

I have found additional issues in the facet module:

DrillSideways does not support scoring via the provided range of doc ids, it throws an exception whenever the range is not 0 and Integer.MAX_VALUE. This is not easy to solve. I have disabled intra-segment concurrency in tests when DrillSideways is used.
FacetsCollectorManager and RandomSampligFacetsCollector#createManager returned multiple MatchingDocs instances that pointed to the same LeafReaderContext which caused TestStringValueFacetCounts failures. I added an extra round of de-duplication in the reduce method that merges back multiple matching docs from multiple partitions of the same segment into a single per-segment instance.

Closes #9721

lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java

lucene/core/src/test/org/apache/lucene/index/TestSegmentToThreadMapping.java

lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java

stefanvodita · 2024-07-04T15:46:44Z

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

    for (LeafReaderContext ctx : sortedLeaves) {
      if (ctx.reader().maxDoc() > maxDocsPerSlice) {
        assert group == null;
-        groupedLeaves.add(Collections.singletonList(ctx));
+        // if the segment does not fit in a single slice, we split it in multiple partitions of


Thank you for moving forward on this issue @javanna!
I had a different strategy in mind for slicing the index. With the current implementation, we deduce the number of slices based on a given per-slice doc count. What if the number of slices was given instead? Each segment would not be divided into n slices, but a slice could straddle over a few segments. I think it gives us better control over the concurrency and leaves us with fewer slices per segment while achieving the same level of concurrency, which is better because it reduces the fixed cost of search that we pay per slice per segment. Are there challenges that make that hard to implement?

I honestly don't even remember how I divided segments into multiple slices. That's subject to change and I'd expect that to be the last details to look at. First, make search across segment partitions work.

I'm confused by the slice vs partition terminology (sorry -- I have not yet looked at the code changes!).

Are we aiming to replace the notion of slices (what IndexSearcher implements today) with partitions? In which case, it does seem important that a partition should support putting N small segments together into a single quanta (one work unit for the searcher thread).

Or, will we keep the notion of a slice, and it's a list of partitions in a single quanta, in which case slices append multiple (small) things (partitions) together, while partitions break up a single big segment.

Thanks for your feedback @mikemccand ! You are right to be confused, this is a POC and details about abstractions needs refining.

Here is how I got to this place: we have slices that target multiple leaf reader contexts, and I wanted to be able to have a slice still targeting multiple segments, at the same time targeting a partition of a segment. The easier way to do that was to wrap a leaf reader context into a new abstraction that holds a leaf reader context and defines a range of doc ids as well. A partition can still point to the entire segment though.

Technically, a slice is a set of segments, yet a partition of an index, while a leaf reader context partition is a partition of a segment, hence the confusion :) One natural next step could be to fold the range of doc ids into the leaf reader context perhaps, but that is a substantial API change. Otherwise keeping the structure i came up with, but renaming things to clarify the concepts and reduce confusion. Do you have other options in mind?

I had worked up a version of this where I modified LeafReaderContext/IndexReaderContext to create a new kind of context that models the range within a segment. I had added interval start/end to LRC, but I suspect a cleaner way would be to make a new thing (IntervalReaderContext or so) and then change APIs to expect IndexReaderContext instead of CompositeReaderContext? If we do it this way it might make it easier to handle some cases like the single-threaded execution you mentioned. But this is more about cleaning up the APIs than making it work and we can argue endlessly about what is neater, so I think your approach to delay such questions makes sense.

Going back to this, especially on @stefanvodita 's comment above, it may make sense to introduce a different way to control the number of slices, instead of using the existing maxDocsPerSlice argument. It is kind of nice that we can apply the existing value and partition segments accordingly, but that can also be surprising and generate way too many partitions perhaps. Good thing to get further feedback on.

Sorry I haven't been able to keep up with all the updates on the PR. I still think there's a better partitioning [name up for debate] strategy, that will make searching more efficient.

@msokolov put it really well:

partitions or intervals (in my view) are a logical division of the index into roughly equal-sized contiguous (in docid space) portions and they overlay the segments arbitrarily

I was thinking about that, and I guess it's not incompatible with this strategy but can be built on top of it EG if we have segments of size 100, 30, 20 and we want to have size 25 partitions (maybe we have 6 threads) we can implement that using 4 partitions for the 1st segment, 2 for the second one (of size 25 and 5) and 1 for the last, assigning the last two partitions to the same thread. IE we can still build slices (collections of partitions) on top.

shubhamvishu · 2024-07-05T12:25:23Z

Nice to see this PR @javanna! I know it might be too early to ask(as changes are not yet consolidated), but curious if we have any early benchmarking numbers when intra concurrency is enabled?

mikemccand · 2024-07-05T12:45:35Z

Thank you for tackling this @javanna! This is long overdue ... it's crazy that a fully optimized (forceMerge(1)) index removes all intra-query concurrency.

IndexSearcher#count / TotalHitCountCollector rely on Weight#count(LeafReaderContext), which now gets called multiple times against the same leaf and leads to excessive counting of hits.

I think queries that do substantial work in Weight/ScorerSupplier.get are also tricky? E.g. points queries (PointRangeQuery) walks the BKD tree to build a bitset. Ideally, if a segment has multiple LeafReaderContextPartitions, we'd only do the BKD intersect once, and then the N threads working on those partitions can share that bitset?

jpountz · 2024-07-07T21:48:19Z

Thanks for looking into this!

IndexSearcher#count / TotalHitCountCollector rely on Weight#count(LeafReaderContext), which now gets called multiple times against the same leaf and leads to excessive counting of hits.
Resolved by synchronizing the call to getLeafCollector, to make sure we don't pull a leaf collector for the same segment multiple times in parallel

I wonder if it should be on IndexSearcher to not call Collector#getLeafCollector multiple times if the first call throws a CollectionTerminatedException (what you implemented) or if it should be on the CollectorManager to no longer assume that Collector#getLeafCollector gets called exactly once per leaf. The latter feels cleaner to me conceptually, and I believe that it would also avoid the issue that you had with LRUQueryCache? The downside is that we'll need to fix all collectors that make this assumption.

javanna · 2024-07-08T10:11:10Z

Bulk reply to some of the feedback I got:

hi @shubhamvishu ,

I know it might be too early to ask(as changes are not yet consolidated), but curious if we have any early benchmarking numbers when intra concurrency is enabled?

I have run no benchmarks so far. I focused more on making tests happy, which felt like a good initial step. We do know that there will be a substantial gain when we search force merged indices.

@mikemccand

I think queries that do substantial work in Weight/ScorerSupplier.get are also tricky? E.g. points queries (PointRangeQuery) walks the BKD tree to build a bitset. Ideally, if a segment has multiple LeafReaderContextPartitions, we'd only do the BKD intersect once, and then the N threads working on those partitions can share that bitset?

Yes, definitely, I mentioned this above in a comment / TODO. Removing duplicated effort across multiple slices pointing to the same segment would be great, but I focused on making things work to start with. What you mention is a natural next step. It may though end up influencing the design depending on where we decide to de-duplicate work: does the notion of segment partition needs to be visible to consumers, or can it be an implementation detail?

@jpountz

I wonder if it should be on IndexSearcher to not call Collector#getLeafCollector multiple times if the first call throws a CollectionTerminatedException (what you implemented) or if it should be on the CollectorManager to no longer assume that Collector#getLeafCollector gets called exactly once per leaf. The latter feels cleaner to me conceptually, and I believe that it would also avoid the issue that you had with LRUQueryCache? The downside is that we'll need to fix all collectors that make this assumption.

Agreed, that adjustment is a hack. The change in expectation should be reflected in the Collector API semantics though (rather that CollectorManager?), is that what you meant? For instance, taking the total hit count example, TotalHitCountCollector would need to be adapted to handle the case where getLeafCollector gets called providing the same leaf multiple times? It would need to be made synchronized and it would need to track which leaves were early terminated?

jpountz · 2024-07-08T16:35:41Z

The change in expectation should be reflected in the Collector API semantics though (rather that CollectorManager?), is that what you meant?

I was referring to CollectorManager because we could have two partitions of the same leaf that are collected by two different Collectors of the same CollectorManager. So both collectors would see this leaf only once, but we still need to avoid double-counting this leaf.

It would need to be made synchronized and it would need to track which leaves were early terminated?

Something like that indeed. And only try to apply the Weight#count optimization on the first time that getLeafCollector is called on a leaf to avoid hitting the problem you had with LRUQueryCache.

msokolov · 2024-07-08T17:44:55Z

I wonder if we should tackle the issue with caching / cloning scorers? We have scorers/scorerSuppliers that do a lot of up-front work when created and we don't want to duplicate that work when two threads search the same segment. OTOH two threads cannot share the same Scorer instance since it has internal state namely the iterator it advances, and maybe more. This seems like the biggest problem to tackle - maybe it would make sense to start there? OTOH we need some kind of execution framework (this PR) that we can use to exercise such changes - I guess it's a kind of leapfrogging problem, gotta start somewhere. Given that, maybe we create a long-running branch we can use to iterate on this in multiple steps before merging to main?

As for the Scorer-cloning I poked at it and it seems tricky. I think that Weight can cache Scorer and/or ScorerSupplier, and I guess BulkScorer. Then when asked for a scorer-type object that it already has it can create a shallowCopy(). Shallow-copying should be implemented so as to do the least amount of work to create a new instance that can be iterated independently. This will be up to each Scorer/ScorerSupplier/BulkScorer to figure out -- maybe some of them just return null and let Weight create a new one as it does today.

msokolov · 2024-07-08T15:49:57Z

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

    for (LeafReaderContext ctx : sortedLeaves) {
      if (ctx.reader().maxDoc() > maxDocsPerSlice) {
        assert group == null;
-        groupedLeaves.add(Collections.singletonList(ctx));
+        // if the segment does not fit in a single slice, we split it in multiple partitions of


I had worked up a version of this where I modified LeafReaderContext/IndexReaderContext to create a new kind of context that models the range within a segment. I had added interval start/end to LRC, but I suspect a cleaner way would be to make a new thing (IntervalReaderContext or so) and then change APIs to expect IndexReaderContext instead of CompositeReaderContext? If we do it this way it might make it easier to handle some cases like the single-threaded execution you mentioned. But this is more about cleaning up the APIs than making it work and we can argue endlessly about what is neater, so I think your approach to delay such questions makes sense.

msokolov · 2024-07-08T16:04:12Z

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

-        groupedLeaves.add(Collections.singletonList(ctx));
+        // if the segment does not fit in a single slice, we split it in multiple partitions of
+        // equal size
+        int numSlices = Math.ceilDiv(ctx.reader().maxDoc(), maxDocsPerSlice);


My mental model of the whole slice/partition/segment/interval concept is: existing physical segments (leaves) divide the index into arbitrary sizes. existing slices (what we have today, not what is called slices in this PR) group segments together. partitions or intervals (in my view) are a logical division of the index into roughly equal-sized contiguous (in docid space) portions and they overlay the segments arbitrarily. Then it is the job of IndexSearcher to map this logical division of work into the underlying physical segments. The main comment here is - let's not confuse ourselves by re-using the word "slice" which already means something else!

partitions or intervals (in my view) are a logical division of the index into roughly equal-sized contiguous (in docid space) portions and they overlay the segments arbitrarily

I like the idea of calling them virtual segments in contrast to physical segments. Interval (and less so partition) seems to imply the part about contiguity in docid space, but is that a necessary assumption? It's efficient for it to work that way, but maybe we can allow more flexibility in how we determine these partitions / intervals (virtual segments).

We are changing the meaning of slices from group of segments to a partition of segment if I understand correctly. I think its ok to change its definition in this PR(we are not adding anything thats conflicting just changing its meaning). But if it really creating confusion maybe renaming to totalNumLeafPartitions(or something better as naming is hard) could do the job?

Thanks for the feedback! I'd suggest postponing detailed reviews around how slices are generated, and naming, for now. I don't plan on addressing these details at the moment, I'd rather focus on functionality and high-level API design (where do we expose the range of ids within the different Lucene API, what problems could there be with the current approach, find better solutions for the hack I came up with).

lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java

lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java

javanna · 2024-07-22T13:07:27Z

lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java

+    // An alternative is to document the requirement to not reuse collector managers across
+    // searches, that would be a
+    // breaking change. Perhaps not needed for now.
+    seenContexts.clear();


Because the collector manager holds state, either we clean it in-between searches (reduce is the current place where that can be done, but that may change over time), or we declare that collector managers cannot be re-used across multiple searches (potential breaking change).

I am unsure whether it is necessary to make a breaking change for this, perhaps clearing state in reduce is not that bad.

javanna · 2024-07-22T18:39:59Z

lucene/core/src/test/org/apache/lucene/index/TestForTooMuchCloning.java

@@ -80,7 +80,7 @@ public void test() throws Exception {
    // System.out.println("query clone count=" + queryCloneCount);
    assertTrue(
        "too many calls to IndexInput.clone during TermRangeQuery: " + queryCloneCount,
-        queryCloneCount < 50);
+        queryCloneCount <= Math.max(s.getLeafContexts().size(), s.getSlices().length) * 4);


This is to account for higher number of slices created while running tests, and the number of calls to clone being a factor of that.

Hmm, shouldn't the multiplicative factor be the number of leaf context partitions?

I have no idea, it varied a lot in my tests and I could not align my expectations with what I observed in practice, I ended up applying a magic factor that seems to be good for all the cases.

I did more digging on this, things were much more predictable before because we only have 20 docs and a couple of segments. 50 was then high enough for all cases.

With intra-segment slicing, we end up with more slices, and the number of clone calls is indeed a factor of the number of slices, but there is more to it which depends on the collected terms, and I am not sure how to calculate that exactly or make it predictable. The test does not seem to do that either so far, it just has a high enough value.

I see two clone calls per partition done when Weight#scorerSupplier is called. They are from two different places in the IntersectTermsEnum constructor (line 89 and 98).

I see other two clone calls per partition done when ScorerSupplier#bulkScorer is called.

That is where my current factor of 4 came from. It seems to work in the vast majority of the cases. I have a seed (F02FB6F046EE8C80) where I have 9 partitions, but a total of 40 clone calls as part of search. Those additional 4 calls are rather unpredictable, I think that it depends on whether the query is rewritten as a boolean query or not.

After all this analysis though, I realize that I came up with a formula that is more restrictive than the previous fixed value. Here we are searching 9 partitions, and the total becomes 36 (against 40 real value), while previously the fixed limit was 50. I think I can increase the factor further without worrying too much. Key is that we are currently limiting the max number of partitions per segments to 5. We will need to adapt things in tests if we remove that limit in the future, which is likely.

javanna · 2024-07-22T18:43:07Z

lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java

@@ -211,7 +211,7 @@ private void doTestScores(int totalHitsThreshold) throws IOException {
  }

  public void testBoosts() throws IOException {
-    doTestBoosts(2);
+    doTestBoosts(1);


These two changes are to account for single document leaf partitions being created in some scenarios, for which we would not be able to ever early terminate provided 2 as a total hits threshold (as there is no partition with more than one doc).

javanna · 2024-07-22T18:59:48Z

I did some work on this draft PR and made the fixes around hits counting early termination and caching more future proof. They are now contained to TotalHitCountCollectorManager, which feels like a good fit.

~~I need to investigate recurring test failures on TestSortRandom.~~ I am now focusing on adjustments to the ScorerSupplier area, so that multiple partitions can do segment level work once.

I am getting some facets failures too, which are predictable, given I have not updated the facets code in any way. I need help figuring out a way forward. FacetsCollector has a separate single segment code-path, I see different assertions tripping if I try to adjust some of that logic. At the end of it I get the "Sub-iterators of ConjunctionDISI are not on the same document!" error, which sounds accurate given that each partition gets its own iterator and they would all be at a different doc indeed. I am not sure how to proceed, we may want to perhaps opt out of applying intra-segment concurrency in this case. There's also an issue with DrillSideways, which has a bulk scorer that ignores the provided range of doc ids, and even asserts that the full range is provided. @mikemccand would you have ideas on what to do here?

List of facet test failures

org.apache.lucene.facet.TestRandomSamplingFacetsCollector.testRandomSampling (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestRandomSamplingFacetsCollector.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestRandomSamplingFacetsCollector.testRandomSampling" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=BD71211916214319 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=256 -Ptests.forceintegervectors=true
org.apache.lucene.facet.TestDrillSideways.testBasic (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestDrillSideways.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestDrillSideways.testBasic" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=75A056B7DBFA7364 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=default -Ptests.forceintegervectors=false
org.apache.lucene.facet.TestDrillSideways.testMultipleRequestsPerDim (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestDrillSideways.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestDrillSideways.testMultipleRequestsPerDim" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=BD71211916214319 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=256 -Ptests.forceintegervectors=true
org.apache.lucene.facet.TestDrillSideways.testBasicWithCollectorManager (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestDrillSideways.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestDrillSideways.testBasicWithCollectorManager" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=75A056B7DBFA7364 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=default -Ptests.forceintegervectors=false
org.apache.lucene.facet.TestStringValueFacetCounts.testRandom (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestStringValueFacetCounts.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestStringValueFacetCounts.testRandom" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=BD71211916214319 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=256 -Ptests.forceintegervectors=true
org.apache.lucene.facet.TestStringValueFacetCounts.testBasicSingleValued (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestStringValueFacetCounts.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestStringValueFacetCounts.testBasicSingleValued" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=BD71211916214319 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=256 -Ptests.forceintegervectors=true
org.apache.lucene.facet.range.TestRangeFacetCounts.testCustomDoubleValuesSource (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.range.TestRangeFacetCounts.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.range.TestRangeFacetCounts.testCustomDoubleValuesSource" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=BD71211916214319 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=256 -Ptests.forceintegervectors=true
org.apache.lucene.facet.TestFacetUtils.testBasic (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.TestFacetUtils.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.TestFacetUtils.testBasic" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=75A056B7DBFA7364 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=default -Ptests.forceintegervectors=false
org.apache.lucene.facet.rangeonrange.TestRangeOnRangeFacetCounts.testMultiDimMixedRangeAndNonRangeTaxonomy (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.rangeonrange.TestRangeOnRangeFacetCounts.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.rangeonrange.TestRangeOnRangeFacetCounts.testMultiDimMixedRangeAndNonRangeTaxonomy" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=75A056B7DBFA7364 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=default -Ptests.forceintegervectors=false
org.apache.lucene.facet.taxonomy.TestTaxonomyFacetLabels.testBasic (:lucene:facet)
Test output: /home/runner/work/lucene/lucene/lucene/facet/build/test-results/test/outputs/OUTPUT-org.apache.lucene.facet.taxonomy.TestTaxonomyFacetLabels.txt
Reproduce with: gradlew :lucene:facet:test --tests "org.apache.lucene.facet.taxonomy.TestTaxonomyFacetLabels.testBasic" -Ptests.jvms=2 -Ptests.jvmargs= -Ptests.seed=75A056B7DBFA7364 -Ptests.gui=true -Ptests.file.encoding=US-ASCII -Ptests.vectorsize=default -Ptests.forceintegervectors=false

It would be good to align and decide on terminology: I have no strong opinions. To me, the LeafReaderContextPartition terminology can coexist with the existing slice terminology. Slices of an index are group of segments of that index. A leaf partition is a subset of a segment, identified by a LeafReaderContext and a range of doc ids. This is only a starting point and I am happy to hear what ideas others have around this.

Next thing I'd like to start thinking about is whether support for intra-segment concurrency requires breaking changes to make, as we have a chance to make those with Lucene 10. I wonder specifically if we are going to be able to hide the notion of leaf partition and the range of doc ids like I currently do, or if that will need to become more of a first class citizen, for instance when we'll want to remove duplicated work across partitions. I need to do a bit more digging to form an opinion on this.

javanna · 2024-07-22T19:19:33Z

lucene/test-framework/src/java/org/apache/lucene/tests/search/ScorerIndexSearcher.java

+    search(
+        Arrays.stream(leaves).map(leafReaderContextSlice -> leafReaderContextSlice.ctx).toList(),
+        weight,
+        collector);


Question about this: this is used in a single test (TestBooleanQueryVisitSubscorers), which never sets the executor, hence there's always ever going to be a single slice and no parallelism. It is meaningless to do anything better than what I did here, but ScorerIndexSearcher is public and part of the lucene test framework, so even if we never use the ScorerIndexSearcher(IndexReader, Executor) constructor internally, there may be external users?

I want to double check that this is worth iterating further on, to support concurrent search across leaf partitions.

I added a check that verifies that only entire segments are scored using this search subclass. This is to cover potential existing subclasses of it. We have no such scenario in our own codebase.

…or) method There's a couple of places in the codebase where we extend IndexSearcher to customize per leaf behaviour, and in order to do that, we need to override the entire search method that loops through the leaves. A good example is ScorerIndexSearcher. Adding a searchLeaf method that provides the per leaf behaviour makes those cases a little easier to deal with. Also, it paves the road for apache#13542 where we will want to replace search(List<LeafReaderContext>, Weight, Collector) with something along the lines of search(LeafReaderContextPartition[], Weight, Collector).

javanna · 2024-09-09T14:48:47Z

lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java

        scores = newScores;
      }
-      scores[totalHits] = scorer.score();
+      scores[doc] = scorer.score();


@gsmiller could you double check the changes I made around facets collection? Tests are green and they were not before, but always good to double check . The ones around scoring here are caused by my recent change to support keepScores set to true in FacetsCollectorManager. These I am slightly nervous about because of the potentially big array to keep scores around and then merge them.

Below, FacetsCollectorManager has now logic to merge back results from multiple FacetsCollector referring to the same LeafReaderContext, which seems to be the expectation in how we expose results from FacetsCollectorManager. It is not particularly nice but I thought it requires no changes for users and should not introduce particular overhead.

Saw you already merged this but I made some time to look at the facet collecting changes anyway. They look correct to me. I share your concern with keeping around large sparse score arrays, but I don't think keepScores is a particularly common use-case (could be wrong...) and we can always look into ways of improving this as a follow-up if necessary. Thanks!

thanks for looking! I noticed that we document the scores array as non-sparse and this change makes it indeed sparse. I think I could have been smarter here and keep it non-sparse. I will try to address this. It feels bad that we allocate such a big array and keep on resizing it as well. We should rather then build in advance knowing how many docs the leaf has. I will see what I can do about this.

javanna · 2024-09-09T14:51:10Z

lucene/facet/src/test/org/apache/lucene/facet/TestDrillSideways.java

@@ -142,7 +142,7 @@ protected Facets buildFacetsResult(
  private IndexSearcher getNewSearcher(IndexReader reader) {
    // Do not wrap with an asserting searcher, since DrillSidewaysQuery doesn't
    // implement all the required components like Weight#scorer.
-    IndexSearcher searcher = newSearcher(reader, true, false, random().nextBoolean());
+    IndexSearcher searcher = newSearcher(reader, true, false, Concurrency.INTER_SEGMENT);


All the tests that rely on DrillSideways need to explicitly disable intra-segment concurrency, but can use inter-segment concurrency, hence I introduced a way to distinguish between the two.

I opened #13753 to track this.

mikemccand

Thank you everyone for persisting on this change! This is long, long overdue for Lucene. I'm happy we are getting a start in 10.0.0.

Also thank you for the numerous comments/javadocs/MIGRATE entries explaining the change so well. I mostly only reviewed those publicly visible docs / API changes.

lucene/MIGRATE.md

lucene/core/src/java/org/apache/lucene/search/CollectorManager.java

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java

mikemccand · 2024-09-10T12:47:38Z

Perhaps it is reasonable to try and spend a little time to figure out if there's foreseeable breaking changes we'll need to make for the next step, I could use help with that! @jpountz do you have good sense for that?

While I care about backward compatibility for the Query / IndexReader / IndexSearcher APIs, which are user-facing, I consider the Weight/Scorer API internal, so I'm not worried about this: we can break it in a minor if necessary.

+1, but have we marked these APIs as @lucene.internal?

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java

javanna · 2024-09-10T21:11:34Z

Thanks all for the feedback and reviews! The journey only begins as this PR is getting merged, looking forward to making intra-segment concurrency support shine soon :)

jpountz · 2024-09-11T20:31:43Z

have we marked these APIs as @lucene.internal?

No indeed, but we have been effectively treating them as unstable API for as long as I've been a committer on this project. Would be good to align our javadocs with our practices.

We have removed BulkScorer#score(LeafReaderContext, Bits) in main in favour of BulkScorer#score(LeafCollector collector, Bits acceptDocs, int min, int max) as part of apache#13542. This commit deprecates the method in 9.x. Internal usages of it are left untouched as there may be subclasses that override it, which we don't want to break in a minor release.

We have removed BulkScorer#score(LeafReaderContext, Bits) in main in favour of BulkScorer#score(LeafCollector collector, Bits acceptDocs, int min, int max) as part of #13542. This commit deprecates the method in 9.x. Internal usages of it are left untouched as there may be subclasses that override it, which we don't want to break in a minor release.

javanna mentioned this pull request Jul 4, 2024

Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] #9721

Closed

javanna commented Jul 4, 2024

View reviewed changes

lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java Outdated Show resolved Hide resolved

javanna commented Jul 4, 2024

View reviewed changes

lucene/core/src/test/org/apache/lucene/index/TestSegmentToThreadMapping.java Outdated Show resolved Hide resolved

javanna commented Jul 4, 2024

View reviewed changes

lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java Outdated Show resolved Hide resolved

stefanvodita reviewed Jul 4, 2024

View reviewed changes

msokolov reviewed Jul 8, 2024

View reviewed changes

javanna force-pushed the intra_segment_concurrency branch 6 times, most recently from ef8b961 to 790191f Compare July 22, 2024 12:06

javanna commented Jul 22, 2024

View reviewed changes

lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java Outdated Show resolved Hide resolved

javanna commented Jul 22, 2024

View reviewed changes

lucene/core/src/test/org/apache/lucene/search/TestSynonymQuery.java Outdated Show resolved Hide resolved

javanna commented Jul 22, 2024

View reviewed changes

javanna force-pushed the intra_segment_concurrency branch 3 times, most recently from 980e966 to 8413125 Compare July 22, 2024 18:37

javanna commented Jul 22, 2024

View reviewed changes

javanna mentioned this pull request Jul 23, 2024

Introduce IndexSearcher#searchLeaf(LeafReaderContext, Weight, Collector) method #13603

Merged

javanna added 7 commits September 9, 2024 13:19

fix logical OR

ddad4f0

Merge branch 'main' into intra_segment_concurrency

1d66416

support keepScores merging, re-enable concurrency in those tests

df9c7fc

fix changes entry

e1ee42b

update migrate entry

ff19f64

clarify javadocs

e7205c8

iter

0691c8b

javanna commented Sep 9, 2024

View reviewed changes

mikemccand reviewed Sep 9, 2024

View reviewed changes

review comments

16b477b

javanna mentioned this pull request Sep 9, 2024

Stop duplicating per-segment work across segment partitions #13745

Open

javanna added 4 commits September 9, 2024 20:31

Merge branch 'main' into intra_segment_concurrency

4908c58

hit count collector manager

ce2d3f2

typo

e546206

iter

fd295ec

reta reviewed Sep 10, 2024

View reviewed changes

lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java Show resolved Hide resolved

Merge branch 'main' into intra_segment_concurrency

21aa355

javanna mentioned this pull request Sep 10, 2024

DrillSideways does not support intra-segment concurrency #13753

Open

last clarifications to changes and migrate entries

d446359

javanna merged commit fafd6af into apache:main Sep 10, 2024
3 checks passed

cbuescher referenced this pull request in elastic/elasticsearch Sep 11, 2024

Fix compilation issues after last Lucene 10 snapshot merge

0ee5e4f

reta mentioned this pull request Sep 11, 2024

[FEATURE] Update to Apache Lucene 10 for 3.0.0 opensearch-project/OpenSearch#11415

Open

javanna mentioned this pull request Sep 16, 2024

Deprecate BulkScorer#score(LeafReaderContext, Bits) #13794

Merged

VimCommando mentioned this pull request Dec 20, 2024

[Meta][Metricbeat] - Collect additional Elasticsearch node metrics for enhanced dashboards elastic/beats#42131

Open

Add support for intra-segment search concurrency #13542

Add support for intra-segment search concurrency #13542

Conversation

javanna commented Jul 4, 2024 • edited Loading

Changes that require migration

Next step

More details: issues found

core

facet

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msokolov Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

shubhamvishu commented Jul 5, 2024

mikemccand commented Jul 5, 2024

jpountz commented Jul 7, 2024

javanna commented Jul 8, 2024 • edited Loading

jpountz commented Jul 8, 2024

msokolov commented Jul 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shubhamvishu Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Jul 22, 2024 • edited Loading

javanna Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikemccand left a comment

Choose a reason for hiding this comment

mikemccand commented Sep 10, 2024

javanna commented Sep 10, 2024

jpountz commented Sep 11, 2024

javanna commented Jul 4, 2024 •

edited

Loading

msokolov Jul 31, 2024 •

edited

Loading

javanna commented Jul 8, 2024 •

edited

Loading

shubhamvishu Jul 10, 2024 •

edited

Loading

javanna commented Jul 22, 2024 •

edited

Loading

javanna Jul 22, 2024 •

edited

Loading