FST#Compiler allocates too much memory #12598

gf2121 · 2023-09-27T08:52:56Z

Description

https://blunders.io/jfr-demo/indexing-4kb-2023.09.25.18.03.36/allocations-drill-down

The allocation profile for nightly indexing benchmark shows that FST#ByteStore occupies a rather high ratio because we always construct a FST#Compiler each time calling Lucene90BlockTreeTermsWriter$TermsWriter#writeBlocks, which will allocate a new 32KB byte[] .

As the profile shows, most of memory was allocated from FST#init, which means BytesStore rarely needs another page to store bytes. So I suspect we are using a too large page here. Maybe we should decrease the page size to 8k or even smaller ?

The text was updated successfully, but these errors were encountered:

gf2121 · 2023-09-28T08:15:46Z

I did an experiment: Index 5M random BytesRefs and count the byte usage when BytesStore#finish called. The following are the statistical results:

total sample: 23130

avg: 78.12
min: 7
mid: 36
pct75: 40
pct90: 40
pct99: 44
max: 10790

While the bytesrefs are random, it may share little prefix and suffix, I tried to mock some common prefix/suffix for them like:

if (R.nextBoolean()) {
  int prefixLen = R.nextInt(b.length / 2);
  System.arraycopy(commonPrefix, 0, b, 0, prefixLen);
}

if (R.nextBoolean()) {
  int suffixLen = R.nextInt(b.length / 2);
  System.arraycopy(commonSuffix, commonSuffix.length - suffixLen, b, b.length - suffixLen, suffixLen);
}

And here is the result:

total sample: 27235

avg: 820.540738020929
min: 8
mid: 24
pct75: 629
pct90: 3347
pct99: 5374
max: 29049

We will allocate a 32kb while 99% cases only need 5kb. These results somewhat matches the allocation profile that we rarely need a second block in BytesStore.

gf2121 · 2023-09-28T15:10:47Z

I get similar statistics for wikimediumall and here are the results when BytesStore#finish called 1,000,000 times.

BytesStore#finish called: 1000000 times

min: 1
mid: 16
avg: 64.555987
pct75: 28
pct90: 57
pct99: 525
pct999: 4957
pct9999: 29124
max: 631700

It seems 1k bytes per block is enough here. 99% cases can be covered by single block and we at most need 600+ blocks for single FST.

mikemccand · 2023-10-03T10:06:10Z

Thanks @gf2121 -- this is a great discovery (and thank you https://blunders.io for the awesome integrated profiling in Lucene's nightly benchmarks!).

While the bytesrefs are random, it may share little prefix and suffix, I tried to mock some common prefix/suffix for them like

It's always dangerous to test on random data (even random data that attempts to simulate realistic patterns): you'll draw random conclusions yet act on them as if they were trustworthy! Better to test by indexing true terms that show up in a real-world corpus.

Still, I think this issue is indeed compelling -- most FSTs in practice are likely smaller than 32 KB. But one issue to be aware of is shrinking the block size will worsen this issue, where performance of building & immediately using (in HEAP, not via save/load) an FST has poor performance when the FST needs more than one block.

Really we need to get away from FST attempting to re-implement an in-memory filesystem, badly! We've seen many problems from this over the years... like poor performance reading bytes in reverse, this issue (over-allocating blocks), the above linked issue (poor performance using a just-built FST). The FST compiler really ought to stream its bytes directly to DataOutput (inspired by Tantivy's awesome FST implementation) which will in general be backed by a "true" filesystem since FST compilation is fundamentally append-only writes.

But until we fix this "correctly" (stop trying to be a ramfs, badly), I agree we should tweak the block size for terms dict FST writing.

risdenk · 2023-10-04T13:18:46Z

FWIW I was looking into this a bit when I saw this issue come in. Specifically on Solr 8.11, but as far as I can tell the changes in #12604 apply to 8.x as well.

In a 30s async profiler memory allocation profile, can see that ~10% (4% on left and then ~6% near the middle) of total is this same FST byte array.

I put up a backport PR for lucene-solr branch_8_11 - apache/lucene-solr#2677

gf2121 added the type:enhancement label Sep 27, 2023

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Sep 27, 2023

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Sep 27, 2023

gf2121 closed this as completed Sep 27, 2023

github-project-automation bot moved this from Open to Closed in OpenSearch Lucene & Core Performance Tracking Sep 27, 2023

gf2121 reopened this Sep 28, 2023

github-project-automation bot moved this from Closed to Open in OpenSearch Lucene & Core Performance Tracking Sep 28, 2023

gf2121 mentioned this issue Sep 28, 2023

Reduce FST block size for BlockTreeTermsWriter #12604

Merged

gf2121 closed this as completed in #12604 Oct 4, 2023

github-project-automation bot moved this from Open to Closed in OpenSearch Lucene & Core Performance Tracking Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FST#Compiler allocates too much memory #12598

FST#Compiler allocates too much memory #12598

gf2121 commented Sep 27, 2023 •

edited

Loading

gf2121 commented Sep 28, 2023 •

edited

Loading

gf2121 commented Sep 28, 2023

mikemccand commented Oct 3, 2023

risdenk commented Oct 4, 2023 •

edited

Loading

FST#Compiler allocates too much memory #12598

FST#Compiler allocates too much memory #12598

Comments

gf2121 commented Sep 27, 2023 • edited Loading

Description

gf2121 commented Sep 28, 2023 • edited Loading

gf2121 commented Sep 28, 2023

mikemccand commented Oct 3, 2023

risdenk commented Oct 4, 2023 • edited Loading

gf2121 commented Sep 27, 2023 •

edited

Loading

gf2121 commented Sep 28, 2023 •

edited

Loading

risdenk commented Oct 4, 2023 •

edited

Loading