Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore: cleanup decoder benches #3118

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

westonpace
Copy link
Contributor

This ensures that the setup routines happen inside of the benchmark function. Otherwise datagen will be run for all benchmarks during the collection phase and the system will run out of RAM.

This also changes the routines to use FileMetadataCache::no_cache as it turns out moka doesn't immediately free its cache space on drop and, when used in a tight benchmark loop, this leads to running out of memory.

@github-actions github-actions bot added the chore label Nov 11, 2024
@codecov-commenter
Copy link

codecov-commenter commented Nov 11, 2024

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Project coverage is 77.19%. Comparing base (ec76db4) to head (236c728).

Files with missing lines Patch % Lines
rust/lance-encoding-datafusion/src/zone.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3118   +/-   ##
=======================================
  Coverage   77.19%   77.19%           
=======================================
  Files         240      240           
  Lines       81517    81523    +6     
  Branches    81517    81523    +6     
=======================================
+ Hits        62927    62935    +8     
- Misses      15383    15384    +1     
+ Partials     3207     3204    -3     
Flag Coverage Δ
unittests 77.19% <81.81%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@westonpace westonpace marked this pull request as ready for review November 13, 2024 16:43
.unwrap();
const NUM_ROWS: u64 = 10000;
// Randomly generated strings are always 12 characters (at the moment)
const NUM_BYTES: u64 = NUM_ROWS * 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe a comment 16 comes from 12 bytes data + 4 bytes offset?


let lance_schema =
Arc::new(lance_core::datatypes::Schema::try_from(data.schema().as_ref()).unwrap());
let encoding_strategy = default_encoding_strategy(LanceFileVersion::default());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fixed size binary encoding is not enabled in LanceFileVersion::default(Lance 2.0)

.unwrap();
let lance_schema =
Arc::new(lance_core::datatypes::Schema::try_from(data.schema().as_ref()).unwrap());
let encoding_strategy = default_encoding_strategy(LanceFileVersion::default());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional that we only run bench_decode against 2.0(LanceFileVersion::default)?

@broccoliSpicy
Copy link
Contributor

related to #2725

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants