Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Dense reader: improve memory consumption for tile structures. #5046

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

KiterLuc
Copy link
Contributor

@KiterLuc KiterLuc commented Jun 5, 2024

This change significantly reduces the memory usage for tile structures (result space tile and tile subarray) by doing three things:

  • Only create the structures for the tiles that will be processed in an internal loop.
  • Using a structure to store the tile subarray with a smaller memory footprint.
  • Adding estimation for the tile structure size when determining the tiles to process for an iteration.

For the first point, a new class was added: IterationTileData to store the tile structurres for an iteration. It is held in a shared pointer so that the compute task can take tile data to finish processing tiles in a thread safe/efficient manner. A problem that arose here is that the tile subarrays were used to get the number of cells when computing the tiles that it can process in an iteration. Since the tile subarrays are heavy to compute, and should be computed in a parallel manner, this posed a problem. To solve this, a new method was added to Subarray to return the cell count of a particular tile (`Subarray::tile_cell_num``). Since the method doesn't allocate any memory, it can run significantly fast and there are no noticeable performance degradation, even in large queries. With this method, we can compute the tiles to process and then compute the tile subarrays later once we know the number of tiles for an iteration.

For the second point, a new class was added: DenseTileSubarray to store the tile subarrays. This class takes much less memory than Subarray and is also much fater to create in memory. Since this is for dense only, the ranges can be types, which also uses less memory as we don't need to store a size. Finally, the computed ranges for a tile can be used in TileCellSlabIter as a const reference, which surely improves performance and reduce small memory allocations.

For the third bullet point, compute_result_space_tiles now tries to estimate the memory for the tile structures as it can be significant when the tile size is small (ie: 1000 cells). I've also changed the code to take into account the filtered data size when computing memory against the memory budget to improve our accuracy.

Note that this PR extracted ReaderBase::compute_tile_domains, which is common code for the new dense reader and the legacy reader to compute the result space tiles. The rest of the code was split between the legacy reader class and the new dense reader class as their implementations diverged significantly.

Finally, I've made a small change to release the filtered data as soon as unfiltering is done to limit the memory footprint even further.

[sc-38387]
[sc-48388]


TYPE: IMPROVEMENT
DESC: Dense reader: improve memory consumption for tile structures.

/* ********************************* */
/* STATIC FUNCTIONS */
/* ********************************* */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this just brings the functions that used to be in reader base into the legacy reader. The implementations between this and the dense reader diverged too much and only the code in the new ReaderBase::compute_tile_domains is still common.

Copy link
Contributor

@DimitrisStaratzis DimitrisStaratzis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. I will add more after our meeting tomorrow

tiledb/sm/subarray/subarray.h Outdated Show resolved Hide resolved
tiledb/sm/subarray/subarray.cc Show resolved Hide resolved
tiledb/sm/query/readers/dense_reader.cc Outdated Show resolved Hide resolved
This change significantly reduces the memory usage for tile structures (result space tile and tile subarray) by doing three things:
- Only create the structures for the tiles that will be processed in an internal loop.
- Using a structure to store the tile subarray with a smaller memory footprint.
- Adding estimation for the tile structure size when determining the tiles to process for an iteration.

For the first point, a new class was added: `IterationTileData` to store the tile structurres for an iteration. It is held in a shared pointer so that the compute task can take tile data to finish processing tiles in a thread safe/efficient manner. A problem that arose here is that the tile subarrays were used to get the number of cells when computing the tiles that it can process in an iteration. Since the tile subarrays are heavy to compute, and should be computed in a parallel manner, this posed a problem. To solve this, a new method was added to Subarray to return the cell count of a particular tile (`Subarray::tile_cell_num``). Since the method doesn't allocate any memory, it can run significantly fast and there are no noticeable performance degradation, even in large queries. With this method, we can compute the tiles to process and then compute the tile subarrays later once we know the number of tiles for an iteration.

For the second point, a new class was added: `DenseTileSubarray` to store the tile subarrays. This class takes much less memory than `Subarray` and is also much fater to create in memory. Since this is for dense only, the ranges can be types, which also uses less memory as we don't need to store a size. Finally, the computed ranges for a tile can be used in `TileCellSlabIter` as a const reference, which surely improves performance and reduce small memory allocations.

For the third bullet point, `compute_result_space_tiles` now tries to estimate the memory for the tile structures as it can be significant when the tile size is small (ie: 1000 cells). I've also changed the code to take into account the filtered data size when computing memory against the memory budget to improve our accuracy.

Note that this PR extracted `ReaderBase::compute_tile_domains`, which is common code for the new dense reader and the legacy reader to compute the result space tiles. The rest of the code was split between the legacy reader class and the new dense reader class as their implementations diverged significantly.

Finally, I've made a small change to release the filtered data as soon as unfiltering is done to limit the memory footprint even further.

[sc-38387]
[sc-48388]

---
TYPE: IMPROVEMENT
DESC: Dense reader: improve memory consumption for tile structures.
@KiterLuc KiterLuc force-pushed the lr/dense-reader-reduce-tile-memory/ch38387 branch from 389e905 to cd8b7aa Compare June 10, 2024 23:19
@KiterLuc KiterLuc merged commit 7f7b15b into dev Jun 11, 2024
60 checks passed
@KiterLuc KiterLuc deleted the lr/dense-reader-reduce-tile-memory/ch38387 branch June 11, 2024 08:55
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants