-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add simplified C++23 chunk_view
#5035
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not reviewed the code itself; absence of comments about it is not endorsement.
What I was looking for is code structure, build/test issues, and naming.
include(unit_test) | ||
|
||
commence(unit_test util) | ||
this_target_sources( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our indent style has been (1) two spaces inside commence
/conclude
(which CLion apparently can't be configured to do automatically) and (2) four spaces for function arguments (which CLion will do).
c6fd503
to
d6a5f35
Compare
d6a5f35
to
0ae3311
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran the new tests on my local Windows machine and got no checked iterator errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things I noticed. One is from last time.
@@ -0,0 +1,30 @@ | |||
# | |||
# tiledb/stdx/__ranges/CMakeLists.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This directory should be just ranges
; no underscores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard library header file is called , which, if we mimic, requires a different name for the subdirectory -- LLVM does it this way (with the underscores for the directory name). Alternatively, we can use "ranges.h" for our own ranges and use ranges
for the subdirectory -- but we've mimicked thread
and stop_token
by having our own files named thread
and stop_token
, so there is precedent for range
, notably in the stdx
directory. I think we should be consistent either way (either all without .h
or all with .h
). GCC solves this by putting everything in the ranges
file, which I would prefer not to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but it looks weird, and it's not something we're doing anywhere else in our own code. Thus it needs documentation.
The appropriate place is DIRECTORY.md
, which we've used elsewhere.
I think all comments are addressed now. If not let me know and I'll make sure we follow up.
This PR adds a class to provide a subset of the functionality of the C++23
std::ranges::chunk_view
. This view divides an underlying view into a range of equal-sized chunks (the last one may be shorter than the others). A complete description ofchunk_view
can be found on cppreference.com, which also includes "exposition only" aspects that point to particular implementation approaches. We did not adopt those approaches in most cases, though the public API of the class matches the spec, for the functions that we did implement. This view will be used to partition the original input data from the user query into, well, chunks that will be individually sorted and then written to a TileDB array as a data block. Data blocks are of fixed and uniform byte size that will contain a chunk smaller than or equal to its byte limit.The class assumes a
random_access_range
, which is what we have in our buffers that we are chunking, so I didn't try to generalize any further than that.The class provides a complete iterator interface via
iterator_facade
as well as a complete view interface viaview_interface
. Aspects of those interfaces that rely on C++23 are pended.Unit tests validate that
chunk_view
and its iterator meet the expectedstd::ranges
concepts.The unit tests do not yet test composition of this view with other views (our custom views as well as other C++20 views). This will be fairly extensive and the subject of its own PR.
One todo would be to create a variable chunk class view that chunks up the underlying view into variable-sized pieces, the better to fit into the data blocks. However, the data blocks will be fairly large, so there should not be too much internal fragmentation. We can always manage internal fragmentation by making the chunks some fraction of the block size, so even if there is some variation in chunk size, they can be well-packed into the data blocks.
TYPE: IMPROVEMENT
DESC: Implementation of a
chunk_view
class to provide a subset of C++23 chunk_view, suitable for supporting external sort.