Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

core/btree: small refactoring + documentation tweaks #543

Closed
wants to merge 4 commits into from

Conversation

jussisaurio
Copy link
Collaborator

@jussisaurio jussisaurio commented Dec 24, 2024

small follow up to #539

contains:

  • Variable renaming and comments to btreecursor.insert_into_cell()
  • New utility methods pagecontent.header_size(), pagecontent.cell_pointer_array_size(), pagecontent.unallocated_region_start() and pagecontent.unallocated_region_size()
  • Refactor of btreecursor.compute_free_space() (plus comments and variable renaming)
  • Rename pagecontent.cell_get_raw_pointer_region() to pagecontent.cell_pointer_array_offset_and_size() and remove its usage in btreecursor.defragment_page()

@jussisaurio jussisaurio requested a review from pereman2 December 24, 2024 08:32
core/storage/btree.rs Outdated Show resolved Hide resolved
@jussisaurio jussisaurio changed the title core/btree: some documentation tweaks to insert_into_cell()/compute_free_space() core/btree: some documentation tweaks to insert_into_cell() + refactor compute_free_space() Dec 24, 2024
@jussisaurio jussisaurio changed the title core/btree: some documentation tweaks to insert_into_cell() + refactor compute_free_space() core/btree: refactor compute_free_space() + some documentation tweaks to insert_into_cell() Dec 24, 2024
@jussisaurio jussisaurio force-pushed the btree-cell-doc-tweaks branch from cd45049 to 25338b5 Compare December 24, 2024 17:00
@jussisaurio jussisaurio changed the title core/btree: refactor compute_free_space() + some documentation tweaks to insert_into_cell() core/btree: small refactoring + documentation tweaks Dec 24, 2024
@jussisaurio jussisaurio force-pushed the btree-cell-doc-tweaks branch from 136ae48 to 42ea904 Compare December 24, 2024 17:27
@@ -526,12 +556,7 @@ impl PageContent {
let ncells = self.cell_count();
// the page header is 12 bytes for interior pages, 8 bytes for leaf pages
// this is because the 4 last bytes in the interior page's header are used for the rightmost pointer.
let cell_pointer_array_start = match self.page_type() {
Copy link
Collaborator Author

@jussisaurio jussisaurio Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a bug? doesn't include offset for 1st page in db (which is 100 bytes, due to the database header)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm feels like it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this one in particular probably isn't, because PageContent generally tries to hide the existence of the potential 100 byte offset from the user -- all of the read/write functions e.g. read_u16() add the offset under the hood:

    pub fn read_u16(&self, pos: usize) -> u16 {
        let buf = self.as_ptr();
        u16::from_be_bytes([buf[self.offset + pos], buf[self.offset + pos + 1]])
    }

whereas this quoted code does the following:

let cell_pointer_array_start = (...does NOT include offset...)
let cell_pointer = cell_pointer_array_start + (idx * 2);
let cell_pointer = self.read_u16(cell_pointer) as usize; // <--key part, offset added implicitly by read_u16

so the offset actually MUST not be added here because then it would be double-added :)

there might be a way to refactor this to make it less footgun-y

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'm blindly guessing that your thought process when naming pub fn cell_get_raw_pointer_region() was that 'raw' means that it includes the offset

Copy link
Collaborator Author

@jussisaurio jussisaurio Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still thinking there is a 100-byte offset related error somewhere that is causing CREATE TABLE errors

edit: i no longer think this is necessarily the case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be very intentional about commenting/variable naming when we are dealing with values that do or do not include self.offset (which is always 0 except for the rootpage for which it is 100 due to the db header), because otherwise it gets very confusing. i don't have a great solution for this yet

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be very intentional about commenting/variable naming when we are dealing with values that do or do not include self.offset (which is always 0 except for the rootpage for which it is 100 due to the db header), because otherwise it gets very confusing. i don't have a great solution for this yet

I'm okay with that, I feel like we should strive to never use offset but there are cases where it's necessary.

@penberg penberg closed this in 937779b Dec 27, 2024
@penberg penberg deleted the btree-cell-doc-tweaks branch January 5, 2025 18:17
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants