Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add a --limit option to read-table-multi-threaded #297

Merged

Conversation

nicklan
Copy link
Collaborator

@nicklan nicklan commented Aug 2, 2024

Arrow printing is actually really slow, so for a big table it's useful to be able to just print a few rows and the total count, for testing, or to get a sense of how fast kernel

Copy link
Collaborator

@zachschuermann zachschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! was trying to come up with a way to simplify that section going through the record batches and limiting/truncating/counting rows but couldn't come up with anything else


/// Limit to printing only LIMIT rows.
#[arg(short, long)]
limit: Option<usize>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity why usize instead of u64?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i suppose slice/take/etc for iterators take usize. interesting though since i feel like this communicates length but i always associate usize with indexes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I had it as a u32, but then I'd just have to cast it below, so I figured make it a usize here

Copy link
Collaborator

@azdavis azdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one unrelated q

return Ok((Box::new(store), path));
Ok((Box::new(store), path))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have linting in ci to make committing this impossible in the first place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do, we run rustfmt and clippy. surprised clippy didn't catch this..?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding me! It's because our CI didn't run clippy with --all-features so it wasn't checking this (it's behind the cloud feature.

I've updated the workflow file now to run with --all-features. Hopefully that's uncontroversial :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice thanks :)

@nicklan nicklan merged commit 2b64876 into delta-io:main Aug 2, 2024
9 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants