Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

no skip_rows option for Excel files #19161

Open
TiburonEl opened this issue Oct 9, 2024 · 3 comments
Open

no skip_rows option for Excel files #19161

TiburonEl opened this issue Oct 9, 2024 · 3 comments
Labels
A-io-excel Area: reading/writing Excel files enhancement New feature or an improvement of an existing feature

Comments

@TiburonEl
Copy link

Description

I'm having trouble importing an Excel file where the first few rows contain merged cells. I considered skipping these rows during import, but unfortunately, there is no skip_rows option available for Excel files in Polars (only for CSV files).

Is there a way to skip these rows? If not, could this functionality be added in a future release?

@TiburonEl TiburonEl added the enhancement New feature or an improvement of an existing feature label Oct 9, 2024
@cmdlineluser
Copy link
Contributor

read_options={"skip_rows": ...} will get passed to the underlying engine1.

Footnotes

  1. http://fastexcel.toucantoco.dev/fastexcel.html#ExcelReader.load_sheet_by_name

@TiburonEl
Copy link
Author

TiburonEl commented Oct 9, 2024

From what I understand, this parameter can only be an integer, meaning I can skip just one row. What if I want to skip multiple rows? Imagine I was to skip row 1,2 and 5th

@avimallu
Copy link
Contributor

skip_rows is typically intended to ignore a certain number of rows before reading a full table in, because of the Excel file author's formatting preferences.

The operation you're describing is more of a filter, which is what I think you should be using with something like .filter(~pl.int_range(0, pl.len()).is_in(rows_to_skip).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-io-excel Area: reading/writing Excel files enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants