-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Implement multiple header_rows #166
Labels
Comments
In case it helps, I made this function using python-calamine which does header_row detecting |
Hello @deanm0000 |
Here's an attached excel file. I'd expect the output to be (assuming polars)
|
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
I find that people will often have Excel sheets where they use multiple header rows to make up their column names.
Here's a snippet of how I deal with that coming from a df generated with python_calamine
this is just a snippet and doesn't handle duplicate column names but that's a separate issue.
A more advanced version of this might infer the
header_rows
by skipping down (let's say) 10 rows and look for types starting there. Then choose a column which isn't a string and go back to the true row=0 and see how many rows down it needs to go before it no longer sees strings. Then that's the inferredheader_rows
The text was updated successfully, but these errors were encountered: