Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

"Could not determine dtype for column X, falling back to string" cluttering console log #326

Open
darrylthom opened this issue Jan 22, 2025 · 5 comments

Comments

@darrylthom
Copy link

darrylthom commented Jan 22, 2025

I am using Polars to do read_excel which is interfacing with the fastexcel wrapper. In cases where I have a column in my Excel sheet with only nulls, it is giving me this message with no way to stop it. It did not seem to give these messages in the past.

A use case would be a Comments column in a master data worksheet that I might have not had to utilize yet. This would be totally expected, but now I get warnings about falling back to string that clutters the console log and may lead a dev to thinking the code is producing errors.

Additional context: pola-rs/polars#20832 (comment)

polars-errors.zip

@darrylthom darrylthom changed the title Message: "falling back to string" "Could not determine dtype for column X, falling back to string" cluttering console log Jan 22, 2025
@artupi
Copy link

artupi commented Jan 27, 2025

Hi,
I had the same message on one of the computers and no message on the others. After comparing versions, I downgraded fastexcel from 0.12.1 => 0.12.0 and messages disappeared.
Artur

@lukapeschke
Copy link
Collaborator

Hello, and thank you for the reproduction case 🙂 There seem to be two different issues here:

  1. There's a log message you'd like not to have. For that, you could either change the logger's level, or completely disable it:
    import logging
    
    # This will disable all messages with a level <= WARNING
    logging.getLogger('fastexcel.types.dtype').setLevel(logging.ERROR)
    
    # This will completely disable the logger
    logging.getLogger('fastexcel.types.dtype').disabled = True
2. `fastexcel` tries to determine the dtype for a column that is **not** included in `use_columns`. This indeed seems to be a bug, I'll look into it. Tracked via #327 

@bzm3r
Copy link

bzm3r commented Feb 10, 2025

@lukapeschke I'd have to introduce statements logging.getLogger('fastexcel.types.dtype').setLevel(logging.ERROR) statements at a lot of places (e.g. all the notebooks/top level script files where I will eventually do something that uses FastExcel), right? If I am understanding correctly, that would be quite painful...

@lukapeschke
Copy link
Collaborator

Hello @darrylthom A fix for this issue was just merged 🙂 You should only get the warning for columns included in use_columns and whose dtype was not specified or could not be determined. Could you please try again ? Windows wheels built from the main branch are available at https://github.com/ToucanToco/fastexcel/actions/runs/13415490754/artifacts/2616948546

@lukapeschke
Copy link
Collaborator

@bzm3r Python's logging module can also be configured through shared config files: https://docs.python.org/3/library/logging.config.html#logging.config.fileConfig .

But with the recently merged fix, you should get less warnings if you're using use_columns or if you're specifying the dtypes for the different columns in the file.

If you want to try the latest fastexcel version out, wheels are available here:

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants