-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: na_values dict form not working on index column #57547
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
import io file_contents = """ default_nan_values = set(["NA", "squid"]) try: |
Thanks for the report, confirmed on main. Further investigations and PRs to fix are welcome! |
take |
replacing the |
@thomas-intellegens Sorry to bother, but in the issue post you mention that
In case you might remember, was the documentation this one? Because otherwise, I cannot find, in the docs, where such property is mentioned. Thank you |
Yeah, this was the section I was reading. Many thanks for taking a look at this |
BUG: Na_values dict not working on index column (#57547) * fix base_parser not setting col_na_values when na_values is a dict containing None * fix python_parser applying na_values in a column None * add unit test to test_na_values.py; * update whatsnew.
pandas-dev#57965) BUG: Na_values dict not working on index column (pandas-dev#57547) * fix base_parser not setting col_na_values when na_values is a dict containing None * fix python_parser applying na_values in a column None * add unit test to test_na_values.py; * update whatsnew.
Uh oh!
There was an error while loading. Please reload this page.
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I'm trying to find a way to read in an index column as exact strings, but read in the rest of the columns as NaN-able numbers or strings. The dict form of na_values seems to be the only way implied in the documentation to allow this to happen, however, when I try this, it errors with the message:
This is unhelpful, as the docs imply this should work, and I can't find any other way to turn off nan detection in the index column without disabling it in the rest of the table (which is a hard requirement)
Expected Behavior
The pandas table should be read without error, leading to a pandas table a bit like the following:
Installed Versions
This has been tested on three versions of pandas v1.5.2, v2.0.2, and v2.2.0, all with similar results.
pandas : 2.2.0
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 69.0.3
pip : 23.2.1
Cython : None
pytest : 7.4.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.9
jinja2 : 3.1.3
IPython : None
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : 0.58.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: