Skip to content

Using Series in DataFrame.astype() does not work correctly #16717

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
elDan101 opened this issue Jun 18, 2017 · 2 comments · Fixed by #16725
Closed

Using Series in DataFrame.astype() does not work correctly #16717

elDan101 opened this issue Jun 18, 2017 · 2 comments · Fixed by #16725
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@elDan101
Copy link

Code Sample

import pandas as pd
import numpy as np

df = pd.DataFrame([[1,2],[3,4]], columns=["A", "B"])


df

 	A 	B
0 	1 	2
1 	3 	4


df.astype({"A": np.float64, "B": np.float64})

 	A 	B
0 	1.0 	2.0
1 	3.0 	4.0

df.astype(pd.Series({"A": np.float64, "B": np.float64})) # using a series does not react at all (error or right behaviour)
 	A 	B
0 	1 	2
1 	3 	4

df.astype(list(pd.Series({"A": np.float64, "B": np.float64}))) # using a type that is not allowed raises error
[...] TypeError: data type not understood

Problem description

I wanted to parse the columns of a DF with a Series (instead of a dict as stated in the docs). I realised that the types did not change at all nor an error was raised. Casting to dict helped (see in code).

Expected Output

I expect to give the same error as when a wrong data type (e.g. list) is used, or alternatively that it works the same way as a dict. At the moment nothing happens.

pandas: 0.20.2 pytest: None pip: 9.0.1 setuptools: 28.8.0 Cython: None numpy: 1.13.0 scipy: None xarray: None IPython: 6.1.0 sphinx: None patsy: None dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None
@elDan101 elDan101 changed the title Using Series in astype() does not work Using Series in DataFrame.astype() does not work correctly Jun 18, 2017
@TomAugspurger
Copy link
Contributor

I think .astype(series[name, dtype]) should work just like a dictionary.

We try isinstance(dtype, collections.Mapping), which is False for series (it should maybe be true, see #12056. We aren't 100% compatible with the interface though, since .values isn't callable). Alternatively we could replace that isinstance with is_dict_like and it'll work. Interested in submitting a PR?

@TomAugspurger TomAugspurger added this to the Next Major Release milestone Jun 18, 2017
@TomAugspurger TomAugspurger added Bug Difficulty Novice Dtype Conversions Unexpected or buggy dtype conversions labels Jun 18, 2017
@elDan101
Copy link
Author

Interested in submitting a PR?

At the moment and near future I am too busy. But, if it remains open and I have some time I will look at it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants