Skip to content

BUG: Setting datetime value using dataframe masking appears to transpose the mask #46294

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
2 of 3 tasks
agural opened this issue Mar 9, 2022 · 1 comment
Open
2 of 3 tasks
Assignees
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@agural
Copy link

agural commented Mar 9, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame([
    [0, 1],
    [0, 2]],
    index=[0, 0])
print(df)
mask = df > df.index.to_numpy().reshape(-1, 1)
print(mask)
df[mask] = np.nan
print(df)

df = pd.DataFrame([
    pd.to_datetime(["2000", "2001"]),
    pd.to_datetime(["2000", "2002"])],
    index=pd.to_datetime(["2000", "2000"]))
print(df)
mask = df > df.index.to_numpy().reshape(-1, 1)
print(mask)
df[mask] = pd.NaT
print(df)

Issue Description

The above code produces the following output:

   0  1
0  0  1
0  0  2
       0     1
0  False  True
0  False  True
   0   1
0  0 NaN
0  0 NaN
                    0          1
2000-01-01 2000-01-01 2001-01-01
2000-01-01 2000-01-01 2002-01-01
                0     1
2000-01-01  False  True
2000-01-01  False  True
                    0          1
2000-01-01 2000-01-01 2001-01-01
2000-01-01        NaT        NaT

The last outputted dataframe should have the NaT values along the second column, not the second row.

Expected Behavior

For the last dataframe above, I would expect it to look like this:

                    0   1
2000-01-01 2000-01-01 NaT
2000-01-01 2000-01-01 NaT

Installed Versions

INSTALLED VERSIONS

commit : 06d2301
python : 3.8.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-1022-aws
Version : #23~20.04.1-Ubuntu SMP Mon Nov 15 14:03:19 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.4.1
numpy : 1.19.2
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.25
pytest : 6.2.5
hypothesis : None
sphinx : 4.4.0
blosc : None
feather : None
xlsxwriter : 3.0.2
lxml.etree : 4.7.1
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.31.1
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fastparquet : None
fsspec : 2022.01.0
gcsfs : None
matplotlib : 3.5.1
numba : 0.54.1
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : 1.4.27
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None

@agural agural added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 9, 2022
@mroeschke mroeschke added Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 17, 2022
@MarcSocha
Copy link

take

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

3 participants