Skip to content

BUG: scatter_matrix not working with subplots #45885

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
3 tasks done
ghost opened this issue Feb 9, 2022 · 2 comments
Open
3 tasks done

BUG: scatter_matrix not working with subplots #45885

ghost opened this issue Feb 9, 2022 · 2 comments
Labels

Comments

@ghost
Copy link

ghost commented Feb 9, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
from pandas.plotting import scatter_matrix
from pandas.plotting import lag_plot
import matplotlib.pyplot as plt

df = pd.DataFrame(dict(
        a=[0, 1, 2, 3, 4, 5], 
        b=[1, 2, 7, 8, 9,10], 
        c=[0, 3, 4, 7, 9, 11], 
        d=[8, 3, 2, 7, 10,13])
    )

fig, ax = plt.subplots(1,2)
ax[0].plot(df.loc[:,'a'], df.loc[:,'b'])
scatter_matrix(df.corr(), diagonal='kde', ax=ax[1])
#lag_plot(df.loc[:,'b'], ax=ax[1])
print(pd.__version__)
plt.show()

Issue Description

There is a bug in scatter_matrix when using subplots and passing the desired ax parameter. I checked with other functions in pandas.plotting (e.g. lag_plot shown above as a commented line) and works as it should. However, for some reason, scatter_matrix shows the warning UserWarning: To output multiple subplots, the figure containing the passed axes is being cleared.

The previous code outputs a single plot as described in the warning:

Figure_1

Expected Behavior

The scatter_matrix should be shown in the second subplot, similar to this image in which the output of lag_plot is used instead:

Figure_1

Installed Versions

1.4.0

INSTALLED VERSIONS

commit : bb1f651
python : 3.9.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.13.0-27-generic
Version : #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.4.0
numpy : 1.21.2
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.0
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
None

@ghost ghost added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 9, 2022
@mroeschke mroeschke added Visualization plotting and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 11, 2022
@mitlabence mitlabence removed their assignment Apr 26, 2022
@jharris99-git
Copy link

I am experiencing the same issue over a year later.
image
image

I am trying to create two subplots of scatter matrices, but what comes out is a stretched version of the 2nd scatter matrix:
image

I have tried assigning the axes as a tuple of (ax1, ax2) as well as referencing axes[0] and axes[1] after assigning axes.

fig, (ax1, ax2) = plt.subplots(1, 2, tight_layout=True)
pd.plotting.scatter_matrix(..., ax=ax1)
pd.plotting.scatter_matrix(..., ax=ax2)

or

fig, axes = plt.subplots(1, 2, tight_layout=True)
...
pd.plotting.scatter_matrix(..., ax=axes[0])
pd.plotting.scatter_matrix(..., ax=axes[1])

both snippets of code followed by plt.show(), as it is set up in my imports. Any news on this bug would be great.

@mitlabence
Copy link
Contributor

Clearing fig can be avoided if one adds a subfigure sfig and assigns an axis sax. Then only sfig is cleared:

import pandas as pd
from pandas.plotting import scatter_matrix
from pandas.plotting import lag_plot
import matplotlib.pyplot as plt

df = pd.DataFrame(dict(
        a=[0, 1, 2, 3, 4, 5], 
        b=[1, 2, 7, 8, 9,10], 
        c=[0, 3, 4, 7, 9, 11], 
        d=[8, 3, 2, 7, 10,13])
    )

fig, ax = plt.subplots(1,2)
subplotspec = ax[1].get_subplotspec()
sfig = fig.add_subfigure(subplotspec)
ax[1].remove()
sax = sfig.add_subplot(111)
ax[0].plot(df.loc[:,'a'], df.loc[:,'b'])
scatter_matrix(df.corr(), diagonal='kde', ax=sax)
#lag_plot(df.loc[:,'b'], ax=sax)
plt.show()

out

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants