Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

smooth_data interpolates observations and predictions differently #2225

Closed
sethaxen opened this issue Mar 21, 2023 · 3 comments · Fixed by #2300
Closed

smooth_data interpolates observations and predictions differently #2225

sethaxen opened this issue Mar 21, 2023 · 3 comments · Fixed by #2300

Comments

@sethaxen
Copy link
Member

Describe the bug
arviz.stats.stats_utils.smooth_data(y, y_hat) smooths y along the first dimension (data dimension) and y_hat along the second dimension (sample dimension).

To Reproduce
Here's an example where we construct y_hat as identical to y in every draw. While y is successfully interpolated between data points, y_hat is unchanged, since interpolation between identical draws does nothing.

In [1]: import arviz as az

In [2]: import numpy as np

In [3]: import scipy.stats

In [4]: y = scipy.stats.binom.rvs(20, 0.5, size=(10))

In [5]: y_hat = np.tile(y.T, (1000, 1)).T

In [6]: y_interp, y_hat_interp = az.stats.stats_utils.smooth_data(y, y_hat)

In [7]: y
Out[7]: array([11,  6, 10,  6, 11,  8,  8, 12,  8,  9])

In [8]: y_interp
Out[8]: 
array([ 9.42237617,  6.32506153,  9.91551262,  6.02439063, 11.02236472,
        8.03842965,  7.88687051, 11.96915287,  8.35219182,  8.17764022])

In [9]: y_hat
Out[9]: 
array([[11, 11, 11, ..., 11, 11, 11],
       [ 6,  6,  6, ...,  6,  6,  6],
       [10, 10, 10, ..., 10, 10, 10],
       ...,
       [12, 12, 12, ..., 12, 12, 12],
       [ 8,  8,  8, ...,  8,  8,  8],
       [ 9,  9,  9, ...,  9,  9,  9]])

In [10]: y_hat_interp
Out[10]: 
array([[11., 11., 11., ..., 11., 11., 11.],
       [ 6.,  6.,  6., ...,  6.,  6.,  6.],
       [10., 10., 10., ..., 10., 10., 10.],
       ...,
       [12., 12., 12., ..., 12., 12., 12.],
       [ 8.,  8.,  8., ...,  8.,  8.,  8.],
       [ 9.,  9.,  9., ...,  9.,  9.,  9.]])

Expected behavior
I'd expect y_interp and y_hat_interp[:, 0] to be identical.

Additional context
arviz v0.16.0.dev0

@sethaxen
Copy link
Member Author

@aloctavodia IIRC you implemented this smoothing. Have I correctly identified a bug here, or am I misusing the function?

@OriolAbril
Copy link
Member

There is this reshape: https://github.com/arviz-devs/arviz/blob/main/arviz/plots/backends/matplotlib/bpvplot.py#L87 which I think makes the first dimension the sample one

@OriolAbril
Copy link
Member

I am updating the docstring to include the shape info

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants