Add support for discrete variables in plot_bpv #1379

aloctavodia · 2020-09-11T18:44:36Z

Fix plot_bpv to work with discrete variables. A similar fix could be applied to plot_loopit

It also add an option (default to false) to plot the mean square error between the uniform distribution and marginal p_value

Old

New

data =np.random.binomial(n=1, p=0.5, size=1000)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

data = np.random.binomial(n=1, p=0.5, size=50)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

data = np.random.binomial(n=1, p=0.5, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

data =np.random.binomial(n=1, p=0.5, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 100., 1.)

data =np.random.binomial(n=1, p=0.8, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 50, 50)

data = np.random.poisson(2.5, size=1000)
with pm.Model() as momo:
p = pm.Uniform('p', 0, 10)

data = np.random.poisson(2.5, size=1000)
with pm.Model() as momo:
p = pm.Uniform('p', 3, 10)

data = np.random.poisson(2.5, size=1000)
data[:100] = 4
with pm.Model() as momo:
p = pm.Uniform('p', 0, 10)

The two following

examples are not affected by this PR, but I add them here as reference.

normal model sample size= 1000

normal model sample size= 100

codecov · 2020-09-11T19:01:09Z

Codecov Report

Merging #1379 into master will decrease coverage by 0.09%.
The diff coverage is 90.90%.

@@            Coverage Diff             @@
##           master    #1379      +/-   ##
==========================================
- Coverage   91.72%   91.63%   -0.10%     
==========================================
  Files         105      105              
  Lines       10941    10965      +24     
==========================================
+ Hits        10036    10048      +12     
- Misses        905      917      +12

Impacted Files	Coverage Δ
arviz/plots/bpvplot.py	`79.24% <ø> (ø)`
arviz/plots/backends/matplotlib/bpvplot.py	`82.88% <90.00%> (+0.40%)`	⬆️
arviz/plots/backends/bokeh/bpvplot.py	`83.17% <92.30%> (-0.33%)`	⬇️
arviz/plots/plot_utils.py	`91.37% <0.00%> (-3.45%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0339151...11615e0. Read the comment docs.

OriolAbril · 2020-09-11T20:40:25Z

I understood that the current loo-pit approximation is not appliable to discrete data. I had this in mind to implement some day https://doi.org/10.1111/j.1541-0420.2009.01191.x

Disclaimer: I have yet to review the PR, will do it soon

aloctavodia · 2020-09-11T21:01:42Z

Right, loo-pit is not applicable to discrete data for the same reasons plot_bpv isn't . I will check that paper, thanks!

aloctavodia · 2020-09-11T21:35:27Z

It seems I need to think a little bit more about this.

arviz/plots/backends/matplotlib/bpvplot.py

CHANGELOG.md

… shape

OriolAbril

Will merge after fixing bokeh backend

OriolAbril · 2020-09-23T16:57:15Z

arviz/plots/backends/matplotlib/bpvplot.py

                            tstat_pit_dens.size,
+                            n_ref,


I think this is also needed in bokeh backend

ahartikainen · 2020-09-23T17:32:09Z

arviz/plots/backends/matplotlib/bpvplot.py

@@ -83,6 +88,16 @@ def plot_bpv(
        obs_vals = obs_vals.flatten()
        pp_vals = pp_vals.reshape(total_pp_samples, -1)

+        print(obs_vals.shape, pp_vals.shape)


I just want to keep the users informed

ahartikainen · 2020-09-23T17:33:25Z

There are some duplication, but I think we can move things to the main function maybe in another PR when we have time?

OriolAbril · 2020-09-23T19:09:40Z

arviz/tests/base_tests/test_plots_bokeh.py

-    fake = {"a": np.random.poisson(2.5, 1000)}
-    fake_model = from_dict(posterior_predictive=fake, observed_data=fake)
+    fake_obs = {"a": np.random.poisson(2.5, 100)}
+    fake_pp = {"a": np.random.poisson(2.5, (10, 100))}


As a 2 dim array this will be understood as a (chain, draw) array, you should add a dimension at the beggining.

OriolAbril · 2020-09-23T19:10:27Z

arviz/plots/backends/matplotlib/bpvplot.py

+        if pp_vals.ndim > 2:
+            pp_vals = pp_vals.reshape(total_pp_samples, -1)


I think this will break the ArviZ shape convention and interpret the draw dimension as the observations.

OriolAbril · 2020-09-23T19:21:27Z

Will merge once/if tests pass and release

aloctavodia changed the title ~~fix plot_bpv for discrete variables~~ [WIP] fix plot_bpv for discrete variables Sep 11, 2020

aloctavodia changed the title ~~[WIP] fix plot_bpv for discrete variables~~ Fix plot_bpv for discrete variables Sep 17, 2020

aloctavodia requested a review from OriolAbril September 21, 2020 15:13

aloctavodia added 6 commits September 22, 2020 08:24

fix plot_bpv for discrete variables

e594039

update changelog

54adcc8

add tests

186b4cd

minor change

1fecc13

add samples for discrete variables

46a24eb

use interpolation instead of randomization

0bba9a5

aloctavodia force-pushed the bpv_discrete branch from 9c8431c to 0bba9a5 Compare September 22, 2020 11:24

OriolAbril reviewed Sep 23, 2020

View reviewed changes

arviz/plots/backends/matplotlib/bpvplot.py Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

aloctavodia added 4 commits September 23, 2020 09:52

Merge branch 'master' into bpv_discrete

4a6bb6c

add interpolation to bokeh, fix bug in p_value reference distribution…

40ad344

… shape

add interpolation to bokeh, fix bug in p_value reference distribution…

11c8d81

… shape

fix axis interpolation

13c9dcc

OriolAbril reviewed Sep 23, 2020

View reviewed changes

OriolAbril changed the title ~~Fix plot_bpv for discrete variables~~ Add support for discrete variables in plot_bpv Sep 23, 2020

aloctavodia added 2 commits September 23, 2020 14:28

fix shape reference distribution bokeh

3c8837b

Merge branch 'master' into bpv_discrete

e35e73b

ahartikainen reviewed Sep 23, 2020

View reviewed changes

aloctavodia added 2 commits September 23, 2020 14:37

remove print

bffb8a3

fix test

976250c

OriolAbril requested changes Sep 23, 2020

View reviewed changes

fix tests

6f28af8

OriolAbril approved these changes Sep 23, 2020

View reviewed changes

remove blank line

11615e0

aloctavodia merged commit 3290926 into master Sep 23, 2020

aloctavodia deleted the bpv_discrete branch September 23, 2020 21:19

aloctavodia mentioned this pull request Sep 25, 2020

Boundary corrected KDE values and flag: ppc_loo_pit_overlay() stan-dev/bayesplot#235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for discrete variables in plot_bpv #1379

Add support for discrete variables in plot_bpv #1379

aloctavodia commented Sep 11, 2020 •

edited

Loading

codecov bot commented Sep 11, 2020 •

edited

Loading

OriolAbril commented Sep 11, 2020

aloctavodia commented Sep 11, 2020

aloctavodia commented Sep 11, 2020

OriolAbril left a comment

OriolAbril Sep 23, 2020

ahartikainen Sep 23, 2020

aloctavodia Sep 23, 2020

ahartikainen commented Sep 23, 2020

OriolAbril Sep 23, 2020

OriolAbril Sep 23, 2020

OriolAbril commented Sep 23, 2020

		if pp_vals.ndim > 2:
		pp_vals = pp_vals.reshape(total_pp_samples, -1)

Add support for discrete variables in plot_bpv #1379

Add support for discrete variables in plot_bpv #1379

Conversation

aloctavodia commented Sep 11, 2020 • edited Loading

codecov bot commented Sep 11, 2020 • edited Loading

Codecov Report

OriolAbril commented Sep 11, 2020

aloctavodia commented Sep 11, 2020

aloctavodia commented Sep 11, 2020

OriolAbril left a comment

Choose a reason for hiding this comment

OriolAbril Sep 23, 2020

Choose a reason for hiding this comment

ahartikainen Sep 23, 2020

Choose a reason for hiding this comment

aloctavodia Sep 23, 2020

Choose a reason for hiding this comment

ahartikainen commented Sep 23, 2020

OriolAbril Sep 23, 2020

Choose a reason for hiding this comment

OriolAbril Sep 23, 2020

Choose a reason for hiding this comment

OriolAbril commented Sep 23, 2020

aloctavodia commented Sep 11, 2020 •

edited

Loading

codecov bot commented Sep 11, 2020 •

edited

Loading