Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add support for discrete variables in plot_bpv #1379

Merged
merged 16 commits into from
Sep 23, 2020
Merged

Conversation

aloctavodia
Copy link
Contributor

@aloctavodia aloctavodia commented Sep 11, 2020

Fix plot_bpv to work with discrete variables. A similar fix could be applied to plot_loopit

It also add an option (default to false) to plot the mean square error between the uniform distribution and marginal p_value

Old
bpv_p_old
bpv_u_old

New
bpv_p_new
bpv_u_new

data =np.random.binomial(n=1, p=0.5, size=1000)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

bpv_binary_p_00
bpv_binary_u_00

data = np.random.binomial(n=1, p=0.5, size=50)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

bpv_binary_p_00
bpv_binary_u_00

data = np.random.binomial(n=1, p=0.5, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 1., 1.)

bpv_binary_p_00
bpv_binary_u_00

data =np.random.binomial(n=1, p=0.5, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 100., 1.)

bpv_binary_p_00
bpv_binary_u_00

data =np.random.binomial(n=1, p=0.8, size=100)
with pm.Model() as momo:
p = pm.Beta("p", 50, 50)
bpv_binary_p_00
bpv_binary_u_00

data = np.random.poisson(2.5, size=1000)
with pm.Model() as momo:
p = pm.Uniform('p', 0, 10)

bpv_binary_p_00
bpv_binary_u_00

data = np.random.poisson(2.5, size=1000)
with pm.Model() as momo:
p = pm.Uniform('p', 3, 10)

bpv_binary_p_00
bpv_binary_u_00

data = np.random.poisson(2.5, size=1000)
data[:100] = 4
with pm.Model() as momo:
p = pm.Uniform('p', 0, 10)

bpv_binary_p_00
bpv_binary_u_00

The two following

examples are not affected by this PR, but I add them here as reference.

normal model sample size= 1000
bpv_binary_p_00
bpv_binary_u_00

normal model sample size= 100

bpv_binary_p_00
bpv_binary_u_00

@codecov
Copy link

codecov bot commented Sep 11, 2020

Codecov Report

Merging #1379 into master will decrease coverage by 0.09%.
The diff coverage is 90.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1379      +/-   ##
==========================================
- Coverage   91.72%   91.63%   -0.10%     
==========================================
  Files         105      105              
  Lines       10941    10965      +24     
==========================================
+ Hits        10036    10048      +12     
- Misses        905      917      +12     
Impacted Files Coverage Δ
arviz/plots/bpvplot.py 79.24% <ø> (ø)
arviz/plots/backends/matplotlib/bpvplot.py 82.88% <90.00%> (+0.40%) ⬆️
arviz/plots/backends/bokeh/bpvplot.py 83.17% <92.30%> (-0.33%) ⬇️
arviz/plots/plot_utils.py 91.37% <0.00%> (-3.45%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0339151...11615e0. Read the comment docs.

@OriolAbril
Copy link
Member

I understood that the current loo-pit approximation is not appliable to discrete data. I had this in mind to implement some day https://doi.org/10.1111/j.1541-0420.2009.01191.x

Disclaimer: I have yet to review the PR, will do it soon

@aloctavodia
Copy link
Contributor Author

Right, loo-pit is not applicable to discrete data for the same reasons plot_bpv isn't . I will check that paper, thanks!

@aloctavodia aloctavodia changed the title fix plot_bpv for discrete variables [WIP] fix plot_bpv for discrete variables Sep 11, 2020
@aloctavodia
Copy link
Contributor Author

It seems I need to think a little bit more about this.

@aloctavodia aloctavodia changed the title [WIP] fix plot_bpv for discrete variables Fix plot_bpv for discrete variables Sep 17, 2020
Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge after fixing bokeh backend

tstat_pit_dens.size,
n_ref,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also needed in bokeh backend

@OriolAbril OriolAbril changed the title Fix plot_bpv for discrete variables Add support for discrete variables in plot_bpv Sep 23, 2020
@@ -83,6 +88,16 @@ def plot_bpv(
obs_vals = obs_vals.flatten()
pp_vals = pp_vals.reshape(total_pp_samples, -1)

print(obs_vals.shape, pp_vals.shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to keep the users informed

@ahartikainen
Copy link
Contributor

There are some duplication, but I think we can move things to the main function maybe in another PR when we have time?

fake = {"a": np.random.poisson(2.5, 1000)}
fake_model = from_dict(posterior_predictive=fake, observed_data=fake)
fake_obs = {"a": np.random.poisson(2.5, 100)}
fake_pp = {"a": np.random.poisson(2.5, (10, 100))}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a 2 dim array this will be understood as a (chain, draw) array, you should add a dimension at the beggining.

Comment on lines 89 to 90
if pp_vals.ndim > 2:
pp_vals = pp_vals.reshape(total_pp_samples, -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will break the ArviZ shape convention and interpret the draw dimension as the observations.

@OriolAbril
Copy link
Member

Will merge once/if tests pass and release

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants