Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Document a Bayesian approach to automated V&V #1382

Merged
merged 7 commits into from
Nov 7, 2023
Merged

Conversation

zmbc
Copy link
Collaborator

@zmbc zmbc commented Oct 30, 2023

Using Bayesian instead of frequentist hypothesis testing.

Code implementing the statistics and applying this method to domestic migration and immigration is here: ihmeuw/vivarium_census_prl_synth_pop#333

@zmbc zmbc added the meta modeling strategy Docs not related to a single project in particular label Oct 30, 2023
one for each of the values we want to check in the simulation.
In these hypothesis tests, the null hypothesis is that the simulation value matches the V&V target;
In these hypothesis tests, the null hypothesis is that the simulation value comes from our V&V target distribution
and the alternative hypothesis is that it comes from a prior distribution of bugs/errors;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is something philosophically interesting here... the alternative hypothesis is that the prior has bugs/errors and they matter. It is possible that there is a bug but it is not caught by this test. But then is it really a bug?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I would say that a bug is still a bug because its something we don't want in the code, even if it doesn't impact the results. For example, if I used a GBD 2018 value instead of 2019 - it's wrong but it might not appear wrong in the outputs. But then yes, we're only testing for bugs that matter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abie, that might be (arguably) the alternative hypothesis we want, but here I am describing what alternative hypothesis we are actually testing. With how I have currently done this, there is a distribution of rates if there is no bug (specified by the V&V target) and a distribution of rates if there is a bug (currently this prior is always the same). The latter can have mass around or at the correct values, which represents the situation you are describing -- a bug that is accidentally right. We still include that as part of the alternative hypothesis.

one for each of the values we want to check in the simulation.
In these hypothesis tests, the null hypothesis is that the simulation value matches the V&V target;
In these hypothesis tests, the null hypothesis is that the simulation value comes from our V&V target distribution
and the alternative hypothesis is that it comes from a prior distribution of bugs/errors;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I would say that a bug is still a bug because its something we don't want in the code, even if it doesn't impact the results. For example, if I used a GBD 2018 value instead of 2019 - it's wrong but it might not appear wrong in the outputs. But then yes, we're only testing for bugs that matter.

@zmbc zmbc merged commit a0fcf17 into main Nov 7, 2023
2 checks passed
@zmbc zmbc deleted the automated_v_and_v_bayesian branch November 7, 2023 16:54
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
meta modeling strategy Docs not related to a single project in particular
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants