Testing of distributions #357

pitdicker · 2018-03-30T11:49:47Z

We should add some testing for distributions. Unfortunately since distribution samples are random, this is inherently difficult. Original issue dhardy#72.

See for some direction rust-lang/rust#10084, and the random-tests.

The text was updated successfully, but these errors were encountered:

vks · 2018-03-30T13:15:37Z

Also see #290.

dhardy · 2018-08-04T10:02:37Z

@vks has some plans for histogram-based testing, though possibly only after PDFs are added.

Separately, we should add some tests of value-stability.

vks · 2018-08-04T10:22:30Z

@vks has some plans for histogram-based testing, though possibly only after PDFs are added.

See tests/uniformity.rs, which could be generalized to arbitrary distributions (with given PDFs).

dhardy · 2019-11-06T12:48:30Z

@vks could I interest you (or potentially someone else) in adding at utility to rand_distr (perhaps an example) which plots histograms of each distribution from actual samples (not PDFs)? This would at least let us eye-ball that distributions are doing roughly the right thing.

vks · 2019-11-06T13:11:02Z

I was thinking about using sparklines for that. I'm interested, but I can't make promises that I'll get to it before the end of the year.

vks · 2021-05-15T17:31:13Z

Resolving this issue was started in #1121, more work is needed to extend it to the other distributions.

vks · 2024-04-26T10:20:05Z

I think we should either make this issue more actionable or close it.

For which distributions do we want histogram tests? Sometimes, calculating the PDF is non-trivial and requires special functions.

dhardy · 2024-04-27T06:13:57Z

I suggest the following approach:

Pick a representative parameter set (or possibly a small number of parametrisations).
Generate a histogram with this parametrisation and manually verify that the result looks reasonable.
Potentially lower the number of buckets and samples used by the histogram, and check the result is still similar. (We want tests to be reasonably fast to run.)
The test should enforce exact reproduction of this histogram.

This should allow fast and robust testing (within the limits of reproducibility), with the caveat that any value-breaking change must again be visually confirmed.

It should be applicable to all distributions, though may be hard to visualise for multi-dimensional outputs.

It cannot confirm perfect accuracy, especially in degenerate cases.

JamboChen · 2024-09-30T13:22:52Z

I am a university student with a background in statistics and would love to contribute to this repository. I have a few questions regarding this issue. I noticed that histograms are used in the tests for visual inspection by comparing the expected curve with the sample curve.

Would it be possible to use a chi-square test for validation? The process should be similar, but hypothesis testing might provide more convincing results.

I wrote a simple test for this: https://github.com/JamboChen/distr_test. The code is rough, and I'm unsure if this is the right approach to use here, I'd appreciate any feedback.

dhardy · 2024-09-30T14:59:55Z

@JamboChen feel free to take a look at #1494 by @benjamin-lieser implementing a Kolmogorov-Smirnov test.

dhardy · 2024-10-17T16:16:55Z

Implemented in #1504 thanks to @JamboChen and @benjamin-lieser.

Failure cases:

Hypergeometric (100, 50, 49): Hypergeo fix #1510
Poisson 1e9
Poisson 1.844e19

dhardy · 2024-11-13T11:39:58Z

We have a PR for the hypergeo issue and #1515 for the Poisson issue, so this can be closed.

pitdicker added the T-distributions label Mar 30, 2018

dhardy mentioned this issue Mar 31, 2018

Testing of distributions dhardy/rand#72

Closed

dhardy mentioned this issue Apr 26, 2018

Implement Bernoulli distribution #411

Merged

dhardy mentioned this issue Aug 4, 2018

Implement triangular distribution #575

Merged

dhardy mentioned this issue Jan 28, 2019

Tracker: Rand 0.7 #715

Closed

22 tasks

dhardy mentioned this issue May 14, 2019

Check usefulness of tests #267

Closed

2 tasks

vks mentioned this issue May 7, 2021

Compare sampled normal distribution to PDF #1121

Merged

saona-raimundo mentioned this issue Apr 24, 2023

ENH: Add the negative binomial distribution to rand_distr. #1296

Closed

dhardy mentioned this issue Apr 25, 2024

Tracker: Closing Old Issues #1432

Closed

6 tasks

dhardy removed the T-distributions label Jul 10, 2024

dhardy mentioned this issue Jul 10, 2024

Optimize Cauchy sampling #493

Closed

dhardy mentioned this issue Oct 17, 2024

Add more Kolmogorov–Smirnov test #1504

Merged

1 task

dhardy assigned JamboChen Oct 17, 2024

dhardy closed this as completed Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing of distributions #357

Testing of distributions #357

pitdicker commented Mar 30, 2018

vks commented Mar 30, 2018

dhardy commented Aug 4, 2018

vks commented Aug 4, 2018 •

edited

Loading

dhardy commented Nov 6, 2019

vks commented Nov 6, 2019

vks commented May 15, 2021

vks commented Apr 26, 2024

dhardy commented Apr 27, 2024

JamboChen commented Sep 30, 2024

dhardy commented Sep 30, 2024

dhardy commented Oct 17, 2024

dhardy commented Nov 13, 2024

Testing of distributions #357

Testing of distributions #357

Comments

pitdicker commented Mar 30, 2018

vks commented Mar 30, 2018

dhardy commented Aug 4, 2018

vks commented Aug 4, 2018 • edited Loading

dhardy commented Nov 6, 2019

vks commented Nov 6, 2019

vks commented May 15, 2021

vks commented Apr 26, 2024

dhardy commented Apr 27, 2024

JamboChen commented Sep 30, 2024

dhardy commented Sep 30, 2024

dhardy commented Oct 17, 2024

dhardy commented Nov 13, 2024

vks commented Aug 4, 2018 •

edited

Loading