Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Convergence of bootstrapping results, nan values for small sample sizes #7

Open
privong opened this issue Jun 22, 2020 · 0 comments
Open

Comments

@privong
Copy link
Owner

privong commented Jun 22, 2020

If the sample size is small (or the number of bootstraps is large), the correlation coefficients can be undefined and return nan values. The use of np.percentile() then returns nan from pymccorrelation(). If there's many nan values this probably suggests the bootstrapping is not well-converged. When looking at the mock dataset to check recovery (#4), the convergence of bootstrapping would be good to consider.

Ultimately, decide if nanpercentile() should be used, optionally with a warning if the size of the dataset is too small for reliable bootstrap error estimation.

There is probably statistics literature about this too...

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant