-
-
Notifications
You must be signed in to change notification settings - Fork 464
Improve documentation for sampling without replacement #1210
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Hi there, and thanks for pointing out the poor state of the docs.
So if I understand correctly this issue is about:
|
How did you even find |
The |
I hope you'll excuse my ignorance, but, whether a set of samples are drawn with constant weights and whether they are drawn with replacement seem orthogonal. I'm very curious to hear why I'm wrong. But I certainly would not have searched for "weighted" when trying to perform a uniform sample without replacement. Or is the idea that I should basically implement |
If drawing without replacement, then the probability of each value being sampled is affected by each value drawn. There are two ways of simulating this: (1) maintaining a list of "taken" values and using rejection sampling (repeating the process if the sample is already "taken"), or (2) maintaining a representation of the probability of each remaining value being sampled. Method (2) is essentially weighted (or biased) sampling. Thus, in abstract, the concepts might be considered orthogonal, but when simulating such sampling, the concepts are related.
This is method (2), and works fine where the pool of available values is small, but not so well if e.g. there are a million possible values and you only wish to sample 1000 of them. But let me ask: is there some particular thing you're still struggling to understand how to implement, or are you merely suggesting parts of the documentation that might be improved? Because I have a suspicion that the documentation on If you wish to modify the distribution you are sampling from with each sample taken, that is not supported by the |
@OliverEvans96 |
Uh oh!
There was an error while loading. Please reload this page.
Hello,
I see that sampling without replacement was requested in #596 and implemented in #1013, however the documentation is not very helpful to someone trying to figure out how to use this feature.
So it took me 6 attempts to find an explanation, which is still only sampling indices, not sampling from a distribution.
It would be really awesome if a search for "replacement" in both the docs and the book returned a clear explanation with a usage example.
Thanks!
Oliver
The text was updated successfully, but these errors were encountered: