The MCBench benchmark suite is designed for quantitative comparisons of Monte Carlo (MC) samples. It offers a standardized method for evaluating MC sample quality and provides researchers and practitioners with a tool for validating, developing, and refining MC sampling algorithms.
For benchmarking, different metrics are applied to point clouds of both independent and identically distributed (iid) samples and correlated samples generated by MC techniques, such as Markov Chain Monte Carlo or Nested Sampling. Through repeated comparisons, test statistics of the metrics are gathered allowing to evaluate the quality of the MC samples. A variety of target functions with different complexities and dimensionalities are availble, providing a versatile platform for testing the capabilities of sampling algorithms.
MCBench is implemented as Julia package but users can run external sampling algorithms of their choice on these test functions and input the resulting samples to obtain detailed metrics that quantify the quality of their samples compared to the iid samples generated by MCBench.
Read more about MCBench at https://arxiv.org/abs/2501.03138
To use MCBench you need a Julia installation. We recommend to use Julia 1.10 or above. Install MCBench via the Julia package manager by running
using Pkg
pkg"add https://github.com/tudo-physik-e4/MCBench"
- Pick a test cases from the list of available target functions.
- Implement these functions into the sampling software of your choice. We provide basic implementations of the listed test cases in Julia, Python (to be used with PyMC) and Stan.
- Generate samples of the target functions with the algorithm you want to benchmark. Save the samples as a
.csv
file withnparameters
columns andnsamples
rows. - Use the Julia package
MCBench
to load your samples and benchmark them against IID samples (which are automatically generated by the package). - See Using MCBench for an example on how to use the
MCBench
package.
This is a simple example of how to use the MCBench package.
f = MvNormal(zeros(3), I(3))
bounds = NamedTupleDist(x = [-10..10 for i in 1:3])
Standard_Normal_3D_Uncorrelated = Testcases(f,bounds,3,"Normal-3D-Uncorrelated")
metrics = [marginal_mean(), marginal_variance(), sliced_wasserstein_distance(), maximum_mean_discrepancy()]
sampler = FileBasedSampler("samples_from_my_algorithm.csv")
Evaluate the metrics both
- for the IID samples (IID samples are generated automatically in the background):
teststatistics_IID = build_teststatistic(Standard_Normal_3D_Uncorrelated, metrics,
n=100, n_steps=10^5, n_samples=10^5)
- and for the MC samples to be tested:
teststatistics_my_samples = build_teststatistic(Standard_Normal_3D_Uncorrelated, metrics,
n=100, n_steps=10^5, n_samples=10^5, s=sampler)
- Overview plot of all selected metrics
plot_metrics(Standard_Normal_3D_Uncorrelated, metrics, sampler)
- Individual metrics
plot_teststatistic(Standard_Normal_3D_Uncorrelated, marginal_mean(), sampler, nbins=20)
The following table contains all test cases currently available in the benchmark suite.
When implementing one of these into the MC sampling framework of your choice, you can use the given testpoints to validate your implementation.
We provide example implementations of the listed test cases to be used with Julia, Python, R and Stan.
This table is not yet complete and will be extended
Name | Equation | Parameters | Testpoints | Julia | Python | R | Stan |
---|---|---|---|---|---|---|---|
Standard Normal 1D |
|
✅ | ✅ | ✅ | ✅ | ||
Standard Normal 2D Uncorrelated |
|
✅ | ✅ | ✅ | ✅ | ||
Standard Normal 3D Uncorrelated |
|
✅ | ✅ | ✅ | ✅ |
The following metrics are available to compare custom generated MC samples to IID samples.
- Marginal mean:
marginal_mean()
- Marginal variance:
marginal_variance()
- Global mode:
global_mean()
- Marginal mode:
marginal_mode()
- Marginal skewness:
marginal_skewness()
- Marginal kurtosis:
marginal_kurtosis()
- Chi-squared:
chi_squared()
- Sliced Wasserstein Distance:
sliced_wasserstein_distance()
- Maximum Mean Discrepancy:
maximum_mean_discrepancy()